Re: new objtool warnings again...

2016-09-23 Thread Linus Torvalds
On Fri, Sep 23, 2016 at 2:16 PM, Josh Poimboeuf  wrote:
>
> I just started seeing this problem today.  I suspect it's a ccache
> issue, since it only showed up after ccache was updated.

Ahh, I didn't even notice that ccache got updated, but yeah, that makes sense.

Linus


[PATCH] [media] VPU: mediatek: Fix return value in case of error

2016-09-23 Thread Christophe JAILLET
If 'dma_alloc_coherent()' returns NULL, 'vpu_alloc_ext_mem()' will
return 0 which means success.
Return -ENOMEM instead.

Signed-off-by: Christophe JAILLET 
---
 drivers/media/platform/mtk-vpu/mtk_vpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/platform/mtk-vpu/mtk_vpu.c 
b/drivers/media/platform/mtk-vpu/mtk_vpu.c
index c9bf58c97878..3edb5ed852e6 100644
--- a/drivers/media/platform/mtk-vpu/mtk_vpu.c
+++ b/drivers/media/platform/mtk-vpu/mtk_vpu.c
@@ -674,7 +674,7 @@ static int vpu_alloc_ext_mem(struct mtk_vpu *vpu, u32 
fw_type)
   GFP_KERNEL);
if (!vpu->extmem[fw_type].va) {
dev_err(dev, "Failed to allocate the extended program 
memory\n");
-   return PTR_ERR(vpu->extmem[fw_type].va);
+   return -ENOMEM;
}
 
/* Disable extend0. Enable extend1 */
-- 
2.7.4



Re: new objtool warnings again...

2016-09-23 Thread Linus Torvalds
On Fri, Sep 23, 2016 at 2:16 PM, Josh Poimboeuf  wrote:
>
> I just started seeing this problem today.  I suspect it's a ccache
> issue, since it only showed up after ccache was updated.

Ahh, I didn't even notice that ccache got updated, but yeah, that makes sense.

Linus


[PATCH] [media] VPU: mediatek: Fix return value in case of error

2016-09-23 Thread Christophe JAILLET
If 'dma_alloc_coherent()' returns NULL, 'vpu_alloc_ext_mem()' will
return 0 which means success.
Return -ENOMEM instead.

Signed-off-by: Christophe JAILLET 
---
 drivers/media/platform/mtk-vpu/mtk_vpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/platform/mtk-vpu/mtk_vpu.c 
b/drivers/media/platform/mtk-vpu/mtk_vpu.c
index c9bf58c97878..3edb5ed852e6 100644
--- a/drivers/media/platform/mtk-vpu/mtk_vpu.c
+++ b/drivers/media/platform/mtk-vpu/mtk_vpu.c
@@ -674,7 +674,7 @@ static int vpu_alloc_ext_mem(struct mtk_vpu *vpu, u32 
fw_type)
   GFP_KERNEL);
if (!vpu->extmem[fw_type].va) {
dev_err(dev, "Failed to allocate the extended program 
memory\n");
-   return PTR_ERR(vpu->extmem[fw_type].va);
+   return -ENOMEM;
}
 
/* Disable extend0. Enable extend1 */
-- 
2.7.4



Re: new objtool warnings again...

2016-09-23 Thread Linus Torvalds
On Fri, Sep 23, 2016 at 2:14 PM, Josh Poimboeuf  wrote:
>
> Is this with your latest pushed master branch?  I have F24, but I don't
> see the warning.

It is with the latest branch, but I was wrong - it doesn't show up for
"allmodconfig", it only shows up for my fairly-minimal config on this
laptop.

> In any case, I'll come up with a patch for you to test.

I can send you my odd config in case you want it, but I'm assuming you
just went "ahh, I know what's up" and don't even need it.

Linus


Re: new objtool warnings again...

2016-09-23 Thread Linus Torvalds
On Fri, Sep 23, 2016 at 2:14 PM, Josh Poimboeuf  wrote:
>
> Is this with your latest pushed master branch?  I have F24, but I don't
> see the warning.

It is with the latest branch, but I was wrong - it doesn't show up for
"allmodconfig", it only shows up for my fairly-minimal config on this
laptop.

> In any case, I'll come up with a patch for you to test.

I can send you my odd config in case you want it, but I'm assuming you
just went "ahh, I know what's up" and don't even need it.

Linus


Re: new objtool warnings again...

2016-09-23 Thread Josh Poimboeuf
On Fri, Sep 23, 2016 at 02:06:03PM -0700, Linus Torvalds wrote:
> On Fri, Sep 23, 2016 at 1:33 PM, Linus Torvalds
>  wrote:
> >
> > So this code is clearly missing the magic to tell gcc that the asm
> > needs a frame pointer.
> 
> Independently of that, the objtool build seems racy or somehow
> fragile. I've now twice gotten into a situation where I end up getting
> 
>   cat: /home/torvalds/v2.6/linux/tools/objtool/.fixdep.o.d: No such
> file or directory
>   make[4]: *** [/home/torvalds/v2.6/linux/tools/objtool/fixdep.o] Error 1
>   make[3]: *** [/home/torvalds/v2.6/linux/tools/objtool/fixdep-in.o] Error 2
>   make[2]: *** [fixdep] Error 2
>   make[1]: *** [objtool] Error 2
>   make: *** [tools/objtool] Error 2
> 
> with just the right timings, and then ccache ends up remembering that
> as a build failure and causing that to be "sticky" even across "git
> clean -dqfx" builds (and the "ccache -C" clears it).
> 
> Adding Michal to the cc, in case he can see what the problem is.

I just started seeing this problem today.  I suspect it's a ccache
issue, since it only showed up after ccache was updated.  I "fixed" it
with:

  yum downgrade ccache

I'll open a bug...

-- 
Josh


Re: [PATCH 1/1] mm/percpu.c: correct max_distance calculation for pcpu_embed_first_chunk()

2016-09-23 Thread zijun_hu
On 2016/9/24 3:23, Tejun Heo wrote:
> On Sat, Sep 24, 2016 at 02:20:24AM +0800, zijun_hu wrote:
>> From: zijun_hu 
>>
>> correct max_distance from (base of the highest group + ai->unit_size)
>> to (base of the highest group + the group size)
>>
>> Signed-off-by: zijun_hu 
> 
> Nacked-by: Tejun Heo 
> 
> Thanks.
>
frankly, the current max_distance is error, doesn't represents the ranges 
spanned by
areas owned by the groups




Re: new objtool warnings again...

2016-09-23 Thread Josh Poimboeuf
On Fri, Sep 23, 2016 at 02:06:03PM -0700, Linus Torvalds wrote:
> On Fri, Sep 23, 2016 at 1:33 PM, Linus Torvalds
>  wrote:
> >
> > So this code is clearly missing the magic to tell gcc that the asm
> > needs a frame pointer.
> 
> Independently of that, the objtool build seems racy or somehow
> fragile. I've now twice gotten into a situation where I end up getting
> 
>   cat: /home/torvalds/v2.6/linux/tools/objtool/.fixdep.o.d: No such
> file or directory
>   make[4]: *** [/home/torvalds/v2.6/linux/tools/objtool/fixdep.o] Error 1
>   make[3]: *** [/home/torvalds/v2.6/linux/tools/objtool/fixdep-in.o] Error 2
>   make[2]: *** [fixdep] Error 2
>   make[1]: *** [objtool] Error 2
>   make: *** [tools/objtool] Error 2
> 
> with just the right timings, and then ccache ends up remembering that
> as a build failure and causing that to be "sticky" even across "git
> clean -dqfx" builds (and the "ccache -C" clears it).
> 
> Adding Michal to the cc, in case he can see what the problem is.

I just started seeing this problem today.  I suspect it's a ccache
issue, since it only showed up after ccache was updated.  I "fixed" it
with:

  yum downgrade ccache

I'll open a bug...

-- 
Josh


Re: [PATCH 1/1] mm/percpu.c: correct max_distance calculation for pcpu_embed_first_chunk()

2016-09-23 Thread zijun_hu
On 2016/9/24 3:23, Tejun Heo wrote:
> On Sat, Sep 24, 2016 at 02:20:24AM +0800, zijun_hu wrote:
>> From: zijun_hu 
>>
>> correct max_distance from (base of the highest group + ai->unit_size)
>> to (base of the highest group + the group size)
>>
>> Signed-off-by: zijun_hu 
> 
> Nacked-by: Tejun Heo 
> 
> Thanks.
>
frankly, the current max_distance is error, doesn't represents the ranges 
spanned by
areas owned by the groups




Re: [PATCH] softirq: let ksoftirqd do its job

2016-09-23 Thread Peter Zijlstra
On Fri, Sep 23, 2016 at 06:51:04PM +0200, Jesper Dangaard Brouer wrote:

> This is your git tree, right:
>  https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git/
> 
> Doesn't look like you pushed it yet, or do I need to look at a specific
> branch?

I mainly work from a local quilt queue which I feed to mingo. I
occasionally push out to get build-bot coverage or have people look at
bits I poked together.

That said, I'll try and do a push later tonight.

Do note however, that git tree is a complete wipe and rebuild, don't
expect any kind of continuity from it.


Re: [PATCH] softirq: let ksoftirqd do its job

2016-09-23 Thread Peter Zijlstra
On Fri, Sep 23, 2016 at 06:51:04PM +0200, Jesper Dangaard Brouer wrote:

> This is your git tree, right:
>  https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git/
> 
> Doesn't look like you pushed it yet, or do I need to look at a specific
> branch?

I mainly work from a local quilt queue which I feed to mingo. I
occasionally push out to get build-bot coverage or have people look at
bits I poked together.

That said, I'll try and do a push later tonight.

Do note however, that git tree is a complete wipe and rebuild, don't
expect any kind of continuity from it.


Re: [PATCH] docs-rst: add inter-document cross references

2016-09-23 Thread Mauro Carvalho Chehab
Hi Jon,

Em Wed, 21 Sep 2016 15:44:05 -0600
Jonathan Corbet  escreveu:

> ...and now I'm thinking that's maybe about enough in docs for 4.9...:)

I finished handling the plain text files that, IMHO, should be on
either user of development process books. 

As you're feeling that there are enough material for 4.9, I'll
postpone their submission to early during 4.10-rc time.

Anyway, if you want to take a sneak pick, the patches are in this tree:

https://git.linuxtv.org//mchehab/experimental.git/log/?h=lkml-books

and the html books are at:
https://mchehab.fedorapeople.org/user/
https://mchehab.fedorapeople.org/development-process/

ePub at:
https://mchehab.fedorapeople.org/user/epub/
https://mchehab.fedorapeople.org/development-process/epub/

PDF and LaTex at:
https://mchehab.fedorapeople.org/user/latex/
https://mchehab.fedorapeople.org/development-process/latex/

Probably, there will be issues with PDF, as Sphinx usually require manual
work to fix issues with PDF output, and on several cases, raw LaTeX
commands inside the rst files. I also had to patch a LaTeX config locally
to avoid an out of memory error when building the user's book.

The last patch in this tree is the RFC patch that adds MAINTAINERS file
to the user's book.

In total, 42 files were converted to either one of the books, 
of a total of 151 files at Documentation/, plus 2 files at /.

There, I opted to use symlinks instead of moving files. There is an
issue with that, though: it is harder to identify what files are
part of the Sphinx build, and what files aren't. Ok, we could
write some sort of script to identify the undocumented files, but
this is a way more complex that doing a
$ find . -maxdepth 1 -type f

(or doing a ls there and see the files inside it)

So, IMHO, we should be moving the files instead of symlinking them.


Thanks,
Mauro

--

The following changes since commit 17e9217d41e18293c82772b4da544f25e62c342e:

  Merge branch 'doc/4.9' into docs-next (2016-09-21 15:55:06 -0600)

are available in the git repository at:

  git://linuxtv.org/mchehab/experimental.git lkml-books

for you to fetch changes up to c8b07684c0278d7f9d0e30f575eb4be3a2da4c3b:

  docs-rst: user: add MAINTAINERS (2016-09-23 17:39:01 -0300)


Mauro Carvalho Chehab (33):
  Documentation/applying-patches.txt: fix a bad external link
  REPORTING-BUGS: convert to ReST markup
  README: convert it to ReST markup
  Documentation/kernel-parameters.txt: convert to ReST markup
  docs-rst: add documents to development-process
  docs-rst: create an user's manual book
  Documentation/adding-syscalls.txt: convert it to ReST markup
  Documentation/bad_memory.txt: convert it to ReST markup
  Documentation/basic_profiling.rst: convert to ReST markup
  Documentation/binfmt_misc.txt: convert it to ReST markup
  Documentation/serial-console.txt: convert it to ReST markup
  Documentation/braille-console: convert it to ReST markup
  Documentation/BUG-HUNTING: convert to ReST markup
  Documentation/CodeOfConflict: add it to the development-process book
  Documentation/devices.rst: convert it to ReST markup
  Documentation/dynamic-debug-howto.txt: convert it to ReST markup
  Documentation/initrd.txt: convert to ReST markup
  Documentation/init.txt: convert to ReST markup
  Documentation/magic-number.txt: convert it to ReST markup
  Documentation/md.txt: Convert to ReST markup
  Documentation/module-signing.txt: convert to ReST markup
  Documentation/mono.txt: convert to ReST markup
  Documentation/java.txt: convert to ReST markup
  Documentation/oops-tracing.txt: convert to ReST markup
  Documentation/parport.txt: convert to ReST markup
  Documentation/ramoops.txt: convert it to ReST format
  Documentation/sysfs-rules.txt: convert it to ReST markup
  Documentation/sysrq.txt: convert to ReST markup
  Documentation/unicode.txt: convert it to ReST markup
  Documentation/VGA-softcursor.txt: convert to ReST markup
  Documentation/volatile-considered-harmful.txt: convert to ReST markup
  Documentation/parport.txt: fix table to show on LaTeX
  docs-rst: user: add MAINTAINERS

 Documentation/BUG-HUNTING  |  164 +--
 Documentation/CodeOfConflict   |1 +
 Documentation/SecurityBugs |   12 +-
 Documentation/VGA-softcursor.txt   |   73 +-
 Documentation/adding-syscalls.txt  |  269 ++---
 Documentation/applying-patches.txt |2 +-
 Documentation/bad_memory.txt   |   26 +-
 Documentation/basic_profiling.txt  |   59 +-
 Documentation/binfmt_misc.txt  |  134 ++-
 Documentation/braille-console.txt  |   30 +-
 Documentation/conf.py 

Re: [PATCH] docs-rst: add inter-document cross references

2016-09-23 Thread Mauro Carvalho Chehab
Hi Jon,

Em Wed, 21 Sep 2016 15:44:05 -0600
Jonathan Corbet  escreveu:

> ...and now I'm thinking that's maybe about enough in docs for 4.9...:)

I finished handling the plain text files that, IMHO, should be on
either user of development process books. 

As you're feeling that there are enough material for 4.9, I'll
postpone their submission to early during 4.10-rc time.

Anyway, if you want to take a sneak pick, the patches are in this tree:

https://git.linuxtv.org//mchehab/experimental.git/log/?h=lkml-books

and the html books are at:
https://mchehab.fedorapeople.org/user/
https://mchehab.fedorapeople.org/development-process/

ePub at:
https://mchehab.fedorapeople.org/user/epub/
https://mchehab.fedorapeople.org/development-process/epub/

PDF and LaTex at:
https://mchehab.fedorapeople.org/user/latex/
https://mchehab.fedorapeople.org/development-process/latex/

Probably, there will be issues with PDF, as Sphinx usually require manual
work to fix issues with PDF output, and on several cases, raw LaTeX
commands inside the rst files. I also had to patch a LaTeX config locally
to avoid an out of memory error when building the user's book.

The last patch in this tree is the RFC patch that adds MAINTAINERS file
to the user's book.

In total, 42 files were converted to either one of the books, 
of a total of 151 files at Documentation/, plus 2 files at /.

There, I opted to use symlinks instead of moving files. There is an
issue with that, though: it is harder to identify what files are
part of the Sphinx build, and what files aren't. Ok, we could
write some sort of script to identify the undocumented files, but
this is a way more complex that doing a
$ find . -maxdepth 1 -type f

(or doing a ls there and see the files inside it)

So, IMHO, we should be moving the files instead of symlinking them.


Thanks,
Mauro

--

The following changes since commit 17e9217d41e18293c82772b4da544f25e62c342e:

  Merge branch 'doc/4.9' into docs-next (2016-09-21 15:55:06 -0600)

are available in the git repository at:

  git://linuxtv.org/mchehab/experimental.git lkml-books

for you to fetch changes up to c8b07684c0278d7f9d0e30f575eb4be3a2da4c3b:

  docs-rst: user: add MAINTAINERS (2016-09-23 17:39:01 -0300)


Mauro Carvalho Chehab (33):
  Documentation/applying-patches.txt: fix a bad external link
  REPORTING-BUGS: convert to ReST markup
  README: convert it to ReST markup
  Documentation/kernel-parameters.txt: convert to ReST markup
  docs-rst: add documents to development-process
  docs-rst: create an user's manual book
  Documentation/adding-syscalls.txt: convert it to ReST markup
  Documentation/bad_memory.txt: convert it to ReST markup
  Documentation/basic_profiling.rst: convert to ReST markup
  Documentation/binfmt_misc.txt: convert it to ReST markup
  Documentation/serial-console.txt: convert it to ReST markup
  Documentation/braille-console: convert it to ReST markup
  Documentation/BUG-HUNTING: convert to ReST markup
  Documentation/CodeOfConflict: add it to the development-process book
  Documentation/devices.rst: convert it to ReST markup
  Documentation/dynamic-debug-howto.txt: convert it to ReST markup
  Documentation/initrd.txt: convert to ReST markup
  Documentation/init.txt: convert to ReST markup
  Documentation/magic-number.txt: convert it to ReST markup
  Documentation/md.txt: Convert to ReST markup
  Documentation/module-signing.txt: convert to ReST markup
  Documentation/mono.txt: convert to ReST markup
  Documentation/java.txt: convert to ReST markup
  Documentation/oops-tracing.txt: convert to ReST markup
  Documentation/parport.txt: convert to ReST markup
  Documentation/ramoops.txt: convert it to ReST format
  Documentation/sysfs-rules.txt: convert it to ReST markup
  Documentation/sysrq.txt: convert to ReST markup
  Documentation/unicode.txt: convert it to ReST markup
  Documentation/VGA-softcursor.txt: convert to ReST markup
  Documentation/volatile-considered-harmful.txt: convert to ReST markup
  Documentation/parport.txt: fix table to show on LaTeX
  docs-rst: user: add MAINTAINERS

 Documentation/BUG-HUNTING  |  164 +--
 Documentation/CodeOfConflict   |1 +
 Documentation/SecurityBugs |   12 +-
 Documentation/VGA-softcursor.txt   |   73 +-
 Documentation/adding-syscalls.txt  |  269 ++---
 Documentation/applying-patches.txt |2 +-
 Documentation/bad_memory.txt   |   26 +-
 Documentation/basic_profiling.txt  |   59 +-
 Documentation/binfmt_misc.txt  |  134 ++-
 Documentation/braille-console.txt  |   30 +-
 Documentation/conf.py 

Re: new objtool warnings again...

2016-09-23 Thread Josh Poimboeuf
On Fri, Sep 23, 2016 at 01:33:45PM -0700, Linus Torvalds wrote:
> Josh,
> 
>  the current F24 toolchain causes
> 
> kernel/signal.o: warning: objtool: .altinstr_replacement+0x54:
> call without frame pointer save/setup
> 
> during a regular allmodconfig build.
> 
> Doing an objdump says:
> 
> ...
>   54:   e8 00 00 00 00  callq  59 <.altinstr_replacement+0x59>
> 55: R_X86_64_PC32   copy_user_generic_string-0x4
>   59:   e8 00 00 00 00  callq  5e <.altinstr_replacement+0x5e>
> 5a: R_X86_64_PC32
> copy_user_enhanced_fast_string-0x4
> ...
> 
> so it seems to come from the alternative_call_2() in copy_user_generic().
> 
> It's somewhere in copy_siginfo_to_user(), so I assume it's just the
> 
> if (from->si_code < 0)
> return __copy_to_user(to, from, sizeof(siginfo_t))
> ? -EFAULT : 0;
> 
> case.  Looking at the code generation, it looks like the frame pointer
> generation in that function has been moved down past this code, so the
> objtool warning seems to be correct, but this indicates that gcc has
> decided that we don't need a frame for that alternative_call_2()
> thing.
> 
> So this code is clearly missing the magic to tell gcc that the asm
> needs a frame pointer.
> 
> What was that magic again? Mind sending a patch?

Is this with your latest pushed master branch?  I have F24, but I don't
see the warning.

In any case, I'll come up with a patch for you to test.

-- 
Josh


Re: new objtool warnings again...

2016-09-23 Thread Josh Poimboeuf
On Fri, Sep 23, 2016 at 01:33:45PM -0700, Linus Torvalds wrote:
> Josh,
> 
>  the current F24 toolchain causes
> 
> kernel/signal.o: warning: objtool: .altinstr_replacement+0x54:
> call without frame pointer save/setup
> 
> during a regular allmodconfig build.
> 
> Doing an objdump says:
> 
> ...
>   54:   e8 00 00 00 00  callq  59 <.altinstr_replacement+0x59>
> 55: R_X86_64_PC32   copy_user_generic_string-0x4
>   59:   e8 00 00 00 00  callq  5e <.altinstr_replacement+0x5e>
> 5a: R_X86_64_PC32
> copy_user_enhanced_fast_string-0x4
> ...
> 
> so it seems to come from the alternative_call_2() in copy_user_generic().
> 
> It's somewhere in copy_siginfo_to_user(), so I assume it's just the
> 
> if (from->si_code < 0)
> return __copy_to_user(to, from, sizeof(siginfo_t))
> ? -EFAULT : 0;
> 
> case.  Looking at the code generation, it looks like the frame pointer
> generation in that function has been moved down past this code, so the
> objtool warning seems to be correct, but this indicates that gcc has
> decided that we don't need a frame for that alternative_call_2()
> thing.
> 
> So this code is clearly missing the magic to tell gcc that the asm
> needs a frame pointer.
> 
> What was that magic again? Mind sending a patch?

Is this with your latest pushed master branch?  I have F24, but I don't
see the warning.

In any case, I'll come up with a patch for you to test.

-- 
Josh


Re: net/sunrpc/stats.c:204: undefined reference to `_GLOBAL_OFFSET_TABLE_'

2016-09-23 Thread Nicolas Pitre
On Thu, 22 Sep 2016, kbuild test robot wrote:

> Hi Nicolas,
> 
> FYI, the error/warning still remains.
> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   7d1e042314619115153a0f6f06e4552c09a50e13
> commit: 461a5e51060c93f5844113f4be9dba513cc92830 do_div(): generic 
> optimization for constant divisor on 32-bit machines
> date:   10 months ago
> config: microblaze-mmu_defconfig (attached as .config)
> compiler: microblaze-linux-gcc (GCC) 6.2.0
> reproduce:
> wget 
> https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
>  -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout 461a5e51060c93f5844113f4be9dba513cc92830
> # save the attached .config to linux build tree
> make.cross ARCH=microblaze 
> 
> All errors (new ones prefixed by >>):
> 
>net/built-in.o: In function `rpc_print_iostats':
> >> net/sunrpc/stats.c:204: undefined reference to `_GLOBAL_OFFSET_TABLE_'
>scripts/link-vmlinux.sh: line 52:  5714 Segmentation fault  ${LD} 
> ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} -T ${lds} ${KBUILD_VMLINUX_INIT} 
> --start-group ${KBUILD_VMLINUX_MAIN} --end-group ${1}

The problem must be at your end, especially if the toolchain exhibits 
segmentation faults.

Following exactly the instructions above, I get:

[...]
  LD  vmlinux
  SORTEX  vmlinux
  SYSMAP  System.map
  OBJCOPY arch/microblaze/boot/linux.bin
  Building modules, stage 2.
  MODPOST 3 modules
  CC  crypto/drbg.mod.o
  CC  crypto/echainiv.mod.o
  CC  crypto/jitterentropy_rng.mod.o
  LD [M]  crypto/jitterentropy_rng.ko
  LD [M]  crypto/drbg.ko
  LD [M]  crypto/echainiv.ko
Kernel: arch/microblaze/boot/linux.bin is ready  (#1)


Nicolas


Re: net/sunrpc/stats.c:204: undefined reference to `_GLOBAL_OFFSET_TABLE_'

2016-09-23 Thread Nicolas Pitre
On Thu, 22 Sep 2016, kbuild test robot wrote:

> Hi Nicolas,
> 
> FYI, the error/warning still remains.
> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   7d1e042314619115153a0f6f06e4552c09a50e13
> commit: 461a5e51060c93f5844113f4be9dba513cc92830 do_div(): generic 
> optimization for constant divisor on 32-bit machines
> date:   10 months ago
> config: microblaze-mmu_defconfig (attached as .config)
> compiler: microblaze-linux-gcc (GCC) 6.2.0
> reproduce:
> wget 
> https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
>  -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout 461a5e51060c93f5844113f4be9dba513cc92830
> # save the attached .config to linux build tree
> make.cross ARCH=microblaze 
> 
> All errors (new ones prefixed by >>):
> 
>net/built-in.o: In function `rpc_print_iostats':
> >> net/sunrpc/stats.c:204: undefined reference to `_GLOBAL_OFFSET_TABLE_'
>scripts/link-vmlinux.sh: line 52:  5714 Segmentation fault  ${LD} 
> ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} -T ${lds} ${KBUILD_VMLINUX_INIT} 
> --start-group ${KBUILD_VMLINUX_MAIN} --end-group ${1}

The problem must be at your end, especially if the toolchain exhibits 
segmentation faults.

Following exactly the instructions above, I get:

[...]
  LD  vmlinux
  SORTEX  vmlinux
  SYSMAP  System.map
  OBJCOPY arch/microblaze/boot/linux.bin
  Building modules, stage 2.
  MODPOST 3 modules
  CC  crypto/drbg.mod.o
  CC  crypto/echainiv.mod.o
  CC  crypto/jitterentropy_rng.mod.o
  LD [M]  crypto/jitterentropy_rng.ko
  LD [M]  crypto/drbg.ko
  LD [M]  crypto/echainiv.ko
Kernel: arch/microblaze/boot/linux.bin is ready  (#1)


Nicolas


[PATCH] f2fs: remove dirty inode pages in error path

2016-09-23 Thread Jaegeuk Kim
When getting EIO while handling orphan inodes, we can get some dirty node
pages. Then, f2fs_write_node_pages() called by iput(node_inode) will try
to flush node pages. But in this case, we should prevent to do that, since
we will try again from the start.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/super.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index e7bb153..fbded38 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1892,6 +1892,7 @@ free_root_inode:
dput(sb->s_root);
sb->s_root = NULL;
 free_node_inode:
+   truncate_inode_pages_final(NODE_MAPPING(sbi));
mutex_lock(>umount_mutex);
release_ino_entry(sbi, true);
f2fs_leave_shrinker(sbi);
-- 
2.8.3



[PATCH] f2fs: remove dirty inode pages in error path

2016-09-23 Thread Jaegeuk Kim
When getting EIO while handling orphan inodes, we can get some dirty node
pages. Then, f2fs_write_node_pages() called by iput(node_inode) will try
to flush node pages. But in this case, we should prevent to do that, since
we will try again from the start.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/super.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index e7bb153..fbded38 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1892,6 +1892,7 @@ free_root_inode:
dput(sb->s_root);
sb->s_root = NULL;
 free_node_inode:
+   truncate_inode_pages_final(NODE_MAPPING(sbi));
mutex_lock(>umount_mutex);
release_ino_entry(sbi, true);
f2fs_leave_shrinker(sbi);
-- 
2.8.3



[PATCH] drm/nouveau/secboot/gm20b: Fix return value in case of error

2016-09-23 Thread Christophe JAILLET
If 'ioremap()' returns 0, 'gm20b_tegra_read_wpr()' will return 0 as well,
which means success.
Return -ENOMEM instead

Signed-off-by: Christophe JAILLET 
---
Not sure that -ENOMEM is the best value.
I've taken it because it is often used in such a case.
---
 drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c
index d5395ebfe8d3..d88db933b3fd 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c
@@ -142,7 +142,7 @@ gm20b_tegra_read_wpr(struct gm200_secboot *gsb)
mc = ioremap(TEGRA_MC_BASE, 0xd00);
if (!mc) {
nvkm_error(>subdev, "Cannot map Tegra MC registers\n");
-   return PTR_ERR(mc);
+   return -ENOMEM;
}
gsb->wpr_addr = ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_0) |
  ((u64)ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_HI_0) << 32);
-- 
2.7.4



Re: new objtool warnings again...

2016-09-23 Thread Linus Torvalds
On Fri, Sep 23, 2016 at 1:33 PM, Linus Torvalds
 wrote:
>
> So this code is clearly missing the magic to tell gcc that the asm
> needs a frame pointer.

Independently of that, the objtool build seems racy or somehow
fragile. I've now twice gotten into a situation where I end up getting

  cat: /home/torvalds/v2.6/linux/tools/objtool/.fixdep.o.d: No such
file or directory
  make[4]: *** [/home/torvalds/v2.6/linux/tools/objtool/fixdep.o] Error 1
  make[3]: *** [/home/torvalds/v2.6/linux/tools/objtool/fixdep-in.o] Error 2
  make[2]: *** [fixdep] Error 2
  make[1]: *** [objtool] Error 2
  make: *** [tools/objtool] Error 2

with just the right timings, and then ccache ends up remembering that
as a build failure and causing that to be "sticky" even across "git
clean -dqfx" builds (and the "ccache -C" clears it).

Adding Michal to the cc, in case he can see what the problem is.

 Linus


Re: new objtool warnings again...

2016-09-23 Thread Linus Torvalds
On Fri, Sep 23, 2016 at 1:33 PM, Linus Torvalds
 wrote:
>
> So this code is clearly missing the magic to tell gcc that the asm
> needs a frame pointer.

Independently of that, the objtool build seems racy or somehow
fragile. I've now twice gotten into a situation where I end up getting

  cat: /home/torvalds/v2.6/linux/tools/objtool/.fixdep.o.d: No such
file or directory
  make[4]: *** [/home/torvalds/v2.6/linux/tools/objtool/fixdep.o] Error 1
  make[3]: *** [/home/torvalds/v2.6/linux/tools/objtool/fixdep-in.o] Error 2
  make[2]: *** [fixdep] Error 2
  make[1]: *** [objtool] Error 2
  make: *** [tools/objtool] Error 2

with just the right timings, and then ccache ends up remembering that
as a build failure and causing that to be "sticky" even across "git
clean -dqfx" builds (and the "ccache -C" clears it).

Adding Michal to the cc, in case he can see what the problem is.

 Linus


[PATCH] drm/nouveau/secboot/gm20b: Fix return value in case of error

2016-09-23 Thread Christophe JAILLET
If 'ioremap()' returns 0, 'gm20b_tegra_read_wpr()' will return 0 as well,
which means success.
Return -ENOMEM instead

Signed-off-by: Christophe JAILLET 
---
Not sure that -ENOMEM is the best value.
I've taken it because it is often used in such a case.
---
 drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c
index d5395ebfe8d3..d88db933b3fd 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c
@@ -142,7 +142,7 @@ gm20b_tegra_read_wpr(struct gm200_secboot *gsb)
mc = ioremap(TEGRA_MC_BASE, 0xd00);
if (!mc) {
nvkm_error(>subdev, "Cannot map Tegra MC registers\n");
-   return PTR_ERR(mc);
+   return -ENOMEM;
}
gsb->wpr_addr = ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_0) |
  ((u64)ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_HI_0) << 32);
-- 
2.7.4



[PATCH cgroup/for-4.8-fixes] cgroup: fix invalid controller enable rejections with cgroup namespace

2016-09-23 Thread Tejun Heo
>From 9157056da8f8c4a6305f15619e269f164b63a6de Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Fri, 23 Sep 2016 16:55:49 -0400

On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it.  The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").

When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace.  This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.

Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.

While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.

Signed-off-by: Tejun Heo 
Reported-by: Evgeny Vereshchagin 
Cc: Serge E. Hallyn 
Cc: Aditya Kali 
Cc: Eric W. Biederman 
Cc: sta...@vger.kernel.org # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: https://github.com/systemd/systemd/pull/3589#issuecomment-249089541
---
Hello,

I applied this patch to cgroup/for-4.8-fixes as I wanted it to get
exposure ASAP as it's pretty late in the devel cycle.  If I messed up
something, please let me know.

Thanks.

 kernel/cgroup.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index d1c51b7..0d4ee1e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3446,9 +3446,28 @@ static ssize_t cgroup_subtree_control_write(struct 
kernfs_open_file *of,
 * Except for the root, subtree_control must be zero for a cgroup
 * with tasks so that child cgroups don't compete against tasks.
 */
-   if (enable && cgroup_parent(cgrp) && !list_empty(>cset_links)) {
-   ret = -EBUSY;
-   goto out_unlock;
+   if (enable && cgroup_parent(cgrp)) {
+   struct cgrp_cset_link *link;
+
+   /*
+* Because namespaces pin csets too, @cgrp->cset_links
+* might not be empty even when @cgrp is empty.  Walk and
+* verify each cset.
+*/
+   spin_lock_irq(_set_lock);
+
+   ret = 0;
+   list_for_each_entry(link, >cset_links, cset_link) {
+   if (css_set_populated(link->cset)) {
+   ret = -EBUSY;
+   break;
+   }
+   }
+
+   spin_unlock_irq(_set_lock);
+
+   if (ret)
+   goto out_unlock;
}
 
/* save and update control masks and prepare csses */
@@ -3899,7 +3918,9 @@ void cgroup_file_notify(struct cgroup_file *cfile)
  * cgroup_task_count - count the number of tasks in a cgroup.
  * @cgrp: the cgroup in question
  *
- * Return the number of tasks in the cgroup.
+ * Return the number of tasks in the cgroup.  The returned number can be
+ * higher than the actual number of tasks due to css_set references from
+ * namespace roots and temporary usages.
  */
 static int cgroup_task_count(const struct cgroup *cgrp)
 {
-- 
2.7.4



[PATCH cgroup/for-4.8-fixes] cgroup: fix invalid controller enable rejections with cgroup namespace

2016-09-23 Thread Tejun Heo
>From 9157056da8f8c4a6305f15619e269f164b63a6de Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Fri, 23 Sep 2016 16:55:49 -0400

On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it.  The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").

When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace.  This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.

Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.

While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.

Signed-off-by: Tejun Heo 
Reported-by: Evgeny Vereshchagin 
Cc: Serge E. Hallyn 
Cc: Aditya Kali 
Cc: Eric W. Biederman 
Cc: sta...@vger.kernel.org # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: https://github.com/systemd/systemd/pull/3589#issuecomment-249089541
---
Hello,

I applied this patch to cgroup/for-4.8-fixes as I wanted it to get
exposure ASAP as it's pretty late in the devel cycle.  If I messed up
something, please let me know.

Thanks.

 kernel/cgroup.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index d1c51b7..0d4ee1e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3446,9 +3446,28 @@ static ssize_t cgroup_subtree_control_write(struct 
kernfs_open_file *of,
 * Except for the root, subtree_control must be zero for a cgroup
 * with tasks so that child cgroups don't compete against tasks.
 */
-   if (enable && cgroup_parent(cgrp) && !list_empty(>cset_links)) {
-   ret = -EBUSY;
-   goto out_unlock;
+   if (enable && cgroup_parent(cgrp)) {
+   struct cgrp_cset_link *link;
+
+   /*
+* Because namespaces pin csets too, @cgrp->cset_links
+* might not be empty even when @cgrp is empty.  Walk and
+* verify each cset.
+*/
+   spin_lock_irq(_set_lock);
+
+   ret = 0;
+   list_for_each_entry(link, >cset_links, cset_link) {
+   if (css_set_populated(link->cset)) {
+   ret = -EBUSY;
+   break;
+   }
+   }
+
+   spin_unlock_irq(_set_lock);
+
+   if (ret)
+   goto out_unlock;
}
 
/* save and update control masks and prepare csses */
@@ -3899,7 +3918,9 @@ void cgroup_file_notify(struct cgroup_file *cfile)
  * cgroup_task_count - count the number of tasks in a cgroup.
  * @cgrp: the cgroup in question
  *
- * Return the number of tasks in the cgroup.
+ * Return the number of tasks in the cgroup.  The returned number can be
+ * higher than the actual number of tasks due to css_set references from
+ * namespace roots and temporary usages.
  */
 static int cgroup_task_count(const struct cgroup *cgrp)
 {
-- 
2.7.4



[PATCH 1/2] bpf samples: fix compiler errors with sockex2 and sockex3

2016-09-23 Thread Naveen N. Rao
These samples fail to compile as 'struct flow_keys' conflicts with
definition in net/flow_dissector.h. Fix the same by renaming the
structure used in the sample.

Signed-off-by: Naveen N. Rao 
---
 samples/bpf/sockex2_kern.c | 10 +-
 samples/bpf/sockex3_kern.c |  8 
 samples/bpf/sockex3_user.c |  4 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/samples/bpf/sockex2_kern.c b/samples/bpf/sockex2_kern.c
index ba0e177..44e5846 100644
--- a/samples/bpf/sockex2_kern.c
+++ b/samples/bpf/sockex2_kern.c
@@ -14,7 +14,7 @@ struct vlan_hdr {
__be16 h_vlan_encapsulated_proto;
 };
 
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -59,7 +59,7 @@ static inline __u32 ipv6_addr_hash(struct __sk_buff *ctx, 
__u64 off)
 }
 
 static inline __u64 parse_ip(struct __sk_buff *skb, __u64 nhoff, __u64 
*ip_proto,
-struct flow_keys *flow)
+struct bpf_flow_keys *flow)
 {
__u64 verlen;
 
@@ -83,7 +83,7 @@ static inline __u64 parse_ip(struct __sk_buff *skb, __u64 
nhoff, __u64 *ip_proto
 }
 
 static inline __u64 parse_ipv6(struct __sk_buff *skb, __u64 nhoff, __u64 
*ip_proto,
-  struct flow_keys *flow)
+  struct bpf_flow_keys *flow)
 {
*ip_proto = load_byte(skb,
  nhoff + offsetof(struct ipv6hdr, nexthdr));
@@ -96,7 +96,7 @@ static inline __u64 parse_ipv6(struct __sk_buff *skb, __u64 
nhoff, __u64 *ip_pro
return nhoff;
 }
 
-static inline bool flow_dissector(struct __sk_buff *skb, struct flow_keys 
*flow)
+static inline bool flow_dissector(struct __sk_buff *skb, struct bpf_flow_keys 
*flow)
 {
__u64 nhoff = ETH_HLEN;
__u64 ip_proto;
@@ -198,7 +198,7 @@ struct bpf_map_def SEC("maps") hash_map = {
 SEC("socket2")
 int bpf_prog2(struct __sk_buff *skb)
 {
-   struct flow_keys flow;
+   struct bpf_flow_keys flow;
struct pair *value;
u32 key;
 
diff --git a/samples/bpf/sockex3_kern.c b/samples/bpf/sockex3_kern.c
index 41ae2fd..95907f8 100644
--- a/samples/bpf/sockex3_kern.c
+++ b/samples/bpf/sockex3_kern.c
@@ -61,7 +61,7 @@ struct vlan_hdr {
__be16 h_vlan_encapsulated_proto;
 };
 
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -88,7 +88,7 @@ static inline __u32 ipv6_addr_hash(struct __sk_buff *ctx, 
__u64 off)
 }
 
 struct globals {
-   struct flow_keys flow;
+   struct bpf_flow_keys flow;
 };
 
 struct bpf_map_def SEC("maps") percpu_map = {
@@ -114,14 +114,14 @@ struct pair {
 
 struct bpf_map_def SEC("maps") hash_map = {
.type = BPF_MAP_TYPE_HASH,
-   .key_size = sizeof(struct flow_keys),
+   .key_size = sizeof(struct bpf_flow_keys),
.value_size = sizeof(struct pair),
.max_entries = 1024,
 };
 
 static void update_stats(struct __sk_buff *skb, struct globals *g)
 {
-   struct flow_keys key = g->flow;
+   struct bpf_flow_keys key = g->flow;
struct pair *value;
 
value = bpf_map_lookup_elem(_map, );
diff --git a/samples/bpf/sockex3_user.c b/samples/bpf/sockex3_user.c
index d4184ab..3fcfd8c4 100644
--- a/samples/bpf/sockex3_user.c
+++ b/samples/bpf/sockex3_user.c
@@ -7,7 +7,7 @@
 #include 
 #include 
 
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -49,7 +49,7 @@ int main(int argc, char **argv)
(void) f;
 
for (i = 0; i < 5; i++) {
-   struct flow_keys key = {}, next_key;
+   struct bpf_flow_keys key = {}, next_key;
struct pair value;
 
sleep(1);
-- 
2.9.3



[PATCH 2/2] bpf samples: update tracex5 sample to use __seccomp_filter

2016-09-23 Thread Naveen N. Rao
seccomp_phase1() does not exist anymore. Instead, update sample to use
__seccomp_filter(). While at it, set max locked memory to unlimited.

Signed-off-by: Naveen N. Rao 
---
I am not completely sure if __seccomp_filter is the right place to hook
in. This works for me though. Please review.

Thanks,
Naveen


 samples/bpf/tracex5_kern.c | 16 +++-
 samples/bpf/tracex5_user.c |  3 +++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/samples/bpf/tracex5_kern.c b/samples/bpf/tracex5_kern.c
index f95f232..fd12d71 100644
--- a/samples/bpf/tracex5_kern.c
+++ b/samples/bpf/tracex5_kern.c
@@ -19,20 +19,18 @@ struct bpf_map_def SEC("maps") progs = {
.max_entries = 1024,
 };
 
-SEC("kprobe/seccomp_phase1")
+SEC("kprobe/__seccomp_filter")
 int bpf_prog1(struct pt_regs *ctx)
 {
-   struct seccomp_data sd;
-
-   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+   int sc_nr = (int)PT_REGS_PARM1(ctx);
 
/* dispatch into next BPF program depending on syscall number */
-   bpf_tail_call(ctx, , sd.nr);
+   bpf_tail_call(ctx, , sc_nr);
 
/* fall through -> unknown syscall */
-   if (sd.nr >= __NR_getuid && sd.nr <= __NR_getsid) {
+   if (sc_nr >= __NR_getuid && sc_nr <= __NR_getsid) {
char fmt[] = "syscall=%d (one of get/set uid/pid/gid)\n";
-   bpf_trace_printk(fmt, sizeof(fmt), sd.nr);
+   bpf_trace_printk(fmt, sizeof(fmt), sc_nr);
}
return 0;
 }
@@ -42,7 +40,7 @@ PROG(__NR_write)(struct pt_regs *ctx)
 {
struct seccomp_data sd;
 
-   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM2(ctx));
if (sd.args[2] == 512) {
char fmt[] = "write(fd=%d, buf=%p, size=%d)\n";
bpf_trace_printk(fmt, sizeof(fmt),
@@ -55,7 +53,7 @@ PROG(__NR_read)(struct pt_regs *ctx)
 {
struct seccomp_data sd;
 
-   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM2(ctx));
if (sd.args[2] > 128 && sd.args[2] <= 1024) {
char fmt[] = "read(fd=%d, buf=%p, size=%d)\n";
bpf_trace_printk(fmt, sizeof(fmt),
diff --git a/samples/bpf/tracex5_user.c b/samples/bpf/tracex5_user.c
index a04dd3c..36b5925 100644
--- a/samples/bpf/tracex5_user.c
+++ b/samples/bpf/tracex5_user.c
@@ -6,6 +6,7 @@
 #include 
 #include "libbpf.h"
 #include "bpf_load.h"
+#include 
 
 /* install fake seccomp program to enable seccomp code path inside the kernel,
  * so that our kprobe attached to seccomp_phase1() can be triggered
@@ -27,8 +28,10 @@ int main(int ac, char **argv)
 {
FILE *f;
char filename[256];
+   struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+   setrlimit(RLIMIT_MEMLOCK, );
 
if (load_bpf_file(filename)) {
printf("%s", bpf_log_buf);
-- 
2.9.3



[PATCH 2/2] bpf samples: update tracex5 sample to use __seccomp_filter

2016-09-23 Thread Naveen N. Rao
seccomp_phase1() does not exist anymore. Instead, update sample to use
__seccomp_filter(). While at it, set max locked memory to unlimited.

Signed-off-by: Naveen N. Rao 
---
I am not completely sure if __seccomp_filter is the right place to hook
in. This works for me though. Please review.

Thanks,
Naveen


 samples/bpf/tracex5_kern.c | 16 +++-
 samples/bpf/tracex5_user.c |  3 +++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/samples/bpf/tracex5_kern.c b/samples/bpf/tracex5_kern.c
index f95f232..fd12d71 100644
--- a/samples/bpf/tracex5_kern.c
+++ b/samples/bpf/tracex5_kern.c
@@ -19,20 +19,18 @@ struct bpf_map_def SEC("maps") progs = {
.max_entries = 1024,
 };
 
-SEC("kprobe/seccomp_phase1")
+SEC("kprobe/__seccomp_filter")
 int bpf_prog1(struct pt_regs *ctx)
 {
-   struct seccomp_data sd;
-
-   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+   int sc_nr = (int)PT_REGS_PARM1(ctx);
 
/* dispatch into next BPF program depending on syscall number */
-   bpf_tail_call(ctx, , sd.nr);
+   bpf_tail_call(ctx, , sc_nr);
 
/* fall through -> unknown syscall */
-   if (sd.nr >= __NR_getuid && sd.nr <= __NR_getsid) {
+   if (sc_nr >= __NR_getuid && sc_nr <= __NR_getsid) {
char fmt[] = "syscall=%d (one of get/set uid/pid/gid)\n";
-   bpf_trace_printk(fmt, sizeof(fmt), sd.nr);
+   bpf_trace_printk(fmt, sizeof(fmt), sc_nr);
}
return 0;
 }
@@ -42,7 +40,7 @@ PROG(__NR_write)(struct pt_regs *ctx)
 {
struct seccomp_data sd;
 
-   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM2(ctx));
if (sd.args[2] == 512) {
char fmt[] = "write(fd=%d, buf=%p, size=%d)\n";
bpf_trace_printk(fmt, sizeof(fmt),
@@ -55,7 +53,7 @@ PROG(__NR_read)(struct pt_regs *ctx)
 {
struct seccomp_data sd;
 
-   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+   bpf_probe_read(, sizeof(sd), (void *)PT_REGS_PARM2(ctx));
if (sd.args[2] > 128 && sd.args[2] <= 1024) {
char fmt[] = "read(fd=%d, buf=%p, size=%d)\n";
bpf_trace_printk(fmt, sizeof(fmt),
diff --git a/samples/bpf/tracex5_user.c b/samples/bpf/tracex5_user.c
index a04dd3c..36b5925 100644
--- a/samples/bpf/tracex5_user.c
+++ b/samples/bpf/tracex5_user.c
@@ -6,6 +6,7 @@
 #include 
 #include "libbpf.h"
 #include "bpf_load.h"
+#include 
 
 /* install fake seccomp program to enable seccomp code path inside the kernel,
  * so that our kprobe attached to seccomp_phase1() can be triggered
@@ -27,8 +28,10 @@ int main(int ac, char **argv)
 {
FILE *f;
char filename[256];
+   struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+   setrlimit(RLIMIT_MEMLOCK, );
 
if (load_bpf_file(filename)) {
printf("%s", bpf_log_buf);
-- 
2.9.3



[PATCH 1/2] bpf samples: fix compiler errors with sockex2 and sockex3

2016-09-23 Thread Naveen N. Rao
These samples fail to compile as 'struct flow_keys' conflicts with
definition in net/flow_dissector.h. Fix the same by renaming the
structure used in the sample.

Signed-off-by: Naveen N. Rao 
---
 samples/bpf/sockex2_kern.c | 10 +-
 samples/bpf/sockex3_kern.c |  8 
 samples/bpf/sockex3_user.c |  4 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/samples/bpf/sockex2_kern.c b/samples/bpf/sockex2_kern.c
index ba0e177..44e5846 100644
--- a/samples/bpf/sockex2_kern.c
+++ b/samples/bpf/sockex2_kern.c
@@ -14,7 +14,7 @@ struct vlan_hdr {
__be16 h_vlan_encapsulated_proto;
 };
 
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -59,7 +59,7 @@ static inline __u32 ipv6_addr_hash(struct __sk_buff *ctx, 
__u64 off)
 }
 
 static inline __u64 parse_ip(struct __sk_buff *skb, __u64 nhoff, __u64 
*ip_proto,
-struct flow_keys *flow)
+struct bpf_flow_keys *flow)
 {
__u64 verlen;
 
@@ -83,7 +83,7 @@ static inline __u64 parse_ip(struct __sk_buff *skb, __u64 
nhoff, __u64 *ip_proto
 }
 
 static inline __u64 parse_ipv6(struct __sk_buff *skb, __u64 nhoff, __u64 
*ip_proto,
-  struct flow_keys *flow)
+  struct bpf_flow_keys *flow)
 {
*ip_proto = load_byte(skb,
  nhoff + offsetof(struct ipv6hdr, nexthdr));
@@ -96,7 +96,7 @@ static inline __u64 parse_ipv6(struct __sk_buff *skb, __u64 
nhoff, __u64 *ip_pro
return nhoff;
 }
 
-static inline bool flow_dissector(struct __sk_buff *skb, struct flow_keys 
*flow)
+static inline bool flow_dissector(struct __sk_buff *skb, struct bpf_flow_keys 
*flow)
 {
__u64 nhoff = ETH_HLEN;
__u64 ip_proto;
@@ -198,7 +198,7 @@ struct bpf_map_def SEC("maps") hash_map = {
 SEC("socket2")
 int bpf_prog2(struct __sk_buff *skb)
 {
-   struct flow_keys flow;
+   struct bpf_flow_keys flow;
struct pair *value;
u32 key;
 
diff --git a/samples/bpf/sockex3_kern.c b/samples/bpf/sockex3_kern.c
index 41ae2fd..95907f8 100644
--- a/samples/bpf/sockex3_kern.c
+++ b/samples/bpf/sockex3_kern.c
@@ -61,7 +61,7 @@ struct vlan_hdr {
__be16 h_vlan_encapsulated_proto;
 };
 
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -88,7 +88,7 @@ static inline __u32 ipv6_addr_hash(struct __sk_buff *ctx, 
__u64 off)
 }
 
 struct globals {
-   struct flow_keys flow;
+   struct bpf_flow_keys flow;
 };
 
 struct bpf_map_def SEC("maps") percpu_map = {
@@ -114,14 +114,14 @@ struct pair {
 
 struct bpf_map_def SEC("maps") hash_map = {
.type = BPF_MAP_TYPE_HASH,
-   .key_size = sizeof(struct flow_keys),
+   .key_size = sizeof(struct bpf_flow_keys),
.value_size = sizeof(struct pair),
.max_entries = 1024,
 };
 
 static void update_stats(struct __sk_buff *skb, struct globals *g)
 {
-   struct flow_keys key = g->flow;
+   struct bpf_flow_keys key = g->flow;
struct pair *value;
 
value = bpf_map_lookup_elem(_map, );
diff --git a/samples/bpf/sockex3_user.c b/samples/bpf/sockex3_user.c
index d4184ab..3fcfd8c4 100644
--- a/samples/bpf/sockex3_user.c
+++ b/samples/bpf/sockex3_user.c
@@ -7,7 +7,7 @@
 #include 
 #include 
 
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -49,7 +49,7 @@ int main(int argc, char **argv)
(void) f;
 
for (i = 0; i < 5; i++) {
-   struct flow_keys key = {}, next_key;
+   struct bpf_flow_keys key = {}, next_key;
struct pair value;
 
sleep(1);
-- 
2.9.3



Re: [PATCH v4] staging: ion: Fix a coding style issue

2016-09-23 Thread Laura Abbott

On 09/23/2016 11:03 AM, Antti Keränen wrote:

This patch fixes the alignment of an allocation flag block comment
and moves the comments before each #define.



Acked-by: Laura Abbott 


Signed-off-by: Antti Keränen 
---
In addition to fixing the alignment issue, this version of the patch moves
the comments from after the define lines to before the define lines.

 drivers/staging/android/uapi/ion.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/android/uapi/ion.h 
b/drivers/staging/android/uapi/ion.h
index 647f130..14cd873 100644
--- a/drivers/staging/android/uapi/ion.h
+++ b/drivers/staging/android/uapi/ion.h
@@ -52,18 +52,18 @@ enum ion_heap_type {
  * allocation flags - the lower 16 bits are used by core ion, the upper 16
  * bits are reserved for use by the heaps themselves.
  */
-#define ION_FLAG_CACHED 1  /*
-* mappings of this buffer should be
-* cached, ion will do cache
-* maintenance when the buffer is
-* mapped for dma
-   */
-#define ION_FLAG_CACHED_NEEDS_SYNC 2   /*
-* mappings of this buffer will created
-* at mmap time, if this is set
-* caches must be managed
-* manually
-*/
+
+/*
+ * mappings of this buffer should be cached, ion will do cache maintenance
+ * when the buffer is mapped for dma
+ */
+#define ION_FLAG_CACHED 1
+
+/*
+ * mappings of this buffer will created at mmap time, if this is set
+ * caches must be managed manually
+ */
+#define ION_FLAG_CACHED_NEEDS_SYNC 2

 /**
  * DOC: Ion Userspace API





Re: [PATCH v4] staging: ion: Fix a coding style issue

2016-09-23 Thread Laura Abbott

On 09/23/2016 11:03 AM, Antti Keränen wrote:

This patch fixes the alignment of an allocation flag block comment
and moves the comments before each #define.



Acked-by: Laura Abbott 


Signed-off-by: Antti Keränen 
---
In addition to fixing the alignment issue, this version of the patch moves
the comments from after the define lines to before the define lines.

 drivers/staging/android/uapi/ion.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/android/uapi/ion.h 
b/drivers/staging/android/uapi/ion.h
index 647f130..14cd873 100644
--- a/drivers/staging/android/uapi/ion.h
+++ b/drivers/staging/android/uapi/ion.h
@@ -52,18 +52,18 @@ enum ion_heap_type {
  * allocation flags - the lower 16 bits are used by core ion, the upper 16
  * bits are reserved for use by the heaps themselves.
  */
-#define ION_FLAG_CACHED 1  /*
-* mappings of this buffer should be
-* cached, ion will do cache
-* maintenance when the buffer is
-* mapped for dma
-   */
-#define ION_FLAG_CACHED_NEEDS_SYNC 2   /*
-* mappings of this buffer will created
-* at mmap time, if this is set
-* caches must be managed
-* manually
-*/
+
+/*
+ * mappings of this buffer should be cached, ion will do cache maintenance
+ * when the buffer is mapped for dma
+ */
+#define ION_FLAG_CACHED 1
+
+/*
+ * mappings of this buffer will created at mmap time, if this is set
+ * caches must be managed manually
+ */
+#define ION_FLAG_CACHED_NEEDS_SYNC 2

 /**
  * DOC: Ion Userspace API





[PATCH 3/3] bpf powerpc: add support for bpf constant blinding

2016-09-23 Thread Naveen N. Rao
In line with similar support for other architectures by Daniel Borkmann.

'MOD Default X' from test_bpf without constant blinding:
84 bytes emitted from JIT compiler (pass:3, flen:7)
d58a4688 + :
   0:   nop
   4:   nop
   8:   std r27,-40(r1)
   c:   std r28,-32(r1)
  10:   xor r8,r8,r8
  14:   xor r28,r28,r28
  18:   mr  r27,r3
  1c:   li  r8,66
  20:   cmpwi   r28,0
  24:   bne 0x0030
  28:   li  r8,0
  2c:   b   0x0044
  30:   divwu   r9,r8,r28
  34:   mullw   r9,r28,r9
  38:   subfr8,r9,r8
  3c:   rotlwi  r8,r8,0
  40:   li  r8,66
  44:   ld  r27,-40(r1)
  48:   ld  r28,-32(r1)
  4c:   mr  r3,r8
  50:   blr

... and with constant blinding:
140 bytes emitted from JIT compiler (pass:3, flen:11)
dbd6ab24 + :
   0:   nop
   4:   nop
   8:   std r27,-40(r1)
   c:   std r28,-32(r1)
  10:   xor r8,r8,r8
  14:   xor r28,r28,r28
  18:   mr  r27,r3
  1c:   lis r2,-22834
  20:   ori r2,r2,36083
  24:   rotlwi  r2,r2,0
  28:   xorir2,r2,36017
  2c:   xoris   r2,r2,42702
  30:   rotlwi  r2,r2,0
  34:   mr  r8,r2
  38:   rotlwi  r8,r8,0
  3c:   cmpwi   r28,0
  40:   bne 0x004c
  44:   li  r8,0
  48:   b   0x007c
  4c:   divwu   r9,r8,r28
  50:   mullw   r9,r28,r9
  54:   subfr8,r9,r8
  58:   rotlwi  r8,r8,0
  5c:   lis r2,-17137
  60:   ori r2,r2,39065
  64:   rotlwi  r2,r2,0
  68:   xorir2,r2,39131
  6c:   xoris   r2,r2,48399
  70:   rotlwi  r2,r2,0
  74:   mr  r8,r2
  78:   rotlwi  r8,r8,0
  7c:   ld  r27,-40(r1)
  80:   ld  r28,-32(r1)
  84:   mr  r3,r8
  88:   blr

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/net/bpf_jit64.h  |  9 +
 arch/powerpc/net/bpf_jit_comp64.c | 36 +---
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index 038e00b..62fa758 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -39,10 +39,10 @@
 #ifndef __ASSEMBLY__
 
 /* BPF register usage */
-#define SKB_HLEN_REG   (MAX_BPF_REG + 0)
-#define SKB_DATA_REG   (MAX_BPF_REG + 1)
-#define TMP_REG_1  (MAX_BPF_REG + 2)
-#define TMP_REG_2  (MAX_BPF_REG + 3)
+#define SKB_HLEN_REG   (MAX_BPF_JIT_REG + 0)
+#define SKB_DATA_REG   (MAX_BPF_JIT_REG + 1)
+#define TMP_REG_1  (MAX_BPF_JIT_REG + 2)
+#define TMP_REG_2  (MAX_BPF_JIT_REG + 3)
 
 /* BPF to ppc register mappings */
 static const int b2p[] = {
@@ -62,6 +62,7 @@ static const int b2p[] = {
/* frame pointer aka BPF_REG_10 */
[BPF_REG_FP] = 31,
/* eBPF jit internal registers */
+   [BPF_REG_AX] = 2,
[SKB_HLEN_REG] = 25,
[SKB_DATA_REG] = 26,
[TMP_REG_1] = 9,
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 3ec29d6..0fe98a5 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -974,21 +974,37 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
int pass;
int flen;
struct bpf_binary_header *bpf_hdr;
+   struct bpf_prog *org_fp = fp;
+   struct bpf_prog *tmp_fp;
+   bool bpf_blinded = false;
 
if (!bpf_jit_enable)
-   return fp;
+   return org_fp;
+
+   tmp_fp = bpf_jit_blind_constants(org_fp);
+   if (IS_ERR(tmp_fp))
+   return org_fp;
+
+   if (tmp_fp != org_fp) {
+   bpf_blinded = true;
+   fp = tmp_fp;
+   }
 
flen = fp->len;
addrs = kzalloc((flen+1) * sizeof(*addrs), GFP_KERNEL);
-   if (addrs == NULL)
-   return fp;
+   if (addrs == NULL) {
+   fp = org_fp;
+   goto out;
+   }
+
+   memset(, 0, sizeof(struct codegen_context));
 
-   cgctx.idx = 0;
-   cgctx.seen = 0;
/* Scouting faux-generate pass 0 */
-   if (bpf_jit_build_body(fp, 0, , addrs))
+   if (bpf_jit_build_body(fp, 0, , addrs)) {
/* We hit something illegal or unsupported. */
+   fp = org_fp;
goto out;
+   }
 
/*
 * Pretend to build prologue, given the features we've seen.  This will
@@ -1003,8 +1019,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
bpf_hdr = bpf_jit_binary_alloc(alloclen, , 4,
bpf_jit_fill_ill_insns);
-   if (!bpf_hdr)
+   if (!bpf_hdr) {
+   fp = org_fp;
goto out;
+   }
 
code_base = (u32 *)(image + FUNCTION_DESCR_SIZE);
 
@@ -1041,6 +1059,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
 out:
kfree(addrs);
+
+   if (bpf_blinded)
+   bpf_jit_prog_release_other(fp, fp == org_fp ? tmp_fp : org_fp);
+
return fp;
 }
 
-- 
2.9.3



[PATCH 3/3] bpf powerpc: add support for bpf constant blinding

2016-09-23 Thread Naveen N. Rao
In line with similar support for other architectures by Daniel Borkmann.

'MOD Default X' from test_bpf without constant blinding:
84 bytes emitted from JIT compiler (pass:3, flen:7)
d58a4688 + :
   0:   nop
   4:   nop
   8:   std r27,-40(r1)
   c:   std r28,-32(r1)
  10:   xor r8,r8,r8
  14:   xor r28,r28,r28
  18:   mr  r27,r3
  1c:   li  r8,66
  20:   cmpwi   r28,0
  24:   bne 0x0030
  28:   li  r8,0
  2c:   b   0x0044
  30:   divwu   r9,r8,r28
  34:   mullw   r9,r28,r9
  38:   subfr8,r9,r8
  3c:   rotlwi  r8,r8,0
  40:   li  r8,66
  44:   ld  r27,-40(r1)
  48:   ld  r28,-32(r1)
  4c:   mr  r3,r8
  50:   blr

... and with constant blinding:
140 bytes emitted from JIT compiler (pass:3, flen:11)
dbd6ab24 + :
   0:   nop
   4:   nop
   8:   std r27,-40(r1)
   c:   std r28,-32(r1)
  10:   xor r8,r8,r8
  14:   xor r28,r28,r28
  18:   mr  r27,r3
  1c:   lis r2,-22834
  20:   ori r2,r2,36083
  24:   rotlwi  r2,r2,0
  28:   xorir2,r2,36017
  2c:   xoris   r2,r2,42702
  30:   rotlwi  r2,r2,0
  34:   mr  r8,r2
  38:   rotlwi  r8,r8,0
  3c:   cmpwi   r28,0
  40:   bne 0x004c
  44:   li  r8,0
  48:   b   0x007c
  4c:   divwu   r9,r8,r28
  50:   mullw   r9,r28,r9
  54:   subfr8,r9,r8
  58:   rotlwi  r8,r8,0
  5c:   lis r2,-17137
  60:   ori r2,r2,39065
  64:   rotlwi  r2,r2,0
  68:   xorir2,r2,39131
  6c:   xoris   r2,r2,48399
  70:   rotlwi  r2,r2,0
  74:   mr  r8,r2
  78:   rotlwi  r8,r8,0
  7c:   ld  r27,-40(r1)
  80:   ld  r28,-32(r1)
  84:   mr  r3,r8
  88:   blr

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/net/bpf_jit64.h  |  9 +
 arch/powerpc/net/bpf_jit_comp64.c | 36 +---
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index 038e00b..62fa758 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -39,10 +39,10 @@
 #ifndef __ASSEMBLY__
 
 /* BPF register usage */
-#define SKB_HLEN_REG   (MAX_BPF_REG + 0)
-#define SKB_DATA_REG   (MAX_BPF_REG + 1)
-#define TMP_REG_1  (MAX_BPF_REG + 2)
-#define TMP_REG_2  (MAX_BPF_REG + 3)
+#define SKB_HLEN_REG   (MAX_BPF_JIT_REG + 0)
+#define SKB_DATA_REG   (MAX_BPF_JIT_REG + 1)
+#define TMP_REG_1  (MAX_BPF_JIT_REG + 2)
+#define TMP_REG_2  (MAX_BPF_JIT_REG + 3)
 
 /* BPF to ppc register mappings */
 static const int b2p[] = {
@@ -62,6 +62,7 @@ static const int b2p[] = {
/* frame pointer aka BPF_REG_10 */
[BPF_REG_FP] = 31,
/* eBPF jit internal registers */
+   [BPF_REG_AX] = 2,
[SKB_HLEN_REG] = 25,
[SKB_DATA_REG] = 26,
[TMP_REG_1] = 9,
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 3ec29d6..0fe98a5 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -974,21 +974,37 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
int pass;
int flen;
struct bpf_binary_header *bpf_hdr;
+   struct bpf_prog *org_fp = fp;
+   struct bpf_prog *tmp_fp;
+   bool bpf_blinded = false;
 
if (!bpf_jit_enable)
-   return fp;
+   return org_fp;
+
+   tmp_fp = bpf_jit_blind_constants(org_fp);
+   if (IS_ERR(tmp_fp))
+   return org_fp;
+
+   if (tmp_fp != org_fp) {
+   bpf_blinded = true;
+   fp = tmp_fp;
+   }
 
flen = fp->len;
addrs = kzalloc((flen+1) * sizeof(*addrs), GFP_KERNEL);
-   if (addrs == NULL)
-   return fp;
+   if (addrs == NULL) {
+   fp = org_fp;
+   goto out;
+   }
+
+   memset(, 0, sizeof(struct codegen_context));
 
-   cgctx.idx = 0;
-   cgctx.seen = 0;
/* Scouting faux-generate pass 0 */
-   if (bpf_jit_build_body(fp, 0, , addrs))
+   if (bpf_jit_build_body(fp, 0, , addrs)) {
/* We hit something illegal or unsupported. */
+   fp = org_fp;
goto out;
+   }
 
/*
 * Pretend to build prologue, given the features we've seen.  This will
@@ -1003,8 +1019,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
bpf_hdr = bpf_jit_binary_alloc(alloclen, , 4,
bpf_jit_fill_ill_insns);
-   if (!bpf_hdr)
+   if (!bpf_hdr) {
+   fp = org_fp;
goto out;
+   }
 
code_base = (u32 *)(image + FUNCTION_DESCR_SIZE);
 
@@ -1041,6 +1059,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
 out:
kfree(addrs);
+
+   if (bpf_blinded)
+   bpf_jit_prog_release_other(fp, fp == org_fp ? tmp_fp : org_fp);
+
return fp;
 }
 
-- 
2.9.3



Re: [RFC PATCH 0/6] perf: Add AUX data sampling

2016-09-23 Thread Peter Zijlstra
On Fri, Sep 23, 2016 at 10:19:43AM -0700, Andi Kleen wrote:
> On Fri, Sep 23, 2016 at 01:49:17PM +0200, Peter Zijlstra wrote:
> > On Fri, Sep 23, 2016 at 02:27:20PM +0300, Alexander Shishkin wrote:
> > > Hi Peter,
> > > 
> > > This is an RFC, I'm not sending the tooling bits in this series,
> > > although they can be found here [1].
> > > 
> > > This series introduces AUX data sampling for perf events, which in
> > > case of our instruction/branch tracing PMUs like Intel PT, BTS, CS
> > > ETM means execution flow history leading up to a perf event's
> > > overflow.
> > 
> > This fails to explain _WHY_ this is a good thing to have. What kind of
> > analysis does this enable, and is that fully implemented in [1] (I
> > didn't look).
> 
> Think of it as a super LBR. (Near) all things LBR can do, PT can do
> with much more branches for each sample.

Clarify the 'near'? Should we then not expose it as a BRANCH_STACK?
Expand on the down-sides of that.

> Also long term execution recording of PT normally doesn't work well because 
> the
> sustained bandwidth is too high for perf and the disk to keep up
> 
> Currently the main solution we have for that is the snapshot mode, but it
> requires explicit instrumentation for someone to trigger snapshots.
> 
> Sampling PT is an alternative that works for many use cases, and does
> not rely on instrumentation.

List a few use-cases on either side of that divide ?


This really isn't rocket science, patches should come with
justification, try and sell this stuff. Don't try and skimp on that.


[PATCH 1/3] bpf powerpc: introduce accessors for using the tmp local stack space

2016-09-23 Thread Naveen N. Rao
While at it, ensure that the location of the local save area is
consistent whether or not we setup our own stackframe. This property is
utilised in the next patch that adds support for tail calls.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/net/bpf_jit64.h  | 16 +---
 arch/powerpc/net/bpf_jit_comp64.c | 79 ++-
 2 files changed, 55 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index 5046d6f..a1645d7 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -16,22 +16,25 @@
 
 /*
  * Stack layout:
+ * Ensure the top half (upto local_tmp_var) stays consistent
+ * with our redzone usage.
  *
  * [   prev sp ] <-
  * [   nv gpr save area] 8*8   |
+ * [tail_call_cnt  ] 8 |
+ * [local_tmp_var  ] 8 |
  * fp (r31) -->[   ebpf stack space] 512   |
- * [  local/tmp var space  ] 16|
  * [ frame header  ] 32/112|
  * sp (r1) --->[stack pointer  ] --
  */
 
-/* for bpf JIT code internal usage */
-#define BPF_PPC_STACK_LOCALS   16
 /* for gpr non volatile registers BPG_REG_6 to 10, plus skb cache registers */
 #define BPF_PPC_STACK_SAVE (8*8)
+/* for bpf JIT code internal usage */
+#define BPF_PPC_STACK_LOCALS   16
 /* Ensure this is quadword aligned */
-#define BPF_PPC_STACKFRAME (STACK_FRAME_MIN_SIZE + BPF_PPC_STACK_LOCALS + \
-MAX_BPF_STACK + BPF_PPC_STACK_SAVE)
+#define BPF_PPC_STACKFRAME (STACK_FRAME_MIN_SIZE + MAX_BPF_STACK + \
+BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE)
 
 #ifndef __ASSEMBLY__
 
@@ -65,6 +68,9 @@ static const int b2p[] = {
[TMP_REG_2] = 10
 };
 
+/* PPC NVR range -- update this if we ever use NVRs below r24 */
+#define BPF_PPC_NVR_MIN24
+
 /* Assembly helpers */
 #define DECLARE_LOAD_FUNC(func)u64 func(u64 r3, u64 r4);   
\
u64 func##_negative_offset(u64 r3, u64 r4); 
\
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 6073b78..5f8c91f 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -58,6 +58,35 @@ static inline bool bpf_has_stack_frame(struct 
codegen_context *ctx)
return ctx->seen & SEEN_FUNC || bpf_is_seen_register(ctx, BPF_REG_FP);
 }
 
+/*
+ * When not setting up our own stackframe, the redzone usage is:
+ *
+ * [   prev sp ] <-
+ * [ ...   ]   |
+ * sp (r1) --->[stack pointer  ] --
+ * [   nv gpr save area] 8*8
+ * [tail_call_cnt  ] 8
+ * [local_tmp_var  ] 8
+ * [   unused red zone ] 208 bytes protected
+ */
+static int bpf_jit_stack_local(struct codegen_context *ctx)
+{
+   if (bpf_has_stack_frame(ctx))
+   return STACK_FRAME_MIN_SIZE + MAX_BPF_STACK;
+   else
+   return -(BPF_PPC_STACK_SAVE + 16);
+}
+
+static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
+{
+   if (reg >= BPF_PPC_NVR_MIN && reg < 32)
+   return (bpf_has_stack_frame(ctx) ? BPF_PPC_STACKFRAME : 0)
+   - (8 * (32 - reg));
+
+   pr_err("BPF JIT is asking about unknown registers");
+   BUG();
+}
+
 static void bpf_jit_emit_skb_loads(u32 *image, struct codegen_context *ctx)
 {
/*
@@ -100,9 +129,8 @@ static void bpf_jit_emit_func_call(u32 *image, struct 
codegen_context *ctx, u64
 static void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
 {
int i;
-   bool new_stack_frame = bpf_has_stack_frame(ctx);
 
-   if (new_stack_frame) {
+   if (bpf_has_stack_frame(ctx)) {
/*
 * We need a stack frame, but we don't necessarily need to
 * save/restore LR unless we call other functions
@@ -122,9 +150,7 @@ static void bpf_jit_build_prologue(u32 *image, struct 
codegen_context *ctx)
 */
for (i = BPF_REG_6; i <= BPF_REG_10; i++)
if (bpf_is_seen_register(ctx, i))
-   PPC_BPF_STL(b2p[i], 1,
-   (new_stack_frame ? BPF_PPC_STACKFRAME : 0) -
-   (8 * (32 - b2p[i])));
+   PPC_BPF_STL(b2p[i], 1, bpf_jit_stack_offsetof(ctx, 
b2p[i]));
 
/*
 * Save additional non-volatile regs if we cache skb
@@ -132,22 +158,21 @@ static void bpf_jit_build_prologue(u32 *image, struct 
codegen_context *ctx)
 */
if (ctx->seen & SEEN_SKB) {
PPC_BPF_STL(b2p[SKB_HLEN_REG], 1,

[PATCH 2/3] bpf powerpc: implement support for tail calls

2016-09-23 Thread Naveen N. Rao
Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF
programs. This can be achieved either by:
(1) retaining the stack setup by the first eBPF program and having all
subsequent eBPF programs re-using it, or,
(2) by unwinding/tearing down the stack and having each eBPF program
deal with its own stack as it sees fit.

To ensure that this does not create loops, there is a limit to how many
tail calls can be done (currently 32). This requires the JIT'ed code to
maintain a count of the number of tail calls done so far.

Approach (1) is simple, but requires every eBPF program to have (almost)
the same prologue/epilogue, regardless of whether they need it. This is
inefficient for small eBPF programs which may not sometimes need a
prologue at all. As such, to minimize impact of tail call
implementation, we use approach (2) here which needs each eBPF program
in the chain to use its own prologue/epilogue. This is not ideal when
many tail calls are involved and when all the eBPF programs in the chain
have similar prologue/epilogue. However, the impact is restricted to
programs that do tail calls. Individual eBPF programs are not affected.

We maintain the tail call count in a fixed location on the stack and
updated tail call count values are passed in through this. The very
first eBPF program in a chain sets this up to 0 (the first 2
instructions). Subsequent tail calls skip the first two eBPF JIT
instructions to maintain the count. For programs that don't do tail
calls themselves, the first two instructions are NOPs.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/ppc-opcode.h |   2 +
 arch/powerpc/net/bpf_jit.h|   2 +
 arch/powerpc/net/bpf_jit64.h  |   1 +
 arch/powerpc/net/bpf_jit_comp64.c | 149 +++---
 4 files changed, 126 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 127ebf5..54ff8ce 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -236,6 +236,7 @@
 #define PPC_INST_STWU  0x9400
 #define PPC_INST_MFLR  0x7c0802a6
 #define PPC_INST_MTLR  0x7c0803a6
+#define PPC_INST_MTCTR 0x7c0903a6
 #define PPC_INST_CMPWI 0x2c00
 #define PPC_INST_CMPDI 0x2c20
 #define PPC_INST_CMPW  0x7c00
@@ -250,6 +251,7 @@
 #define PPC_INST_SUB   0x7c50
 #define PPC_INST_BLR   0x4e800020
 #define PPC_INST_BLRL  0x4e800021
+#define PPC_INST_BCTR  0x4e800420
 #define PPC_INST_MULLD 0x7c0001d2
 #define PPC_INST_MULLW 0x7c0001d6
 #define PPC_INST_MULHWU0x7c16
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index d5301b6..89f7007 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -40,6 +40,8 @@
 #define PPC_BLR()  EMIT(PPC_INST_BLR)
 #define PPC_BLRL() EMIT(PPC_INST_BLRL)
 #define PPC_MTLR(r)EMIT(PPC_INST_MTLR | ___PPC_RT(r))
+#define PPC_BCTR() EMIT(PPC_INST_BCTR)
+#define PPC_MTCTR(r)   EMIT(PPC_INST_MTCTR | ___PPC_RT(r))
 #define PPC_ADDI(d, a, i)  EMIT(PPC_INST_ADDI | ___PPC_RT(d) |   \
 ___PPC_RA(a) | IMM_L(i))
 #define PPC_MR(d, a)   PPC_OR(d, a, a)
diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index a1645d7..038e00b 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -88,6 +88,7 @@ DECLARE_LOAD_FUNC(sk_load_byte);
 #define SEEN_FUNC  0x1000 /* might call external helpers */
 #define SEEN_STACK 0x2000 /* uses BPF stack */
 #define SEEN_SKB   0x4000 /* uses sk_buff */
+#define SEEN_TAILCALL  0x8000 /* uses tail calls */
 
 struct codegen_context {
/*
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 5f8c91f..3ec29d6 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "bpf_jit64.h"
 
@@ -77,6 +78,11 @@ static int bpf_jit_stack_local(struct codegen_context *ctx)
return -(BPF_PPC_STACK_SAVE + 16);
 }
 
+static int bpf_jit_stack_tailcallcnt(struct codegen_context *ctx)
+{
+   return bpf_jit_stack_local(ctx) + 8;
+}
+
 static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
 {
if (reg >= BPF_PPC_NVR_MIN && reg < 32)
@@ -102,33 +108,25 @@ static void bpf_jit_emit_skb_loads(u32 *image, struct 
codegen_context *ctx)
PPC_BPF_LL(b2p[SKB_DATA_REG], 3, offsetof(struct sk_buff, data));
 }
 
-static void bpf_jit_emit_func_call(u32 *image, struct codegen_context *ctx, 
u64 func)
+static void bpf_jit_build_prologue(u32 *image, struct 

Re: [RFC PATCH 0/6] perf: Add AUX data sampling

2016-09-23 Thread Peter Zijlstra
On Fri, Sep 23, 2016 at 10:19:43AM -0700, Andi Kleen wrote:
> On Fri, Sep 23, 2016 at 01:49:17PM +0200, Peter Zijlstra wrote:
> > On Fri, Sep 23, 2016 at 02:27:20PM +0300, Alexander Shishkin wrote:
> > > Hi Peter,
> > > 
> > > This is an RFC, I'm not sending the tooling bits in this series,
> > > although they can be found here [1].
> > > 
> > > This series introduces AUX data sampling for perf events, which in
> > > case of our instruction/branch tracing PMUs like Intel PT, BTS, CS
> > > ETM means execution flow history leading up to a perf event's
> > > overflow.
> > 
> > This fails to explain _WHY_ this is a good thing to have. What kind of
> > analysis does this enable, and is that fully implemented in [1] (I
> > didn't look).
> 
> Think of it as a super LBR. (Near) all things LBR can do, PT can do
> with much more branches for each sample.

Clarify the 'near'? Should we then not expose it as a BRANCH_STACK?
Expand on the down-sides of that.

> Also long term execution recording of PT normally doesn't work well because 
> the
> sustained bandwidth is too high for perf and the disk to keep up
> 
> Currently the main solution we have for that is the snapshot mode, but it
> requires explicit instrumentation for someone to trigger snapshots.
> 
> Sampling PT is an alternative that works for many use cases, and does
> not rely on instrumentation.

List a few use-cases on either side of that divide ?


This really isn't rocket science, patches should come with
justification, try and sell this stuff. Don't try and skimp on that.


[PATCH 1/3] bpf powerpc: introduce accessors for using the tmp local stack space

2016-09-23 Thread Naveen N. Rao
While at it, ensure that the location of the local save area is
consistent whether or not we setup our own stackframe. This property is
utilised in the next patch that adds support for tail calls.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/net/bpf_jit64.h  | 16 +---
 arch/powerpc/net/bpf_jit_comp64.c | 79 ++-
 2 files changed, 55 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index 5046d6f..a1645d7 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -16,22 +16,25 @@
 
 /*
  * Stack layout:
+ * Ensure the top half (upto local_tmp_var) stays consistent
+ * with our redzone usage.
  *
  * [   prev sp ] <-
  * [   nv gpr save area] 8*8   |
+ * [tail_call_cnt  ] 8 |
+ * [local_tmp_var  ] 8 |
  * fp (r31) -->[   ebpf stack space] 512   |
- * [  local/tmp var space  ] 16|
  * [ frame header  ] 32/112|
  * sp (r1) --->[stack pointer  ] --
  */
 
-/* for bpf JIT code internal usage */
-#define BPF_PPC_STACK_LOCALS   16
 /* for gpr non volatile registers BPG_REG_6 to 10, plus skb cache registers */
 #define BPF_PPC_STACK_SAVE (8*8)
+/* for bpf JIT code internal usage */
+#define BPF_PPC_STACK_LOCALS   16
 /* Ensure this is quadword aligned */
-#define BPF_PPC_STACKFRAME (STACK_FRAME_MIN_SIZE + BPF_PPC_STACK_LOCALS + \
-MAX_BPF_STACK + BPF_PPC_STACK_SAVE)
+#define BPF_PPC_STACKFRAME (STACK_FRAME_MIN_SIZE + MAX_BPF_STACK + \
+BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE)
 
 #ifndef __ASSEMBLY__
 
@@ -65,6 +68,9 @@ static const int b2p[] = {
[TMP_REG_2] = 10
 };
 
+/* PPC NVR range -- update this if we ever use NVRs below r24 */
+#define BPF_PPC_NVR_MIN24
+
 /* Assembly helpers */
 #define DECLARE_LOAD_FUNC(func)u64 func(u64 r3, u64 r4);   
\
u64 func##_negative_offset(u64 r3, u64 r4); 
\
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 6073b78..5f8c91f 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -58,6 +58,35 @@ static inline bool bpf_has_stack_frame(struct 
codegen_context *ctx)
return ctx->seen & SEEN_FUNC || bpf_is_seen_register(ctx, BPF_REG_FP);
 }
 
+/*
+ * When not setting up our own stackframe, the redzone usage is:
+ *
+ * [   prev sp ] <-
+ * [ ...   ]   |
+ * sp (r1) --->[stack pointer  ] --
+ * [   nv gpr save area] 8*8
+ * [tail_call_cnt  ] 8
+ * [local_tmp_var  ] 8
+ * [   unused red zone ] 208 bytes protected
+ */
+static int bpf_jit_stack_local(struct codegen_context *ctx)
+{
+   if (bpf_has_stack_frame(ctx))
+   return STACK_FRAME_MIN_SIZE + MAX_BPF_STACK;
+   else
+   return -(BPF_PPC_STACK_SAVE + 16);
+}
+
+static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
+{
+   if (reg >= BPF_PPC_NVR_MIN && reg < 32)
+   return (bpf_has_stack_frame(ctx) ? BPF_PPC_STACKFRAME : 0)
+   - (8 * (32 - reg));
+
+   pr_err("BPF JIT is asking about unknown registers");
+   BUG();
+}
+
 static void bpf_jit_emit_skb_loads(u32 *image, struct codegen_context *ctx)
 {
/*
@@ -100,9 +129,8 @@ static void bpf_jit_emit_func_call(u32 *image, struct 
codegen_context *ctx, u64
 static void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
 {
int i;
-   bool new_stack_frame = bpf_has_stack_frame(ctx);
 
-   if (new_stack_frame) {
+   if (bpf_has_stack_frame(ctx)) {
/*
 * We need a stack frame, but we don't necessarily need to
 * save/restore LR unless we call other functions
@@ -122,9 +150,7 @@ static void bpf_jit_build_prologue(u32 *image, struct 
codegen_context *ctx)
 */
for (i = BPF_REG_6; i <= BPF_REG_10; i++)
if (bpf_is_seen_register(ctx, i))
-   PPC_BPF_STL(b2p[i], 1,
-   (new_stack_frame ? BPF_PPC_STACKFRAME : 0) -
-   (8 * (32 - b2p[i])));
+   PPC_BPF_STL(b2p[i], 1, bpf_jit_stack_offsetof(ctx, 
b2p[i]));
 
/*
 * Save additional non-volatile regs if we cache skb
@@ -132,22 +158,21 @@ static void bpf_jit_build_prologue(u32 *image, struct 
codegen_context *ctx)
 */
if (ctx->seen & SEEN_SKB) {
PPC_BPF_STL(b2p[SKB_HLEN_REG], 1,
-   

[PATCH 2/3] bpf powerpc: implement support for tail calls

2016-09-23 Thread Naveen N. Rao
Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF
programs. This can be achieved either by:
(1) retaining the stack setup by the first eBPF program and having all
subsequent eBPF programs re-using it, or,
(2) by unwinding/tearing down the stack and having each eBPF program
deal with its own stack as it sees fit.

To ensure that this does not create loops, there is a limit to how many
tail calls can be done (currently 32). This requires the JIT'ed code to
maintain a count of the number of tail calls done so far.

Approach (1) is simple, but requires every eBPF program to have (almost)
the same prologue/epilogue, regardless of whether they need it. This is
inefficient for small eBPF programs which may not sometimes need a
prologue at all. As such, to minimize impact of tail call
implementation, we use approach (2) here which needs each eBPF program
in the chain to use its own prologue/epilogue. This is not ideal when
many tail calls are involved and when all the eBPF programs in the chain
have similar prologue/epilogue. However, the impact is restricted to
programs that do tail calls. Individual eBPF programs are not affected.

We maintain the tail call count in a fixed location on the stack and
updated tail call count values are passed in through this. The very
first eBPF program in a chain sets this up to 0 (the first 2
instructions). Subsequent tail calls skip the first two eBPF JIT
instructions to maintain the count. For programs that don't do tail
calls themselves, the first two instructions are NOPs.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/ppc-opcode.h |   2 +
 arch/powerpc/net/bpf_jit.h|   2 +
 arch/powerpc/net/bpf_jit64.h  |   1 +
 arch/powerpc/net/bpf_jit_comp64.c | 149 +++---
 4 files changed, 126 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 127ebf5..54ff8ce 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -236,6 +236,7 @@
 #define PPC_INST_STWU  0x9400
 #define PPC_INST_MFLR  0x7c0802a6
 #define PPC_INST_MTLR  0x7c0803a6
+#define PPC_INST_MTCTR 0x7c0903a6
 #define PPC_INST_CMPWI 0x2c00
 #define PPC_INST_CMPDI 0x2c20
 #define PPC_INST_CMPW  0x7c00
@@ -250,6 +251,7 @@
 #define PPC_INST_SUB   0x7c50
 #define PPC_INST_BLR   0x4e800020
 #define PPC_INST_BLRL  0x4e800021
+#define PPC_INST_BCTR  0x4e800420
 #define PPC_INST_MULLD 0x7c0001d2
 #define PPC_INST_MULLW 0x7c0001d6
 #define PPC_INST_MULHWU0x7c16
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index d5301b6..89f7007 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -40,6 +40,8 @@
 #define PPC_BLR()  EMIT(PPC_INST_BLR)
 #define PPC_BLRL() EMIT(PPC_INST_BLRL)
 #define PPC_MTLR(r)EMIT(PPC_INST_MTLR | ___PPC_RT(r))
+#define PPC_BCTR() EMIT(PPC_INST_BCTR)
+#define PPC_MTCTR(r)   EMIT(PPC_INST_MTCTR | ___PPC_RT(r))
 #define PPC_ADDI(d, a, i)  EMIT(PPC_INST_ADDI | ___PPC_RT(d) |   \
 ___PPC_RA(a) | IMM_L(i))
 #define PPC_MR(d, a)   PPC_OR(d, a, a)
diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index a1645d7..038e00b 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -88,6 +88,7 @@ DECLARE_LOAD_FUNC(sk_load_byte);
 #define SEEN_FUNC  0x1000 /* might call external helpers */
 #define SEEN_STACK 0x2000 /* uses BPF stack */
 #define SEEN_SKB   0x4000 /* uses sk_buff */
+#define SEEN_TAILCALL  0x8000 /* uses tail calls */
 
 struct codegen_context {
/*
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 5f8c91f..3ec29d6 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "bpf_jit64.h"
 
@@ -77,6 +78,11 @@ static int bpf_jit_stack_local(struct codegen_context *ctx)
return -(BPF_PPC_STACK_SAVE + 16);
 }
 
+static int bpf_jit_stack_tailcallcnt(struct codegen_context *ctx)
+{
+   return bpf_jit_stack_local(ctx) + 8;
+}
+
 static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
 {
if (reg >= BPF_PPC_NVR_MIN && reg < 32)
@@ -102,33 +108,25 @@ static void bpf_jit_emit_skb_loads(u32 *image, struct 
codegen_context *ctx)
PPC_BPF_LL(b2p[SKB_DATA_REG], 3, offsetof(struct sk_buff, data));
 }
 
-static void bpf_jit_emit_func_call(u32 *image, struct codegen_context *ctx, 
u64 func)
+static void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
 {
-#ifdef 

new objtool warnings again...

2016-09-23 Thread Linus Torvalds
Josh,

 the current F24 toolchain causes

kernel/signal.o: warning: objtool: .altinstr_replacement+0x54:
call without frame pointer save/setup

during a regular allmodconfig build.

Doing an objdump says:

...
  54:   e8 00 00 00 00  callq  59 <.altinstr_replacement+0x59>
55: R_X86_64_PC32   copy_user_generic_string-0x4
  59:   e8 00 00 00 00  callq  5e <.altinstr_replacement+0x5e>
5a: R_X86_64_PC32
copy_user_enhanced_fast_string-0x4
...

so it seems to come from the alternative_call_2() in copy_user_generic().

It's somewhere in copy_siginfo_to_user(), so I assume it's just the

if (from->si_code < 0)
return __copy_to_user(to, from, sizeof(siginfo_t))
? -EFAULT : 0;

case.  Looking at the code generation, it looks like the frame pointer
generation in that function has been moved down past this code, so the
objtool warning seems to be correct, but this indicates that gcc has
decided that we don't need a frame for that alternative_call_2()
thing.

So this code is clearly missing the magic to tell gcc that the asm
needs a frame pointer.

What was that magic again? Mind sending a patch?

Linus


new objtool warnings again...

2016-09-23 Thread Linus Torvalds
Josh,

 the current F24 toolchain causes

kernel/signal.o: warning: objtool: .altinstr_replacement+0x54:
call without frame pointer save/setup

during a regular allmodconfig build.

Doing an objdump says:

...
  54:   e8 00 00 00 00  callq  59 <.altinstr_replacement+0x59>
55: R_X86_64_PC32   copy_user_generic_string-0x4
  59:   e8 00 00 00 00  callq  5e <.altinstr_replacement+0x5e>
5a: R_X86_64_PC32
copy_user_enhanced_fast_string-0x4
...

so it seems to come from the alternative_call_2() in copy_user_generic().

It's somewhere in copy_siginfo_to_user(), so I assume it's just the

if (from->si_code < 0)
return __copy_to_user(to, from, sizeof(siginfo_t))
? -EFAULT : 0;

case.  Looking at the code generation, it looks like the frame pointer
generation in that function has been moved down past this code, so the
objtool warning seems to be correct, but this indicates that gcc has
decided that we don't need a frame for that alternative_call_2()
thing.

So this code is clearly missing the magic to tell gcc that the asm
needs a frame pointer.

What was that magic again? Mind sending a patch?

Linus


[PATCH 2/3] ARM: pxa: ezx: use the new pxa_camera platform_data

2016-09-23 Thread Robert Jarzmik
pxa_camera has transitioned from a soc_camera driver to a standalone
v4l2 driver. Amend the device declaration accordingly.

Signed-off-by: Robert Jarzmik 
---
 arch/arm/mach-pxa/ezx.c | 176 
 1 file changed, 72 insertions(+), 104 deletions(-)

diff --git a/arch/arm/mach-pxa/ezx.c b/arch/arm/mach-pxa/ezx.c
index 34ad0a89d4a9..0b8300e6fca3 100644
--- a/arch/arm/mach-pxa/ezx.c
+++ b/arch/arm/mach-pxa/ezx.c
@@ -17,14 +17,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 
-#include 
-
 #include 
 #include 
 #include 
@@ -723,6 +723,42 @@ static struct platform_device a780_gpio_keys = {
 };
 
 /* camera */
+static struct regulator_consumer_supply camera_dummy_supplies[] = {
+   REGULATOR_SUPPLY("vdd", "0-005d"),
+};
+
+static struct regulator_init_data camera_dummy_initdata = {
+   .consumer_supplies = camera_dummy_supplies,
+   .num_consumer_supplies = ARRAY_SIZE(camera_dummy_supplies),
+   .constraints = {
+   .valid_ops_mask = REGULATOR_CHANGE_STATUS,
+   },
+};
+
+static struct fixed_voltage_config camera_dummy_config = {
+   .supply_name= "camera_vdd",
+   .microvolts = 280,
+   .gpio   = GPIO50_nCAM_EN,
+   .enable_high= 0,
+   .init_data  = _dummy_initdata,
+};
+
+static struct platform_device camera_supply_dummy_device = {
+   .name   = "reg-fixed-voltage",
+   .id = 1,
+   .dev= {
+   .platform_data = _dummy_config,
+   },
+};
+static int a780_camera_reset(struct device *dev)
+{
+   gpio_set_value(GPIO19_GEN1_CAM_RST, 0);
+   msleep(10);
+   gpio_set_value(GPIO19_GEN1_CAM_RST, 1);
+
+   return 0;
+}
+
 static int a780_camera_init(void)
 {
int err;
@@ -731,73 +767,36 @@ static int a780_camera_init(void)
 * GPIO50_nCAM_EN is active low
 * GPIO19_GEN1_CAM_RST is active on rising edge
 */
-   err = gpio_request(GPIO50_nCAM_EN, "nCAM_EN");
-   if (err) {
-   pr_err("%s: Failed to request nCAM_EN\n", __func__);
-   goto fail;
-   }
-
err = gpio_request(GPIO19_GEN1_CAM_RST, "CAM_RST");
if (err) {
pr_err("%s: Failed to request CAM_RST\n", __func__);
-   goto fail_gpio_cam_rst;
+   return err;
}
 
-   gpio_direction_output(GPIO50_nCAM_EN, 1);
gpio_direction_output(GPIO19_GEN1_CAM_RST, 0);
-
-   return 0;
-
-fail_gpio_cam_rst:
-   gpio_free(GPIO50_nCAM_EN);
-fail:
-   return err;
-}
-
-static int a780_camera_power(struct device *dev, int on)
-{
-   gpio_set_value(GPIO50_nCAM_EN, !on);
-   return 0;
-}
-
-static int a780_camera_reset(struct device *dev)
-{
-   gpio_set_value(GPIO19_GEN1_CAM_RST, 0);
-   msleep(10);
-   gpio_set_value(GPIO19_GEN1_CAM_RST, 1);
+   a780_camera_reset(NULL);
 
return 0;
 }
 
 struct pxacamera_platform_data a780_pxacamera_platform_data = {
.flags  = PXA_CAMERA_MASTER | PXA_CAMERA_DATAWIDTH_8 |
-   PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN,
+   PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN |
+   PXA_CAMERA_PCP,
.mclk_10khz = 5000,
+   .sensor_i2c_adapter_id = 0,
+   .sensor_i2c_address = 0x5d,
 };
 
-static struct i2c_board_info a780_camera_i2c_board_info = {
-   I2C_BOARD_INFO("mt9m111", 0x5d),
-};
-
-static struct soc_camera_link a780_iclink = {
-   .bus_id = 0,
-   .flags  = SOCAM_SENSOR_INVERT_PCLK,
-   .i2c_adapter_id = 0,
-   .board_info = _camera_i2c_board_info,
-   .power  = a780_camera_power,
-   .reset  = a780_camera_reset,
-};
-
-static struct platform_device a780_camera = {
-   .name   = "soc-camera-pdrv",
-   .id = 0,
-   .dev= {
-   .platform_data = _iclink,
+static struct i2c_board_info a780_i2c_board_info[] = {
+   {
+   I2C_BOARD_INFO("mt9m111", 0x5d),
},
 };
 
 static struct platform_device *a780_devices[] __initdata = {
_gpio_keys,
+   _supply_dummy_device,
 };
 
 static void __init a780_init(void)
@@ -811,19 +810,19 @@ static void __init a780_init(void)
pxa_set_stuart_info(NULL);
 
pxa_set_i2c_info(NULL);
+   i2c_register_board_info(0, ARRAY_AND_SIZE(a780_i2c_board_info));
 
pxa_set_fb_info(NULL, _fb_info_1);
 
pxa_set_keypad_info(_keypad_platform_data);
 
-   if (a780_camera_init() == 0) {
+   if (a780_camera_init() == 0)
pxa_set_camera_info(_pxacamera_platform_data);
-   platform_device_register(_camera);
-   }
 
pwm_add_table(ezx_pwm_lookup, ARRAY_SIZE(ezx_pwm_lookup));
platform_add_devices(ARRAY_AND_SIZE(ezx_devices));
platform_add_devices(ARRAY_AND_SIZE(a780_devices));
+   

[PATCH 2/3] ARM: pxa: ezx: use the new pxa_camera platform_data

2016-09-23 Thread Robert Jarzmik
pxa_camera has transitioned from a soc_camera driver to a standalone
v4l2 driver. Amend the device declaration accordingly.

Signed-off-by: Robert Jarzmik 
---
 arch/arm/mach-pxa/ezx.c | 176 
 1 file changed, 72 insertions(+), 104 deletions(-)

diff --git a/arch/arm/mach-pxa/ezx.c b/arch/arm/mach-pxa/ezx.c
index 34ad0a89d4a9..0b8300e6fca3 100644
--- a/arch/arm/mach-pxa/ezx.c
+++ b/arch/arm/mach-pxa/ezx.c
@@ -17,14 +17,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 
-#include 
-
 #include 
 #include 
 #include 
@@ -723,6 +723,42 @@ static struct platform_device a780_gpio_keys = {
 };
 
 /* camera */
+static struct regulator_consumer_supply camera_dummy_supplies[] = {
+   REGULATOR_SUPPLY("vdd", "0-005d"),
+};
+
+static struct regulator_init_data camera_dummy_initdata = {
+   .consumer_supplies = camera_dummy_supplies,
+   .num_consumer_supplies = ARRAY_SIZE(camera_dummy_supplies),
+   .constraints = {
+   .valid_ops_mask = REGULATOR_CHANGE_STATUS,
+   },
+};
+
+static struct fixed_voltage_config camera_dummy_config = {
+   .supply_name= "camera_vdd",
+   .microvolts = 280,
+   .gpio   = GPIO50_nCAM_EN,
+   .enable_high= 0,
+   .init_data  = _dummy_initdata,
+};
+
+static struct platform_device camera_supply_dummy_device = {
+   .name   = "reg-fixed-voltage",
+   .id = 1,
+   .dev= {
+   .platform_data = _dummy_config,
+   },
+};
+static int a780_camera_reset(struct device *dev)
+{
+   gpio_set_value(GPIO19_GEN1_CAM_RST, 0);
+   msleep(10);
+   gpio_set_value(GPIO19_GEN1_CAM_RST, 1);
+
+   return 0;
+}
+
 static int a780_camera_init(void)
 {
int err;
@@ -731,73 +767,36 @@ static int a780_camera_init(void)
 * GPIO50_nCAM_EN is active low
 * GPIO19_GEN1_CAM_RST is active on rising edge
 */
-   err = gpio_request(GPIO50_nCAM_EN, "nCAM_EN");
-   if (err) {
-   pr_err("%s: Failed to request nCAM_EN\n", __func__);
-   goto fail;
-   }
-
err = gpio_request(GPIO19_GEN1_CAM_RST, "CAM_RST");
if (err) {
pr_err("%s: Failed to request CAM_RST\n", __func__);
-   goto fail_gpio_cam_rst;
+   return err;
}
 
-   gpio_direction_output(GPIO50_nCAM_EN, 1);
gpio_direction_output(GPIO19_GEN1_CAM_RST, 0);
-
-   return 0;
-
-fail_gpio_cam_rst:
-   gpio_free(GPIO50_nCAM_EN);
-fail:
-   return err;
-}
-
-static int a780_camera_power(struct device *dev, int on)
-{
-   gpio_set_value(GPIO50_nCAM_EN, !on);
-   return 0;
-}
-
-static int a780_camera_reset(struct device *dev)
-{
-   gpio_set_value(GPIO19_GEN1_CAM_RST, 0);
-   msleep(10);
-   gpio_set_value(GPIO19_GEN1_CAM_RST, 1);
+   a780_camera_reset(NULL);
 
return 0;
 }
 
 struct pxacamera_platform_data a780_pxacamera_platform_data = {
.flags  = PXA_CAMERA_MASTER | PXA_CAMERA_DATAWIDTH_8 |
-   PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN,
+   PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN |
+   PXA_CAMERA_PCP,
.mclk_10khz = 5000,
+   .sensor_i2c_adapter_id = 0,
+   .sensor_i2c_address = 0x5d,
 };
 
-static struct i2c_board_info a780_camera_i2c_board_info = {
-   I2C_BOARD_INFO("mt9m111", 0x5d),
-};
-
-static struct soc_camera_link a780_iclink = {
-   .bus_id = 0,
-   .flags  = SOCAM_SENSOR_INVERT_PCLK,
-   .i2c_adapter_id = 0,
-   .board_info = _camera_i2c_board_info,
-   .power  = a780_camera_power,
-   .reset  = a780_camera_reset,
-};
-
-static struct platform_device a780_camera = {
-   .name   = "soc-camera-pdrv",
-   .id = 0,
-   .dev= {
-   .platform_data = _iclink,
+static struct i2c_board_info a780_i2c_board_info[] = {
+   {
+   I2C_BOARD_INFO("mt9m111", 0x5d),
},
 };
 
 static struct platform_device *a780_devices[] __initdata = {
_gpio_keys,
+   _supply_dummy_device,
 };
 
 static void __init a780_init(void)
@@ -811,19 +810,19 @@ static void __init a780_init(void)
pxa_set_stuart_info(NULL);
 
pxa_set_i2c_info(NULL);
+   i2c_register_board_info(0, ARRAY_AND_SIZE(a780_i2c_board_info));
 
pxa_set_fb_info(NULL, _fb_info_1);
 
pxa_set_keypad_info(_keypad_platform_data);
 
-   if (a780_camera_init() == 0) {
+   if (a780_camera_init() == 0)
pxa_set_camera_info(_pxacamera_platform_data);
-   platform_device_register(_camera);
-   }
 
pwm_add_table(ezx_pwm_lookup, ARRAY_SIZE(ezx_pwm_lookup));
platform_add_devices(ARRAY_AND_SIZE(ezx_devices));
platform_add_devices(ARRAY_AND_SIZE(a780_devices));
+   regulator_has_full_constraints();
 }
 
 

[PATCH 3/3] ARM: pxa: em-x270: use the new pxa_camera platform_data

2016-09-23 Thread Robert Jarzmik
pxa_camera has transitioned from a soc_camera driver to a standalone
v4l2 driver. Amend the device declaration accordingly.

Signed-off-by: Robert Jarzmik 
---
 arch/arm/mach-pxa/em-x270.c | 85 ++---
 1 file changed, 27 insertions(+), 58 deletions(-)

diff --git a/arch/arm/mach-pxa/em-x270.c b/arch/arm/mach-pxa/em-x270.c
index 03354c21e1f2..2d7762ab1d70 100644
--- a/arch/arm/mach-pxa/em-x270.c
+++ b/arch/arm/mach-pxa/em-x270.c
@@ -34,8 +34,6 @@
 #include 
 #include 
 
-#include 
-
 #include 
 #include 
 
@@ -969,81 +967,52 @@ static int em_x270_sensor_init(void)
return ret;
 
gpio_direction_output(cam_reset, 0);
-
-   em_x270_camera_ldo = regulator_get(NULL, "vcc cam");
-   if (em_x270_camera_ldo == NULL) {
-   gpio_free(cam_reset);
-   return -ENODEV;
-   }
-
-   ret = regulator_enable(em_x270_camera_ldo);
-   if (ret) {
-   regulator_put(em_x270_camera_ldo);
-   gpio_free(cam_reset);
-   return ret;
-   }
-
gpio_set_value(cam_reset, 1);
 
return 0;
 }
 
-struct pxacamera_platform_data em_x270_camera_platform_data = {
-   .flags  = PXA_CAMERA_MASTER | PXA_CAMERA_DATAWIDTH_8 |
-   PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN,
-   .mclk_10khz = 2600,
+static struct regulator_consumer_supply camera_dummy_supplies[] = {
+   REGULATOR_SUPPLY("vdd", "0-005d"),
 };
 
-static int em_x270_sensor_power(struct device *dev, int on)
-{
-   int ret;
-   int is_on = regulator_is_enabled(em_x270_camera_ldo);
-
-   if (on == is_on)
-   return 0;
-
-   gpio_set_value(cam_reset, !on);
-
-   if (on)
-   ret = regulator_enable(em_x270_camera_ldo);
-   else
-   ret = regulator_disable(em_x270_camera_ldo);
-
-   if (ret)
-   return ret;
-
-   gpio_set_value(cam_reset, on);
-
-   return 0;
-}
-
-static struct i2c_board_info em_x270_i2c_cam_info[] = {
-   {
-   I2C_BOARD_INFO("mt9m111", 0x48),
+static struct regulator_init_data camera_dummy_initdata = {
+   .consumer_supplies = camera_dummy_supplies,
+   .num_consumer_supplies = ARRAY_SIZE(camera_dummy_supplies),
+   .constraints = {
+   .valid_ops_mask = REGULATOR_CHANGE_STATUS,
},
 };
 
-static struct soc_camera_link iclink = {
-   .bus_id = 0,
-   .power  = em_x270_sensor_power,
-   .board_info = _x270_i2c_cam_info[0],
-   .i2c_adapter_id = 0,
+static struct fixed_voltage_config camera_dummy_config = {
+   .supply_name= "camera_vdd",
+   .input_supply   = "vcc cam",
+   .microvolts = 280,
+   .gpio   = GPIO56_MT9M111_nOE,
+   .enable_high= 0,
+   .init_data  = _dummy_initdata,
 };
 
-static struct platform_device em_x270_camera = {
-   .name   = "soc-camera-pdrv",
-   .id = -1,
+static struct platform_device camera_supply_dummy_device = {
+   .name   = "reg-fixed-voltage",
+   .id = 1,
.dev= {
-   .platform_data = ,
+   .platform_data = _dummy_config,
},
 };
 
+struct pxacamera_platform_data em_x270_camera_platform_data = {
+   .flags  = PXA_CAMERA_MASTER | PXA_CAMERA_DATAWIDTH_8 |
+   PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN,
+   .mclk_10khz = 2600,
+   .sensor_i2c_adapter_id = 0,
+   .sensor_i2c_address = 0x5d,
+};
+
 static void  __init em_x270_init_camera(void)
 {
-   if (em_x270_sensor_init() == 0) {
+   if (em_x270_sensor_init() == 0)
pxa_set_camera_info(_x270_camera_platform_data);
-   platform_device_register(_x270_camera);
-   }
 }
 #else
 static inline void em_x270_init_camera(void) {}
-- 
2.1.4



[PATCH 3/3] ARM: pxa: em-x270: use the new pxa_camera platform_data

2016-09-23 Thread Robert Jarzmik
pxa_camera has transitioned from a soc_camera driver to a standalone
v4l2 driver. Amend the device declaration accordingly.

Signed-off-by: Robert Jarzmik 
---
 arch/arm/mach-pxa/em-x270.c | 85 ++---
 1 file changed, 27 insertions(+), 58 deletions(-)

diff --git a/arch/arm/mach-pxa/em-x270.c b/arch/arm/mach-pxa/em-x270.c
index 03354c21e1f2..2d7762ab1d70 100644
--- a/arch/arm/mach-pxa/em-x270.c
+++ b/arch/arm/mach-pxa/em-x270.c
@@ -34,8 +34,6 @@
 #include 
 #include 
 
-#include 
-
 #include 
 #include 
 
@@ -969,81 +967,52 @@ static int em_x270_sensor_init(void)
return ret;
 
gpio_direction_output(cam_reset, 0);
-
-   em_x270_camera_ldo = regulator_get(NULL, "vcc cam");
-   if (em_x270_camera_ldo == NULL) {
-   gpio_free(cam_reset);
-   return -ENODEV;
-   }
-
-   ret = regulator_enable(em_x270_camera_ldo);
-   if (ret) {
-   regulator_put(em_x270_camera_ldo);
-   gpio_free(cam_reset);
-   return ret;
-   }
-
gpio_set_value(cam_reset, 1);
 
return 0;
 }
 
-struct pxacamera_platform_data em_x270_camera_platform_data = {
-   .flags  = PXA_CAMERA_MASTER | PXA_CAMERA_DATAWIDTH_8 |
-   PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN,
-   .mclk_10khz = 2600,
+static struct regulator_consumer_supply camera_dummy_supplies[] = {
+   REGULATOR_SUPPLY("vdd", "0-005d"),
 };
 
-static int em_x270_sensor_power(struct device *dev, int on)
-{
-   int ret;
-   int is_on = regulator_is_enabled(em_x270_camera_ldo);
-
-   if (on == is_on)
-   return 0;
-
-   gpio_set_value(cam_reset, !on);
-
-   if (on)
-   ret = regulator_enable(em_x270_camera_ldo);
-   else
-   ret = regulator_disable(em_x270_camera_ldo);
-
-   if (ret)
-   return ret;
-
-   gpio_set_value(cam_reset, on);
-
-   return 0;
-}
-
-static struct i2c_board_info em_x270_i2c_cam_info[] = {
-   {
-   I2C_BOARD_INFO("mt9m111", 0x48),
+static struct regulator_init_data camera_dummy_initdata = {
+   .consumer_supplies = camera_dummy_supplies,
+   .num_consumer_supplies = ARRAY_SIZE(camera_dummy_supplies),
+   .constraints = {
+   .valid_ops_mask = REGULATOR_CHANGE_STATUS,
},
 };
 
-static struct soc_camera_link iclink = {
-   .bus_id = 0,
-   .power  = em_x270_sensor_power,
-   .board_info = _x270_i2c_cam_info[0],
-   .i2c_adapter_id = 0,
+static struct fixed_voltage_config camera_dummy_config = {
+   .supply_name= "camera_vdd",
+   .input_supply   = "vcc cam",
+   .microvolts = 280,
+   .gpio   = GPIO56_MT9M111_nOE,
+   .enable_high= 0,
+   .init_data  = _dummy_initdata,
 };
 
-static struct platform_device em_x270_camera = {
-   .name   = "soc-camera-pdrv",
-   .id = -1,
+static struct platform_device camera_supply_dummy_device = {
+   .name   = "reg-fixed-voltage",
+   .id = 1,
.dev= {
-   .platform_data = ,
+   .platform_data = _dummy_config,
},
 };
 
+struct pxacamera_platform_data em_x270_camera_platform_data = {
+   .flags  = PXA_CAMERA_MASTER | PXA_CAMERA_DATAWIDTH_8 |
+   PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN,
+   .mclk_10khz = 2600,
+   .sensor_i2c_adapter_id = 0,
+   .sensor_i2c_address = 0x5d,
+};
+
 static void  __init em_x270_init_camera(void)
 {
-   if (em_x270_sensor_init() == 0) {
+   if (em_x270_sensor_init() == 0)
pxa_set_camera_info(_x270_camera_platform_data);
-   platform_device_register(_x270_camera);
-   }
 }
 #else
 static inline void em_x270_init_camera(void) {}
-- 
2.1.4



Re: [PATCH] vme: fake: mark symbols static where possible

2016-09-23 Thread Martyn Welch
On 23 September 2016 at 14:38, Baoyou Xie  wrote:
> We get 4 warnings when building kernel with W=1:
> drivers/vme/bridges/vme_fake.c:384:6: warning: no previous prototype for 
> 'fake_lm_check' [-Wmissing-prototypes]
> drivers/vme/bridges/vme_fake.c:619:6: warning: no previous prototype for 
> 'fake_vmewrite8' [-Wmissing-prototypes]
> drivers/vme/bridges/vme_fake.c:649:6: warning: no previous prototype for 
> 'fake_vmewrite16' [-Wmissing-prototypes]
> drivers/vme/bridges/vme_fake.c:679:6: warning: no previous prototype for 
> 'fake_vmewrite32' [-Wmissing-prototypes]
>
> In fact, these functions are only used in the file in which they are
> declared and don't need a declaration, but can be made static.
> so this patch marks these functions with 'static'.
>
> Signed-off-by: Baoyou Xie 

Acked-by: Martyn Welch 

> ---
>  drivers/vme/bridges/vme_fake.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/vme/bridges/vme_fake.c b/drivers/vme/bridges/vme_fake.c
> index ebf35d3..29ac74f 100644
> --- a/drivers/vme/bridges/vme_fake.c
> +++ b/drivers/vme/bridges/vme_fake.c
> @@ -381,8 +381,8 @@ static int fake_master_get(struct vme_master_resource 
> *image, int *enabled,
>  }
>
>
> -void fake_lm_check(struct fake_driver *bridge, unsigned long long addr,
> -   u32 aspace, u32 cycle)
> +static void fake_lm_check(struct fake_driver *bridge, unsigned long long 
> addr,
> + u32 aspace, u32 cycle)
>  {
> struct vme_bridge *fake_bridge;
> unsigned long long lm_base;
> @@ -616,8 +616,8 @@ static ssize_t fake_master_read(struct 
> vme_master_resource *image, void *buf,
> return retval;
>  }
>
> -void fake_vmewrite8(struct fake_driver *bridge, u8 *buf,
> -unsigned long long addr, u32 aspace, u32 cycle)
> +static void fake_vmewrite8(struct fake_driver *bridge, u8 *buf,
> +  unsigned long long addr, u32 aspace, u32 cycle)
>  {
> int i;
> unsigned long long start, end, offset;
> @@ -646,8 +646,8 @@ void fake_vmewrite8(struct fake_driver *bridge, u8 *buf,
>
>  }
>
> -void fake_vmewrite16(struct fake_driver *bridge, u16 *buf,
> -   unsigned long long addr, u32 aspace, u32 cycle)
> +static void fake_vmewrite16(struct fake_driver *bridge, u16 *buf,
> +   unsigned long long addr, u32 aspace, u32 cycle)
>  {
> int i;
> unsigned long long start, end, offset;
> @@ -676,8 +676,8 @@ void fake_vmewrite16(struct fake_driver *bridge, u16 *buf,
>
>  }
>
> -void fake_vmewrite32(struct fake_driver *bridge, u32 *buf,
> -   unsigned long long addr, u32 aspace, u32 cycle)
> +static void fake_vmewrite32(struct fake_driver *bridge, u32 *buf,
> +   unsigned long long addr, u32 aspace, u32 cycle)
>  {
> int i;
> unsigned long long start, end, offset;
> --
> 2.7.4
>


[PATCH 1/3] ARM: pxa: mioa701: use the new pxa_camera platform_data

2016-09-23 Thread Robert Jarzmik
pxa_camera has transitioned from a soc_camera driver to a standalone
v4l2 driver. Amend the device declaration accordingly.

Signed-off-by: Robert Jarzmik 
---
 arch/arm/mach-pxa/mioa701.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/arm/mach-pxa/mioa701.c b/arch/arm/mach-pxa/mioa701.c
index 38a96a193dc4..8a5d0491e73c 100644
--- a/arch/arm/mach-pxa/mioa701.c
+++ b/arch/arm/mach-pxa/mioa701.c
@@ -57,7 +57,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mioa701.h"
 
@@ -627,6 +626,8 @@ struct pxacamera_platform_data 
mioa701_pxacamera_platform_data = {
.flags  = PXA_CAMERA_MASTER | PXA_CAMERA_DATAWIDTH_8 |
PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN,
.mclk_10khz = 5000,
+   .sensor_i2c_adapter_id = 0,
+   .sensor_i2c_address = 0x5d,
 };
 
 static struct i2c_board_info __initdata mioa701_pi2c_devices[] = {
@@ -643,12 +644,6 @@ static struct i2c_board_info mioa701_i2c_devices[] = {
},
 };
 
-static struct soc_camera_link iclink = {
-   .bus_id = 0, /* Match id in pxa27x_device_camera in device.c */
-   .board_info = _i2c_devices[0],
-   .i2c_adapter_id = 0,
-};
-
 struct i2c_pxa_platform_data i2c_pdata = {
.fast_mode = 1,
 };
@@ -684,7 +679,6 @@ MIO_SIMPLE_DEV(mioa701_sound, "mioa701-wm9713", 
NULL)
 MIO_SIMPLE_DEV(mioa701_board,"mioa701-board",  NULL)
 MIO_SIMPLE_DEV(wm9713_acodec,"wm9713-codec",   NULL);
 MIO_SIMPLE_DEV(gpio_vbus,"gpio-vbus",  _vbus_data);
-MIO_SIMPLE_DEV(mioa701_camera,   "soc-camera-pdrv",);
 
 static struct platform_device *devices[] __initdata = {
_gpio_keys,
@@ -696,7 +690,6 @@ static struct platform_device *devices[] __initdata = {
_dev,
,
_vbus,
-   _camera,
_board,
 };
 
@@ -761,6 +754,7 @@ static void __init mioa701_machine_init(void)
platform_add_devices(devices, ARRAY_SIZE(devices));
gsm_init();
 
+   i2c_register_board_info(0, ARRAY_AND_SIZE(mioa701_i2c_devices));
i2c_register_board_info(1, ARRAY_AND_SIZE(mioa701_pi2c_devices));
pxa_set_i2c_info(_pdata);
pxa27x_set_i2c_power_info(NULL);
@@ -769,6 +763,7 @@ static void __init mioa701_machine_init(void)
regulator_register_always_on(0, "fixed-5.0V", fixed_5v0_consumers,
 ARRAY_SIZE(fixed_5v0_consumers),
 500);
+   regulator_has_full_constraints();
 }
 
 static void mioa701_machine_exit(void)
-- 
2.1.4



Re: [PATCH] vme: fake: mark symbols static where possible

2016-09-23 Thread Martyn Welch
On 23 September 2016 at 14:38, Baoyou Xie  wrote:
> We get 4 warnings when building kernel with W=1:
> drivers/vme/bridges/vme_fake.c:384:6: warning: no previous prototype for 
> 'fake_lm_check' [-Wmissing-prototypes]
> drivers/vme/bridges/vme_fake.c:619:6: warning: no previous prototype for 
> 'fake_vmewrite8' [-Wmissing-prototypes]
> drivers/vme/bridges/vme_fake.c:649:6: warning: no previous prototype for 
> 'fake_vmewrite16' [-Wmissing-prototypes]
> drivers/vme/bridges/vme_fake.c:679:6: warning: no previous prototype for 
> 'fake_vmewrite32' [-Wmissing-prototypes]
>
> In fact, these functions are only used in the file in which they are
> declared and don't need a declaration, but can be made static.
> so this patch marks these functions with 'static'.
>
> Signed-off-by: Baoyou Xie 

Acked-by: Martyn Welch 

> ---
>  drivers/vme/bridges/vme_fake.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/vme/bridges/vme_fake.c b/drivers/vme/bridges/vme_fake.c
> index ebf35d3..29ac74f 100644
> --- a/drivers/vme/bridges/vme_fake.c
> +++ b/drivers/vme/bridges/vme_fake.c
> @@ -381,8 +381,8 @@ static int fake_master_get(struct vme_master_resource 
> *image, int *enabled,
>  }
>
>
> -void fake_lm_check(struct fake_driver *bridge, unsigned long long addr,
> -   u32 aspace, u32 cycle)
> +static void fake_lm_check(struct fake_driver *bridge, unsigned long long 
> addr,
> + u32 aspace, u32 cycle)
>  {
> struct vme_bridge *fake_bridge;
> unsigned long long lm_base;
> @@ -616,8 +616,8 @@ static ssize_t fake_master_read(struct 
> vme_master_resource *image, void *buf,
> return retval;
>  }
>
> -void fake_vmewrite8(struct fake_driver *bridge, u8 *buf,
> -unsigned long long addr, u32 aspace, u32 cycle)
> +static void fake_vmewrite8(struct fake_driver *bridge, u8 *buf,
> +  unsigned long long addr, u32 aspace, u32 cycle)
>  {
> int i;
> unsigned long long start, end, offset;
> @@ -646,8 +646,8 @@ void fake_vmewrite8(struct fake_driver *bridge, u8 *buf,
>
>  }
>
> -void fake_vmewrite16(struct fake_driver *bridge, u16 *buf,
> -   unsigned long long addr, u32 aspace, u32 cycle)
> +static void fake_vmewrite16(struct fake_driver *bridge, u16 *buf,
> +   unsigned long long addr, u32 aspace, u32 cycle)
>  {
> int i;
> unsigned long long start, end, offset;
> @@ -676,8 +676,8 @@ void fake_vmewrite16(struct fake_driver *bridge, u16 *buf,
>
>  }
>
> -void fake_vmewrite32(struct fake_driver *bridge, u32 *buf,
> -   unsigned long long addr, u32 aspace, u32 cycle)
> +static void fake_vmewrite32(struct fake_driver *bridge, u32 *buf,
> +   unsigned long long addr, u32 aspace, u32 cycle)
>  {
> int i;
> unsigned long long start, end, offset;
> --
> 2.7.4
>


[PATCH 1/3] ARM: pxa: mioa701: use the new pxa_camera platform_data

2016-09-23 Thread Robert Jarzmik
pxa_camera has transitioned from a soc_camera driver to a standalone
v4l2 driver. Amend the device declaration accordingly.

Signed-off-by: Robert Jarzmik 
---
 arch/arm/mach-pxa/mioa701.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/arm/mach-pxa/mioa701.c b/arch/arm/mach-pxa/mioa701.c
index 38a96a193dc4..8a5d0491e73c 100644
--- a/arch/arm/mach-pxa/mioa701.c
+++ b/arch/arm/mach-pxa/mioa701.c
@@ -57,7 +57,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mioa701.h"
 
@@ -627,6 +626,8 @@ struct pxacamera_platform_data 
mioa701_pxacamera_platform_data = {
.flags  = PXA_CAMERA_MASTER | PXA_CAMERA_DATAWIDTH_8 |
PXA_CAMERA_PCLK_EN | PXA_CAMERA_MCLK_EN,
.mclk_10khz = 5000,
+   .sensor_i2c_adapter_id = 0,
+   .sensor_i2c_address = 0x5d,
 };
 
 static struct i2c_board_info __initdata mioa701_pi2c_devices[] = {
@@ -643,12 +644,6 @@ static struct i2c_board_info mioa701_i2c_devices[] = {
},
 };
 
-static struct soc_camera_link iclink = {
-   .bus_id = 0, /* Match id in pxa27x_device_camera in device.c */
-   .board_info = _i2c_devices[0],
-   .i2c_adapter_id = 0,
-};
-
 struct i2c_pxa_platform_data i2c_pdata = {
.fast_mode = 1,
 };
@@ -684,7 +679,6 @@ MIO_SIMPLE_DEV(mioa701_sound, "mioa701-wm9713", 
NULL)
 MIO_SIMPLE_DEV(mioa701_board,"mioa701-board",  NULL)
 MIO_SIMPLE_DEV(wm9713_acodec,"wm9713-codec",   NULL);
 MIO_SIMPLE_DEV(gpio_vbus,"gpio-vbus",  _vbus_data);
-MIO_SIMPLE_DEV(mioa701_camera,   "soc-camera-pdrv",);
 
 static struct platform_device *devices[] __initdata = {
_gpio_keys,
@@ -696,7 +690,6 @@ static struct platform_device *devices[] __initdata = {
_dev,
,
_vbus,
-   _camera,
_board,
 };
 
@@ -761,6 +754,7 @@ static void __init mioa701_machine_init(void)
platform_add_devices(devices, ARRAY_SIZE(devices));
gsm_init();
 
+   i2c_register_board_info(0, ARRAY_AND_SIZE(mioa701_i2c_devices));
i2c_register_board_info(1, ARRAY_AND_SIZE(mioa701_pi2c_devices));
pxa_set_i2c_info(_pdata);
pxa27x_set_i2c_power_info(NULL);
@@ -769,6 +763,7 @@ static void __init mioa701_machine_init(void)
regulator_register_always_on(0, "fixed-5.0V", fixed_5v0_consumers,
 ARRAY_SIZE(fixed_5v0_consumers),
 500);
+   regulator_has_full_constraints();
 }
 
 static void mioa701_machine_exit(void)
-- 
2.1.4



Re: [PATCH 0/2] Ajust lockdep static allocations

2016-09-23 Thread Babu Moger


On 9/23/2016 3:17 PM, Peter Zijlstra wrote:

On Fri, Sep 23, 2016 at 02:57:39PM -0500, Babu Moger wrote:


  We checked again. Yes, It goes in .bss section. But in sparc we have
  to fit .text, .data, .bss in 7 permanent TLBs(that is totally 28MB).
  It was fine so far.  But the commit 1413c0389333 ("lockdep: Increase
  static allocations") added extra 4MB which makes it go beyond 28MB.
  That is causing system boot up problems in sparc.

*sigh*, why didn't you start with that :/


Yes.  We know it.  This is a limitation. Changing this limit in our
hardware is a much bigger change which we cannot address right away.
So, we are trying to come up with a solution which can work for all. I
will re-post the patches with  CONFIG_BASE_SMALL option if there is no
objections.

OK, so double check BASE_SMALL doesn't imply other things you cannot
live with, Sparc64 isn't a dinky system. If BASE_SMALL works for you
then good, otherwise do a PROVE_LOCKING_SMALL symbol that is not user
selectable and have SPARC select that. Use the invisible Help for that
symbol to explain all this again.


 Thanks. Will work on it.




  CCing David Miller and Rob Gardner. They might be able to explain
  more if you have any more questions.

Nah, I think I remember enough of how the Sparc MMU works to see reason.




Re: [PATCH 0/2] Ajust lockdep static allocations

2016-09-23 Thread Babu Moger


On 9/23/2016 3:17 PM, Peter Zijlstra wrote:

On Fri, Sep 23, 2016 at 02:57:39PM -0500, Babu Moger wrote:


  We checked again. Yes, It goes in .bss section. But in sparc we have
  to fit .text, .data, .bss in 7 permanent TLBs(that is totally 28MB).
  It was fine so far.  But the commit 1413c0389333 ("lockdep: Increase
  static allocations") added extra 4MB which makes it go beyond 28MB.
  That is causing system boot up problems in sparc.

*sigh*, why didn't you start with that :/


Yes.  We know it.  This is a limitation. Changing this limit in our
hardware is a much bigger change which we cannot address right away.
So, we are trying to come up with a solution which can work for all. I
will re-post the patches with  CONFIG_BASE_SMALL option if there is no
objections.

OK, so double check BASE_SMALL doesn't imply other things you cannot
live with, Sparc64 isn't a dinky system. If BASE_SMALL works for you
then good, otherwise do a PROVE_LOCKING_SMALL symbol that is not user
selectable and have SPARC select that. Use the invisible Help for that
symbol to explain all this again.


 Thanks. Will work on it.




  CCing David Miller and Rob Gardner. They might be able to explain
  more if you have any more questions.

Nah, I think I remember enough of how the Sparc MMU works to see reason.




Re: [RFC PATCH 1/6] perf: Move mlock accounting to ring buffer allocation

2016-09-23 Thread Peter Zijlstra
On Fri, Sep 23, 2016 at 10:26:15AM -0700, Andi Kleen wrote:
> > Afaict there's no actual need to hide the AUX buffer for this sampling
> > stuff; the user knows about all this and can simply mmap() the AUX part.
> > The sample could either point to locations in the AUX buffer, or (as I
> > think this code does) memcpy bits out.
> 
> This would work for perf, but not for the core dump case below.
> 
> > Ideally we'd pass the AUX-event into the syscall, that way you avoid all
> > the find_aux_event crud. I'm not sure we want to overload the group_fd
> > thing more (its already very hard to create counter groups in a cgroup
> > for example) ..
> > 
> > Coredump was mentioned somewhere, but I'm not sure I've seen
> > code/interfaces for that. How was that envisioned to work?
> 
> The idea was to have a rlimit that enables PT running as a ring buffer
> in the background.  If something crashes the ring buffer is dumped
> as part of the core dump, and then gdb can tell you how you crashed.
> This extends what gdb already does explicitly today using perf
> API calls.

Well, we could 'force' inject a VMA into the process's address space, we
do that for a few other things as well. It also makes for less
exceptions with the actual core dumping.

But the worry I have is the total amount of pinned memory. If you want
to inherit this on fork(), as is a reasonable expectation, then its
possible to quickly exceed the total amount of pinnable memory.

At which point we _should_ start failing fork(), which is a somewhat
unexpected, and undesirable side-effect.

Ideally we'd unpin the old buffers and repin the new buffers on context
switch, but that's impossible since faulting needs scheduling,
recursion, we loose.

I really want to see something sensible before we go do that.


Re: [RFC PATCH 1/6] perf: Move mlock accounting to ring buffer allocation

2016-09-23 Thread Peter Zijlstra
On Fri, Sep 23, 2016 at 10:26:15AM -0700, Andi Kleen wrote:
> > Afaict there's no actual need to hide the AUX buffer for this sampling
> > stuff; the user knows about all this and can simply mmap() the AUX part.
> > The sample could either point to locations in the AUX buffer, or (as I
> > think this code does) memcpy bits out.
> 
> This would work for perf, but not for the core dump case below.
> 
> > Ideally we'd pass the AUX-event into the syscall, that way you avoid all
> > the find_aux_event crud. I'm not sure we want to overload the group_fd
> > thing more (its already very hard to create counter groups in a cgroup
> > for example) ..
> > 
> > Coredump was mentioned somewhere, but I'm not sure I've seen
> > code/interfaces for that. How was that envisioned to work?
> 
> The idea was to have a rlimit that enables PT running as a ring buffer
> in the background.  If something crashes the ring buffer is dumped
> as part of the core dump, and then gdb can tell you how you crashed.
> This extends what gdb already does explicitly today using perf
> API calls.

Well, we could 'force' inject a VMA into the process's address space, we
do that for a few other things as well. It also makes for less
exceptions with the actual core dumping.

But the worry I have is the total amount of pinned memory. If you want
to inherit this on fork(), as is a reasonable expectation, then its
possible to quickly exceed the total amount of pinnable memory.

At which point we _should_ start failing fork(), which is a somewhat
unexpected, and undesirable side-effect.

Ideally we'd unpin the old buffers and repin the new buffers on context
switch, but that's impossible since faulting needs scheduling,
recursion, we loose.

I really want to see something sensible before we go do that.


[PATCH] nfp: bpf: improve handling for disabled BPF syscall

2016-09-23 Thread Arnd Bergmann
I stumbled over a new warning during randconfig testing,
with CONFIG_BPF_SYSCALL disabled:

drivers/net/ethernet/netronome/nfp/nfp_net_offload.c: In function 
'nfp_net_bpf_offload':
drivers/net/ethernet/netronome/nfp/nfp_net_offload.c:263:3: error: '*((void 
*)+4)' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]
drivers/net/ethernet/netronome/nfp/nfp_net_offload.c:263:3: error: 
'res.n_instr' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]

As far as I can tell, this is a false positive caused by the compiler
getting confused about a function that is partially inlined, but it's
easy to avoid while improving the code:

The nfp_bpf_jit() stub helper for that configuration is unusual as it
is defined in a header file but not marked 'static inline'. By moving
the compile-time check into the caller using the IS_ENABLED() macro,
we can remove that stub and simplify the nfp_net_bpf_offload_prepare()
function enough to unconfuse the compiler.

Fixes: 7533fdc0f77f ("nfp: bpf: add hardware bpf offload")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/netronome/nfp/nfp_bpf.h | 10 --
 drivers/net/ethernet/netronome/nfp/nfp_net_offload.c |  3 +++
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h 
b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
index fc220cd04115..87aa8a3e9112 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
@@ -192,20 +192,10 @@ struct nfp_bpf_result {
bool dense_mode;
 };
 
-#ifdef CONFIG_BPF_SYSCALL
 int
 nfp_bpf_jit(struct bpf_prog *filter, void *prog, enum nfp_bpf_action_type act,
unsigned int prog_start, unsigned int prog_done,
unsigned int prog_sz, struct nfp_bpf_result *res);
-#else
-int
-nfp_bpf_jit(struct bpf_prog *filter, void *prog, enum nfp_bpf_action_type act,
-   unsigned int prog_start, unsigned int prog_done,
-   unsigned int prog_sz, struct nfp_bpf_result *res)
-{
-   return -ENOTSUPP;
-}
-#endif
 
 int nfp_prog_verify(struct nfp_prog *nfp_prog, struct bpf_prog *prog);
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c 
b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
index 43f42f842eda..8acfb631a0ea 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
@@ -148,6 +148,9 @@ nfp_net_bpf_offload_prepare(struct nfp_net *nn,
unsigned int max_mtu;
int ret;
 
+   if (!IS_ENABLED(CONFIG_BPF_SYSCALL))
+   return -ENOTSUPP;
+
ret = nfp_net_bpf_get_act(nn, cls_bpf);
if (ret < 0)
return ret;
-- 
2.9.0



[PATCH] nfp: bpf: improve handling for disabled BPF syscall

2016-09-23 Thread Arnd Bergmann
I stumbled over a new warning during randconfig testing,
with CONFIG_BPF_SYSCALL disabled:

drivers/net/ethernet/netronome/nfp/nfp_net_offload.c: In function 
'nfp_net_bpf_offload':
drivers/net/ethernet/netronome/nfp/nfp_net_offload.c:263:3: error: '*((void 
*)+4)' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]
drivers/net/ethernet/netronome/nfp/nfp_net_offload.c:263:3: error: 
'res.n_instr' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]

As far as I can tell, this is a false positive caused by the compiler
getting confused about a function that is partially inlined, but it's
easy to avoid while improving the code:

The nfp_bpf_jit() stub helper for that configuration is unusual as it
is defined in a header file but not marked 'static inline'. By moving
the compile-time check into the caller using the IS_ENABLED() macro,
we can remove that stub and simplify the nfp_net_bpf_offload_prepare()
function enough to unconfuse the compiler.

Fixes: 7533fdc0f77f ("nfp: bpf: add hardware bpf offload")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/netronome/nfp/nfp_bpf.h | 10 --
 drivers/net/ethernet/netronome/nfp/nfp_net_offload.c |  3 +++
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h 
b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
index fc220cd04115..87aa8a3e9112 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
@@ -192,20 +192,10 @@ struct nfp_bpf_result {
bool dense_mode;
 };
 
-#ifdef CONFIG_BPF_SYSCALL
 int
 nfp_bpf_jit(struct bpf_prog *filter, void *prog, enum nfp_bpf_action_type act,
unsigned int prog_start, unsigned int prog_done,
unsigned int prog_sz, struct nfp_bpf_result *res);
-#else
-int
-nfp_bpf_jit(struct bpf_prog *filter, void *prog, enum nfp_bpf_action_type act,
-   unsigned int prog_start, unsigned int prog_done,
-   unsigned int prog_sz, struct nfp_bpf_result *res)
-{
-   return -ENOTSUPP;
-}
-#endif
 
 int nfp_prog_verify(struct nfp_prog *nfp_prog, struct bpf_prog *prog);
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c 
b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
index 43f42f842eda..8acfb631a0ea 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
@@ -148,6 +148,9 @@ nfp_net_bpf_offload_prepare(struct nfp_net *nn,
unsigned int max_mtu;
int ret;
 
+   if (!IS_ENABLED(CONFIG_BPF_SYSCALL))
+   return -ENOTSUPP;
+
ret = nfp_net_bpf_get_act(nn, cls_bpf);
if (ret < 0)
return ret;
-- 
2.9.0



[RFC 0/2] Add device tree property and quirk for supporting sdhci

2016-09-23 Thread Zach Brown
Some board configurations can not support sd highspeed mode due to the distance
between the card slot and the controller. The card and controller report that
they are capable of highspeed however, so we need a mechanism for specifying
that the setup is incapable of supporting highspeed mode.

The first patch adds documentation about a new devicetree property
sd-broken-highspeed.

The second patch keeps the sd controller and card from going into highspeed
mode when the property is set.

Chen Yee Chew (1):
  sdhci: Prevent SD from doing high-speed timing when
sd-broken-highspeed property is set

Zach Brown (1):
  sdhci: Add device tree property sd-broken-highspeed

 Documentation/devicetree/bindings/mmc/mmc.txt | 2 ++
 arch/arm/boot/dts/ni-77D5.dts | 1 -
 drivers/mmc/host/sdhci-pltfm.c| 3 +++
 drivers/mmc/host/sdhci.c  | 3 ++-
 drivers/mmc/host/sdhci.h  | 2 ++
 5 files changed, 9 insertions(+), 2 deletions(-)

-- 
2.7.4



[RFC 0/2] Add device tree property and quirk for supporting sdhci

2016-09-23 Thread Zach Brown
Some board configurations can not support sd highspeed mode due to the distance
between the card slot and the controller. The card and controller report that
they are capable of highspeed however, so we need a mechanism for specifying
that the setup is incapable of supporting highspeed mode.

The first patch adds documentation about a new devicetree property
sd-broken-highspeed.

The second patch keeps the sd controller and card from going into highspeed
mode when the property is set.

Chen Yee Chew (1):
  sdhci: Prevent SD from doing high-speed timing when
sd-broken-highspeed property is set

Zach Brown (1):
  sdhci: Add device tree property sd-broken-highspeed

 Documentation/devicetree/bindings/mmc/mmc.txt | 2 ++
 arch/arm/boot/dts/ni-77D5.dts | 1 -
 drivers/mmc/host/sdhci-pltfm.c| 3 +++
 drivers/mmc/host/sdhci.c  | 3 ++-
 drivers/mmc/host/sdhci.h  | 2 ++
 5 files changed, 9 insertions(+), 2 deletions(-)

-- 
2.7.4



[RFC 1/2] sdhci: Add device tree property sd-broken-highspeed

2016-09-23 Thread Zach Brown
Certain board configurations can make highspeed malfunction due to
timing issues. In these cases a way is needed to force the controller
and card into standard speed even if they otherwise appear to be capable
of highspeed.

The sd-broken-highspeed property will let the sdhci driver know that
highspeed will not work.

Signed-off-by: Zach Brown 
---
 Documentation/devicetree/bindings/mmc/mmc.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/mmc/mmc.txt 
b/Documentation/devicetree/bindings/mmc/mmc.txt
index 8a37782..59332ea 100644
--- a/Documentation/devicetree/bindings/mmc/mmc.txt
+++ b/Documentation/devicetree/bindings/mmc/mmc.txt
@@ -52,6 +52,8 @@ Optional properties:
 - no-sdio: controller is limited to send sdio cmd during initialization
 - no-sd: controller is limited to send sd cmd during initialization
 - no-mmc: controller is limited to send mmc cmd during initialization
+- sd-broken-highspeed: Highspeed is broken, even if the controller and card
+  themselves claim they support highspeed.
 
 *NOTE* on CD and WP polarity. To use common for all SD/MMC host controllers 
line
 polarity properties, we have to fix the meaning of the "normal" and "inverted"
-- 
2.7.4



[RFC 2/2] sdhci: Prevent SD from doing highspeed timing when sd-broken-highspeed property is set

2016-09-23 Thread Zach Brown
From: Chen Yee Chew 

When the sd-broken-highspeed property is set the sdhci driver will not
go into highspeed mode even if the controller and card appear to
otherwise support highspeed mode.

This is useful in cases where the controller and card support highspeed,
but the board configuration or some other issue make highspeed
impossible.

Signed-off-by: Chen Yee Chew 
Reviewed-by: Keng Soon Cheah 
Reviewed-by: Joe Hershberger 
Signed-off-by: Zach Brown 
---
 drivers/mmc/host/sdhci-pltfm.c | 3 +++
 drivers/mmc/host/sdhci.c   | 3 ++-
 drivers/mmc/host/sdhci.h   | 2 ++
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-pltfm.c b/drivers/mmc/host/sdhci-pltfm.c
index ad49bfa..7482706 100644
--- a/drivers/mmc/host/sdhci-pltfm.c
+++ b/drivers/mmc/host/sdhci-pltfm.c
@@ -87,6 +87,9 @@ void sdhci_get_of_property(struct platform_device *pdev)
if (of_get_property(np, "broken-cd", NULL))
host->quirks |= SDHCI_QUIRK_BROKEN_CARD_DETECTION;
 
+   if (of_get_property(np, "sd-broken-highspeed", NULL))
+   host->quirks2 |= SDHCI_QUIRK2_BROKEN_HISPD;
+
if (of_get_property(np, "no-1-8-v", NULL))
host->quirks2 |= SDHCI_QUIRK2_NO_1_8_V;
 
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 4805566..4b0969c 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -3274,7 +3274,8 @@ int sdhci_setup_host(struct sdhci_host *host)
if (host->quirks2 & SDHCI_QUIRK2_HOST_NO_CMD23)
mmc->caps &= ~MMC_CAP_CMD23;
 
-   if (host->caps & SDHCI_CAN_DO_HISPD)
+   if ((host->caps & SDHCI_CAN_DO_HISPD) &&
+   !(host->quirks2 & SDHCI_QUIRK2_BROKEN_HISPD))
mmc->caps |= MMC_CAP_SD_HIGHSPEED | MMC_CAP_MMC_HIGHSPEED;
 
if ((host->quirks & SDHCI_QUIRK_BROKEN_CARD_DETECTION) &&
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index c722cd2..3d0fdda 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -424,6 +424,8 @@ struct sdhci_host {
 #define SDHCI_QUIRK2_ACMD23_BROKEN (1<<14)
 /* Broken Clock divider zero in controller */
 #define SDHCI_QUIRK2_CLOCK_DIV_ZERO_BROKEN (1<<15)
+/* Highspeed is broken even if it appears otherwise */
+#define SDHCI_QUIRK2_BROKEN_HISPD  (1<<16)
 
int irq;/* Device IRQ */
void __iomem *ioaddr;   /* Mapped address */
-- 
2.7.4



[RFC 1/2] sdhci: Add device tree property sd-broken-highspeed

2016-09-23 Thread Zach Brown
Certain board configurations can make highspeed malfunction due to
timing issues. In these cases a way is needed to force the controller
and card into standard speed even if they otherwise appear to be capable
of highspeed.

The sd-broken-highspeed property will let the sdhci driver know that
highspeed will not work.

Signed-off-by: Zach Brown 
---
 Documentation/devicetree/bindings/mmc/mmc.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/mmc/mmc.txt 
b/Documentation/devicetree/bindings/mmc/mmc.txt
index 8a37782..59332ea 100644
--- a/Documentation/devicetree/bindings/mmc/mmc.txt
+++ b/Documentation/devicetree/bindings/mmc/mmc.txt
@@ -52,6 +52,8 @@ Optional properties:
 - no-sdio: controller is limited to send sdio cmd during initialization
 - no-sd: controller is limited to send sd cmd during initialization
 - no-mmc: controller is limited to send mmc cmd during initialization
+- sd-broken-highspeed: Highspeed is broken, even if the controller and card
+  themselves claim they support highspeed.
 
 *NOTE* on CD and WP polarity. To use common for all SD/MMC host controllers 
line
 polarity properties, we have to fix the meaning of the "normal" and "inverted"
-- 
2.7.4



[RFC 2/2] sdhci: Prevent SD from doing highspeed timing when sd-broken-highspeed property is set

2016-09-23 Thread Zach Brown
From: Chen Yee Chew 

When the sd-broken-highspeed property is set the sdhci driver will not
go into highspeed mode even if the controller and card appear to
otherwise support highspeed mode.

This is useful in cases where the controller and card support highspeed,
but the board configuration or some other issue make highspeed
impossible.

Signed-off-by: Chen Yee Chew 
Reviewed-by: Keng Soon Cheah 
Reviewed-by: Joe Hershberger 
Signed-off-by: Zach Brown 
---
 drivers/mmc/host/sdhci-pltfm.c | 3 +++
 drivers/mmc/host/sdhci.c   | 3 ++-
 drivers/mmc/host/sdhci.h   | 2 ++
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-pltfm.c b/drivers/mmc/host/sdhci-pltfm.c
index ad49bfa..7482706 100644
--- a/drivers/mmc/host/sdhci-pltfm.c
+++ b/drivers/mmc/host/sdhci-pltfm.c
@@ -87,6 +87,9 @@ void sdhci_get_of_property(struct platform_device *pdev)
if (of_get_property(np, "broken-cd", NULL))
host->quirks |= SDHCI_QUIRK_BROKEN_CARD_DETECTION;
 
+   if (of_get_property(np, "sd-broken-highspeed", NULL))
+   host->quirks2 |= SDHCI_QUIRK2_BROKEN_HISPD;
+
if (of_get_property(np, "no-1-8-v", NULL))
host->quirks2 |= SDHCI_QUIRK2_NO_1_8_V;
 
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 4805566..4b0969c 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -3274,7 +3274,8 @@ int sdhci_setup_host(struct sdhci_host *host)
if (host->quirks2 & SDHCI_QUIRK2_HOST_NO_CMD23)
mmc->caps &= ~MMC_CAP_CMD23;
 
-   if (host->caps & SDHCI_CAN_DO_HISPD)
+   if ((host->caps & SDHCI_CAN_DO_HISPD) &&
+   !(host->quirks2 & SDHCI_QUIRK2_BROKEN_HISPD))
mmc->caps |= MMC_CAP_SD_HIGHSPEED | MMC_CAP_MMC_HIGHSPEED;
 
if ((host->quirks & SDHCI_QUIRK_BROKEN_CARD_DETECTION) &&
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index c722cd2..3d0fdda 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -424,6 +424,8 @@ struct sdhci_host {
 #define SDHCI_QUIRK2_ACMD23_BROKEN (1<<14)
 /* Broken Clock divider zero in controller */
 #define SDHCI_QUIRK2_CLOCK_DIV_ZERO_BROKEN (1<<15)
+/* Highspeed is broken even if it appears otherwise */
+#define SDHCI_QUIRK2_BROKEN_HISPD  (1<<16)
 
int irq;/* Device IRQ */
void __iomem *ioaddr;   /* Mapped address */
-- 
2.7.4



Re: [PATCH v2] signals: Avoid unnecessary taking of sighand->siglock

2016-09-23 Thread Stas Sergeev

23.09.2016 19:56, Waiman Long пишет:

When running certain database workload on a high-end system with many
CPUs, it was found that spinlock contention in the sigprocmask syscalls
became a significant portion of the overall CPU cycles as shown below.

Hi, I was recently facing the same problem, and my solution
was to extract swapcontext() from libtask - it has better semantic
and does not do sigprocmask. How much you hack sigprocmask,
it is still faster to just not call it at all.
Alternatively, perhaps the speed-up can be achieved if the
current mask is exported to glibc via vdso.
Just my 2 cents.


Re: [PATCH v2] signals: Avoid unnecessary taking of sighand->siglock

2016-09-23 Thread Stas Sergeev

23.09.2016 19:56, Waiman Long пишет:

When running certain database workload on a high-end system with many
CPUs, it was found that spinlock contention in the sigprocmask syscalls
became a significant portion of the overall CPU cycles as shown below.

Hi, I was recently facing the same problem, and my solution
was to extract swapcontext() from libtask - it has better semantic
and does not do sigprocmask. How much you hack sigprocmask,
it is still faster to just not call it at all.
Alternatively, perhaps the speed-up can be achieved if the
current mask is exported to glibc via vdso.
Just my 2 cents.


Re: [PATCH 0/2] Ajust lockdep static allocations

2016-09-23 Thread Peter Zijlstra
On Fri, Sep 23, 2016 at 02:57:39PM -0500, Babu Moger wrote:

>  We checked again. Yes, It goes in .bss section. But in sparc we have
>  to fit .text, .data, .bss in 7 permanent TLBs(that is totally 28MB).
>  It was fine so far.  But the commit 1413c0389333 ("lockdep: Increase
>  static allocations") added extra 4MB which makes it go beyond 28MB.
>  That is causing system boot up problems in sparc. 

*sigh*, why didn't you start with that :/

> Yes.  We know it.  This is a limitation. Changing this limit in our
> hardware is a much bigger change which we cannot address right away.
> So, we are trying to come up with a solution which can work for all. I
> will re-post the patches with  CONFIG_BASE_SMALL option if there is no
> objections.

OK, so double check BASE_SMALL doesn't imply other things you cannot
live with, Sparc64 isn't a dinky system. If BASE_SMALL works for you
then good, otherwise do a PROVE_LOCKING_SMALL symbol that is not user
selectable and have SPARC select that. Use the invisible Help for that
symbol to explain all this again.

>  CCing David Miller and Rob Gardner. They might be able to explain
>  more if you have any more questions.

Nah, I think I remember enough of how the Sparc MMU works to see reason.


Re: [PATCH 0/2] Ajust lockdep static allocations

2016-09-23 Thread Peter Zijlstra
On Fri, Sep 23, 2016 at 02:57:39PM -0500, Babu Moger wrote:

>  We checked again. Yes, It goes in .bss section. But in sparc we have
>  to fit .text, .data, .bss in 7 permanent TLBs(that is totally 28MB).
>  It was fine so far.  But the commit 1413c0389333 ("lockdep: Increase
>  static allocations") added extra 4MB which makes it go beyond 28MB.
>  That is causing system boot up problems in sparc. 

*sigh*, why didn't you start with that :/

> Yes.  We know it.  This is a limitation. Changing this limit in our
> hardware is a much bigger change which we cannot address right away.
> So, we are trying to come up with a solution which can work for all. I
> will re-post the patches with  CONFIG_BASE_SMALL option if there is no
> objections.

OK, so double check BASE_SMALL doesn't imply other things you cannot
live with, Sparc64 isn't a dinky system. If BASE_SMALL works for you
then good, otherwise do a PROVE_LOCKING_SMALL symbol that is not user
selectable and have SPARC select that. Use the invisible Help for that
symbol to explain all this again.

>  CCing David Miller and Rob Gardner. They might be able to explain
>  more if you have any more questions.

Nah, I think I remember enough of how the Sparc MMU works to see reason.


RE: [PATCH 2/2] radix-tree: Fix optimisation problem

2016-09-23 Thread Matthew Wilcox
From: linus...@gmail.com [mailto:linus...@gmail.com] On Behalf Of Linus Torvalds
> On Thu, Sep 22, 2016 at 11:53 AM, Matthew Wilcox
>  wrote:
> >
> >   Change the test suite to compile with -O2, and
> > fix the optimisation problem by passing 'entry' through entry_to_node()
> > so gcc knows this isn't a plain pointer.
> 
> Ugh. I really don't like this patch very much.
> 
> Wouldn't it be cleaner to just fix "get_slot_offset()" instead? As it
> is, looking at the code, I suspect that it's really hard to convince
> people that there isn't some other place this might happen. Because
> the "pointer subtraction followed by pointer addition" pattern is all
> hidden in these inline functions.
> 
> Or at least add a big comment about why this is the only such case.
> 
> Because without that, the code now looks very bad.

That's fair.  I looked at all the other callers of get_slot_offset, and all the 
others are using a real slot pointer.  radix_tree_descend() really is the 
outlier here.  I think the real problem is that the types in the tree are 
wrong; instead of storing void *, we should be storing uintptr_t.  But fixing 
that is a little beyond the scope of -rc8.  Here's a slightly better version 
which asserts that the passed pointer really is a pointer.

(attached as well, I have no idea whether this patch will get mangled)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 1b7bf73..368f641 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -91,9 +91,15 @@ static inline bool is_sibling_entry(struct radix_tree_node 
*parent, void *node)
 }
 #endif
 
+/*
+ * The slot pointer must be a real pointer as GCC will optimise
+ * through inlined functions and may deduce that
+ * parent->slots + get_slot_offset(parent, slot) == slot
+ */
 static inline unsigned long get_slot_offset(struct radix_tree_node *parent,
 void **slot)
 {
+   BUG_ON(radix_tree_exception(slot));
return slot - parent->slots;
 }
 
@@ -101,11 +107,12 @@ static unsigned int radix_tree_descend(struct 
radix_tree_node *parent,
struct radix_tree_node **nodep, unsigned long index)
 {
unsigned int offset = (index >> parent->shift) & RADIX_TREE_MAP_MASK;
-   void **entry = rcu_dereference_raw(parent->slots[offset]);
+   void *entry = rcu_dereference_raw(parent->slots[offset]);
 
 #ifdef CONFIG_RADIX_TREE_MULTIORDER
if (radix_tree_is_internal_node(entry)) {
-   unsigned long siboff = get_slot_offset(parent, entry);
+   unsigned long siboff = get_slot_offset(parent,
+   (void **)entry_to_node(entry));
if (siboff < RADIX_TREE_MAP_SIZE) {
offset = siboff;
entry = rcu_dereference_raw(parent->slots[offset]);
@@ -113,7 +120,7 @@ static unsigned int radix_tree_descend(struct 
radix_tree_node *parent,
}
 #endif
 
-   *nodep = (void *)entry;
+   *nodep = entry;
return offset;
 }
 
diff --git a/tools/testing/radix-tree/Makefile 
b/tools/testing/radix-tree/Makefile
index 3b53046..9d0919ed 100644
--- a/tools/testing/radix-tree/Makefile
+++ b/tools/testing/radix-tree/Makefile
@@ -1,5 +1,5 @@
 
-CFLAGS += -I. -g -Wall -D_LGPL_SOURCE
+CFLAGS += -I. -g -O2 -Wall -D_LGPL_SOURCE
 LDFLAGS += -lpthread -lurcu
 TARGETS = main
 OFILES = main.o radix-tree.o linux.o test.o tag_check.o find_next_bit.o \


for-linus.diff
Description: for-linus.diff


RE: [PATCH 2/2] radix-tree: Fix optimisation problem

2016-09-23 Thread Matthew Wilcox
From: linus...@gmail.com [mailto:linus...@gmail.com] On Behalf Of Linus Torvalds
> On Thu, Sep 22, 2016 at 11:53 AM, Matthew Wilcox
>  wrote:
> >
> >   Change the test suite to compile with -O2, and
> > fix the optimisation problem by passing 'entry' through entry_to_node()
> > so gcc knows this isn't a plain pointer.
> 
> Ugh. I really don't like this patch very much.
> 
> Wouldn't it be cleaner to just fix "get_slot_offset()" instead? As it
> is, looking at the code, I suspect that it's really hard to convince
> people that there isn't some other place this might happen. Because
> the "pointer subtraction followed by pointer addition" pattern is all
> hidden in these inline functions.
> 
> Or at least add a big comment about why this is the only such case.
> 
> Because without that, the code now looks very bad.

That's fair.  I looked at all the other callers of get_slot_offset, and all the 
others are using a real slot pointer.  radix_tree_descend() really is the 
outlier here.  I think the real problem is that the types in the tree are 
wrong; instead of storing void *, we should be storing uintptr_t.  But fixing 
that is a little beyond the scope of -rc8.  Here's a slightly better version 
which asserts that the passed pointer really is a pointer.

(attached as well, I have no idea whether this patch will get mangled)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 1b7bf73..368f641 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -91,9 +91,15 @@ static inline bool is_sibling_entry(struct radix_tree_node 
*parent, void *node)
 }
 #endif
 
+/*
+ * The slot pointer must be a real pointer as GCC will optimise
+ * through inlined functions and may deduce that
+ * parent->slots + get_slot_offset(parent, slot) == slot
+ */
 static inline unsigned long get_slot_offset(struct radix_tree_node *parent,
 void **slot)
 {
+   BUG_ON(radix_tree_exception(slot));
return slot - parent->slots;
 }
 
@@ -101,11 +107,12 @@ static unsigned int radix_tree_descend(struct 
radix_tree_node *parent,
struct radix_tree_node **nodep, unsigned long index)
 {
unsigned int offset = (index >> parent->shift) & RADIX_TREE_MAP_MASK;
-   void **entry = rcu_dereference_raw(parent->slots[offset]);
+   void *entry = rcu_dereference_raw(parent->slots[offset]);
 
 #ifdef CONFIG_RADIX_TREE_MULTIORDER
if (radix_tree_is_internal_node(entry)) {
-   unsigned long siboff = get_slot_offset(parent, entry);
+   unsigned long siboff = get_slot_offset(parent,
+   (void **)entry_to_node(entry));
if (siboff < RADIX_TREE_MAP_SIZE) {
offset = siboff;
entry = rcu_dereference_raw(parent->slots[offset]);
@@ -113,7 +120,7 @@ static unsigned int radix_tree_descend(struct 
radix_tree_node *parent,
}
 #endif
 
-   *nodep = (void *)entry;
+   *nodep = entry;
return offset;
 }
 
diff --git a/tools/testing/radix-tree/Makefile 
b/tools/testing/radix-tree/Makefile
index 3b53046..9d0919ed 100644
--- a/tools/testing/radix-tree/Makefile
+++ b/tools/testing/radix-tree/Makefile
@@ -1,5 +1,5 @@
 
-CFLAGS += -I. -g -Wall -D_LGPL_SOURCE
+CFLAGS += -I. -g -O2 -Wall -D_LGPL_SOURCE
 LDFLAGS += -lpthread -lurcu
 TARGETS = main
 OFILES = main.o radix-tree.o linux.o test.o tag_check.o find_next_bit.o \


for-linus.diff
Description: for-linus.diff


[PATCH] mlx5: Add ndo_poll_controller() implementation

2016-09-23 Thread Calvin Owens
This implements ndo_poll_controller in net_device_ops for mlx5, which is
necessary to use netconsole with this driver.

Signed-off-by: Calvin Owens 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 2459c7f..439476f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2786,6 +2786,20 @@ static void mlx5e_tx_timeout(struct net_device *dev)
schedule_work(>tx_timeout_work);
 }
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+/* Fake "interrupt" called by netpoll (eg netconsole) to send skbs without
+ * reenabling interrupts.
+ */
+static void mlx5e_netpoll(struct net_device *dev)
+{
+   struct mlx5e_priv *priv = netdev_priv(dev);
+   int i, nr_sq = priv->params.num_channels * priv->params.num_tc;
+
+   for (i = 0; i < nr_sq; i++)
+   napi_schedule(priv->txq_to_sq_map[i]->cq.napi);
+}
+#endif
+
 static const struct net_device_ops mlx5e_netdev_ops_basic = {
.ndo_open= mlx5e_open,
.ndo_stop= mlx5e_close,
@@ -2805,6 +2819,9 @@ static const struct net_device_ops mlx5e_netdev_ops_basic 
= {
.ndo_rx_flow_steer   = mlx5e_rx_flow_steer,
 #endif
.ndo_tx_timeout  = mlx5e_tx_timeout,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+   .ndo_poll_controller = mlx5e_netpoll,
+#endif
 };
 
 static const struct net_device_ops mlx5e_netdev_ops_sriov = {
@@ -2836,6 +2853,9 @@ static const struct net_device_ops mlx5e_netdev_ops_sriov 
= {
.ndo_set_vf_link_state   = mlx5e_set_vf_link_state,
.ndo_get_vf_stats= mlx5e_get_vf_stats,
.ndo_tx_timeout  = mlx5e_tx_timeout,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+   .ndo_poll_controller = mlx5e_netpoll,
+#endif
 };
 
 static int mlx5e_check_required_hca_cap(struct mlx5_core_dev *mdev)
-- 
2.9.3



[PATCH] mlx5: Add ndo_poll_controller() implementation

2016-09-23 Thread Calvin Owens
This implements ndo_poll_controller in net_device_ops for mlx5, which is
necessary to use netconsole with this driver.

Signed-off-by: Calvin Owens 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 2459c7f..439476f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2786,6 +2786,20 @@ static void mlx5e_tx_timeout(struct net_device *dev)
schedule_work(>tx_timeout_work);
 }
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+/* Fake "interrupt" called by netpoll (eg netconsole) to send skbs without
+ * reenabling interrupts.
+ */
+static void mlx5e_netpoll(struct net_device *dev)
+{
+   struct mlx5e_priv *priv = netdev_priv(dev);
+   int i, nr_sq = priv->params.num_channels * priv->params.num_tc;
+
+   for (i = 0; i < nr_sq; i++)
+   napi_schedule(priv->txq_to_sq_map[i]->cq.napi);
+}
+#endif
+
 static const struct net_device_ops mlx5e_netdev_ops_basic = {
.ndo_open= mlx5e_open,
.ndo_stop= mlx5e_close,
@@ -2805,6 +2819,9 @@ static const struct net_device_ops mlx5e_netdev_ops_basic 
= {
.ndo_rx_flow_steer   = mlx5e_rx_flow_steer,
 #endif
.ndo_tx_timeout  = mlx5e_tx_timeout,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+   .ndo_poll_controller = mlx5e_netpoll,
+#endif
 };
 
 static const struct net_device_ops mlx5e_netdev_ops_sriov = {
@@ -2836,6 +2853,9 @@ static const struct net_device_ops mlx5e_netdev_ops_sriov 
= {
.ndo_set_vf_link_state   = mlx5e_set_vf_link_state,
.ndo_get_vf_stats= mlx5e_get_vf_stats,
.ndo_tx_timeout  = mlx5e_tx_timeout,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+   .ndo_poll_controller = mlx5e_netpoll,
+#endif
 };
 
 static int mlx5e_check_required_hca_cap(struct mlx5_core_dev *mdev)
-- 
2.9.3



Re: [PATCH 0/2] Ajust lockdep static allocations

2016-09-23 Thread Rob Gardner

On 09/23/2016 01:57 PM, Babu Moger wrote:


On 9/23/2016 10:40 AM, Peter Zijlstra wrote:

On Fri, Sep 23, 2016 at 10:15:46AM -0500, Babu Moger wrote:

 Correct, We can't boot with lockdep. Sorry I did not make 
that clear.

We have a limit on static size of the kernel.
This stuff should be in .bss not .data. It should not affect the 
static

size at all. Or am I misunderstanding things?

  Here it is.
$ ./scripts/bloat-o-meter vmlinux.lockdep.small vmlinux.lockdep.big

What does bloat-o-meter have to do with things? The static image size is
not dependent on .bss, right?


 Peter,
 We checked again. Yes, It goes in .bss section. But in sparc we have 
to fit .text, .data,
 .bss in 7 permanent TLBs(that is totally 28MB). It was fine so far.  
But the commit
 1413c0389333 ("lockdep: Increase static allocations") added extra 4MB 
which makes
 it go beyond 28MB. That is causing system boot up problems in sparc. 
Yes. We know it.
 This is a limitation. Changing this limit in our hardware is a much 
bigger change which
 we cannot address right away. So, we are trying to come up with a 
solution which can
 work for all. I will re-post the patches with  CONFIG_BASE_SMALL 
option if there is no

 objections.

 CCing David Miller and Rob Gardner. They might be able to explain 
more if you
 have any more questions. Here is the discussion thread if you guys 
want to look at history.
 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1237642.html 






Yes, perhaps I can help clarify the problem Babu is seeing. It is true 
that stuff in bss doesn't increase the static size of the file that 
contains the kernel. But it does increase the kernel's memory footprint. 
And as the system is booting, all the kernel's code, data, and bss, must 
have locked translations in the TLB so that we don't get TLB misses on 
kernel code and data. Current sparc chips have 8 TLB entries available 
that may be locked down, and with a 4mb page size, this gives a maximum 
of 32mb of memory that can be covered. One of these is used for kexec (I 
think), so that leaves 28mb. It sounds to me like Babu is saying that 
the change in question has increased the size of bss data so this limit 
is exceeded, thus causing boot problems, and he proposes to somewhat 
reduce the added space to alleviate this problem.


And also as Babu says, changing to a larger page size is very tricky. 
Not only do different sparc cpus support a different set of h/w page 
sizes, the effects of changing this are quite far reaching and would 
affect a lot of code.


Rob Gardner




Re: [PATCH 0/2] Ajust lockdep static allocations

2016-09-23 Thread Rob Gardner

On 09/23/2016 01:57 PM, Babu Moger wrote:


On 9/23/2016 10:40 AM, Peter Zijlstra wrote:

On Fri, Sep 23, 2016 at 10:15:46AM -0500, Babu Moger wrote:

 Correct, We can't boot with lockdep. Sorry I did not make 
that clear.

We have a limit on static size of the kernel.
This stuff should be in .bss not .data. It should not affect the 
static

size at all. Or am I misunderstanding things?

  Here it is.
$ ./scripts/bloat-o-meter vmlinux.lockdep.small vmlinux.lockdep.big

What does bloat-o-meter have to do with things? The static image size is
not dependent on .bss, right?


 Peter,
 We checked again. Yes, It goes in .bss section. But in sparc we have 
to fit .text, .data,
 .bss in 7 permanent TLBs(that is totally 28MB). It was fine so far.  
But the commit
 1413c0389333 ("lockdep: Increase static allocations") added extra 4MB 
which makes
 it go beyond 28MB. That is causing system boot up problems in sparc. 
Yes. We know it.
 This is a limitation. Changing this limit in our hardware is a much 
bigger change which
 we cannot address right away. So, we are trying to come up with a 
solution which can
 work for all. I will re-post the patches with  CONFIG_BASE_SMALL 
option if there is no

 objections.

 CCing David Miller and Rob Gardner. They might be able to explain 
more if you
 have any more questions. Here is the discussion thread if you guys 
want to look at history.
 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1237642.html 






Yes, perhaps I can help clarify the problem Babu is seeing. It is true 
that stuff in bss doesn't increase the static size of the file that 
contains the kernel. But it does increase the kernel's memory footprint. 
And as the system is booting, all the kernel's code, data, and bss, must 
have locked translations in the TLB so that we don't get TLB misses on 
kernel code and data. Current sparc chips have 8 TLB entries available 
that may be locked down, and with a 4mb page size, this gives a maximum 
of 32mb of memory that can be covered. One of these is used for kexec (I 
think), so that leaves 28mb. It sounds to me like Babu is saying that 
the change in question has increased the size of bss data so this limit 
is exceeded, thus causing boot problems, and he proposes to somewhat 
reduce the added space to alleviate this problem.


And also as Babu says, changing to a larger page size is very tricky. 
Not only do different sparc cpus support a different set of h/w page 
sizes, the effects of changing this are quite far reaching and would 
affect a lot of code.


Rob Gardner




Re: [RFC PATCH] PM / OPP: Don't support OPP if it provides supported-hw but platform does not

2016-09-23 Thread Dave Gerlach

Hi,
On 09/23/2016 03:07 PM, Dave Gerlach wrote:

The OPP framework allows each OPP to set a opp-supported-hw property
which provides values that are matched against supported_hw values
provided by the platform to limit support for certain OPPs on specific
hardware. Currently, if the platform does not set supported_hw values,
all OPPs are interpreted as supported, even if they have provided their
own opp-supported-hw values.

If an OPP has provided opp-supported-hw, it is indicating that there is
some specific hardware configuration it is supported by. These constraints
should be honored, and if no supported_hw has been provided by the
platform, there is no way to determine if that OPP is actually supported,
so it should be marked as not supported.

Signed-off-by: Dave Gerlach 
---


Currently only the sti-cpufreq and forthcoming ti-cpufreq [1] driver are 
making use of dev_pm_opp_set_supported_hw so maybe nobody has seen this 
yet or the framework was designed to work as it does.


I would think that if an OPP provides a set of constraints that define 
when it is supported and we can't tell if we can meet those, we should 
disable an OPP rather than enable it. Otherwise, what was the point of 
providing constraints?


Regards,
Dave

[1] http://www.spinics.net/lists/arm-kernel/msg527921.html


  drivers/base/power/opp/of.c | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c
index 1dfd3dd92624..af9f8968 100644
--- a/drivers/base/power/opp/of.c
+++ b/drivers/base/power/opp/of.c
@@ -71,8 +71,18 @@ static bool _opp_is_supported(struct device *dev, struct 
opp_table *opp_table,
u32 version;
int ret;

-   if (!opp_table->supported_hw)
-   return true;
+   if (!opp_table->supported_hw) {
+   /*
+* In the case that no supported_hw has been set by the
+* platform but there is an opp-supported-hw value set for
+* an OPP then the OPP should not be enabled as there is
+* no way to see if the hardware supports it.
+*/
+   if (of_find_property(np, "opp-supported-hw", NULL))
+   return false;
+   else
+   return true;
+   }

while (count--) {
ret = of_property_read_u32_index(np, "opp-supported-hw", count,





Re: [RFC PATCH] PM / OPP: Don't support OPP if it provides supported-hw but platform does not

2016-09-23 Thread Dave Gerlach

Hi,
On 09/23/2016 03:07 PM, Dave Gerlach wrote:

The OPP framework allows each OPP to set a opp-supported-hw property
which provides values that are matched against supported_hw values
provided by the platform to limit support for certain OPPs on specific
hardware. Currently, if the platform does not set supported_hw values,
all OPPs are interpreted as supported, even if they have provided their
own opp-supported-hw values.

If an OPP has provided opp-supported-hw, it is indicating that there is
some specific hardware configuration it is supported by. These constraints
should be honored, and if no supported_hw has been provided by the
platform, there is no way to determine if that OPP is actually supported,
so it should be marked as not supported.

Signed-off-by: Dave Gerlach 
---


Currently only the sti-cpufreq and forthcoming ti-cpufreq [1] driver are 
making use of dev_pm_opp_set_supported_hw so maybe nobody has seen this 
yet or the framework was designed to work as it does.


I would think that if an OPP provides a set of constraints that define 
when it is supported and we can't tell if we can meet those, we should 
disable an OPP rather than enable it. Otherwise, what was the point of 
providing constraints?


Regards,
Dave

[1] http://www.spinics.net/lists/arm-kernel/msg527921.html


  drivers/base/power/opp/of.c | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c
index 1dfd3dd92624..af9f8968 100644
--- a/drivers/base/power/opp/of.c
+++ b/drivers/base/power/opp/of.c
@@ -71,8 +71,18 @@ static bool _opp_is_supported(struct device *dev, struct 
opp_table *opp_table,
u32 version;
int ret;

-   if (!opp_table->supported_hw)
-   return true;
+   if (!opp_table->supported_hw) {
+   /*
+* In the case that no supported_hw has been set by the
+* platform but there is an opp-supported-hw value set for
+* an OPP then the OPP should not be enabled as there is
+* no way to see if the hardware supports it.
+*/
+   if (of_find_property(np, "opp-supported-hw", NULL))
+   return false;
+   else
+   return true;
+   }

while (count--) {
ret = of_property_read_u32_index(np, "opp-supported-hw", count,





[RFC PATCH] PM / OPP: Don't support OPP if it provides supported-hw but platform does not

2016-09-23 Thread Dave Gerlach
The OPP framework allows each OPP to set a opp-supported-hw property
which provides values that are matched against supported_hw values
provided by the platform to limit support for certain OPPs on specific
hardware. Currently, if the platform does not set supported_hw values,
all OPPs are interpreted as supported, even if they have provided their
own opp-supported-hw values.

If an OPP has provided opp-supported-hw, it is indicating that there is
some specific hardware configuration it is supported by. These constraints
should be honored, and if no supported_hw has been provided by the
platform, there is no way to determine if that OPP is actually supported,
so it should be marked as not supported.

Signed-off-by: Dave Gerlach 
---
 drivers/base/power/opp/of.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c
index 1dfd3dd92624..af9f8968 100644
--- a/drivers/base/power/opp/of.c
+++ b/drivers/base/power/opp/of.c
@@ -71,8 +71,18 @@ static bool _opp_is_supported(struct device *dev, struct 
opp_table *opp_table,
u32 version;
int ret;
 
-   if (!opp_table->supported_hw)
-   return true;
+   if (!opp_table->supported_hw) {
+   /*
+* In the case that no supported_hw has been set by the
+* platform but there is an opp-supported-hw value set for
+* an OPP then the OPP should not be enabled as there is
+* no way to see if the hardware supports it.
+*/
+   if (of_find_property(np, "opp-supported-hw", NULL))
+   return false;
+   else
+   return true;
+   }
 
while (count--) {
ret = of_property_read_u32_index(np, "opp-supported-hw", count,
-- 
2.9.3



[RFC PATCH] PM / OPP: Don't support OPP if it provides supported-hw but platform does not

2016-09-23 Thread Dave Gerlach
The OPP framework allows each OPP to set a opp-supported-hw property
which provides values that are matched against supported_hw values
provided by the platform to limit support for certain OPPs on specific
hardware. Currently, if the platform does not set supported_hw values,
all OPPs are interpreted as supported, even if they have provided their
own opp-supported-hw values.

If an OPP has provided opp-supported-hw, it is indicating that there is
some specific hardware configuration it is supported by. These constraints
should be honored, and if no supported_hw has been provided by the
platform, there is no way to determine if that OPP is actually supported,
so it should be marked as not supported.

Signed-off-by: Dave Gerlach 
---
 drivers/base/power/opp/of.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c
index 1dfd3dd92624..af9f8968 100644
--- a/drivers/base/power/opp/of.c
+++ b/drivers/base/power/opp/of.c
@@ -71,8 +71,18 @@ static bool _opp_is_supported(struct device *dev, struct 
opp_table *opp_table,
u32 version;
int ret;
 
-   if (!opp_table->supported_hw)
-   return true;
+   if (!opp_table->supported_hw) {
+   /*
+* In the case that no supported_hw has been set by the
+* platform but there is an opp-supported-hw value set for
+* an OPP then the OPP should not be enabled as there is
+* no way to see if the hardware supports it.
+*/
+   if (of_find_property(np, "opp-supported-hw", NULL))
+   return false;
+   else
+   return true;
+   }
 
while (count--) {
ret = of_property_read_u32_index(np, "opp-supported-hw", count,
-- 
2.9.3



Re: [PATCH v2 1/3] dt-bindings: display: display-timing: Add property to configure sync drive edge

2016-09-23 Thread Rob Herring
On Thu, Sep 22, 2016 at 01:35:24PM +0300, Peter Ujfalusi wrote:
> There are display panels which demands that the sync signal is driven on
> different edge than the pixel data.
> With the syncclk-active property we can specify the clk edge to be used to
> drive the sync signal. When the property is missing it indicates that the
> sync is driven on the same edge as the pixel data.
> 
> Signed-off-by: Peter Ujfalusi 
> CC: Rob Herring 
> CC: Mark Rutland 
> CC: devicet...@vger.kernel.org
> ---
>  .../devicetree/bindings/display/panel/display-timing.txt  | 8 
> 
>  1 file changed, 8 insertions(+)

Acked-by: Rob Herring 


Re: [PATCH v2 1/3] dt-bindings: display: display-timing: Add property to configure sync drive edge

2016-09-23 Thread Rob Herring
On Thu, Sep 22, 2016 at 01:35:24PM +0300, Peter Ujfalusi wrote:
> There are display panels which demands that the sync signal is driven on
> different edge than the pixel data.
> With the syncclk-active property we can specify the clk edge to be used to
> drive the sync signal. When the property is missing it indicates that the
> sync is driven on the same edge as the pixel data.
> 
> Signed-off-by: Peter Ujfalusi 
> CC: Rob Herring 
> CC: Mark Rutland 
> CC: devicet...@vger.kernel.org
> ---
>  .../devicetree/bindings/display/panel/display-timing.txt  | 8 
> 
>  1 file changed, 8 insertions(+)

Acked-by: Rob Herring 


[GIT PULL] Btrfs

2016-09-23 Thread Chris Mason
Hi Linus,

We have two fixes in my for-linus-4.8 branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git 
for-linus-4.8

Josef is fixing a problem when quotas are enabled with his latest ENOSPC
rework, and Jeff is adding more checks into the subvol ioctls to avoid
tripping up lookup_one_len

Josef Bacik (1) commits (+3/-6):
Btrfs: handle quota reserve failure properly

Jeff Mahoney (1) commits (+12/-0):
btrfs: ensure that file descriptor used with subvol ioctls is a dir

Total: (2) commits (+15/-6)

 fs/btrfs/extent-tree.c |  9 +++--
 fs/btrfs/ioctl.c   | 12 
 2 files changed, 15 insertions(+), 6 deletions(-)


[GIT PULL] Btrfs

2016-09-23 Thread Chris Mason
Hi Linus,

We have two fixes in my for-linus-4.8 branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git 
for-linus-4.8

Josef is fixing a problem when quotas are enabled with his latest ENOSPC
rework, and Jeff is adding more checks into the subvol ioctls to avoid
tripping up lookup_one_len

Josef Bacik (1) commits (+3/-6):
Btrfs: handle quota reserve failure properly

Jeff Mahoney (1) commits (+12/-0):
btrfs: ensure that file descriptor used with subvol ioctls is a dir

Total: (2) commits (+15/-6)

 fs/btrfs/extent-tree.c |  9 +++--
 fs/btrfs/ioctl.c   | 12 
 2 files changed, 15 insertions(+), 6 deletions(-)


Re: [PATCH 1/3] watchdog: bindings: dw_wdt: add reset lines

2016-09-23 Thread Rob Herring
On Thu, Sep 22, 2016 at 09:02:30AM +0200, Steffen Trumtrar wrote:
> Document the reset lines holding the watchdog core in reset.
> 
> Signed-off-by: Steffen Trumtrar 
> Cc: Wim Van Sebroeck 
> Cc: Rob Herring 
> Cc: Mark Rutland 
> Cc: linux-watch...@vger.kernel.org
> Cc: devicet...@vger.kernel.org
> ---
>  Documentation/devicetree/bindings/watchdog/dw_wdt.txt | 5 +
>  1 file changed, 5 insertions(+)

Acked-by: Rob Herring 


Re: [PATCH 1/3] watchdog: bindings: dw_wdt: add reset lines

2016-09-23 Thread Rob Herring
On Thu, Sep 22, 2016 at 09:02:30AM +0200, Steffen Trumtrar wrote:
> Document the reset lines holding the watchdog core in reset.
> 
> Signed-off-by: Steffen Trumtrar 
> Cc: Wim Van Sebroeck 
> Cc: Rob Herring 
> Cc: Mark Rutland 
> Cc: linux-watch...@vger.kernel.org
> Cc: devicet...@vger.kernel.org
> ---
>  Documentation/devicetree/bindings/watchdog/dw_wdt.txt | 5 +
>  1 file changed, 5 insertions(+)

Acked-by: Rob Herring 


Re: [PATCH 2/2] PM / OPP: Multiple regulators aren't supported yet

2016-09-23 Thread Rob Herring
On Wed, Sep 21, 2016 at 03:02:50PM +0530, Viresh Kumar wrote:
> Multiple regulators per device aren't supported yet by the kernel code
> and the bindings provided in documentation aren't sufficient to handle
> that case (as there is no way for kernel code to link multiple
> voltage/current values to a power supply).

What do you mean? Because the supplies are in the cpu node?

Rob


Re: [PATCH 2/2] PM / OPP: Multiple regulators aren't supported yet

2016-09-23 Thread Rob Herring
On Wed, Sep 21, 2016 at 03:02:50PM +0530, Viresh Kumar wrote:
> Multiple regulators per device aren't supported yet by the kernel code
> and the bindings provided in documentation aren't sufficient to handle
> that case (as there is no way for kernel code to link multiple
> voltage/current values to a power supply).

What do you mean? Because the supplies are in the cpu node?

Rob


Re: [PATCH RT 3/7] net: add back the missing serialization in ip_send_unicast_reply()

2016-09-23 Thread Steven Rostedt
On Fri, 23 Sep 2016 15:50:11 -0400
Steven Rostedt  wrote:

> 3.2.82-rt119-rc1 stable review patch.
> If anyone has any objections, please let me know.

I once again forgot to fix the umlaut before sending it via quilt. I'll
have to try to fix quilt when I get a chance.

-- Steve

> 
> --
> 
> From: Sebastian Andrzej Siewior 
> 
> Some time ago Sami PietikÀinen reported a crash on -RT in
> ip_send_unicast_reply() which was later fixed by Nicholas Mc Guire
> (v3.12.8-rt11). Later (v3.18.8) the code was reworked and I dropped the
> patch. As it turns out it was mistake.
> I have reports that the same crash is possible with a similar backtrace.
> It seems that vanilla protects access to this_cpu_ptr() via
> local_bh_disable(). This does not work the on -RT since we can have
> NET_RX and NET_TX running in parallel on the same CPU.
> This is brings back the old locks.
> 
> |Unable to handle kernel NULL pointer dereference at virtual address 0010
> |PC is at __ip_make_skb+0x198/0x3e8
> |[] (__ip_make_skb) from [] 
> (ip_push_pending_frames+0x20/0x40)
> |[] (ip_push_pending_frames) from [] 
> (ip_send_unicast_reply+0x210/0x22c)
> |[] (ip_send_unicast_reply) from [] 
> (tcp_v4_send_reset+0x190/0x1c0)
> |[] (tcp_v4_send_reset) from [] 
> (tcp_v4_do_rcv+0x22c/0x288)
> |[] (tcp_v4_do_rcv) from [] (release_sock+0xb4/0x150)
> |[] (release_sock) from [] (tcp_close+0x240/0x454)
> |[] (tcp_close) from [] (inet_release+0x74/0x7c)
> |[] (inet_release) from [] (sock_release+0x30/0xb0)
> |[] (sock_release) from [] (sock_close+0x1c/0x24)
> |[] (sock_close) from [] (__fput+0xe8/0x20c)
> |[] (__fput) from [] (fput+0x18/0x1c)
> |[] (fput) from [] (task_work_run+0xa4/0xb8)
> |[] (task_work_run) from [] (do_work_pending+0xd0/0xe4)
> |[] (do_work_pending) from [] (work_pending+0xc/0x20)
> |Code: e3530001 8a01 e3a00040 ea11 (e5973010)
> 
> Cc: stable...@vger.kernel.org
> Signed-off-by: Sebastian Andrzej Siewior 
> Signed-off-by: Steven Rostedt 
> ---
>  net/ipv4/tcp_ipv4.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index b4e0eb49f56d..e6345b547922 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -61,6 +61,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -575,6 +576,7 @@ int tcp_v4_gso_send_check(struct sk_buff *skb)
>   return 0;
>  }
>  
> +static DEFINE_LOCAL_IRQ_LOCK(tcp_sk_lock);
>  /*
>   *   This routine will send an RST to the other tcp.
>   *
> @@ -659,8 +661,11 @@ static void tcp_v4_send_reset(struct sock *sk, struct 
> sk_buff *skb)
>  
>   net = dev_net(skb_dst(skb)->dev);
>   arg.tos = ip_hdr(skb)->tos;
> +
> + local_lock(tcp_sk_lock);
>   ip_send_reply(net->ipv4.tcp_sock, skb, ip_hdr(skb)->saddr,
> , arg.iov[0].iov_len);
> + local_unlock(tcp_sk_lock);
>  
>   TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS);
>   TCP_INC_STATS_BH(net, TCP_MIB_OUTRSTS);
> @@ -734,8 +739,10 @@ static void tcp_v4_send_ack(struct sk_buff *skb, u32 
> seq, u32 ack,
>   if (oif)
>   arg.bound_dev_if = oif;
>   arg.tos = tos;
> + local_lock(tcp_sk_lock);
>   ip_send_reply(net->ipv4.tcp_sock, skb, ip_hdr(skb)->saddr,
> , arg.iov[0].iov_len);
> + local_unlock(tcp_sk_lock);
>  
>   TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS);
>  }



Re: [PATCH RT 3/7] net: add back the missing serialization in ip_send_unicast_reply()

2016-09-23 Thread Steven Rostedt
On Fri, 23 Sep 2016 15:50:11 -0400
Steven Rostedt  wrote:

> 3.2.82-rt119-rc1 stable review patch.
> If anyone has any objections, please let me know.

I once again forgot to fix the umlaut before sending it via quilt. I'll
have to try to fix quilt when I get a chance.

-- Steve

> 
> --
> 
> From: Sebastian Andrzej Siewior 
> 
> Some time ago Sami PietikÀinen reported a crash on -RT in
> ip_send_unicast_reply() which was later fixed by Nicholas Mc Guire
> (v3.12.8-rt11). Later (v3.18.8) the code was reworked and I dropped the
> patch. As it turns out it was mistake.
> I have reports that the same crash is possible with a similar backtrace.
> It seems that vanilla protects access to this_cpu_ptr() via
> local_bh_disable(). This does not work the on -RT since we can have
> NET_RX and NET_TX running in parallel on the same CPU.
> This is brings back the old locks.
> 
> |Unable to handle kernel NULL pointer dereference at virtual address 0010
> |PC is at __ip_make_skb+0x198/0x3e8
> |[] (__ip_make_skb) from [] 
> (ip_push_pending_frames+0x20/0x40)
> |[] (ip_push_pending_frames) from [] 
> (ip_send_unicast_reply+0x210/0x22c)
> |[] (ip_send_unicast_reply) from [] 
> (tcp_v4_send_reset+0x190/0x1c0)
> |[] (tcp_v4_send_reset) from [] 
> (tcp_v4_do_rcv+0x22c/0x288)
> |[] (tcp_v4_do_rcv) from [] (release_sock+0xb4/0x150)
> |[] (release_sock) from [] (tcp_close+0x240/0x454)
> |[] (tcp_close) from [] (inet_release+0x74/0x7c)
> |[] (inet_release) from [] (sock_release+0x30/0xb0)
> |[] (sock_release) from [] (sock_close+0x1c/0x24)
> |[] (sock_close) from [] (__fput+0xe8/0x20c)
> |[] (__fput) from [] (fput+0x18/0x1c)
> |[] (fput) from [] (task_work_run+0xa4/0xb8)
> |[] (task_work_run) from [] (do_work_pending+0xd0/0xe4)
> |[] (do_work_pending) from [] (work_pending+0xc/0x20)
> |Code: e3530001 8a01 e3a00040 ea11 (e5973010)
> 
> Cc: stable...@vger.kernel.org
> Signed-off-by: Sebastian Andrzej Siewior 
> Signed-off-by: Steven Rostedt 
> ---
>  net/ipv4/tcp_ipv4.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index b4e0eb49f56d..e6345b547922 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -61,6 +61,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -575,6 +576,7 @@ int tcp_v4_gso_send_check(struct sk_buff *skb)
>   return 0;
>  }
>  
> +static DEFINE_LOCAL_IRQ_LOCK(tcp_sk_lock);
>  /*
>   *   This routine will send an RST to the other tcp.
>   *
> @@ -659,8 +661,11 @@ static void tcp_v4_send_reset(struct sock *sk, struct 
> sk_buff *skb)
>  
>   net = dev_net(skb_dst(skb)->dev);
>   arg.tos = ip_hdr(skb)->tos;
> +
> + local_lock(tcp_sk_lock);
>   ip_send_reply(net->ipv4.tcp_sock, skb, ip_hdr(skb)->saddr,
> , arg.iov[0].iov_len);
> + local_unlock(tcp_sk_lock);
>  
>   TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS);
>   TCP_INC_STATS_BH(net, TCP_MIB_OUTRSTS);
> @@ -734,8 +739,10 @@ static void tcp_v4_send_ack(struct sk_buff *skb, u32 
> seq, u32 ack,
>   if (oif)
>   arg.bound_dev_if = oif;
>   arg.tos = tos;
> + local_lock(tcp_sk_lock);
>   ip_send_reply(net->ipv4.tcp_sock, skb, ip_hdr(skb)->saddr,
> , arg.iov[0].iov_len);
> + local_unlock(tcp_sk_lock);
>  
>   TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS);
>  }



Re: [PATCH 0/2] Ajust lockdep static allocations

2016-09-23 Thread Babu Moger


On 9/23/2016 10:40 AM, Peter Zijlstra wrote:

On Fri, Sep 23, 2016 at 10:15:46AM -0500, Babu Moger wrote:


 Correct, We can't boot with lockdep. Sorry I did not make that clear.
We have a limit on static size of the kernel.

This stuff should be in .bss not .data. It should not affect the static
size at all. Or am I misunderstanding things?

  Here it is.
$ ./scripts/bloat-o-meter vmlinux.lockdep.small vmlinux.lockdep.big

What does bloat-o-meter have to do with things? The static image size is
not dependent on .bss, right?


 Peter,
 We checked again. Yes, It goes in .bss section. But in sparc we have 
to fit .text, .data,
 .bss in 7 permanent TLBs(that is totally 28MB). It was fine so far.  
But the commit
 1413c0389333 ("lockdep: Increase static allocations") added extra 4MB 
which makes
 it go beyond 28MB. That is causing system boot up problems in sparc. 
Yes. We know it.
 This is a limitation. Changing this limit in our hardware is a much 
bigger change which
 we cannot address right away. So, we are trying to come up with a 
solution which can
 work for all. I will re-post the patches with  CONFIG_BASE_SMALL 
option if there is no

 objections.

 CCing David Miller and Rob Gardner. They might be able to explain more 
if you
 have any more questions. Here is the discussion thread if you guys 
want to look at history.

 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1237642.html




Re: [PATCH 0/2] Ajust lockdep static allocations

2016-09-23 Thread Babu Moger


On 9/23/2016 10:40 AM, Peter Zijlstra wrote:

On Fri, Sep 23, 2016 at 10:15:46AM -0500, Babu Moger wrote:


 Correct, We can't boot with lockdep. Sorry I did not make that clear.
We have a limit on static size of the kernel.

This stuff should be in .bss not .data. It should not affect the static
size at all. Or am I misunderstanding things?

  Here it is.
$ ./scripts/bloat-o-meter vmlinux.lockdep.small vmlinux.lockdep.big

What does bloat-o-meter have to do with things? The static image size is
not dependent on .bss, right?


 Peter,
 We checked again. Yes, It goes in .bss section. But in sparc we have 
to fit .text, .data,
 .bss in 7 permanent TLBs(that is totally 28MB). It was fine so far.  
But the commit
 1413c0389333 ("lockdep: Increase static allocations") added extra 4MB 
which makes
 it go beyond 28MB. That is causing system boot up problems in sparc. 
Yes. We know it.
 This is a limitation. Changing this limit in our hardware is a much 
bigger change which
 we cannot address right away. So, we are trying to come up with a 
solution which can
 work for all. I will re-post the patches with  CONFIG_BASE_SMALL 
option if there is no

 objections.

 CCing David Miller and Rob Gardner. They might be able to explain more 
if you
 have any more questions. Here is the discussion thread if you guys 
want to look at history.

 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1237642.html




Re: [PATCH 1/2] PM / OPP: compatible is an optional property

2016-09-23 Thread Rob Herring
On Fri, Sep 23, 2016 at 10:45:26AM +0530, Viresh Kumar wrote:
> On 22-09-16, 12:24, Stephen Boyd wrote:
> > On 09/21/2016 02:32 AM, Viresh Kumar wrote:
> > > It was never compulsory to have a compatible string in the OPP table.
> > > Fix the documentation to mark it optional.
> > >

NAK.

> > > Also update its description a bit.
> > >
> > > Signed-off-by: Viresh Kumar 
> > > ---
> > 
> > Why? I'd prefer the compatible string to be required so we know what
> > sort of node it is.

Agreed.

> Okay, the code doesn't have any checks for it then and that needs to be fixed.

Why? The kernel is not a DT validator.
 
> Just for my clarity, for platforms with special OPP bindings and so a 
> different
> compatible string like: "operating-points-v2-XYZ", should the compatible 
> string
> contain both "operating-points-v2" and the above one? It would be easier to
> check for "operating-points-v2" in that case from core code.

That would imply operating-points-v2-XYZ has extra properties or is 
different in some way. If an OS only understanding operating-points-v2 
will work, then yes it should have both. If not, then no.

Rob


Re: [PATCH 1/2] PM / OPP: compatible is an optional property

2016-09-23 Thread Rob Herring
On Fri, Sep 23, 2016 at 10:45:26AM +0530, Viresh Kumar wrote:
> On 22-09-16, 12:24, Stephen Boyd wrote:
> > On 09/21/2016 02:32 AM, Viresh Kumar wrote:
> > > It was never compulsory to have a compatible string in the OPP table.
> > > Fix the documentation to mark it optional.
> > >

NAK.

> > > Also update its description a bit.
> > >
> > > Signed-off-by: Viresh Kumar 
> > > ---
> > 
> > Why? I'd prefer the compatible string to be required so we know what
> > sort of node it is.

Agreed.

> Okay, the code doesn't have any checks for it then and that needs to be fixed.

Why? The kernel is not a DT validator.
 
> Just for my clarity, for platforms with special OPP bindings and so a 
> different
> compatible string like: "operating-points-v2-XYZ", should the compatible 
> string
> contain both "operating-points-v2" and the above one? It would be easier to
> check for "operating-points-v2" in that case from core code.

That would imply operating-points-v2-XYZ has extra properties or is 
different in some way. If an OS only understanding operating-points-v2 
will work, then yes it should have both. If not, then no.

Rob


[PATCH RT 7/7] Linux 3.2.82-rt119-rc1

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: "Steven Rostedt (Red Hat)" 

---
 localversion-rt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/localversion-rt b/localversion-rt
index 4e32122c6b30..bd54f56d15eb 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt118
+-rt119-rc1
-- 
2.8.1




[PATCH RT 6/7] fs/dcache: incremental fixup of the retry routine

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

It has been pointed out by tglx that on UP the non-RT task could spin
its entire time slice because the lock owner is preempted. This won't
happen on !RT. So we back to "chill" if we can't cond_resched() did not
work.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 fs/dcache.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index bd9bd649c390..bea5589bc957 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -38,8 +38,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include "internal.h"
 
 /*
@@ -513,10 +511,11 @@ kill_it:
if (parent == dentry) {
/* the task with the highest priority won't schedule */
r = cond_resched();
-   if (!r && (rt_task(current) || dl_task(current)))
+   if (!r)
cpu_chill();
-   } else
+   } else {
dentry = parent;
+   }
goto repeat;
}
 }
-- 
2.8.1




[PATCH RT 1/7] timers: wakeup all timer waiters

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

The base lock is dropped during the invocation if the timer. That means
it is possible that we have one waiter while timer1 is running and once
this one finished, we get another waiter while timer2 is running. Since
we wake up only one waiter it is possible that we miss the other one.
This will probably heal itself over time because most of the time we
complete timers without an active wake up.
To avoid the scenario where we don't wake up all waiters at once,
wake_up_all() is used.

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 kernel/timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index badd2d2066dc..3beac326b447 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -932,7 +932,7 @@ static void wait_for_running_timer(struct timer_list *timer)
   base->running_timer != timer);
 }
 
-# define wakeup_timer_waiters(b)   wake_up(&(b)->wait_for_running_timer)
+# define wakeup_timer_waiters(b)   
wake_up_all(&(b)->wait_for_running_timer)
 #else
 static inline void wait_for_running_timer(struct timer_list *timer)
 {
-- 
2.8.1




[PATCH RT 7/7] Linux 3.2.82-rt119-rc1

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: "Steven Rostedt (Red Hat)" 

---
 localversion-rt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/localversion-rt b/localversion-rt
index 4e32122c6b30..bd54f56d15eb 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt118
+-rt119-rc1
-- 
2.8.1




[PATCH RT 6/7] fs/dcache: incremental fixup of the retry routine

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

It has been pointed out by tglx that on UP the non-RT task could spin
its entire time slice because the lock owner is preempted. This won't
happen on !RT. So we back to "chill" if we can't cond_resched() did not
work.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 fs/dcache.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index bd9bd649c390..bea5589bc957 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -38,8 +38,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include "internal.h"
 
 /*
@@ -513,10 +511,11 @@ kill_it:
if (parent == dentry) {
/* the task with the highest priority won't schedule */
r = cond_resched();
-   if (!r && (rt_task(current) || dl_task(current)))
+   if (!r)
cpu_chill();
-   } else
+   } else {
dentry = parent;
+   }
goto repeat;
}
 }
-- 
2.8.1




[PATCH RT 1/7] timers: wakeup all timer waiters

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

The base lock is dropped during the invocation if the timer. That means
it is possible that we have one waiter while timer1 is running and once
this one finished, we get another waiter while timer2 is running. Since
we wake up only one waiter it is possible that we miss the other one.
This will probably heal itself over time because most of the time we
complete timers without an active wake up.
To avoid the scenario where we don't wake up all waiters at once,
wake_up_all() is used.

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 kernel/timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index badd2d2066dc..3beac326b447 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -932,7 +932,7 @@ static void wait_for_running_timer(struct timer_list *timer)
   base->running_timer != timer);
 }
 
-# define wakeup_timer_waiters(b)   wake_up(&(b)->wait_for_running_timer)
+# define wakeup_timer_waiters(b)   
wake_up_all(&(b)->wait_for_running_timer)
 #else
 static inline void wait_for_running_timer(struct timer_list *timer)
 {
-- 
2.8.1




[PATCH RT 5/7] fs/dcache: resched/chill only if we make no progress

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

Upstream commit 47be61845c77 ("fs/dcache.c: avoid soft-lockup in
dput()") changed the condition _when_ cpu_relax() / cond_resched() was
invoked. This change was adapted in -RT into mostly the same thing
except that if cond_resched() did nothing we had to do cpu_chill() to
force the task off CPU for a tiny little bit in case the task had RT
priority and did not want to leave the CPU.
This change resulted in a performance regression (in my testcase the
build time on /dev/shm increased from 19min to 24min). The reason is
that with this change cpu_chill() was invoked even dput() made progress
(dentry_kill() returned a different dentry) instead only if we were
trying this operation on the same dentry over and over again.

This patch brings back to the old behavior back to cond_resched() &
chill if we make no progress. A little improvement is to invoke
cpu_chill() only if we are a RT task (and avoid the sleep otherwise).
Otherwise the scheduler should remove us from the CPU if we make no
progress.

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 fs/dcache.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 0089bd3c86ce..bd9bd649c390 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -38,6 +38,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "internal.h"
 
 /*
@@ -465,6 +467,8 @@ relock:
  */
 void dput(struct dentry *dentry)
 {
+   struct dentry *parent;
+
if (!dentry)
return;
 
@@ -502,9 +506,19 @@ repeat:
return;
 
 kill_it:
-   dentry = dentry_kill(dentry, 1);
-   if (dentry)
+   parent = dentry_kill(dentry, 1);
+   if (parent) {
+   int r;
+
+   if (parent == dentry) {
+   /* the task with the highest priority won't schedule */
+   r = cond_resched();
+   if (!r && (rt_task(current) || dl_task(current)))
+   cpu_chill();
+   } else
+   dentry = parent;
goto repeat;
+   }
 }
 EXPORT_SYMBOL(dput);
 
-- 
2.8.1




[PATCH RT 4/7] net: add a lock around icmp_sk()

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

It looks like the this_cpu_ptr() access in icmp_sk() is protected with
local_bh_disable(). To avoid missing serialization in -RT I am adding
here a local lock. No crash has been observed, this is just precaution.

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 net/ipv4/icmp.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 028eb47226e8..e232225f7446 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -76,6 +76,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -202,6 +203,8 @@ static const struct icmp_control 
icmp_pointers[NR_ICMP_TYPES+1];
  *
  * On SMP we have one ICMP socket per-cpu.
  */
+static DEFINE_LOCAL_IRQ_LOCK(icmp_sk_lock);
+
 static struct sock *icmp_sk(struct net *net)
 {
return net->ipv4.icmp_sk[smp_processor_id()];
@@ -213,12 +216,14 @@ static inline struct sock *icmp_xmit_lock(struct net *net)
 
local_bh_disable();
 
+   local_lock(icmp_sk_lock);
sk = icmp_sk(net);
 
if (unlikely(!spin_trylock(>sk_lock.slock))) {
/* This can happen if the output path signals a
 * dst_link_failure() for an outgoing ICMP packet.
 */
+   local_unlock(icmp_sk_lock);
local_bh_enable();
return NULL;
}
@@ -228,6 +233,7 @@ static inline struct sock *icmp_xmit_lock(struct net *net)
 static inline void icmp_xmit_unlock(struct sock *sk)
 {
spin_unlock_bh(>sk_lock.slock);
+   local_unlock(icmp_sk_lock);
 }
 
 /*
@@ -298,6 +304,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
struct sock *sk;
struct sk_buff *skb;
 
+   local_lock(icmp_sk_lock);
sk = icmp_sk(dev_net((*rt)->dst.dev));
if (ip_append_data(sk, fl4, icmp_glue_bits, icmp_param,
   icmp_param->data_len+icmp_param->head_len,
@@ -320,6 +327,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
skb->ip_summed = CHECKSUM_NONE;
ip_push_pending_frames(sk, fl4);
}
+   local_unlock(icmp_sk_lock);
 }
 
 /*
-- 
2.8.1




[PATCH RT 5/7] fs/dcache: resched/chill only if we make no progress

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

Upstream commit 47be61845c77 ("fs/dcache.c: avoid soft-lockup in
dput()") changed the condition _when_ cpu_relax() / cond_resched() was
invoked. This change was adapted in -RT into mostly the same thing
except that if cond_resched() did nothing we had to do cpu_chill() to
force the task off CPU for a tiny little bit in case the task had RT
priority and did not want to leave the CPU.
This change resulted in a performance regression (in my testcase the
build time on /dev/shm increased from 19min to 24min). The reason is
that with this change cpu_chill() was invoked even dput() made progress
(dentry_kill() returned a different dentry) instead only if we were
trying this operation on the same dentry over and over again.

This patch brings back to the old behavior back to cond_resched() &
chill if we make no progress. A little improvement is to invoke
cpu_chill() only if we are a RT task (and avoid the sleep otherwise).
Otherwise the scheduler should remove us from the CPU if we make no
progress.

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 fs/dcache.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 0089bd3c86ce..bd9bd649c390 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -38,6 +38,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "internal.h"
 
 /*
@@ -465,6 +467,8 @@ relock:
  */
 void dput(struct dentry *dentry)
 {
+   struct dentry *parent;
+
if (!dentry)
return;
 
@@ -502,9 +506,19 @@ repeat:
return;
 
 kill_it:
-   dentry = dentry_kill(dentry, 1);
-   if (dentry)
+   parent = dentry_kill(dentry, 1);
+   if (parent) {
+   int r;
+
+   if (parent == dentry) {
+   /* the task with the highest priority won't schedule */
+   r = cond_resched();
+   if (!r && (rt_task(current) || dl_task(current)))
+   cpu_chill();
+   } else
+   dentry = parent;
goto repeat;
+   }
 }
 EXPORT_SYMBOL(dput);
 
-- 
2.8.1




[PATCH RT 4/7] net: add a lock around icmp_sk()

2016-09-23 Thread Steven Rostedt
3.2.82-rt119-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

It looks like the this_cpu_ptr() access in icmp_sk() is protected with
local_bh_disable(). To avoid missing serialization in -RT I am adding
here a local lock. No crash has been observed, this is just precaution.

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 net/ipv4/icmp.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 028eb47226e8..e232225f7446 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -76,6 +76,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -202,6 +203,8 @@ static const struct icmp_control 
icmp_pointers[NR_ICMP_TYPES+1];
  *
  * On SMP we have one ICMP socket per-cpu.
  */
+static DEFINE_LOCAL_IRQ_LOCK(icmp_sk_lock);
+
 static struct sock *icmp_sk(struct net *net)
 {
return net->ipv4.icmp_sk[smp_processor_id()];
@@ -213,12 +216,14 @@ static inline struct sock *icmp_xmit_lock(struct net *net)
 
local_bh_disable();
 
+   local_lock(icmp_sk_lock);
sk = icmp_sk(net);
 
if (unlikely(!spin_trylock(>sk_lock.slock))) {
/* This can happen if the output path signals a
 * dst_link_failure() for an outgoing ICMP packet.
 */
+   local_unlock(icmp_sk_lock);
local_bh_enable();
return NULL;
}
@@ -228,6 +233,7 @@ static inline struct sock *icmp_xmit_lock(struct net *net)
 static inline void icmp_xmit_unlock(struct sock *sk)
 {
spin_unlock_bh(>sk_lock.slock);
+   local_unlock(icmp_sk_lock);
 }
 
 /*
@@ -298,6 +304,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
struct sock *sk;
struct sk_buff *skb;
 
+   local_lock(icmp_sk_lock);
sk = icmp_sk(dev_net((*rt)->dst.dev));
if (ip_append_data(sk, fl4, icmp_glue_bits, icmp_param,
   icmp_param->data_len+icmp_param->head_len,
@@ -320,6 +327,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
skb->ip_summed = CHECKSUM_NONE;
ip_push_pending_frames(sk, fl4);
}
+   local_unlock(icmp_sk_lock);
 }
 
 /*
-- 
2.8.1




<    1   2   3   4   5   6   7   8   9   10   >