Re: [PATCH v2 1/1] usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers

2018-03-05 Thread Bin Liu
On Mon, Mar 05, 2018 at 08:35:10PM +0200, Ivaylo Dimitrov wrote:
> Hi,
> 
> On  5.03.2018 19:38, Bin Liu wrote:
> >On Wed, Feb 28, 2018 at 01:59:43PM -0800, Tony Lindgren wrote:
> >>* Merlijn Wajer  [180227 22:29]:
> >>>Without pm_runtime_{get,put}_sync calls in place, reading
> >>>vbus status via /sys causes the following error:
> >>>
> >>>Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa0ab060
> >>>pgd = b333e822
> >>>[fa0ab060] *pgd=48011452(bad)
> >>>
> >>>[] (musb_default_readb) from [] 
> >>>(musb_vbus_show+0x58/0xe4)
> >>>[] (musb_vbus_show) from [] (dev_attr_show+0x20/0x44)
> >>>[] (dev_attr_show) from [] 
> >>>(sysfs_kf_seq_show+0x80/0xdc)
> >>>[] (sysfs_kf_seq_show) from [] (seq_read+0x250/0x448)
> >>>[] (seq_read) from [] (__vfs_read+0x1c/0x118)
> >>>[] (__vfs_read) from [] (vfs_read+0x90/0x144)
> >>>[] (vfs_read) from [] (SyS_read+0x3c/0x74)
> >>>[] (SyS_read) from [] (ret_fast_syscall+0x0/0x54)
> >>>
> >>>Solution was suggested by Tony Lindgren .
> >>>
> >>>Signed-off-by: Merlijn Wajer 
> >>
> >>Thanks for fixing this. Assuming it passes Bin's tests:
> >>
> >>Acked-by: Tony Lindgren 
> >
> >Applied and sent out. Thanks.
> >
> 
> What about stable kernels? Shouldn't this be fixed there as well?

Yes, it should, but I need to figure out which stables to send to and I
am busy in something at the moment. So I will send it @stable some time
later.

Regards,
-Bin.


Re: "x86/boot/compressed/64: Prepare trampoline memory" breaks boot on Zotac CI-321

2018-03-05 Thread Heiner Kallweit
Am 05.03.2018 um 09:19 schrieb Kirill A. Shutemov:
> On Sat, Mar 03, 2018 at 12:46:28PM +0100, Heiner Kallweit wrote:
>> Am 03.03.2018 um 11:02 schrieb Ingo Molnar:
>>>
>>> * Heiner Kallweit  wrote:
>>>
 Am 03.03.2018 um 00:50 schrieb Dexuan-Linux Cui:
> On Fri, Mar 2, 2018 at 12:57 PM, Heiner Kallweit  > wrote:
>
> Recently my Mini PC Zotac CI-321 started to reboot immediately before
> anything was written to the console.
>
> Bisecting lead to b91993a87aff "x86/boot/compressed/64: Prepare
> trampoline memory" being the change breaking boot.
>
> If you need any more information, please let me know.
>
> Rgds, Heiner
>
>
> This may fix the issue: https://lkml.org/lkml/2018/2/13/668
>
> Kirill posted a v2 patchset 3 days ago and I suppose the patchset should 
> include the fix.
>
 Thanks for the link. I bisected based on the latest next kernel including
 v2 of the patchset (IOW - the potential fix is included already).
>>>
>>> Are you sure? b91993a87aff is the old patch-set - which I just removed from 
>>> -next 
>>> and which should thus be gone in the Monday iteration of -next.
>>>
>>> I have not merged v2 in -tip yet, did it get applied via some other tree?
>>>
>>> Thanks,
>>>
>>> Ingo
>>>
>> I wanted to apply the fix mentioned in the link but found that the statement 
>> was movq already.
>> Therefore my (most likely false) understanding that it's v2.
>> I'll re-test once v2 is out and let you know.
> 
> movq fix is unrelated to the problem.
> 
> Please check if current linux-next plus this patchset causes a problem for
> you:
> 
> http://lkml.kernel.org/r/20180227154217.69347-1-kirill.shute...@linux.intel.com
> 

linux-next from today boots fine with the patchset applied.

Rgds, Heiner



Re: [RFC, PATCH 18/22] x86/mm: Handle allocation of encrypted pages

2018-03-05 Thread Dave Hansen
On 03/05/2018 08:26 AM, Kirill A. Shutemov wrote:
> -#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
> - alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
>  #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
> +#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr)   
> \
> +({   
> \
> + struct page *page;  
> \
> + gfp_t gfp = movableflags | GFP_HIGHUSER;
> \
> + if (vma_is_encrypted(vma))  
> \
> + page = __alloc_zeroed_encrypted_user_highpage(gfp, vma, vaddr); 
> \
> + else
> \
> + page = alloc_page_vma(gfp | __GFP_ZERO, vma, vaddr);
> \
> + page;   
> \
> +})

This is pretty darn ugly and also adds a big old branch into the hottest
path in the page allocator.

It's also really odd that you strip __GFP_ZERO and then go ahead and
zero the encrypted page unconditionally.  It really makes me wonder if
this is the right spot to be doing this.

Can we not, for instance do it inside alloc_page_vma()?


[PATCH] thermal: of: Allow selection of thermal governor in DT

2018-03-05 Thread Amit Kucheria
From: Ram Chandrasekar 

There is currently no way for the governor to be selected for each thermal
zone in devicetree. This results in the default governor being used for all
thermal zones even though no such restriction exists in the core code.

Add support for specifying the thermal governor to be used for a thermal
zone in the devicetree. The devicetree config should specify the governor
name as a string that matches any available governors. If not specified, we
maintain the current behaviour of using the default governor.

Signed-off-by: Ram Chandrasekar 
Signed-off-by: Amit Kucheria 
---
 Documentation/devicetree/bindings/thermal/thermal.txt | 8 
 drivers/thermal/of-thermal.c  | 6 ++
 2 files changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt 
b/Documentation/devicetree/bindings/thermal/thermal.txt
index 1719d47..fced9d3 100644
--- a/Documentation/devicetree/bindings/thermal/thermal.txt
+++ b/Documentation/devicetree/bindings/thermal/thermal.txt
@@ -168,6 +168,14 @@ Optional property:
by means of sensor ID. Additional coefficients are
interpreted as constant offset.
 
+- thermal-governor: Thermal governor to be used for this thermal zone.
+   Expected values are:
+   "step_wise": Use step wise governor.
+   "fair_share": Use fair share governor.
+   "user_space": Use user space governor.
+   "power_allocator": Use power allocator governor.
+  Type: string
+
 - sustainable-power:   An estimate of the sustainable power (in mW) that the
   Type: unsigned   thermal zone can dissipate at the desired
   Size: one cell   control temperature.  For reference, the
diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
index e09f035..a884b01 100644
--- a/drivers/thermal/of-thermal.c
+++ b/drivers/thermal/of-thermal.c
@@ -974,6 +974,7 @@ int __init of_parse_thermal_zones(void)
struct thermal_zone_params *tzp;
int i, mask = 0;
u32 prop;
+   const char *governor_name;
 
tz = thermal_of_build_thermal_zone(child);
if (IS_ERR(tz)) {
@@ -996,6 +997,11 @@ int __init of_parse_thermal_zones(void)
/* No hwmon because there might be hwmon drivers registering */
tzp->no_hwmon = true;
 
+   if (!of_property_read_string(child, "thermal-governor",
+   _name))
+   strlcpy(tzp->governor_name, governor_name,
+   THERMAL_NAME_LENGTH);
+
if (!of_property_read_u32(child, "sustainable-power", ))
tzp->sustainable_power = prop;
 
-- 
2.7.4



Re: inconsistent lock state with usbnet/asix usb ethernet and xhci

2018-03-05 Thread Eric Dumazet
On Mon, 2018-03-05 at 12:46 +0100, Oliver Neukum wrote:
> On Mon, 2018-03-05 at 08:45 +0100, Marek Szyprowski wrote:
> > Hi Oliver,
> > 
> > On 2018-02-27 17:07, Oliver Neukum wrote:
> > > Am Dienstag, den 27.02.2018, 07:13 -0800 schrieb Eric Dumazet:
> > > > On Tue, 2018-02-27 at 07:09 -0800, Eric Dumazet wrote:
> > > > > 
> > > > > Note that for this one, it seems we also could perform stats
> > > > > updates in
> > > > > BH context, since skb is queued via defer_bh()
> > > > > 
> > > > > But simplicity wins I guess.
> > > > 
> > > > Thinking more about this, I am not sure we have any guarantee
> > > > that TX
> > > > and RX can not run on multiple cpus.
> > > > 
> > > > Using an unique syncp is not going to be safe, even if we make
> > > > lockdep
> > > > happy enough with the local_irq save/restore.
> > > 
> > > Unfortunately you are right. It is not guaranteed for some
> > > hardware.
> > 
> > Does it mean that the fix proposed by Eric is not the proper
> > solution?
> 
> For asix it should work, but asix is unlikely to be the only driver
> with that issue. 32 bit recieves less testing nowadays.

Yes, although the lockdep part could be enforced in 64bit if we really
care.

I will send a patch using two different sync (one for RX, one for TX)




[PATCH v11 1/7] powerpc: io.h: move iomap.h include so that it can use readq/writeq defs

2018-03-05 Thread Logan Gunthorpe
Subsequent patches in this series makes use of the readq and writeq
defines in iomap.h. However, as is, they get missed on the powerpc
platform seeing the include comes before the define. This patch
moves the include down to fix this.

Signed-off-by: Logan Gunthorpe 
Acked-by: Michael Ellerman 
Reviewed-by: Andy Shevchenko 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Suresh Warrier 
Cc: "Oliver O'Halloran" 
---
 arch/powerpc/include/asm/io.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index 422f99cf9924..af074923d598 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -33,8 +33,6 @@ extern struct pci_dev *isa_bridge_pcidev;
 #include 
 #include 
 
-#include 
-
 #ifdef CONFIG_PPC64
 #include 
 #endif
@@ -663,6 +661,8 @@ static inline void name at  
\
 #define writel_relaxed(v, addr)writel(v, addr)
 #define writeq_relaxed(v, addr)writeq(v, addr)
 
+#include 
+
 #ifdef CONFIG_PPC32
 #define mmiowb()
 #else
-- 
2.11.0



[PATCH v11 7/7] ntb: ntb_hw_switchtec: Cleanup 64bit IO defines to use the common header

2018-03-05 Thread Logan Gunthorpe
Clean up the ifdefs which conditionally defined the io{read|write}64
functions in favour of the new common io-64-nonatomic-lo-hi header.

Signed-off-by: Logan Gunthorpe 
Cc: Jon Mason 
---
 drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 30 +-
 1 file changed, 1 insertion(+), 29 deletions(-)

diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c 
b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
index f624ae27eabe..d2a1e746b335 100644
--- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
+++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 MODULE_DESCRIPTION("Microsemi Switchtec(tm) NTB Driver");
 MODULE_VERSION("0.1");
@@ -35,35 +36,6 @@ module_param(use_lut_mws, bool, 0644);
 MODULE_PARM_DESC(use_lut_mws,
 "Enable the use of the LUT based memory windows");
 
-#ifndef ioread64
-#ifdef readq
-#define ioread64 readq
-#else
-#define ioread64 _ioread64
-static inline u64 _ioread64(void __iomem *mmio)
-{
-   u64 low, high;
-
-   low = ioread32(mmio);
-   high = ioread32(mmio + sizeof(u32));
-   return low | (high << 32);
-}
-#endif
-#endif
-
-#ifndef iowrite64
-#ifdef writeq
-#define iowrite64 writeq
-#else
-#define iowrite64 _iowrite64
-static inline void _iowrite64(u64 val, void __iomem *mmio)
-{
-   iowrite32(val, mmio);
-   iowrite32(val >> 32, mmio + sizeof(u32));
-}
-#endif
-#endif
-
 #define SWITCHTEC_NTB_MAGIC 0x45CC0001
 #define MAX_MWS 128
 
-- 
2.11.0



Re: [PATCH 2/3] tpm: reduce poll sleep time between send() and recv() in tpm_transmit()

2018-03-05 Thread Mimi Zohar
On Mon, 2018-03-05 at 20:01 +0200, Jarkko Sakkinen wrote:
> On Mon, Mar 05, 2018 at 12:56:33PM +0200, Jarkko Sakkinen wrote:
> > On Fri, Mar 02, 2018 at 12:26:35AM +0530, Nayna Jain wrote:
> > > 
> > > 
> > > On 03/01/2018 02:52 PM, Jarkko Sakkinen wrote:
> > > > On Wed, Feb 28, 2018 at 02:18:27PM -0500, Nayna Jain wrote:
> > > > > In tpm_transmit, after send(), the code checks for status in a loop
> > > > Maybe cutting hairs now but please just use the actual function name
> > > > instead of send(). Just makes the commit log more useful asset.
> > > Sure, will do.
> > > > 
> > > > > - tpm_msleep(TPM_TIMEOUT);
> > > > > + tpm_msleep(TPM_TIMEOUT_POLL);
> > > > What about just calling schedule()?
> > > I'm not sure what you mean by "schedule()".  Are you suggesting instead of
> > > using usleep_range(),  using something with an even finer grain construct?
> > > 
> > > Thanks & Regards,
> > >  - Nayna
> > 
> > kernel/sched/core.c
> 
> The question I'm trying ask to is: is it better to sleep such a short
> time or just ask scheduler to schedule something else after each
> iteration?

I still don't understand why scheduling some work would be an
improvement.  We still need to loop, testing for the TPM command to
complete.

According to the schedule_hrtimeout_range() function comment,
schedule_hrtimeout_range() is both power and performance friendly.
 What we didn't realize is that the hrtimer *uses* the maximum range
to calculate the sleep time, but *may* return earlier based on the
minimum time.

This patch minimizes the tpm_msleep().  The subsequent patch in this
patch set shows that 1 msec isn't fine enough granularity.  I still
think calling usleep_range() is the right solution, but it needs to be
at a finer granularity than tpm_msleep() provides.

Mimi



[PATCH 4/7] perf mmap: Using stored 'overwrite' in perf_mmap__consume

2018-03-05 Thread kan . liang
From: Kan Liang 

The 'overwrite' is set at initialization. It will not be changed.
Using it to replace the parameter of perf_mmap__consume().
The parameters will be discarded later.

No functional change.

Signed-off-by: Kan Liang 
---
 tools/perf/util/mmap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index da9e68b..c4e41d2 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -121,9 +121,9 @@ void perf_mmap__put(struct perf_mmap *map)
perf_mmap__munmap(map);
 }
 
-void perf_mmap__consume(struct perf_mmap *map, bool overwrite)
+void perf_mmap__consume(struct perf_mmap *map, bool overwrite __maybe_unused)
 {
-   if (!overwrite) {
+   if (!map->overwrite) {
u64 old = map->prev;
 
perf_mmap__write_tail(map, old);
-- 
2.4.11



[PATCH 7/7] perf tools: Refine perf_mmap__read_init

2018-03-05 Thread kan . liang
From: Kan Liang 

It doesn't need to pass the 'start' and 'end' boilerplate to
perf_mmap__read_init().
The data will be stored in the struct perf_mmap.

Discard the parameters.

Signed-off-by: Kan Liang 
---
 tools/perf/arch/x86/tests/perf-time-to-tsc.c |  3 +--
 tools/perf/builtin-kvm.c |  3 +--
 tools/perf/builtin-top.c |  3 +--
 tools/perf/builtin-trace.c   |  3 +--
 tools/perf/tests/backward-ring-buffer.c  |  3 +--
 tools/perf/tests/bpf.c   |  3 +--
 tools/perf/tests/code-reading.c  |  3 +--
 tools/perf/tests/keep-tracking.c |  3 +--
 tools/perf/tests/mmap-basic.c|  3 +--
 tools/perf/tests/openat-syscall-tp-fields.c  |  3 +--
 tools/perf/tests/perf-record.c   |  3 +--
 tools/perf/tests/sw-clock.c  |  3 +--
 tools/perf/tests/switch-tracking.c   |  3 +--
 tools/perf/tests/task-exit.c |  3 +--
 tools/perf/util/mmap.c   | 10 ++
 tools/perf/util/mmap.h   |  3 +--
 tools/perf/util/python.c |  3 +--
 17 files changed, 18 insertions(+), 40 deletions(-)

diff --git a/tools/perf/arch/x86/tests/perf-time-to-tsc.c 
b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
index 17cf7fc..a7c9f60 100644
--- a/tools/perf/arch/x86/tests/perf-time-to-tsc.c
+++ b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
@@ -61,7 +61,6 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, 
int subtest __maybe
u64 test_tsc, comm1_tsc, comm2_tsc;
u64 test_time, comm1_time = 0, comm2_time = 0;
struct perf_mmap *md;
-   u64 end, start;
 
threads = thread_map__new(-1, getpid(), UINT_MAX);
CHECK_NOT_NULL__(threads);
@@ -112,7 +111,7 @@ int test__perf_time_to_tsc(struct test *test 
__maybe_unused, int subtest __maybe
 
for (i = 0; i < evlist->nr_mmaps; i++) {
md = >mmap[i];
-   if (perf_mmap__read_init(md, false, , ) < 0)
+   if (perf_mmap__read_init(md, false) < 0)
continue;
 
while ((event = perf_mmap__read_event(md)) != NULL) {
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index e9f69b8..2bcc599 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -746,14 +746,13 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat 
*kvm, int idx,
struct perf_evlist *evlist = kvm->evlist;
union perf_event *event;
struct perf_mmap *md;
-   u64 end, start;
u64 timestamp;
s64 n = 0;
int err;
 
*mmap_time = ULLONG_MAX;
md = >mmap[idx];
-   err = perf_mmap__read_init(md, false, , );
+   err = perf_mmap__read_init(md, false);
if (err < 0)
return (err == -EAGAIN) ? 0 : -1;
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index eb19cf9..9fc13c4 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -817,11 +817,10 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
struct perf_session *session = top->session;
union perf_event *event;
struct machine *machine;
-   u64 end, start;
int ret;
 
md = opts->overwrite ? >overwrite_mmap[idx] : 
>mmap[idx];
-   if (perf_mmap__read_init(md, opts->overwrite, , ) < 0)
+   if (perf_mmap__read_init(md, opts->overwrite) < 0)
return;
 
while ((event = perf_mmap__read_event(md)) != NULL) {
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index c71ef7ba..86cdc98 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2473,10 +2473,9 @@ static int trace__run(struct trace *trace, int argc, 
const char **argv)
for (i = 0; i < evlist->nr_mmaps; i++) {
union perf_event *event;
struct perf_mmap *md;
-   u64 end, start;
 
md = >mmap[i];
-   if (perf_mmap__read_init(md, false, , ) < 0)
+   if (perf_mmap__read_init(md, false) < 0)
continue;
 
while ((event = perf_mmap__read_event(md)) != NULL) {
diff --git a/tools/perf/tests/backward-ring-buffer.c 
b/tools/perf/tests/backward-ring-buffer.c
index e0eae10..15b80b2 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -33,9 +33,8 @@ static int count_samples(struct perf_evlist *evlist, int 
*sample_count,
for (i = 0; i < evlist->nr_mmaps; i++) {
struct perf_mmap *map = >overwrite_mmap[i];
union perf_event *event;
-   u64 start, end;
 
-   perf_mmap__read_init(map, true, , );
+   perf_mmap__read_init(map, true);
while ((event = perf_mmap__read_event(map)) != NULL) {
const u32 

[PATCH 5/7] perf tools: Refine perf_mmap__consume

2018-03-05 Thread kan . liang
From: Kan Liang 

It doesn't need to pass the 'overwrite' to perf_mmap__consume().
Discard the parameter.

Signed-off-by: Kan Liang 
---
 tools/perf/arch/x86/tests/perf-time-to-tsc.c | 2 +-
 tools/perf/builtin-kvm.c | 4 ++--
 tools/perf/builtin-top.c | 2 +-
 tools/perf/builtin-trace.c   | 2 +-
 tools/perf/tests/code-reading.c  | 2 +-
 tools/perf/tests/keep-tracking.c | 2 +-
 tools/perf/tests/mmap-basic.c| 2 +-
 tools/perf/tests/openat-syscall-tp-fields.c  | 2 +-
 tools/perf/tests/perf-record.c   | 2 +-
 tools/perf/tests/sw-clock.c  | 2 +-
 tools/perf/tests/switch-tracking.c   | 2 +-
 tools/perf/tests/task-exit.c | 2 +-
 tools/perf/util/mmap.c   | 6 +++---
 tools/perf/util/mmap.h   | 2 +-
 tools/perf/util/python.c | 2 +-
 15 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/tools/perf/arch/x86/tests/perf-time-to-tsc.c 
b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
index 7f82d91..a9bc77d 100644
--- a/tools/perf/arch/x86/tests/perf-time-to-tsc.c
+++ b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
@@ -134,7 +134,7 @@ int test__perf_time_to_tsc(struct test *test 
__maybe_unused, int subtest __maybe
comm2_time = sample.time;
}
 next_event:
-   perf_mmap__consume(md, false);
+   perf_mmap__consume(md);
}
perf_mmap__read_done(md);
}
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index d2703d3b..165c0446 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -760,7 +760,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat 
*kvm, int idx,
while ((event = perf_mmap__read_event(md, false, , end)) != NULL) 
{
err = perf_evlist__parse_sample_timestamp(evlist, event, 
);
if (err) {
-   perf_mmap__consume(md, false);
+   perf_mmap__consume(md);
pr_err("Failed to parse sample\n");
return -1;
}
@@ -770,7 +770,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat 
*kvm, int idx,
 * FIXME: Here we can't consume the event, as 
perf_session__queue_event will
 *point to it, and it'll get possibly overwritten by 
the kernel.
 */
-   perf_mmap__consume(md, false);
+   perf_mmap__consume(md);
 
if (err) {
pr_err("Failed to enqueue sample: %d\n", err);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index bb4f9fa..11b4a41 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -879,7 +879,7 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
} else
++session->evlist->stats.nr_unknown_events;
 next_event:
-   perf_mmap__consume(md, opts->overwrite);
+   perf_mmap__consume(md);
}
 
perf_mmap__read_done(md);
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 1a93deb..abc855d 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2492,7 +2492,7 @@ static int trace__run(struct trace *trace, int argc, 
const char **argv)
 
trace__handle_event(trace, event, );
 next_event:
-   perf_mmap__consume(md, false);
+   perf_mmap__consume(md);
 
if (interrupted)
goto out_disable;
diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index 03ed8c7..f7c199a 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -420,7 +420,7 @@ static int process_events(struct machine *machine, struct 
perf_evlist *evlist,
 
while ((event = perf_mmap__read_event(md, false, , end)) 
!= NULL) {
ret = process_event(machine, evlist, event, state);
-   perf_mmap__consume(md, false);
+   perf_mmap__consume(md);
if (ret < 0)
return ret;
}
diff --git a/tools/perf/tests/keep-tracking.c b/tools/perf/tests/keep-tracking.c
index 4590d8f..1f1db59 100644
--- a/tools/perf/tests/keep-tracking.c
+++ b/tools/perf/tests/keep-tracking.c
@@ -42,7 +42,7 @@ static int find_comm(struct perf_evlist *evlist, const char 
*comm)
(pid_t)event->comm.tid == getpid() &&
strcmp(event->comm.comm, comm) == 0)
found += 1;
-   perf_mmap__consume(md, false);
+   

[PATCH 6/7] perf tools: Refine perf_mmap__read_event

2018-03-05 Thread kan . liang
From: Kan Liang 

It doesn't need to pass the 'overwrite', 'start' and 'end' to
perf_mmap__read_event.
Discard the parameters.

Signed-off-by: Kan Liang 
---
 tools/perf/arch/x86/tests/perf-time-to-tsc.c | 2 +-
 tools/perf/builtin-kvm.c | 2 +-
 tools/perf/builtin-top.c | 2 +-
 tools/perf/builtin-trace.c   | 2 +-
 tools/perf/tests/backward-ring-buffer.c  | 2 +-
 tools/perf/tests/bpf.c   | 2 +-
 tools/perf/tests/code-reading.c  | 2 +-
 tools/perf/tests/keep-tracking.c | 2 +-
 tools/perf/tests/mmap-basic.c| 2 +-
 tools/perf/tests/openat-syscall-tp-fields.c  | 2 +-
 tools/perf/tests/perf-record.c   | 2 +-
 tools/perf/tests/sw-clock.c  | 2 +-
 tools/perf/tests/switch-tracking.c   | 2 +-
 tools/perf/tests/task-exit.c | 2 +-
 tools/perf/util/mmap.c   | 8 +---
 tools/perf/util/mmap.h   | 4 +---
 tools/perf/util/python.c | 2 +-
 17 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/tools/perf/arch/x86/tests/perf-time-to-tsc.c 
b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
index a9bc77d..17cf7fc 100644
--- a/tools/perf/arch/x86/tests/perf-time-to-tsc.c
+++ b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
@@ -115,7 +115,7 @@ int test__perf_time_to_tsc(struct test *test 
__maybe_unused, int subtest __maybe
if (perf_mmap__read_init(md, false, , ) < 0)
continue;
 
-   while ((event = perf_mmap__read_event(md, false, , end)) 
!= NULL) {
+   while ((event = perf_mmap__read_event(md)) != NULL) {
struct perf_sample sample;
 
if (event->header.type != PERF_RECORD_COMM ||
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 165c0446..e9f69b8 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -757,7 +757,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat 
*kvm, int idx,
if (err < 0)
return (err == -EAGAIN) ? 0 : -1;
 
-   while ((event = perf_mmap__read_event(md, false, , end)) != NULL) 
{
+   while ((event = perf_mmap__read_event(md)) != NULL) {
err = perf_evlist__parse_sample_timestamp(evlist, event, 
);
if (err) {
perf_mmap__consume(md);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 11b4a41..eb19cf9 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -824,7 +824,7 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
if (perf_mmap__read_init(md, opts->overwrite, , ) < 0)
return;
 
-   while ((event = perf_mmap__read_event(md, opts->overwrite, , 
end)) != NULL) {
+   while ((event = perf_mmap__read_event(md)) != NULL) {
ret = perf_evlist__parse_sample(evlist, event, );
if (ret) {
pr_err("Can't parse sample, err = %d\n", ret);
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index abc855d..c71ef7ba 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2479,7 +2479,7 @@ static int trace__run(struct trace *trace, int argc, 
const char **argv)
if (perf_mmap__read_init(md, false, , ) < 0)
continue;
 
-   while ((event = perf_mmap__read_event(md, false, , end)) 
!= NULL) {
+   while ((event = perf_mmap__read_event(md)) != NULL) {
struct perf_sample sample;
 
++trace->nr_events;
diff --git a/tools/perf/tests/backward-ring-buffer.c 
b/tools/perf/tests/backward-ring-buffer.c
index e0b1b41..e0eae10 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -36,7 +36,7 @@ static int count_samples(struct perf_evlist *evlist, int 
*sample_count,
u64 start, end;
 
perf_mmap__read_init(map, true, , );
-   while ((event = perf_mmap__read_event(map, true, , end)) 
!= NULL) {
+   while ((event = perf_mmap__read_event(map)) != NULL) {
const u32 type = event->header.type;
 
switch (type) {
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 09c9c9f..384c20f 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -183,7 +183,7 @@ static int do_test(struct bpf_object *obj, int 
(*func)(void),
if (perf_mmap__read_init(md, false, , ) < 0)
continue;
 
-   while ((event = perf_mmap__read_event(md, false, , end)) 
!= NULL) {
+   while ((event = perf_mmap__read_event(md)) != NULL) {
const u32 type = event->header.type;
 
if (type == 

Re: [RFC, PATCH 13/22] mm, rmap: Free encrypted pages once mapcount drops to zero

2018-03-05 Thread Dave Hansen
On 03/05/2018 08:26 AM, Kirill A. Shutemov wrote:
> @@ -1292,6 +1308,12 @@ static void page_remove_anon_compound_rmap(struct page 
> *page)
>   __mod_node_page_state(page_pgdat(page), NR_ANON_MAPPED, -nr);
>   deferred_split_huge_page(page);
>   }
> +
> + anon_vma = page_anon_vma(page);
> + if (anon_vma_encrypted(anon_vma)) {
> + int keyid = anon_vma_keyid(anon_vma);
> + free_encrypt_page(page, keyid, compound_order(page));
> + }
>  }

It's not covered in the description and I'm to lazy to dig into it, so:
Without this code, where do they get freed?  Why does it not cause any
problems to free them here?


[PATCH 3/7] perf mmap: Using the stored data in perf_mmap__read_event

2018-03-05 Thread kan . liang
From: Kan Liang 

Using the 'start', 'end' and 'overwrite' which are stored in
struct perf_mmap to replace the parameters of perf_mmap__read_event().
The parameters will be discarded later.

No functional change.

Signed-off-by: Kan Liang 
---
 tools/perf/util/mmap.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 4cb3614..da9e68b 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -76,8 +76,8 @@ static union perf_event *perf_mmap__read(struct perf_mmap 
*map,
  * perf_mmap__read_done()
  */
 union perf_event *perf_mmap__read_event(struct perf_mmap *map,
-   bool overwrite,
-   u64 *startp, u64 end)
+   bool overwrite __maybe_unused,
+   u64 *startp, u64 end __maybe_unused)
 {
union perf_event *event;
 
@@ -91,13 +91,14 @@ union perf_event *perf_mmap__read_event(struct perf_mmap 
*map,
return NULL;
 
/* non-overwirte doesn't pause the ringbuffer */
-   if (!overwrite)
-   end = perf_mmap__read_head(map);
+   if (!map->overwrite)
+   map->end = perf_mmap__read_head(map);
 
-   event = perf_mmap__read(map, startp, end);
+   event = perf_mmap__read(map, >start, map->end);
+   *startp = map->start;
 
-   if (!overwrite)
-   map->prev = *startp;
+   if (!map->overwrite)
+   map->prev = map->start;
 
return event;
 }
-- 
2.4.11



Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-03-05 Thread Kees Cook
On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamada
 wrote:
> Sorry for chiming in late.
>
> I noticed this thread today,
> honestly, the commit made me upset.
>
>
> Can I suggest another way to make it less fragile?
> __attribute((...)) can be placed after 'struct'.
>
>
> So, we can write:
>
>
> struct __randomize_layout path {
> struct vfsmount *mnt;
> struct dentry *dentry;
> };
>
>
>   instead of
>
>
> struct path {
> struct vfsmount *mnt;
> struct dentry *dentry;
> } __randomize_layout;

Ugh. I had tried this after the struct _name_, not after "struct"
itself. This does fix it, though it remains fragile, as you mention.

> If we force the former notation,
> the undefined __randomize_layout results in a build error
> instead of silent broken code generation.
>
>
> It is true somebody can still place
> __randomize_layout after the closing brace,
> but can we check this by coccicheck or checkpatch.pl?
> (we can describe it in coding style documentation, of course)
>
>
> IMHO, we should not (ab)use include/linux/kconfig.h
> to bring in misc things.

I'm happy to send a patch that reverts the other changes and relocates
all the markings...

Linus, how would you like this to go?

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH v12 02/11] mm, swap: Add infrastructure for saving page metadata on swap

2018-03-05 Thread Dave Hansen
On 02/21/2018 09:15 AM, Khalid Aziz wrote:
> If a processor supports special metadata for a page, for example ADI
> version tags on SPARC M7, this metadata must be saved when the page is
> swapped out. The same metadata must be restored when the page is swapped
> back in. This patch adds two new architecture specific functions -
> arch_do_swap_page() to be called when a page is swapped in, and
> arch_unmap_one() to be called when a page is being unmapped for swap
> out. These architecture hooks allow page metadata to be saved if the
> architecture supports it.

I still think silently squishing cacheline-level hardware data into
page-level software data structures is dangerous.

But, you seem rather determined to do it this way.  I don't think this
will _hurt_ anyone else, though other than needlessly cluttering up the
code.


Re: [PATCH v12 09/11] mm: Allow arch code to override copy_highpage()

2018-03-05 Thread Dave Hansen
On 02/21/2018 09:15 AM, Khalid Aziz wrote:
> +#ifndef __HAVE_ARCH_COPY_HIGHPAGE
> +
>  static inline void copy_highpage(struct page *to, struct page *from)
>  {
>   char *vfrom, *vto;
> @@ -248,4 +250,6 @@ static inline void copy_highpage(struct page *to, struct 
> page *from)
>   kunmap_atomic(vfrom);
>  }
>  
> +#endif

I think we prefer that these are CONFIG_* options.


Applied "regmap: debugfs: Disambiguate dummy debugfs file name" to the regmap tree

2018-03-05 Thread Mark Brown
The patch

   regmap: debugfs: Disambiguate dummy debugfs file name

has been applied to the regmap tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From a430ab205d29e7d1537b220fcf989b8080d8267f Mon Sep 17 00:00:00 2001
From: Fabio Estevam 
Date: Mon, 5 Mar 2018 15:52:09 -0300
Subject: [PATCH] regmap: debugfs: Disambiguate dummy debugfs file name

Since commit 9b947a13e7f6 ("regmap: use debugfs even when no device")
allows the usage of regmap debugfs even when there is no device
associated, which causes several warnings like this:

(NULL device *): Failed to create debugfs directory

This happens when the debugfs file name is 'dummy'.

The first dummy debugfs creation works fine, but subsequent creations
fail as they have all the same name.

Disambiguate the 'dummy' debugfs file name by adding a suffix entry,
so that the names become dummy0, dummy1, dummy2, etc.

Signed-off-by: Fabio Estevam 
Signed-off-by: Mark Brown 
---
 drivers/base/regmap/regmap-debugfs.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/base/regmap/regmap-debugfs.c 
b/drivers/base/regmap/regmap-debugfs.c
index 7eb512ba2828..e3e7b91cc421 100644
--- a/drivers/base/regmap/regmap-debugfs.c
+++ b/drivers/base/regmap/regmap-debugfs.c
@@ -25,6 +25,7 @@ struct regmap_debugfs_node {
struct list_head link;
 };
 
+static unsigned int dummy_index;
 static struct dentry *regmap_debugfs_root;
 static LIST_HEAD(regmap_debugfs_early_list);
 static DEFINE_MUTEX(regmap_debugfs_early_lock);
@@ -573,6 +574,11 @@ void regmap_debugfs_init(struct regmap *map, const char 
*name)
name = devname;
}
 
+   if (!strcmp(name, "dummy")) {
+   name = kasprintf(GFP_KERNEL, "dummy%d", dummy_index);
+   dummy_index++;
+   }
+
map->debugfs = debugfs_create_dir(name, regmap_debugfs_root);
if (!map->debugfs) {
dev_warn(map->dev, "Failed to create debugfs directory\n");
-- 
2.16.2



[PATCH v2 3/6] PCI: hv: serialize the present/eject work items

2018-03-05 Thread Dexuan Cui
When we hot-remove the device, we first receive a PCI_EJECT message and
then receive a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.

The first message is offloaded to hv_eject_device_work(), and the second
is offloaded to pci_devices_present_work(). Both the paths can be running
list_del(>list_entry), causing general protection fault, because
system_wq can run them concurrently.

The patch eliminates the race condition.

Signed-off-by: Dexuan Cui 
Tested-by: Adrian Suhov 
Tested-by: Chris Valean 
Cc: Vitaly Kuznetsov 
Cc: Jack Morgenstein 
Cc: sta...@vger.kernel.org
Cc: Stephen Hemminger 
Cc: K. Y. Srinivasan 
---
 drivers/pci/host/pci-hyperv.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 04edb24c92ee..aaee41faf55f 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -461,6 +461,8 @@ struct hv_pcibus_device {
struct retarget_msi_interrupt retarget_msi_interrupt_params;
 
spinlock_t retarget_msi_interrupt_lock;
+
+   struct workqueue_struct *wq;
 };
 
 /*
@@ -1770,7 +1772,7 @@ static void hv_pci_devices_present(struct 
hv_pcibus_device *hbus,
spin_unlock_irqrestore(>device_list_lock, flags);
 
get_hvpcibus(hbus);
-   schedule_work(_wrk->wrk);
+   queue_work(hbus->wq, _wrk->wrk);
 }
 
 /**
@@ -1845,7 +1847,7 @@ static void hv_pci_eject_device(struct hv_pci_dev *hpdev)
get_pcichild(hpdev, hv_pcidev_ref_pnp);
INIT_WORK(>wrk, hv_eject_device_work);
get_hvpcibus(hpdev->hbus);
-   schedule_work(>wrk);
+   queue_work(hpdev->hbus->wq, >wrk);
 }
 
 /**
@@ -2460,11 +2462,17 @@ static int hv_pci_probe(struct hv_device *hdev,
spin_lock_init(>retarget_msi_interrupt_lock);
sema_init(>enum_sem, 1);
init_completion(>remove_event);
+   hbus->wq = alloc_ordered_workqueue("hv_pci_%x", 0,
+  hbus->sysdata.domain);
+   if (!hbus->wq) {
+   ret = -ENOMEM;
+   goto free_bus;
+   }
 
ret = vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL, 0,
 hv_pci_onchannelcallback, hbus);
if (ret)
-   goto free_bus;
+   goto destroy_wq;
 
hv_set_drvdata(hdev, hbus);
 
@@ -2533,6 +2541,8 @@ static int hv_pci_probe(struct hv_device *hdev,
hv_free_config_window(hbus);
 close:
vmbus_close(hdev->channel);
+destroy_wq:
+   destroy_workqueue(hbus->wq);
 free_bus:
free_page((unsigned long)hbus);
return ret;
@@ -2612,6 +2622,7 @@ static int hv_pci_remove(struct hv_device *hdev)
irq_domain_free_fwnode(hbus->sysdata.fwnode);
put_hvpcibus(hbus);
wait_for_completion(>remove_event);
+   destroy_workqueue(hbus->wq);
free_page((unsigned long)hbus);
return 0;
 }
-- 
2.7.4


Re: [PATCH 1/2] checkpatch: add check for tag Co-Developed-by

2018-03-05 Thread Joe Perches
On Mon, 2018-03-05 at 14:58 +1100, Tobin C. Harding wrote:
> From: Joe Perches 

I still think this "Co-Developed-by" stuff is unnecessary.

> Recently signature tag Co-Developed-by was added to the
> kernel (Documentation/process/5.Posting.rst). checkpatch.pl doesn't know
> about it yet. All prior tags used all lowercase characters except for first
> character. Checks for this format had to be re-worked to allow for the
> new tag.
> 
> Cc: Greg Kroah-Hartman 
> 
> Reviewed-by: Greg Kroah-Hartman 
> Signed-off-by: Tobin C. Harding 
> ---
>  scripts/checkpatch.pl | 58 
> +++
>  1 file changed, 35 insertions(+), 23 deletions(-)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 3d4040322ae1..fbe2ae2d035f 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -461,16 +461,18 @@ our $logFunctions = qr{(?x:
>   seq_vprintf|seq_printf|seq_puts
>  )};
>  
> -our $signature_tags = qr{(?xi:
> - Signed-off-by:|
> - Acked-by:|
> - Tested-by:|
> - Reviewed-by:|
> - Reported-by:|
> - Suggested-by:|
> - To:|
> - Cc:
> -)};
> +our @valid_signatures = (
> + "Signed-off-by:",
> + "Acked-by:",
> + "Tested-by:",
> + "Reviewed-by:",
> + "Reported-by:",
> + "Suggested-by:",
> + "Co-Developed-by:",
> + "To:",
> + "Cc:"
> +);
> +my $signature_tags = "(?x:" . join('|', @valid_signatures) . ")";
>  
>  our @typeListMisordered = (
>   qr{char\s+(?:un)?signed},
> @@ -2193,6 +2195,17 @@ sub pos_last_openparen {
>   return length(expand_tabs(substr($line, 0, $last_openparen))) + 1;
>  }
>  
> +sub get_preferred_sign_off {
> + my ($sign_off) = @_;
> +
> + foreach my $sig (@valid_signatures) {
> + if (lc($sign_off) eq lc($sig)) {
> + return $sig;
> + }
> + }
> + return "";
> +}
> +
>  sub process {
>   my $filename = shift;
>  
> @@ -2499,35 +2512,34 @@ sub process {
>   my $sign_off = $2;
>   my $space_after = $3;
>   my $email = $4;
> - my $ucfirst_sign_off = ucfirst(lc($sign_off));
> + my $preferred_sign_off = ucfirst(lc($sign_off));
>  
> - if ($sign_off !~ /$signature_tags/) {
> + if ($sign_off !~ /$signature_tags/i) {
>   WARN("BAD_SIGN_OFF",
>"Non-standard signature: $sign_off\n" . 
> $herecurr);
> - }
> - if (defined $space_before && $space_before ne "") {
> + } elsif ($sign_off !~ /$signature_tags/) {
> + $preferred_sign_off = 
> get_preferred_sign_off($sign_off);
>   if (WARN("BAD_SIGN_OFF",
> -  "Do not use whitespace before 
> $ucfirst_sign_off\n" . $herecurr) &&
> +  "'$preferred_sign_off' is the 
> preferred signature form\n" . $herecurr) &&
>   $fix) {
> - $fixed[$fixlinenr] =
> - "$ucfirst_sign_off $email";
> + $fixed[$fixlinenr] = 
> "$preferred_sign_off $email";
>   }
>   }
> - if ($sign_off =~ /-by:$/i && $sign_off ne 
> $ucfirst_sign_off) {
> + if (defined $space_before && $space_before ne "") {
>   if (WARN("BAD_SIGN_OFF",
> -  "'$ucfirst_sign_off' is the preferred 
> signature form\n" . $herecurr) &&
> +  "Do not use whitespace before 
> $preferred_sign_off\n" . $herecurr) &&
>   $fix) {
>   $fixed[$fixlinenr] =
> - "$ucfirst_sign_off $email";
> + "$preferred_sign_off $email";
>   }
> -
>   }
> +
>   if (!defined $space_after || $space_after ne " ") {
>   if (WARN("BAD_SIGN_OFF",
> -  "Use a single space after 
> $ucfirst_sign_off\n" . $herecurr) &&
> +  "Use a single space after 
> $preferred_sign_off\n" . $herecurr) &&
>   $fix) {
>   $fixed[$fixlinenr] =
> - "$ucfirst_sign_off $email";
> + "$preferred_sign_off $email";
>   }
>   }
>  


Re: [PATCH v12 02/11] mm, swap: Add infrastructure for saving page metadata on swap

2018-03-05 Thread Khalid Aziz

On 03/05/2018 12:20 PM, Dave Hansen wrote:

On 02/21/2018 09:15 AM, Khalid Aziz wrote:

If a processor supports special metadata for a page, for example ADI
version tags on SPARC M7, this metadata must be saved when the page is
swapped out. The same metadata must be restored when the page is swapped
back in. This patch adds two new architecture specific functions -
arch_do_swap_page() to be called when a page is swapped in, and
arch_unmap_one() to be called when a page is being unmapped for swap
out. These architecture hooks allow page metadata to be saved if the
architecture supports it.


I still think silently squishing cacheline-level hardware data into
page-level software data structures is dangerous.

But, you seem rather determined to do it this way.  I don't think this
will _hurt_ anyone else, though other than needlessly cluttering up the
code.


Hello Dave,

Thanks for taking the time to look at this patch and providing feedback.

ADI data is per page data and is held in the spare bits in the RAM. It 
is loaded into the cache when data is loaded from RAM and flushed out to 
spare bits in the RAM when data is flushed from cache. Sparc allows one 
tag for each ADI block size of data and ADI block size is same as 
cacheline size. When a page is loaded into RAM from swap space, all of 
the associated ADI data for the page must also be loaded into the RAM, 
so it looks like page level data and storing it in page level software 
data structure makes sense. I am open to other suggestions though.


Thanks,
Khalid


Re: [PATCH v12 02/11] mm, swap: Add infrastructure for saving page metadata on swap

2018-03-05 Thread Dave Hansen
On 03/05/2018 11:29 AM, Khalid Aziz wrote:
> ADI data is per page data and is held in the spare bits in the RAM. It
> is loaded into the cache when data is loaded from RAM and flushed out to
> spare bits in the RAM when data is flushed from cache. Sparc allows one
> tag for each ADI block size of data and ADI block size is same as
> cacheline size.

Which does not square with your earlier assertion "ADI data is per page
data".  It's per-cacheline data.  Right?

> When a page is loaded into RAM from swap space, all of
> the associated ADI data for the page must also be loaded into the RAM,
> so it looks like page level data and storing it in page level software
> data structure makes sense. I am open to other suggestions though.

Do you have a way to tell that data is not being thrown away?  Like if
the ADI metadata is different for two different cachelines within a
single page?


Re: Regression in IPMI on 4.15.6

2018-03-05 Thread Corey Minyard

On 03/05/2018 01:31 PM, Corey Minyard wrote:

On 03/05/2018 01:07 PM, Laura Abbott wrote:

On 03/02/2018 05:46 AM, Corey Minyard wrote:

On 02/28/2018 01:07 PM, Corey Minyard wrote:

On 02/28/2018 08:17 AM, Corey Minyard wrote:

On 02/28/2018 07:53 AM, Corey Minyard wrote:

On 02/27/2018 05:55 PM, Laura Abbott wrote:

Hi,

Fedora got a bug report of a crash in IPMI on 4.15.6
https://bugzilla.redhat.com/show_bug.cgi?id=1549316
Unfortunately, it's only a screenshot but it's fairly
clear. It looks like a panic in the error handling path
in platform_device_unregister. Any ideas?





You may also run into another issue.  You can pull the
individual patch at

https://github.com/cminyard/linux-ipmi.git 
c8a1972e77dbe321ce5ce0247056e727234cbaec


Actually, it needed a few more tweaks.  Can you do change
426fa6179dae677134dfb37b21d057819418515b
instead?  It's "ipmi: Fix some error cleanup issues"

I can send you patches, if you like.  If you could test and get back
to me, that would be great.


Laura, have you had a chance to test this?  I'd like to get it in soon,
if possible.

Thanks,

-corey



I think "ipmi: Re-use existing macros for built-in properties" is 
broken:




That particular requires some new stuff.  I was just wanting you to 
pull that individual patch,

not the whole branch.  I can just send the two patches, if you like.


Or, I just pulled in 4.15.6 and cherry picked those two patches to:

https://github.com/cminyard/linux-ipmi.git fix-pdev-unreg

Hopefully that makes things easier.

-corey



-corey


In file included from ./include/linux/acpi.h:28:0,
 from ./include/linux/ipmi.h:21,
 from drivers/char/ipmi/ipmi_dmi.c:7:
drivers/char/ipmi/ipmi_dmi.c: In function ‘dmi_add_platform_ipmi’:
./include/linux/property.h:236:1: error: expected expression before 
‘{’ token

 {   \
 ^
./include/linux/property.h:244:2: note: in expansion of macro 
‘PROPERTY_ENTRY_INTEGER’

  PROPERTY_ENTRY_INTEGER(_name_, u8, _val_)
  ^~
drivers/char/ipmi/ipmi_dmi.c:79:15: note: in expansion of macro 
‘PROPERTY_ENTRY_U8’

   p[pidx++] = PROPERTY_ENTRY_U8("ipmi-type", si_type);
   ^
./include/linux/property.h:236:1: error: expected expression before 
‘{’ token

 {   \
 ^
./include/linux/property.h:244:2: note: in expansion of macro 
‘PROPERTY_ENTRY_INTEGER’

  PROPERTY_ENTRY_INTEGER(_name_, u8, _val_)
  ^~
drivers/char/ipmi/ipmi_dmi.c:81:14: note: in expansion of macro 
‘PROPERTY_ENTRY_U8’

  p[pidx++] = PROPERTY_ENTRY_U8("slave-addr", slave_addr);
  ^
./include/linux/property.h:236:1: error: expected expression before 
‘{’ token

 {   \
 ^
./include/linux/property.h:244:2: note: in expansion of macro 
‘PROPERTY_ENTRY_INTEGER’

  PROPERTY_ENTRY_INTEGER(_name_, u8, _val_)
  ^~
drivers/char/ipmi/ipmi_dmi.c:82:14: note: in expansion of macro 
‘PROPERTY_ENTRY_U8’

  p[pidx++] = PROPERTY_ENTRY_U8("addr-source", SI_SMBIOS);
  ^
./include/linux/property.h:236:1: error: expected expression before 
‘{’ token

 {   \
 ^
./include/linux/property.h:246:2: note: in expansion of macro 
‘PROPERTY_ENTRY_INTEGER’

  PROPERTY_ENTRY_INTEGER(_name_, u16, _val_)
  ^~
drivers/char/ipmi/ipmi_dmi.c:107:15: note: in expansion of macro 
‘PROPERTY_ENTRY_U16’

   p[pidx++] = PROPERTY_ENTRY_U16("i2c-addr", base_addr);
   ^~

I don't think that macro is actually a replacement?

Thanks,
Laura



BTW, the IPMI setup in your system is incorrect.  SMBIOS says it's 
at a

memory address, but it's at an I/O address.  And the address given
doesn't appear to be a valid address, the value read doesn't appear
to be a valid value.

-corey



for that fix.

-corey


Yeah, this is fixed by 174134ac7602 "ipmi_si: Fix error
handling of platform device" in mainstream.

I guess I need to request a backport of this.

Thanks for reporting.

-corey



Thanks,
Laura

















Re: [PATCH v4 13/24] fpga: region: add compat_id support

2018-03-05 Thread Alan Tull
On Thu, Mar 1, 2018 at 12:17 AM, Wu Hao  wrote:
> On Wed, Feb 28, 2018 at 04:55:15PM -0600, Alan Tull wrote:
>> On Tue, Feb 13, 2018 at 3:24 AM, Wu Hao  wrote:
>>
>> Hi Hao,
>
> Hi Alan,
>
> Thanks for the review.
>
>>
>> > This patch introduces a compat_id member and sysfs interface for each
>> > fpga-region, e.g userspace applications could read the compat_id
>> > from the sysfs interface for compatibility checking before PR.
>> >
>> > Signed-off-by: Wu Hao 
>> > ---
>> >  Documentation/ABI/testing/sysfs-class-fpga-region |  5 +
>> >  drivers/fpga/fpga-region.c| 19 +++
>> >  include/linux/fpga/fpga-region.h  | 13 +
>> >  3 files changed, 37 insertions(+)
>> >  create mode 100644 Documentation/ABI/testing/sysfs-class-fpga-region
>> >
>> > diff --git a/Documentation/ABI/testing/sysfs-class-fpga-region 
>> > b/Documentation/ABI/testing/sysfs-class-fpga-region
>> > new file mode 100644
>> > index 000..419d930
>> > --- /dev/null
>> > +++ b/Documentation/ABI/testing/sysfs-class-fpga-region
>> > @@ -0,0 +1,5 @@
>> > +What:  /sys/class/fpga_region//compat_id
>> > +Date:  February 2018
>> > +KernelVersion: 4.16
>> > +Contact:   Wu Hao 
>> > +Description:   FPGA region id for compatibility check.

It would be helpful to add some explanation here that although the
intended function of compat_id is set, the way the actual value is
defined or calculated is set by the layer that is creating the FPGA
region.

>> > diff --git a/drivers/fpga/fpga-region.c b/drivers/fpga/fpga-region.c
>> > index 660a91b..babec96 100644
>> > --- a/drivers/fpga/fpga-region.c
>> > +++ b/drivers/fpga/fpga-region.c
>> > @@ -162,6 +162,24 @@ int fpga_region_program_fpga(struct fpga_region 
>> > *region)
>> >  }
>> >  EXPORT_SYMBOL_GPL(fpga_region_program_fpga);
>> >
>> > +static ssize_t compat_id_show(struct device *dev,
>> > + struct device_attribute *attr, char *buf)
>> > +{
>> > +   struct fpga_region *region = to_fpga_region(dev);
>>
>> This looks good, but not all users of FPGA are going to use compat_id.
>> How would you feel about making it a pointer in struct fpga_region?
>> With compat_id as a pointer, could check for non-null compat_id
>> pointer and return an error if it wasn't initialized.
>
> It sounds good to me.
>
> if (!region->compat_id)
> return -ENOENT;

Yes, thanks!

Alan

>
>>
>> > +
>> > +   return sprintf(buf, "%016llx%016llx\n",
>> > +  (unsigned long long)region->compat_id.id_h,
>> > +  (unsigned long long)region->compat_id.id_l);
>> > +}
>> > +
>> > +static DEVICE_ATTR_RO(compat_id);
>> > +
>> > +static struct attribute *fpga_region_attrs[] = {
>> > +   _attr_compat_id.attr,
>> > +   NULL,
>> > +};
>> > +ATTRIBUTE_GROUPS(fpga_region);
>> > +
>> >  int fpga_region_register(struct fpga_region *region)
>> >  {
>> > struct device *dev = region->parent;
>> > @@ -226,6 +244,7 @@ static int __init fpga_region_init(void)
>> > if (IS_ERR(fpga_region_class))
>> > return PTR_ERR(fpga_region_class);
>> >
>> > +   fpga_region_class->dev_groups = fpga_region_groups;
>> > fpga_region_class->dev_release = fpga_region_dev_release;
>> >
>> > return 0;
>> > diff --git a/include/linux/fpga/fpga-region.h 
>> > b/include/linux/fpga/fpga-region.h
>> > index 423c87e..bf97dcc 100644
>> > --- a/include/linux/fpga/fpga-region.h
>> > +++ b/include/linux/fpga/fpga-region.h
>> > @@ -6,6 +6,17 @@
>> >  #include 
>> >
>> >  /**
>> > + * struct fpga_region_compat_id - FPGA Region id for compatibility check
>> > + *
>> > + * @id_h: high 64bit of the compat_id
>> > + * @id_l: low 64bit of the compat_id
>> > + */
>> > +struct fpga_region_compat_id {
>> > +   u64 id_h;
>> > +   u64 id_l;
>>
>> I guess each user will choose how to define these bits.
>
> Yes.
>
>>
>> > +};
>> > +
>> > +/**
>> >   * struct fpga_region - FPGA Region structure
>> >   * @dev: FPGA Region device
>> >   * @parent: parent device
>> > @@ -13,6 +24,7 @@
>> >   * @bridge_list: list of FPGA bridges specified in region
>> >   * @mgr: FPGA manager
>> >   * @info: FPGA image info
>> > + * @compat_id: FPGA region id for compatibility check.
>> >   * @priv: private data
>> >   * @get_bridges: optional function to get bridges to a list
>> >   * @groups: optional attribute groups.
>> > @@ -24,6 +36,7 @@ struct fpga_region {
>> > struct list_head bridge_list;
>> > struct fpga_manager *mgr;
>> > struct fpga_image_info *info;
>> > +   struct fpga_region_compat_id compat_id;
>>
>> Here it would be a pointer instead.
>
> Yes. Will update this patch.
>
> Thanks
> Hao
>
>>
>> Alan
>>
>> > void *priv;
>> > int (*get_bridges)(struct fpga_region *region);
>> > const struct attribute_group **groups;
>> > --
>> > 2.7.4
>> >
> --
> To 

Re: usb: musb: "(null)" in sysfs mode file after disabling a gadget (and at other times, system hangs)

2018-03-05 Thread Merlijn Wajer
Hi Bin,

On 05/03/18 20:28, Bin Liu wrote:

> The musb udc driver sets the state to b_idle without checking a
> gadget driver, this should be cleaned up. I have add this in my backlog.
> But if this issue doesn't bother you much right now, I will make the
> action low priority and address it later whenever I got time. (likely
> not very soon, I have a hand full of musb driver bugs to fix...)

I can try to fix it this (or next) week. Do you have a pointer for me?

Cheers,
Merlijn



signature.asc
Description: OpenPGP digital signature


[PATCH 003/103] sched, treewide: Replace hardcoded nice values with MIN_NICE/MAX_NICE

2018-03-05 Thread micky387
From: Dongsheng Yang 

Replace various -20/+19 hardcoded nice values with MIN_NICE/MAX_NICE.

Signed-off-by: Dongsheng Yang 
Acked-by: Tejun Heo 
Signed-off-by: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/ff13819fd09b7a5dba5ab5ae797f2e7019bdfa17.1394532288.git.yangds.f...@cn.fujitsu.com
Cc: de...@driverdev.osuosl.org
Cc: devicet...@vger.kernel.org
Cc: fcoe-de...@open-fcoe.org
Cc: linux...@de.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux-s...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: nbd-gene...@lists.sourceforge.net
Cc: ocfs2-de...@oss.oracle.com
Cc: openipmi-develo...@lists.sourceforge.net
Cc: qla2xxx-upstr...@qlogic.com
Cc: linux-a...@vger.kernel.org
[ Consolidated the patches, twiddled the changelog. ]
Signed-off-by: Ingo Molnar 

Change-Id: I00a4ccd66fcc206211f462245d98d35a853f8264
---
 drivers/block/loop.c  | 2 +-
 drivers/block/nbd.c   | 2 +-
 drivers/block/pktcdvd.c   | 2 +-
 drivers/char/ipmi/ipmi_si_intf.c  | 2 +-
 drivers/s390/crypto/ap_bus.c  | 2 +-
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 4 ++--
 drivers/scsi/bnx2i/bnx2i_hwi.c| 2 +-
 drivers/scsi/fcoe/fcoe.c  | 2 +-
 drivers/scsi/ibmvscsi/ibmvfc.c| 2 +-
 drivers/scsi/ibmvscsi/ibmvscsi.c  | 2 +-
 drivers/scsi/lpfc/lpfc_hbadisc.c  | 2 +-
 drivers/scsi/qla2xxx/qla_os.c | 2 +-
 fs/ocfs2/cluster/heartbeat.c  | 2 +-
 kernel/workqueue.c| 6 +++---
 mm/huge_memory.c  | 2 +-
 15 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 333458c..029e43c 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -547,7 +547,7 @@ static int loop_thread(void *data)
struct loop_device *lo = data;
struct bio *bio;
 
-   set_user_nice(current, -20);
+   set_user_nice(current, MIN_NICE);
 
while (!kthread_should_stop() || !bio_list_empty(>lo_bio_list)) {
 
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index d593fa5..f1a2da8 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -533,7 +533,7 @@ static int nbd_thread(void *data)
struct nbd_device *nbd = data;
struct request *req;
 
-   set_user_nice(current, -20);
+   set_user_nice(current, MIN_NICE);
while (!kthread_should_stop() || !list_empty(>waiting_queue)) {
/* wait for something to do */
wait_event_interruptible(nbd->waiting_wq,
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index caddb5d..14a8075 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -1471,7 +1471,7 @@ static int kcdrwd(void *foobar)
struct packet_data *pkt;
long min_sleep_time, residue;
 
-   set_user_nice(current, -20);
+   set_user_nice(current, MIN_NICE);
set_freezable();
 
for (;;) {
diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index a67ac2a..fc22dec 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -992,7 +992,7 @@ static int ipmi_thread(void *data)
struct timespec busy_until;
 
ipmi_si_set_not_busy(_until);
-   set_user_nice(current, 19);
+   set_user_nice(current, MAX_NICE);
while (!kthread_should_stop()) {
int busy_wait;
 
diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
index 6f512fa..b30ffb8 100644
--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -1755,7 +1755,7 @@ static int ap_poll_thread(void *data)
int requests;
struct ap_device *ap_dev;
 
-   set_user_nice(current, 19);
+   set_user_nice(current, MAX_NICE);
while (1) {
if (ap_suspend_flag)
return 0;
diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c 
b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index aad5535..ff08516 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -471,7 +471,7 @@ static int bnx2fc_l2_rcv_thread(void *arg)
struct fcoe_percpu_s *bg = arg;
struct sk_buff *skb;
 
-   set_user_nice(current, -20);
+   set_user_nice(current, MIN_NICE);
set_current_state(TASK_INTERRUPTIBLE);
while (!kthread_should_stop()) {
schedule();
@@ -610,7 +610,7 @@ int bnx2fc_percpu_io_thread(void *arg)
struct bnx2fc_work *work, *tmp;
LIST_HEAD(work_list);
 
-   set_user_nice(current, -20);
+   set_user_nice(current, MIN_NICE);
set_current_state(TASK_INTERRUPTIBLE);
while (!kthread_should_stop()) {
schedule();
diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c b/drivers/scsi/bnx2i/bnx2i_hwi.c
index a28b03e..a95ea80 100644
--- a/drivers/scsi/bnx2i/bnx2i_hwi.c
+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c
@@ -1870,7 +1870,7 @@ int 

Re: [PATCH] uprobe: add support for overlayfs

2018-03-05 Thread Josef Bacik
On Tue, Feb 27, 2018 at 04:40:14PM -0800, Howard McLauchlan wrote:
> uprobes cannot successfully attach to binaries located in a directory
> mounted with overlayfs.
> 
> To verify, create directories for mounting overlayfs
> (upper,lower,work,merge), move some binary into merge/ and use readelf
> to obtain some known instruction of the binary. I used /bin/true and the
> entry instruction(0x13b0):
> 
>   $ mount -t overlay overlay -o 
> lowerdir=lower,upperdir=upper,workdir=work merge
>   $ cd /sys/kernel/debug/tracing
>   $ echo 'p:true_entry PATH_TO_MERGE/merge/true:0x13b0' > uprobe_events
>   $ echo 1 > events/uprobes/true_entry/enable
> 
> This returns 'bash: echo: write error: Input/output error' and dmesg
> tells us 'event trace: Could not enable event true_entry'
> 
> This change makes create_trace_uprobe() look for the real inode of a
> dentry. In the case of normal filesystems, this simplifies to just
> returning the inode. In the case of overlayfs(and similar fs) we will
> obtain the underlying dentry and corresponding inode, upon which uprobes
> can successfully register.
> 
> Running the example above with the patch applied, we can see that the
> uprobe is enabled and will output to trace as expected.
> 
> Signed-off-by: Howard McLauchlan 

Reviewed-by: Josef Bacik 

Thanks,

Josef


Re: [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it

2018-03-05 Thread Kees Cook
On Mon, Mar 5, 2018 at 11:42 AM, Dave Hansen
 wrote:
> On 03/05/2018 11:34 AM, Kees Cook wrote:
>> Boris, Andy, and Dave (Hansen), you've all looked at this; would you
>> be willing to give an Ack on the x86 parts? (Though I do now see a new
>> comment from Dave was just sent.) And if not, what changes would you
>> like to see?
>
> I think it could definitely use another cleanup and de-#ifdef'ing pass.
> It seems to have inherited the style from the original code and it's a
> bit more than we're used to in mainline.

There are a few places it could be minimized, that's true. It looked
like it might not be worth it, but the places I see are:

include/linux/compiler.h:
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+/* Poison value points to the unused hole in the virtual memory map */
+# define STACKLEAK_POISON -0xBEEF
+# define STACKLEAK_POISON_CHECK_DEPTH 128
+#endif

This doesn't need an #ifdef wrapper...


arch/x86/kernel/process_64.c and arch/x86/kernel/process_32.c:
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+   p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+   2 * sizeof(unsigned long);
+#endif

This could be made into a helper function, maybe, in processor.h? Like:

#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
# define record_lowest_stack(p) do { \
p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
  2 * sizeof(unsigned long);
} while (0)
#else
# define save_lowest_stack(p) do { } while (0)
#endif

And the uses in process_*.c would be:

save_lowest_stack(p);

?


And "fs/proc: Show STACKLEAK metrics in the /proc file system" could
maybe be adjusted too?

It doesn't seem like a lot of savings, but what do you think?

One new thing did pop out at me in this review, track_stack() likely
shouldn't live in fs/exec.c. It has nothing to do with exec(). There
aren't a lot of good places, but maybe a better place would be
mm/util.c. (A whole new source file seems like overkill.)

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH AUTOSEL for 4.9 124/219] ARM: dts: rockchip: disable arm-global-timer for rk3188

2018-03-05 Thread Sasha Levin
On Mon, Mar 05, 2018 at 02:19:42AM +0300, Alexander Kochetkov wrote:
>Hello, Sasha!
>
>Following 2 patches must be applied together with the patch:
>5e0a39d0f727b35c8b7ef56ba0724c8ceb006297 clocksource/drivers/rockchip_timer: 
>Implement clocksource timer
>627988a66aee3c845aa2f1f874a3ddba8adb89d9 ARM: dts: rockchip: Add timer entries 
>to rk3188 SoC

Hey Alexander,

A bit too much for stable, I'll just drop the patch. Thanks!

-- 

Thanks,
Sasha

Re: [PATCH 1/7] genalloc: track beginning of allocations

2018-03-05 Thread J Freyensee

.
.


On 2/28/18 12:06 PM, Igor Stoppa wrote:

+
+/**
+ * gen_pool_dma_alloc() - allocate special memory from the pool for DMA usage
+ * @pool: pool to allocate from
+ * @size: number of bytes to allocate from the pool
+ * @dma: dma-view physical address return value.  Use NULL if unneeded.
+ *
+ * Allocate the requested number of bytes from the specified pool.
+ * Uses the pool allocation function (with first-fit algorithm by default).
+ * Can not be used in NMI handler on architectures without
+ * NMI-safe cmpxchg implementation.
+ *
+ * Return:
+ * * address of the memory allocated   - success
+ * * NULL  - error
+ */
+void *gen_pool_dma_alloc(struct gen_pool *pool, size_t size, dma_addr_t *dma);
+


OK, so gen_pool_dma_alloc() is defined here, which believe is the API 
line being drawn for this series.


so,
.
.
.


  
  /**

- * gen_pool_dma_alloc - allocate special memory from the pool for DMA usage
+ * gen_pool_dma_alloc() - allocate special memory from the pool for DMA usage
   * @pool: pool to allocate from
   * @size: number of bytes to allocate from the pool
   * @dma: dma-view physical address return value.  Use NULL if unneeded.
@@ -342,14 +566,15 @@ EXPORT_SYMBOL(gen_pool_alloc_algo);
   * Uses the pool allocation function (with first-fit algorithm by default).
   * Can not be used in NMI handler on architectures without
   * NMI-safe cmpxchg implementation.
+ *
+ * Return:
+ * * address of the memory allocated   - success
+ * * NULL  - error
   */
  void *gen_pool_dma_alloc(struct gen_pool *pool, size_t size, dma_addr_t *dma)
  {
unsigned long vaddr;
  
-	if (!pool)

-   return NULL;
-
why is this being removed?  I don't believe this code was getting 
removed from your v17 series patches.

vaddr = gen_pool_alloc(pool, size);
if (!vaddr)
return NULL;
@@ -362,10 +587,10 @@ void *gen_pool_dma_alloc(struct gen_pool *pool, size_t 
size, dma_addr_t *dma)
  EXPORT_SYMBOL(gen_pool_dma_alloc);
  


Otherwise, looks good,

Reviewed-by: Jay Freyensee 


Re: [RFC, PATCH 18/22] x86/mm: Handle allocation of encrypted pages

2018-03-05 Thread Dave Hansen
On 03/05/2018 08:26 AM, Kirill A. Shutemov wrote:
> kmap_atomic_keyid() would map the page with the specified KeyID.
> For now it's dummy implementation that would be replaced later.

I think you need to explain the tradeoffs here.  We could just change
the linear map around, but you don't.  Why?


Re: Regression in IPMI on 4.15.6

2018-03-05 Thread Laura Abbott

On 03/02/2018 05:46 AM, Corey Minyard wrote:

On 02/28/2018 01:07 PM, Corey Minyard wrote:

On 02/28/2018 08:17 AM, Corey Minyard wrote:

On 02/28/2018 07:53 AM, Corey Minyard wrote:

On 02/27/2018 05:55 PM, Laura Abbott wrote:

Hi,

Fedora got a bug report of a crash in IPMI on 4.15.6
https://bugzilla.redhat.com/show_bug.cgi?id=1549316
Unfortunately, it's only a screenshot but it's fairly
clear. It looks like a panic in the error handling path
in platform_device_unregister. Any ideas?





You may also run into another issue.  You can pull the
individual patch at

https://github.com/cminyard/linux-ipmi.git 
c8a1972e77dbe321ce5ce0247056e727234cbaec


Actually, it needed a few more tweaks.  Can you do change
426fa6179dae677134dfb37b21d057819418515b
instead?  It's "ipmi: Fix some error cleanup issues"

I can send you patches, if you like.  If you could test and get back
to me, that would be great.


Laura, have you had a chance to test this?  I'd like to get it in soon,
if possible.

Thanks,

-corey



I think "ipmi: Re-use existing macros for built-in properties" is broken:

In file included from ./include/linux/acpi.h:28:0,
 from ./include/linux/ipmi.h:21,
 from drivers/char/ipmi/ipmi_dmi.c:7:
drivers/char/ipmi/ipmi_dmi.c: In function ‘dmi_add_platform_ipmi’:
./include/linux/property.h:236:1: error: expected expression before ‘{’ token
 {   \
 ^
./include/linux/property.h:244:2: note: in expansion of macro 
‘PROPERTY_ENTRY_INTEGER’
  PROPERTY_ENTRY_INTEGER(_name_, u8, _val_)
  ^~
drivers/char/ipmi/ipmi_dmi.c:79:15: note: in expansion of macro 
‘PROPERTY_ENTRY_U8’
   p[pidx++] = PROPERTY_ENTRY_U8("ipmi-type", si_type);
   ^
./include/linux/property.h:236:1: error: expected expression before ‘{’ token
 {   \
 ^
./include/linux/property.h:244:2: note: in expansion of macro 
‘PROPERTY_ENTRY_INTEGER’
  PROPERTY_ENTRY_INTEGER(_name_, u8, _val_)
  ^~
drivers/char/ipmi/ipmi_dmi.c:81:14: note: in expansion of macro 
‘PROPERTY_ENTRY_U8’
  p[pidx++] = PROPERTY_ENTRY_U8("slave-addr", slave_addr);
  ^
./include/linux/property.h:236:1: error: expected expression before ‘{’ token
 {   \
 ^
./include/linux/property.h:244:2: note: in expansion of macro 
‘PROPERTY_ENTRY_INTEGER’
  PROPERTY_ENTRY_INTEGER(_name_, u8, _val_)
  ^~
drivers/char/ipmi/ipmi_dmi.c:82:14: note: in expansion of macro 
‘PROPERTY_ENTRY_U8’
  p[pidx++] = PROPERTY_ENTRY_U8("addr-source", SI_SMBIOS);
  ^
./include/linux/property.h:236:1: error: expected expression before ‘{’ token
 {   \
 ^
./include/linux/property.h:246:2: note: in expansion of macro 
‘PROPERTY_ENTRY_INTEGER’
  PROPERTY_ENTRY_INTEGER(_name_, u16, _val_)
  ^~
drivers/char/ipmi/ipmi_dmi.c:107:15: note: in expansion of macro 
‘PROPERTY_ENTRY_U16’
   p[pidx++] = PROPERTY_ENTRY_U16("i2c-addr", base_addr);
   ^~

I don't think that macro is actually a replacement?

Thanks,
Laura



BTW, the IPMI setup in your system is incorrect.  SMBIOS says it's at a
memory address, but it's at an I/O address.  And the address given
doesn't appear to be a valid address, the value read doesn't appear
to be a valid value.

-corey



for that fix.

-corey


Yeah, this is fixed by 174134ac7602 "ipmi_si: Fix error
handling of platform device" in mainstream.

I guess I need to request a backport of this.

Thanks for reporting.

-corey



Thanks,
Laura













Re: [RFC, PATCH 19/22] x86/mm: Implement free_encrypt_page()

2018-03-05 Thread Dave Hansen
On 03/05/2018 08:26 AM, Kirill A. Shutemov wrote:
> +void free_encrypt_page(struct page *page, int keyid, unsigned int order)
> +{
> + int i;
> + void *v;
> +
> + for (i = 0; i < (1 << order); i++) {
> + v = kmap_atomic_keyid(page, keyid + i);
> + /* See comment in prep_encrypt_page() */
> + clflush_cache_range(v, PAGE_SIZE);
> + kunmap_atomic(v);
> + }
> +}

Have you measured how slow this is?

It's an optimization, but can we find a way to only do this dance when
we *actually* change the keyid?  Right now, we're doing mapping at alloc
and free, clflushing at free and zeroing at alloc.  Let's say somebody does:

ptr = malloc(PAGE_SIZE);
*ptr = foo;
free(ptr);

ptr = malloc(PAGE_SIZE);
*ptr = bar;
free(ptr);

And let's say ptr is in encrypted memory and that we actually munmap()
at free().  We can theoretically skip the clflush, right?


[PATCH v11 3/7] iomap: introduce io{read|write}64_{lo_hi|hi_lo}

2018-03-05 Thread Logan Gunthorpe
In order to provide non-atomic functions for io{read|write}64 that will
use readq and writeq when appropriate. We define a number of variants
of these functions in the generic iomap that will do non-atomic
operations on pio but atomic operations on mmio.

These functions are only defined if readq and writeq are defined. If
they are not, then the wrappers that always use non-atomic operations
from include/linux/io-64-nonatomic*.h will be used.

Signed-off-by: Logan Gunthorpe 
Reviewed-by: Andy Shevchenko 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Arnd Bergmann 
Cc: Suresh Warrier 
Cc: Nicholas Piggin 
---
 arch/powerpc/include/asm/io.h |   2 +
 include/asm-generic/iomap.h   |  26 +++--
 lib/iomap.c   | 132 ++
 3 files changed, 154 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index af074923d598..4cc420cfaa78 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -788,8 +788,10 @@ extern void __iounmap_at(void *ea, unsigned long size);
 
 #define mmio_read16be(addr)readw_be(addr)
 #define mmio_read32be(addr)readl_be(addr)
+#define mmio_read64be(addr)readq_be(addr)
 #define mmio_write16be(val, addr)  writew_be(val, addr)
 #define mmio_write32be(val, addr)  writel_be(val, addr)
+#define mmio_write64be(val, addr)  writeq_be(val, addr)
 #define mmio_insb(addr, dst, count)readsb(addr, dst, count)
 #define mmio_insw(addr, dst, count)readsw(addr, dst, count)
 #define mmio_insl(addr, dst, count)readsl(addr, dst, count)
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 5b63b94ef6b5..5a4af0199b32 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -31,9 +31,16 @@ extern unsigned int ioread16(void __iomem *);
 extern unsigned int ioread16be(void __iomem *);
 extern unsigned int ioread32(void __iomem *);
 extern unsigned int ioread32be(void __iomem *);
-#ifdef CONFIG_64BIT
-extern u64 ioread64(void __iomem *);
-extern u64 ioread64be(void __iomem *);
+
+#ifdef readq
+#define ioread64_lo_hi ioread64_lo_hi
+#define ioread64_hi_lo ioread64_hi_lo
+#define ioread64be_lo_hi ioread64be_lo_hi
+#define ioread64be_hi_lo ioread64be_hi_lo
+extern u64 ioread64_lo_hi(void __iomem *addr);
+extern u64 ioread64_hi_lo(void __iomem *addr);
+extern u64 ioread64be_lo_hi(void __iomem *addr);
+extern u64 ioread64be_hi_lo(void __iomem *addr);
 #endif
 
 extern void iowrite8(u8, void __iomem *);
@@ -41,9 +48,16 @@ extern void iowrite16(u16, void __iomem *);
 extern void iowrite16be(u16, void __iomem *);
 extern void iowrite32(u32, void __iomem *);
 extern void iowrite32be(u32, void __iomem *);
-#ifdef CONFIG_64BIT
-extern void iowrite64(u64, void __iomem *);
-extern void iowrite64be(u64, void __iomem *);
+
+#ifdef writeq
+#define iowrite64_lo_hi iowrite64_lo_hi
+#define iowrite64_hi_lo iowrite64_hi_lo
+#define iowrite64be_lo_hi iowrite64be_lo_hi
+#define iowrite64be_hi_lo iowrite64be_hi_lo
+extern void iowrite64_lo_hi(u64 val, void __iomem *addr);
+extern void iowrite64_hi_lo(u64 val, void __iomem *addr);
+extern void iowrite64be_lo_hi(u64 val, void __iomem *addr);
+extern void iowrite64be_hi_lo(u64 val, void __iomem *addr);
 #endif
 
 /*
diff --git a/lib/iomap.c b/lib/iomap.c
index 541d926da95e..d324b6c013af 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -67,6 +67,7 @@ static void bad_io_access(unsigned long port, const char 
*access)
 #ifndef mmio_read16be
 #define mmio_read16be(addr) be16_to_cpu(__raw_readw(addr))
 #define mmio_read32be(addr) be32_to_cpu(__raw_readl(addr))
+#define mmio_read64be(addr) be64_to_cpu(__raw_readq(addr))
 #endif
 
 unsigned int ioread8(void __iomem *addr)
@@ -100,6 +101,80 @@ EXPORT_SYMBOL(ioread16be);
 EXPORT_SYMBOL(ioread32);
 EXPORT_SYMBOL(ioread32be);
 
+#ifdef readq
+static u64 pio_read64_lo_hi(unsigned long port)
+{
+   u64 lo, hi;
+
+   lo = inl(port);
+   hi = inl(port + sizeof(u32));
+
+   return lo | (hi << 32);
+}
+
+static u64 pio_read64_hi_lo(unsigned long port)
+{
+   u64 lo, hi;
+
+   hi = inl(port + sizeof(u32));
+   lo = inl(port);
+
+   return lo | (hi << 32);
+}
+
+static u64 pio_read64be_lo_hi(unsigned long port)
+{
+   u64 lo, hi;
+
+   lo = pio_read32be(port + sizeof(u32));
+   hi = pio_read32be(port);
+
+   return lo | (hi << 32);
+}
+
+static u64 pio_read64be_hi_lo(unsigned long port)
+{
+   u64 lo, hi;
+
+   hi = pio_read32be(port);
+   lo = pio_read32be(port + sizeof(u32));
+
+   return lo | (hi << 32);
+}
+
+u64 ioread64_lo_hi(void __iomem *addr)
+{
+   IO_COND(addr, return pio_read64_lo_hi(port), return readq(addr));
+   return 0xULL;
+}
+

[PATCH v11 2/7] powerpc: iomap.c: introduce io{read|write}64_{lo_hi|hi_lo}

2018-03-05 Thread Logan Gunthorpe
These functions will be introduced into the generic iomap.c so
they can deal with PIO accesses in hi-lo/lo-hi variants. Thus,
the powerpc version of iomap.c will need to provide the same
functions even though, in this arch, they are identical to the
regular io{read|write}64 functions.

Signed-off-by: Logan Gunthorpe 
Tested-by: Horia Geantă 
Reviewed-by: Andy Shevchenko 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
---
 arch/powerpc/kernel/iomap.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/arch/powerpc/kernel/iomap.c b/arch/powerpc/kernel/iomap.c
index aab456ed2a00..5ac84efc6ede 100644
--- a/arch/powerpc/kernel/iomap.c
+++ b/arch/powerpc/kernel/iomap.c
@@ -45,12 +45,32 @@ u64 ioread64(void __iomem *addr)
 {
return readq(addr);
 }
+u64 ioread64_lo_hi(void __iomem *addr)
+{
+   return readq(addr);
+}
+u64 ioread64_hi_lo(void __iomem *addr)
+{
+   return readq(addr);
+}
 u64 ioread64be(void __iomem *addr)
 {
return readq_be(addr);
 }
+u64 ioread64be_lo_hi(void __iomem *addr)
+{
+   return readq_be(addr);
+}
+u64 ioread64be_hi_lo(void __iomem *addr)
+{
+   return readq_be(addr);
+}
 EXPORT_SYMBOL(ioread64);
+EXPORT_SYMBOL(ioread64_lo_hi);
+EXPORT_SYMBOL(ioread64_hi_lo);
 EXPORT_SYMBOL(ioread64be);
+EXPORT_SYMBOL(ioread64be_lo_hi);
+EXPORT_SYMBOL(ioread64be_hi_lo);
 #endif /* __powerpc64__ */
 
 void iowrite8(u8 val, void __iomem *addr)
@@ -83,12 +103,32 @@ void iowrite64(u64 val, void __iomem *addr)
 {
writeq(val, addr);
 }
+void iowrite64_lo_hi(u64 val, void __iomem *addr)
+{
+   writeq(val, addr);
+}
+void iowrite64_hi_lo(u64 val, void __iomem *addr)
+{
+   writeq(val, addr);
+}
 void iowrite64be(u64 val, void __iomem *addr)
 {
writeq_be(val, addr);
 }
+void iowrite64be_lo_hi(u64 val, void __iomem *addr)
+{
+   writeq_be(val, addr);
+}
+void iowrite64be_hi_lo(u64 val, void __iomem *addr)
+{
+   writeq_be(val, addr);
+}
 EXPORT_SYMBOL(iowrite64);
+EXPORT_SYMBOL(iowrite64_lo_hi);
+EXPORT_SYMBOL(iowrite64_hi_lo);
 EXPORT_SYMBOL(iowrite64be);
+EXPORT_SYMBOL(iowrite64be_lo_hi);
+EXPORT_SYMBOL(iowrite64be_hi_lo);
 #endif /* __powerpc64__ */
 
 /*
-- 
2.11.0



[PATCH v11 4/7] io-64-nonatomic: add io{read|write}64[be]{_lo_hi|_hi_lo} macros

2018-03-05 Thread Logan Gunthorpe
This patch adds generic io{read|write}64[be]{_lo_hi|_hi_lo} macros if
they are not already defined by the architecture. (As they are provided
by the generic iomap library).

The patch also points io{read|write}64[be] to the variant specified by the
header name.

This is because new drivers are encouraged to use ioreadXX, et al instead
of readX[1], et al -- and mixing ioreadXX with readq is pretty ugly.

[1] LDD3: section 9.4.2

Signed-off-by: Logan Gunthorpe 
Reviewed-by: Andy Shevchenko 
Cc: Christoph Hellwig 
Cc: Arnd Bergmann 
Cc: Alan Cox 
Cc: Greg Kroah-Hartman 
---
 include/linux/io-64-nonatomic-hi-lo.h | 64 +++
 include/linux/io-64-nonatomic-lo-hi.h | 64 +++
 2 files changed, 128 insertions(+)

diff --git a/include/linux/io-64-nonatomic-hi-lo.h 
b/include/linux/io-64-nonatomic-hi-lo.h
index 862d786a904f..ae21b72cce85 100644
--- a/include/linux/io-64-nonatomic-hi-lo.h
+++ b/include/linux/io-64-nonatomic-hi-lo.h
@@ -55,4 +55,68 @@ static inline void hi_lo_writeq_relaxed(__u64 val, volatile 
void __iomem *addr)
 #define writeq_relaxed hi_lo_writeq_relaxed
 #endif
 
+#ifndef ioread64_hi_lo
+#define ioread64_hi_lo ioread64_hi_lo
+static inline u64 ioread64_hi_lo(void __iomem *addr)
+{
+   u32 low, high;
+
+   high = ioread32(addr + sizeof(u32));
+   low = ioread32(addr);
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64_hi_lo
+#define iowrite64_hi_lo iowrite64_hi_lo
+static inline void iowrite64_hi_lo(u64 val, void __iomem *addr)
+{
+   iowrite32(val >> 32, addr + sizeof(u32));
+   iowrite32(val, addr);
+}
+#endif
+
+#ifndef ioread64be_hi_lo
+#define ioread64be_hi_lo ioread64be_hi_lo
+static inline u64 ioread64be_hi_lo(void __iomem *addr)
+{
+   u32 low, high;
+
+   high = ioread32be(addr);
+   low = ioread32be(addr + sizeof(u32));
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64be_hi_lo
+#define iowrite64be_hi_lo iowrite64be_hi_lo
+static inline void iowrite64be_hi_lo(u64 val, void __iomem *addr)
+{
+   iowrite32be(val >> 32, addr);
+   iowrite32be(val, addr + sizeof(u32));
+}
+#endif
+
+#ifndef ioread64
+#define ioread64_is_nonatomic
+#define ioread64 ioread64_hi_lo
+#endif
+
+#ifndef iowrite64
+#define iowrite64_is_nonatomic
+#define iowrite64 iowrite64_hi_lo
+#endif
+
+#ifndef ioread64be
+#define ioread64be_is_nonatomic
+#define ioread64be ioread64be_hi_lo
+#endif
+
+#ifndef iowrite64be
+#define iowrite64be_is_nonatomic
+#define iowrite64be iowrite64be_hi_lo
+#endif
+
 #endif /* _LINUX_IO_64_NONATOMIC_HI_LO_H_ */
diff --git a/include/linux/io-64-nonatomic-lo-hi.h 
b/include/linux/io-64-nonatomic-lo-hi.h
index d042e7bb5adb..faaa842dbdb9 100644
--- a/include/linux/io-64-nonatomic-lo-hi.h
+++ b/include/linux/io-64-nonatomic-lo-hi.h
@@ -55,4 +55,68 @@ static inline void lo_hi_writeq_relaxed(__u64 val, volatile 
void __iomem *addr)
 #define writeq_relaxed lo_hi_writeq_relaxed
 #endif
 
+#ifndef ioread64_lo_hi
+#define ioread64_lo_hi ioread64_lo_hi
+static inline u64 ioread64_lo_hi(void __iomem *addr)
+{
+   u32 low, high;
+
+   low = ioread32(addr);
+   high = ioread32(addr + sizeof(u32));
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64_lo_hi
+#define iowrite64_lo_hi iowrite64_lo_hi
+static inline void iowrite64_lo_hi(u64 val, void __iomem *addr)
+{
+   iowrite32(val, addr);
+   iowrite32(val >> 32, addr + sizeof(u32));
+}
+#endif
+
+#ifndef ioread64be_lo_hi
+#define ioread64be_lo_hi ioread64be_lo_hi
+static inline u64 ioread64be_lo_hi(void __iomem *addr)
+{
+   u32 low, high;
+
+   low = ioread32be(addr + sizeof(u32));
+   high = ioread32be(addr);
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64be_lo_hi
+#define iowrite64be_lo_hi iowrite64be_lo_hi
+static inline void iowrite64be_lo_hi(u64 val, void __iomem *addr)
+{
+   iowrite32be(val, addr + sizeof(u32));
+   iowrite32be(val >> 32, addr);
+}
+#endif
+
+#ifndef ioread64
+#define ioread64_is_nonatomic
+#define ioread64 ioread64_lo_hi
+#endif
+
+#ifndef iowrite64
+#define iowrite64_is_nonatomic
+#define iowrite64 iowrite64_lo_hi
+#endif
+
+#ifndef ioread64be
+#define ioread64be_is_nonatomic
+#define ioread64be ioread64be_lo_hi
+#endif
+
+#ifndef iowrite64be
+#define iowrite64be_is_nonatomic
+#define iowrite64be iowrite64be_lo_hi
+#endif
+
 #endif /* _LINUX_IO_64_NONATOMIC_LO_HI_H_ */
-- 
2.11.0



Re: [PATCH v12 08/11] mm: Clear arch specific VM flags on protection change

2018-03-05 Thread Dave Hansen
On 02/21/2018 09:15 AM, Khalid Aziz wrote:
> +/* Arch-specific flags to clear when updating VM flags on protection change 
> */
> +#ifndef VM_ARCH_CLEAR
> +# define VM_ARCH_CLEAR   VM_NONE
> +#endif
> +#define VM_FLAGS_CLEAR   (ARCH_VM_PKEY_FLAGS | VM_ARCH_CLEAR)

Shouldn't this be defining

# define VM_ARCH_CLEAR  ARCH_VM_PKEY_FLAGS

on x86?


[PATCH v2 6/6] PCI: hv: fix 2 hang issues in hv_compose_msi_msg()

2018-03-05 Thread Dexuan Cui
1. With the patch "x86/vector/msi: Switch to global reservation mode"
(4900be8360), the recent v4.15 and newer kernels always hang for 1-vCPU
Hyper-V VM with SR-IOV. This is because when we reach hv_compose_msi_msg()
by request_irq()  -> request_threaded_irq() -> __setup_irq()->irq_startup()
 -> __irq_startup() -> irq_domain_activate_irq() -> ... ->
msi_domain_activate() -> ... -> hv_compose_msi_msg(), local irq is
disabled in __setup_irq().

Fix this by polling the channel.

2. If the host is ejecting the VF device before we reach
hv_compose_msi_msg(), in a UP VM, we can hang in hv_compose_msi_msg()
forever, because at this time the host doesn't respond to the
CREATE_INTERRUPT request. This issue also happens to old kernels like
v4.14, v4.13, etc.

Fix this by polling the channel for the PCI_EJECT message and
hpdev->state, and by checking the PCI vendor ID.

Note: actually the above issues also happen to a SMP VM, if
"hbus->hdev->channel->target_cpu == smp_processor_id()" is true.

Signed-off-by: Dexuan Cui 
Tested-by: Adrian Suhov 
Tested-by: Chris Valean 
Cc: sta...@vger.kernel.org
Cc: Stephen Hemminger 
Cc: K. Y. Srinivasan 
Cc: Vitaly Kuznetsov 
Cc: Jack Morgenstein 
---
 drivers/pci/host/pci-hyperv.c | 58 ++-
 1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index d3aa6736a9bb..114624dfbd97 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -521,6 +521,8 @@ struct hv_pci_compl {
s32 completion_status;
 };
 
+static void hv_pci_onchannelcallback(void *context);
+
 /**
  * hv_pci_generic_compl() - Invoked for a completion packet
  * @context:   Set up by the sender of the packet.
@@ -665,6 +667,31 @@ static void _hv_pcifront_read_config(struct hv_pci_dev 
*hpdev, int where,
}
 }
 
+static u16 hv_pcifront_get_vendor_id(struct hv_pci_dev *hpdev)
+{
+   u16 ret;
+   unsigned long flags;
+   void __iomem *addr = hpdev->hbus->cfg_addr + CFG_PAGE_OFFSET +
+PCI_VENDOR_ID;
+
+   spin_lock_irqsave(>hbus->config_lock, flags);
+
+   /* Choose the function to be read. (See comment above) */
+   writel(hpdev->desc.win_slot.slot, hpdev->hbus->cfg_addr);
+   /* Make sure the function was chosen before we start reading. */
+   mb();
+   /* Read from that function's config space. */
+   ret = readw(addr);
+   /*
+* mb() is not required here, because the spin_unlock_irqrestore()
+* is a barrier.
+*/
+
+   spin_unlock_irqrestore(>hbus->config_lock, flags);
+
+   return ret;
+}
+
 /**
  * _hv_pcifront_write_config() - Internal PCI config write
  * @hpdev: The PCI driver's representation of the device
@@ -1107,8 +1134,37 @@ static void hv_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
 * Since this function is called with IRQ locks held, can't
 * do normal wait for completion; instead poll.
 */
-   while (!try_wait_for_completion(_pkt.host_event))
+   while (!try_wait_for_completion(_pkt.host_event)) {
+   /* 0x means an invalid PCI VENDOR ID. */
+   if (hv_pcifront_get_vendor_id(hpdev) == 0x) {
+   dev_err_once(>hdev->device,
+"the device has gone\n");
+   goto free_int_desc;
+   }
+
+   /*
+* When the higher level interrupt code calls us with
+* interrupt disabled, we must poll the channel by calling
+* the channel callback directly when channel->target_cpu is
+* the current CPU. When the higher level interrupt code
+* calls us with interrupt enabled, let's add the
+* local_bh_disable()/enable() to avoid race.
+*/
+   local_bh_disable();
+
+   if (hbus->hdev->channel->target_cpu == smp_processor_id())
+   hv_pci_onchannelcallback(hbus);
+
+   local_bh_enable();
+
+   if (hpdev->state == hv_pcichild_ejecting) {
+   dev_err_once(>hdev->device,
+"the device is being ejected\n");
+   goto free_int_desc;
+   }
+
udelay(100);
+   }
 
if (comp.comp_pkt.completion_status < 0) {
dev_err(>hdev->device,
-- 
2.7.4


[PATCH v2 5/6] PCI: hv: hv_pci_devices_present(): only queue a new work when necessary

2018-03-05 Thread Dexuan Cui
If there is a pending work, we just need to add the new dr into
the dr_list.

This is suggested by Michael Kelley.

Signed-off-by: Dexuan Cui 
Cc: Vitaly Kuznetsov 
Cc: Jack Morgenstein 
Cc: sta...@vger.kernel.org
Cc: Stephen Hemminger 
Cc: K. Y. Srinivasan 
Cc: Michael Kelley (EOSG) 
---
 drivers/pci/host/pci-hyperv.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 3a385212f666..d3aa6736a9bb 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -1733,6 +1733,7 @@ static void hv_pci_devices_present(struct 
hv_pcibus_device *hbus,
struct hv_dr_state *dr;
struct hv_dr_work *dr_wrk;
unsigned long flags;
+   bool pending_dr;
 
dr_wrk = kzalloc(sizeof(*dr_wrk), GFP_NOWAIT);
if (!dr_wrk)
@@ -1756,11 +1757,23 @@ static void hv_pci_devices_present(struct 
hv_pcibus_device *hbus,
}
 
spin_lock_irqsave(>device_list_lock, flags);
+
+   /*
+* If pending_dr is true, we have already queued a work,
+* which will see the new dr. Otherwise, we need to
+* queue a new work.
+*/
+   pending_dr = !list_empty(>dr_list);
list_add_tail(>list_entry, >dr_list);
-   spin_unlock_irqrestore(>device_list_lock, flags);
 
-   get_hvpcibus(hbus);
-   queue_work(hbus->wq, _wrk->wrk);
+   if (pending_dr) {
+   kfree(dr_wrk);
+   } else {
+   get_hvpcibus(hbus);
+   queue_work(hbus->wq, _wrk->wrk);
+   }
+
+   spin_unlock_irqrestore(>device_list_lock, flags);
 }
 
 /**
-- 
2.7.4


Re: [PATCH] Fix partial warnings of checkpatch.pl for ipx_route.c

2018-03-05 Thread Eric Dumazet
On Mon, 2018-03-05 at 20:19 +0100, Horatiu Vultur wrote:
> Fix partial warnings of checkpatch.pl for ipx_route.c
> 
> Signed-off-by: Horatiu Vultur 
> ---
>  drivers/staging/ipx/ipx_route.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 

Please take a look at drivers/staging/ipx/TODO




Re: [PATCH v5 0/5] Add coupled regulators mechanism

2018-03-05 Thread Fabio Estevam
Hi Maciej,

On Mon, Mar 5, 2018 at 12:57 PM, Fabio Estevam <feste...@gmail.com> wrote:

> kernelci.org also shows other imx6 boards that cannot boot with
> today's linux-next.

Here are the completes logs in case they help:

https://storage.kernelci.org/next/master/next-20180305/arm/imx_v6_v7_defconfig/lab-free-electrons/boot-imx6q-nitrogen6x.txt

https://storage.kernelci.org/next/master/next-20180305/arm/imx_v6_v7_defconfig/lab-baylibre-seattle/boot-imx6dl-wandboard_solo.txt


Re: [PATCH 2/7] genalloc: selftest

2018-03-05 Thread J Freyensee



+
+/*
+ * In case of failure of any of these tests, memory corruption is almost
+ * guarranteed; allowing the boot to continue means risking to corrupt
+ * also any filesystem/block device accessed write mode.
+ * Therefore, BUG_ON() is used, when testing.
+ */
+
+


I like the explanation; good background info on why something is 
implemented the way it is :-).


Reviewed-by: Jay Freyensee 



Re: [PATCH AUTOSEL for 4.9 005/219] kretprobes: Ensure probe location is at function entry

2018-03-05 Thread Sasha Levin
On Mon, Mar 05, 2018 at 12:32:57PM +0530, Naveen N. Rao wrote:
>Hi Sasha,
>
>Sasha Levin wrote:
>>From: "Naveen N. Rao" 
>>
>>[ Upstream commit 90ec5e89e393c76e19afc845d8f88a5dc8315919 ]
>>
>
>Sorry if this is obvious, but why was this patch picked up for 
>-stable?  I don't see the upstream commit tagging -stable, so curious 
>why this was done.
>
>I don't think this patch should be pushed to -stable since this is not 
>really a bug fix. There are also other dependencies for this change 
>(see commit a64e3f35a45f4a, for instance), including how userspace 
>(perf) builds out the retprobe argument. As such, please drop this 
>from -stable (for 3.18. 4.4 and 4.9).

Hi Naveen,

It's an automatic selection process that attempts to find commits that
should be in stable but weren't tagged as such.

I'll drop this patch, thanks!

-- 

Thanks,
Sasha

Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()

2018-03-05 Thread Kees Cook
On Mon, Mar 5, 2018 at 11:40 AM, Dave Hansen
 wrote:
> On 03/03/2018 12:00 PM, Alexander Popov wrote:
>> @@ -128,6 +134,7 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>
>>   do_audit_syscall_entry(regs, arch);
>>
>> + erase_kstack();
>>   return ret ?: regs->orig_ax;
>>  }
>
> This seems like an odd place to be clearing the stack.  Why was it done her?

Perhaps the commit log could be improved, but the idea is that the
audit work (ptrace, seccomp, etc), is happening before the syscall
code starts running, and it has therefore written to the stack (that
used to be cleared on last exit). This retains the clear stack state
even in the face of ptrace-ish work happening before the syscall
proper starts.

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH AUTOSEL for 4.9 002/219] spi/bcm63xx: make spi subsystem aware of message size limits

2018-03-05 Thread Sasha Levin
On Mon, Mar 05, 2018 at 10:23:10AM +, Mark Brown wrote:
>On Sat, Mar 03, 2018 at 10:27:56PM +, Sasha Levin wrote:
>> From: Jonas Gorski 
>>
>> [ Upstream commit 0135c03df914f0481c61f097c78d37cece84f330 ]
>
>Why are there so many more patches for v4.9 than for more recent
>kernels?

The v4.11..v4.14 range got processed, and all those commits are now
being pushed into 4.9 and older.

>> The bcm63xx SPI controller does not allow manual control of the CS
>> lines and will toggle it automatically before and after sending data,
>> so we are limited to messages that fit in the FIFO buffer. Since the CS
>> lines aren't available as GPIOs either, we will need to make slave
>> drivers aware of this limitation so they can handle them accordingly.
>
>This seems really aggressive for stable...

Why so?

-- 

Thanks,
Sasha

Re: [PATCH 0/29] arm meltdown fix backporting review for lts 4.9

2018-03-05 Thread Mark Brown
On Fri, Mar 02, 2018 at 05:54:15PM +0100, Greg KH wrote:
> On Fri, Mar 02, 2018 at 05:14:50PM +0800, Alex Shi wrote:
> > On 03/01/2018 11:24 PM, Greg KH wrote:

> > > And why are you making this patchset up?  What is wrong with the patches
> > > in the android-common tree for this?

> > We believe the LTS is the base kernel for android/lsk, so the fixing
> > patches should get it first and then merge to other tree.

> But you know that android-common is already fine here, the needed
> patches are all integrated into there, so no additional work is needed
> for android devices.  So what devices do you expect to use this 4.9
> backport?

See below...

> What is "lsk"?

The Linaro Stable Kernel, it's LTS plus some feature backports.

> But really, I don't see this need as all ARM devices that I know of that
> are stuck on 4.9.y are already using the android-common tree.  Same for
> 4.4.y.  Do you know of any that are not, and that can not just use
> 4.14.y instead?

There's way more to ARM than just Android systems, assuming that getting
things into the Android kernel is enough is like assuming that x86 is
covered since the distros have their own backports - it covers a lot of
users but not everyone.  Off the top of my head there's things like
routers, NASs, cameras, IoT, radio systems, industrial appliances, set
top boxes and these days even servers.  Most of these segments are just
as conservative about taking new kernel versions on shipping product as
the phone vendors are, the practices that make people relucant to take
bigger updates in production are general engineering practices common
across industry.

I mostly talk to chip vendors so I can't off the top of my head name
specific end products with particular kernel versions.  What I can tell
you is that many of the chip vendors care deeply about LTS because their
customers demand it - off the top of my head at least Atmel, ST and TI
ship vanilla LTS kernels with no Android at all into large market
segments.  Some of these chips couldn't usefully run Android so there's
just no Android support, some also have Android available as an
alternative.  Some of them even have very complete upstream support
available with barely any vendor patch required at all (none in some
applications).

Things that are functioning well will inevitably be less visible -
that's good, that means there's less of a pain point there but it
doesn't mean there's not still a support need.


signature.asc
Description: PGP signature


Re: [PATCH v8 0/8] livepatch: Atomic replace feature

2018-03-05 Thread Miroslav Benes
On Mon, 5 Mar 2018, Evgenii Shatokhin wrote:

> Hi,

Hi,
 
> > The atomic replace allows to create cumulative patches. They
> > are useful when you maintain many livepatches and want to remove
> > one that is lower on the stack. In addition it is very useful when
> > more patches touch the same function and there are dependencies
> > between them.
> 
> I have experimented with this updated patchset a bit.
> It looks like replace operation fails if there is a loaded but disabled patch.
> 
> Suppose there are 2 binary patches changing the same kernel functions. Load
> patch1, then disable it (echo 0 > /sys/kernel/livepatch/patch1/enabled), then
> try load patch2 with atomic replace. I get "Device or resource busy" error at
> this point.
> 
> It seems, __klp_enable_patch() returns -EBUSY for some reason. I haven't dug
> deeper into that code yet.

Yes, it is "enforce stacking" check in __klp_enable_patch():

/* enforce stacking: only the first disabled patch can be enabled */
if (patch->list.prev != _patches &&
!list_prev_entry(patch, list)->enabled)
return -EBUSY;

So not connected to this patch set. We've had the behaviour for quite some 
time.

Miroslav
 
> A workaround is simple: just unload all disabled patches before trying to load
> patch2 with replace. Still, the behavior is quite strange.


Re: [RFC/RFT][PATCH 6/7] sched: idle: Predict idle duration before stopping the tick

2018-03-05 Thread Rafael J. Wysocki
On Mon, Mar 5, 2018 at 1:42 PM, Peter Zijlstra  wrote:
> On Mon, Mar 05, 2018 at 01:07:07PM +0100, Rafael J. Wysocki wrote:
>> On Mon, Mar 5, 2018 at 12:50 PM, Rafael J. Wysocki  wrote:
>> > On Mon, Mar 5, 2018 at 12:45 PM, Peter Zijlstra  
>> > wrote:
>
>> >> So I think this is entirely wrong, I would much rather see something
>> >> like:
>> >>
>> >> tick_nohz_idle_go_idle(next_state->nohz);
>> >>
>> >> Where the selected state itself has the nohz property or not.
>> >
>> > Can you elaborate here, I'm not following?
>> >
>> >> We can always insert an extra state at whatever the right boundary point
>> >> is for nohz if it doesn't line up with an existing point.
>>
>> OK, I guess I know what you mean: to add a state flag meaning "stop
>> the tick if this state is selected".
>
> Yes, that.
>
>> That could work, but I see problems, like having to go through all of
>> the already defined states and deciding what to do with them.
>
> Shouldn't be too hard, upon registering a cpuidle driver to the cpuidle
> core, the core could go through the provided states and flag all those <
> TICK_USEC as not stopping, all those > TICK_USEC as stopping and
> splitting the state we'd select for TICK_NSEC sleeps, stopping it for <
> and disabling it for >.
>

Well, on Intel everything below C8 has target residencies below 1 ms. :-)

I think that we want C6 to be "nohz" too, though, at least in some cases.

And what about C3 if C6 is disabled?


Re: [PATCH 01/12] lightnvm: simplify geometry structure.

2018-03-05 Thread Matias Bjørling

On 03/02/2018 04:21 PM, Javier González wrote:

Currently, the device geometry is stored redundantly in the nvm_id and
nvm_geo structures at a device level. Moreover, when instantiating
targets on a specific number of LUNs, these structures are replicated
and manually modified to fit the instance channel and LUN partitioning.

Instead, create a generic geometry around nvm_geo, which can be used by
(i) the underlying device to describe the geometry of the whole device,
and (ii) instances to describe their geometry independently.

Signed-off-by: Javier González 
---
  drivers/lightnvm/core.c  |  70 +++-
  drivers/lightnvm/pblk-core.c |  16 +-
  drivers/lightnvm/pblk-gc.c   |   2 +-
  drivers/lightnvm/pblk-init.c | 113 +++--
  drivers/lightnvm/pblk-read.c |   2 +-
  drivers/lightnvm/pblk-recovery.c |  14 +-
  drivers/lightnvm/pblk-rl.c   |   2 +-
  drivers/lightnvm/pblk-sysfs.c|  35 ++--
  drivers/lightnvm/pblk-write.c|   2 +-
  drivers/lightnvm/pblk.h  |  83 --
  drivers/nvme/host/lightnvm.c | 339 +++
  include/linux/lightnvm.h | 198 +++
  12 files changed, 451 insertions(+), 425 deletions(-)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index 19c46ebb1b91..9a417d9cdf0c 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -155,7 +155,7 @@ static struct nvm_tgt_dev *nvm_create_tgt_dev(struct 
nvm_dev *dev,
int blun = lun_begin % dev->geo.nr_luns;
int lunid = 0;
int lun_balanced = 1;
-   int prev_nr_luns;
+   int sec_per_lun, prev_nr_luns;
int i, j;
  
  	nr_chnls = (nr_chnls_mod == 0) ? nr_chnls : nr_chnls + 1;

@@ -215,18 +215,23 @@ static struct nvm_tgt_dev *nvm_create_tgt_dev(struct 
nvm_dev *dev,
if (!tgt_dev)
goto err_ch;
  
+	/* Inherit device geometry from parent */

memcpy(_dev->geo, >geo, sizeof(struct nvm_geo));
+
/* Target device only owns a portion of the physical device */
tgt_dev->geo.nr_chnls = nr_chnls;
-   tgt_dev->geo.all_luns = nr_luns;
tgt_dev->geo.nr_luns = (lun_balanced) ? prev_nr_luns : -1;
+   tgt_dev->geo.all_luns = nr_luns;
+   tgt_dev->geo.all_chunks = nr_luns * dev->geo.nr_chks;
+
tgt_dev->geo.op = op;
-   tgt_dev->total_secs = nr_luns * tgt_dev->geo.sec_per_lun;
+
+   sec_per_lun = dev->geo.clba * dev->geo.nr_chks;
+   tgt_dev->geo.total_secs = nr_luns * sec_per_lun;
+
tgt_dev->q = dev->q;
tgt_dev->map = dev_map;
tgt_dev->luns = luns;
-   memcpy(_dev->identity, >identity, sizeof(struct nvm_id));
-
tgt_dev->parent = dev;
  
  	return tgt_dev;

@@ -296,8 +301,6 @@ static int __nvm_config_simple(struct nvm_dev *dev,
  static int __nvm_config_extended(struct nvm_dev *dev,
 struct nvm_ioctl_create_extended *e)
  {
-   struct nvm_geo *geo = >geo;
-
if (e->lun_begin == 0x && e->lun_end == 0x) {
e->lun_begin = 0;
e->lun_end = dev->geo.all_luns - 1;
@@ -311,7 +314,7 @@ static int __nvm_config_extended(struct nvm_dev *dev,
return -EINVAL;
}
  
-	return nvm_config_check_luns(geo, e->lun_begin, e->lun_end);

+   return nvm_config_check_luns(>geo, e->lun_begin, e->lun_end);
  }
  
  static int nvm_create_tgt(struct nvm_dev *dev, struct nvm_ioctl_create *create)

@@ -406,7 +409,7 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
tqueue->queuedata = targetdata;
  
  	blk_queue_max_hw_sectors(tqueue,

-   (dev->geo.sec_size >> 9) * NVM_MAX_VLBA);
+   (dev->geo.csecs >> 9) * NVM_MAX_VLBA);
  
  	set_capacity(tdisk, tt->capacity(targetdata));

add_disk(tdisk);
@@ -841,40 +844,9 @@ EXPORT_SYMBOL(nvm_get_tgt_bb_tbl);
  
  static int nvm_core_init(struct nvm_dev *dev)

  {
-   struct nvm_id *id = >identity;
struct nvm_geo *geo = >geo;
int ret;
  
-	memcpy(>ppaf, >ppaf, sizeof(struct nvm_addr_format));

-
-   if (id->mtype != 0) {
-   pr_err("nvm: memory type not supported\n");
-   return -EINVAL;
-   }
-
-   /* Whole device values */
-   geo->nr_chnls = id->num_ch;
-   geo->nr_luns = id->num_lun;
-
-   /* Generic device geometry values */
-   geo->ws_min = id->ws_min;
-   geo->ws_opt = id->ws_opt;
-   geo->ws_seq = id->ws_seq;
-   geo->ws_per_chk = id->ws_per_chk;
-   geo->nr_chks = id->num_chk;
-   geo->mccap = id->mccap;
-
-   geo->sec_per_chk = id->clba;
-   geo->sec_per_lun = geo->sec_per_chk * geo->nr_chks;
-   geo->all_luns = geo->nr_luns * geo->nr_chnls;
-
-   /* 1.2 spec device geometry values */
-   geo->plane_mode = 1 << geo->ws_seq;
-   geo->nr_planes = geo->ws_opt / geo->ws_min;
-   geo->sec_per_pg = geo->ws_min;
-   

Re: Would you help to tell why async printk solution was not taken to upstream kernel ?

2018-03-05 Thread Petr Mladek
On Mon 2018-03-05 14:56:59, Qixuan.Wu wrote:
> Hi Steve,  
> 
> On Sun, 04 Mar 2018 23:43:23 +0800
> Steven Rostedt  wrote:
> 
> > Yes, people keep bringing up this scenario.
> > It would require a single burst of printks to all CPUs. And then no
> > more printks after that. The last one will end up printing the entire
> > buffer out the slow console. The thing is, this is a bounded time, and
> > no printk will print more than one full buffer worth.
> 
> > If this is a worry, then set the timeouts for the lockup detection to
> > be longer than the time it takes to print one full buffer with the
> > slowest console.
> 
> Thanks for your information and suggestion. We will think of backport 
> the code as per the workload, or recently, maybe we will think of disable 
> ttyS0 console just for the printk. 

Please, share the log if you still see a soft/hard lockups with the 4
commits (console waiter logic). It would help to improve the solution.

We need some justification to make the printk code more complicated.
Also many possible solutions might improve some scenarios and make
worse some others. Therefore we need data to make decisions.

Best Regards,
Petr


Re: [RESEND PATCH v6 03/14] iommu/rockchip: Request irqs in rk_iommu_probe()

2018-03-05 Thread Tomasz Figa
Hi Jeffy,

On Thu, Mar 1, 2018 at 7:18 PM, Jeffy Chen  wrote:
> Move request_irq to the end of rk_iommu_probe().
>
> Suggested-by: Robin Murphy 
> Signed-off-by: Jeffy Chen 
> ---
>
> Changes in v6: None
> Changes in v5: None
> Changes in v4: None
> Changes in v3:
> Loop platform_get_irq() as Robin suggested.
>
> Changes in v2: None
>
>  drivers/iommu/rockchip-iommu.c | 38 +-
>  1 file changed, 9 insertions(+), 29 deletions(-)
>

Reviewed-by: Tomasz Figa 

Best regards,
Tomasz


Re: [PATCH] lightnvm: pblk: refactor init/exit sequences

2018-03-05 Thread Javier González
> On 5 Mar 2018, at 14.38, Matias Bjørling  wrote:
> 
> On 03/01/2018 08:29 PM, Javier González wrote:
>>> On 1 Mar 2018, at 19.49, Matias Bjørling  wrote:
>>> 
>>> On 03/01/2018 04:59 PM, Javier González wrote:
 Refactor init and exit sequences to eliminate dependencies among init
 modules and improve readability.
 Signed-off-by: Javier González 
 ---
  drivers/lightnvm/pblk-init.c | 415 
 +--
  1 file changed, 206 insertions(+), 209 deletions(-)
 diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
 index 25fc70ca07f7..87c390667dd6 100644
 --- a/drivers/lightnvm/pblk-init.c
 +++ b/drivers/lightnvm/pblk-init.c
 @@ -103,7 +103,40 @@ static void pblk_l2p_free(struct pblk *pblk)
vfree(pblk->trans_map);
  }
  -static int pblk_l2p_init(struct pblk *pblk)
 +static int pblk_l2p_recover(struct pblk *pblk, bool factory_init)
 +{
 +  struct pblk_line *line = NULL;
 +
 +  if (factory_init) {
 +  pblk_setup_uuid(pblk);
 +  } else {
 +  line = pblk_recov_l2p(pblk);
 +  if (IS_ERR(line)) {
 +  pr_err("pblk: could not recover l2p table\n");
 +  return -EFAULT;
 +  }
 +  }
 +
 +#ifdef CONFIG_NVM_DEBUG
 +  pr_info("pblk init: L2P CRC: %x\n", pblk_l2p_crc(pblk));
 +#endif
 +
 +  /* Free full lines directly as GC has not been started yet */
 +  pblk_gc_free_full_lines(pblk);
 +
 +  if (!line) {
 +  /* Configure next line for user data */
 +  line = pblk_line_get_first_data(pblk);
 +  if (!line) {
 +  pr_err("pblk: line list corrupted\n");
 +  return -EFAULT;
 +  }
 +  }
 +
 +  return 0;
 +}
 +
 +static int pblk_l2p_init(struct pblk *pblk, bool factory_init)
  {
sector_t i;
struct ppa_addr ppa;
 @@ -119,7 +152,7 @@ static int pblk_l2p_init(struct pblk *pblk)
for (i = 0; i < pblk->rl.nr_secs; i++)
pblk_trans_map_set(pblk, i, ppa);
  - return 0;
 +  return pblk_l2p_recover(pblk, factory_init);
  }
static void pblk_rwb_free(struct pblk *pblk)
 @@ -159,7 +192,13 @@ static int pblk_set_ppaf(struct pblk *pblk)
struct nvm_tgt_dev *dev = pblk->dev;
struct nvm_geo *geo = >geo;
struct nvm_addr_format ppaf = geo->ppaf;
 -  int power_len;
 +  int mod, power_len;
 +
 +  div_u64_rem(geo->sec_per_chk, pblk->min_write_pgs, );
 +  if (mod) {
 +  pr_err("pblk: bad configuration of sectors/pages\n");
 +  return -EINVAL;
 +  }
/* Re-calculate channel and lun format to adapt to 
 configuration */
power_len = get_count_order(geo->nr_chnls);
 @@ -252,12 +291,39 @@ static int pblk_core_init(struct pblk *pblk)
  {
struct nvm_tgt_dev *dev = pblk->dev;
struct nvm_geo *geo = >geo;
 +  int max_write_ppas;
 +
 +  atomic64_set(>user_wa, 0);
 +  atomic64_set(>pad_wa, 0);
 +  atomic64_set(>gc_wa, 0);
 +  pblk->user_rst_wa = 0;
 +  pblk->pad_rst_wa = 0;
 +  pblk->gc_rst_wa = 0;
 +
 +  atomic64_set(>nr_flush, 0);
 +  pblk->nr_flush_rst = 0;
pblk->pgs_in_buffer = NVM_MEM_PAGE_WRITE * geo->sec_per_pg *
geo->nr_planes * geo->all_luns;
  + pblk->min_write_pgs = geo->sec_per_pl * (geo->sec_size / PAGE_SIZE);
 +  max_write_ppas = pblk->min_write_pgs * geo->all_luns;
 +  pblk->max_write_pgs = min_t(int, max_write_ppas, NVM_MAX_VLBA);
 +  pblk_set_sec_per_write(pblk, pblk->min_write_pgs);
 +
 +  if (pblk->max_write_pgs > PBLK_MAX_REQ_ADDRS) {
 +  pr_err("pblk: vector list too big(%u > %u)\n",
 +  pblk->max_write_pgs, PBLK_MAX_REQ_ADDRS);
 +  return -EINVAL;
 +  }
 +
 +  pblk->pad_dist = kzalloc((pblk->min_write_pgs - 1) * sizeof(atomic64_t),
 +  GFP_KERNEL);
 +  if (!pblk->pad_dist)
 +  return -ENOMEM;
 +
if (pblk_init_global_caches(pblk))
 -  return -ENOMEM;
 +  goto fail_free_pad_dist;
/* Internal bios can be at most the sectors signaled by the 
 device. */
pblk->page_bio_pool = mempool_create_page_pool(NVM_MAX_VLBA, 0);
 @@ -307,10 +373,8 @@ static int pblk_core_init(struct pblk *pblk)
if (pblk_set_ppaf(pblk))
goto free_r_end_wq;
  - if (pblk_rwb_init(pblk))
 -  goto free_r_end_wq;
 -
INIT_LIST_HEAD(>compl_list);
 +
return 0;
free_r_end_wq:
 @@ -333,6 +397,8 @@ static int pblk_core_init(struct pblk *pblk)

Re: [PATCH 8/9] drm/xen-front: Implement GEM operations

2018-03-05 Thread Oleksandr Andrushchenko

On 03/05/2018 11:32 AM, Daniel Vetter wrote:

On Wed, Feb 21, 2018 at 10:03:41AM +0200, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 

Implement GEM handling depending on driver mode of operation:
depending on the requirements for the para-virtualized environment, namely
requirements dictated by the accompanying DRM/(v)GPU drivers running in both
host and guest environments, number of operating modes of para-virtualized
display driver are supported:
  - display buffers can be allocated by either frontend driver or backend
  - display buffers can be allocated to be contiguous in memory or not

Note! Frontend driver itself has no dependency on contiguous memory for
its operation.

1. Buffers allocated by the frontend driver.

The below modes of operation are configured at compile-time via
frontend driver's kernel configuration.

1.1. Front driver configured to use GEM CMA helpers
  This use-case is useful when used with accompanying DRM/vGPU driver in
  guest domain which was designed to only work with contiguous buffers,
  e.g. DRM driver based on GEM CMA helpers: such drivers can only import
  contiguous PRIME buffers, thus requiring frontend driver to provide
  such. In order to implement this mode of operation para-virtualized
  frontend driver can be configured to use GEM CMA helpers.

1.2. Front driver doesn't use GEM CMA
  If accompanying drivers can cope with non-contiguous memory then, to
  lower pressure on CMA subsystem of the kernel, driver can allocate
  buffers from system memory.

Note! If used with accompanying DRM/(v)GPU drivers this mode of operation
may require IOMMU support on the platform, so accompanying DRM/vGPU
hardware can still reach display buffer memory while importing PRIME
buffers from the frontend driver.

2. Buffers allocated by the backend

This mode of operation is run-time configured via guest domain configuration
through XenStore entries.

For systems which do not provide IOMMU support, but having specific
requirements for display buffers it is possible to allocate such buffers
at backend side and share those with the frontend.
For example, if host domain is 1:1 mapped and has DRM/GPU hardware expecting
physically contiguous memory, this allows implementing zero-copying
use-cases.

Note! Configuration options 1.1 (contiguous display buffers) and 2 (backend
allocated buffers) are not supported at the same time.

Signed-off-by: Oleksandr Andrushchenko 

Some suggestions below for some larger cleanup work.
-Daniel


---
  drivers/gpu/drm/xen/Kconfig |  13 +
  drivers/gpu/drm/xen/Makefile|   6 +
  drivers/gpu/drm/xen/xen_drm_front.h |  74 ++
  drivers/gpu/drm/xen/xen_drm_front_drv.c |  80 ++-
  drivers/gpu/drm/xen/xen_drm_front_drv.h |   1 +
  drivers/gpu/drm/xen/xen_drm_front_gem.c | 360 
  drivers/gpu/drm/xen/xen_drm_front_gem.h |  46 
  drivers/gpu/drm/xen/xen_drm_front_gem_cma.c |  93 +++
  8 files changed, 667 insertions(+), 6 deletions(-)
  create mode 100644 drivers/gpu/drm/xen/xen_drm_front_gem.c
  create mode 100644 drivers/gpu/drm/xen/xen_drm_front_gem.h
  create mode 100644 drivers/gpu/drm/xen/xen_drm_front_gem_cma.c

diff --git a/drivers/gpu/drm/xen/Kconfig b/drivers/gpu/drm/xen/Kconfig
index 4cca160782ab..4f4abc91f3b6 100644
--- a/drivers/gpu/drm/xen/Kconfig
+++ b/drivers/gpu/drm/xen/Kconfig
@@ -15,3 +15,16 @@ config DRM_XEN_FRONTEND
help
  Choose this option if you want to enable a para-virtualized
  frontend DRM/KMS driver for Xen guest OSes.
+
+config DRM_XEN_FRONTEND_CMA
+   bool "Use DRM CMA to allocate dumb buffers"
+   depends on DRM_XEN_FRONTEND
+   select DRM_KMS_CMA_HELPER
+   select DRM_GEM_CMA_HELPER
+   help
+ Use DRM CMA helpers to allocate display buffers.
+ This is useful for the use-cases when guest driver needs to
+ share or export buffers to other drivers which only expect
+ contiguous buffers.
+ Note: in this mode driver cannot use buffers allocated
+ by the backend.
diff --git a/drivers/gpu/drm/xen/Makefile b/drivers/gpu/drm/xen/Makefile
index 4fcb0da1a9c5..12376ec78fbc 100644
--- a/drivers/gpu/drm/xen/Makefile
+++ b/drivers/gpu/drm/xen/Makefile
@@ -8,4 +8,10 @@ drm_xen_front-objs := xen_drm_front.o \
  xen_drm_front_shbuf.o \
  xen_drm_front_cfg.o
  
+ifeq ($(CONFIG_DRM_XEN_FRONTEND_CMA),y)

+   drm_xen_front-objs += xen_drm_front_gem_cma.o
+else
+   drm_xen_front-objs += xen_drm_front_gem.o
+endif
+
  obj-$(CONFIG_DRM_XEN_FRONTEND) += drm_xen_front.o
diff --git a/drivers/gpu/drm/xen/xen_drm_front.h 
b/drivers/gpu/drm/xen/xen_drm_front.h
index 9ed5bfb248d0..c6f52c892434 100644
--- a/drivers/gpu/drm/xen/xen_drm_front.h
+++ b/drivers/gpu/drm/xen/xen_drm_front.h
@@ -34,6 +34,80 @@
  
  

Re: [RFC/RFT][PATCH 6/7] sched: idle: Predict idle duration before stopping the tick

2018-03-05 Thread Peter Zijlstra
On Mon, Mar 05, 2018 at 02:37:25PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 05, 2018 at 08:19:15AM -0500, Rik van Riel wrote:

> > > Also, I think that at this point you've introduced a problem; by not
> > > disabling the tick unconditionally, we'll have extra wakeups due to
> > > the (now still running) tick, which will bias the estimation, as per
> > > reflect(), downwards.
> > > 
> > > We should effectively discard tick wakeups when we could have
> > > entered nohz but didn't, accumulating the idle period in reflect and
> > > only commit once we get a !tick wakeup.
> > 
> > How much of a problem would that actually be?
> > 
> > Don't all but the very deepest C-states have
> > target residencies that are orders of magnitude
> > smaller than the tick period?
> > 
> > In other words, if our sleeps end up getting
> > "cut short" to 600us, we will still select C6,
> > and it will not result in picking C3 by mistake.
> > 
> > This only seems to affect C7 states and deeper.
> 
> On modern Intel, what about other platforms? This is something that
> should work across the board.

Look at this for example:

arch/arm64/boot/dts/hisilicon/hi3660.dtsi:  
min-residency-us = <2>;

That's 20ms right there..

But on average, considering ARM64 defaults to HZ=250, most of them are


Re: [PATCH] lightnvm: pblk: refactor init/exit sequences

2018-03-05 Thread Matias Bjørling

On 03/05/2018 02:45 PM, Javier González wrote:

On 5 Mar 2018, at 14.38, Matias Bjørling  wrote:

On 03/01/2018 08:29 PM, Javier González wrote:

On 1 Mar 2018, at 19.49, Matias Bjørling  wrote:

On 03/01/2018 04:59 PM, Javier González wrote:

Refactor init and exit sequences to eliminate dependencies among init
modules and improve readability.
Signed-off-by: Javier González 
---
  drivers/lightnvm/pblk-init.c | 415 +--
  1 file changed, 206 insertions(+), 209 deletions(-)
diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 25fc70ca07f7..87c390667dd6 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -103,7 +103,40 @@ static void pblk_l2p_free(struct pblk *pblk)
vfree(pblk->trans_map);
  }
  -static int pblk_l2p_init(struct pblk *pblk)
+static int pblk_l2p_recover(struct pblk *pblk, bool factory_init)
+{
+   struct pblk_line *line = NULL;
+
+   if (factory_init) {
+   pblk_setup_uuid(pblk);
+   } else {
+   line = pblk_recov_l2p(pblk);
+   if (IS_ERR(line)) {
+   pr_err("pblk: could not recover l2p table\n");
+   return -EFAULT;
+   }
+   }
+
+#ifdef CONFIG_NVM_DEBUG
+   pr_info("pblk init: L2P CRC: %x\n", pblk_l2p_crc(pblk));
+#endif
+
+   /* Free full lines directly as GC has not been started yet */
+   pblk_gc_free_full_lines(pblk);
+
+   if (!line) {
+   /* Configure next line for user data */
+   line = pblk_line_get_first_data(pblk);
+   if (!line) {
+   pr_err("pblk: line list corrupted\n");
+   return -EFAULT;
+   }
+   }
+
+   return 0;
+}
+
+static int pblk_l2p_init(struct pblk *pblk, bool factory_init)
  {
sector_t i;
struct ppa_addr ppa;
@@ -119,7 +152,7 @@ static int pblk_l2p_init(struct pblk *pblk)
for (i = 0; i < pblk->rl.nr_secs; i++)
pblk_trans_map_set(pblk, i, ppa);
  - return 0;
+   return pblk_l2p_recover(pblk, factory_init);
  }
static void pblk_rwb_free(struct pblk *pblk)
@@ -159,7 +192,13 @@ static int pblk_set_ppaf(struct pblk *pblk)
struct nvm_tgt_dev *dev = pblk->dev;
struct nvm_geo *geo = >geo;
struct nvm_addr_format ppaf = geo->ppaf;
-   int power_len;
+   int mod, power_len;
+
+   div_u64_rem(geo->sec_per_chk, pblk->min_write_pgs, );
+   if (mod) {
+   pr_err("pblk: bad configuration of sectors/pages\n");
+   return -EINVAL;
+   }
/* Re-calculate channel and lun format to adapt to configuration */
power_len = get_count_order(geo->nr_chnls);
@@ -252,12 +291,39 @@ static int pblk_core_init(struct pblk *pblk)
  {
struct nvm_tgt_dev *dev = pblk->dev;
struct nvm_geo *geo = >geo;
+   int max_write_ppas;
+
+   atomic64_set(>user_wa, 0);
+   atomic64_set(>pad_wa, 0);
+   atomic64_set(>gc_wa, 0);
+   pblk->user_rst_wa = 0;
+   pblk->pad_rst_wa = 0;
+   pblk->gc_rst_wa = 0;
+
+   atomic64_set(>nr_flush, 0);
+   pblk->nr_flush_rst = 0;
pblk->pgs_in_buffer = NVM_MEM_PAGE_WRITE * geo->sec_per_pg *
geo->nr_planes * geo->all_luns;
  + pblk->min_write_pgs = geo->sec_per_pl * (geo->sec_size / PAGE_SIZE);
+   max_write_ppas = pblk->min_write_pgs * geo->all_luns;
+   pblk->max_write_pgs = min_t(int, max_write_ppas, NVM_MAX_VLBA);
+   pblk_set_sec_per_write(pblk, pblk->min_write_pgs);
+
+   if (pblk->max_write_pgs > PBLK_MAX_REQ_ADDRS) {
+   pr_err("pblk: vector list too big(%u > %u)\n",
+   pblk->max_write_pgs, PBLK_MAX_REQ_ADDRS);
+   return -EINVAL;
+   }
+
+   pblk->pad_dist = kzalloc((pblk->min_write_pgs - 1) * sizeof(atomic64_t),
+   GFP_KERNEL);
+   if (!pblk->pad_dist)
+   return -ENOMEM;
+
if (pblk_init_global_caches(pblk))
-   return -ENOMEM;
+   goto fail_free_pad_dist;
/* Internal bios can be at most the sectors signaled by the device. */
pblk->page_bio_pool = mempool_create_page_pool(NVM_MAX_VLBA, 0);
@@ -307,10 +373,8 @@ static int pblk_core_init(struct pblk *pblk)
if (pblk_set_ppaf(pblk))
goto free_r_end_wq;
  - if (pblk_rwb_init(pblk))
-   goto free_r_end_wq;
-
INIT_LIST_HEAD(>compl_list);
+
return 0;
free_r_end_wq:
@@ -333,6 +397,8 @@ static int pblk_core_init(struct pblk *pblk)
mempool_destroy(pblk->page_bio_pool);
  free_global_caches:
pblk_free_global_caches(pblk);
+fail_free_pad_dist:
+   kfree(pblk->pad_dist);
return -ENOMEM;
  }
  @@ -354,14 +420,8 @@ static void 

Re: [PATCH] ARM: dts: rockchip: Add dp83867 CLK_OUT muxing

2018-03-05 Thread Heiko Stuebner
Hi Daniel,

Am Montag, 5. März 2018, 13:45:11 CET schrieb Daniel Schultz:
> The CLK_O_SEL default is synchronous to XI input clock, which is 25 MHz.
> Set CLK_O_SEL to channel A transmit clock so we have 125 MHz on CLK_OUT.
> 
> Signed-off-by: Daniel Schultz 
> ---
> 
> The binding will be added with the next merge of net-next:
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=9708fb630d19ee51ae3aeb3a533e3010da0e8570

I did find the commit, but no related change of the dp83867 dt binding
document [0], including a review by dt-maintainers.

While your property does not look overly complicated, the binding
should be updated nontheless.


Heiko

[0] 
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/log/Documentation/devicetree/bindings/net/ti,dp83867.txt?id=9708fb630d19ee51ae3aeb3a533e3010da0e8570

>  arch/arm/boot/dts/rk3288-phycore-som.dtsi | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/boot/dts/rk3288-phycore-som.dtsi 
> b/arch/arm/boot/dts/rk3288-phycore-som.dtsi
> index bdd80aa..e60535d 100644
> --- a/arch/arm/boot/dts/rk3288-phycore-som.dtsi
> +++ b/arch/arm/boot/dts/rk3288-phycore-som.dtsi
> @@ -141,6 +141,7 @@
>   ti,tx-internal-delay = ;
>   ti,fifo-depth = ;
>   enet-phy-lane-no-swap;
> + ti,clk-output-sel = ;
>   };
>   };
>  };
> 




[PATCH 28/28] perf mmap: Discard legacy interfaces for mmap read forward

2018-03-05 Thread Arnaldo Carvalho de Melo
From: Kan Liang 

Discards legacy interfaces perf_evlist__mmap_read_forward(),
perf_evlist__mmap_read() and perf_evlist__mmap_consume().

No tools use them.

Signed-off-by: Kan Liang 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-14-git-send-email-kan.li...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evlist.c | 25 +
 tools/perf/util/evlist.h |  4 
 tools/perf/util/mmap.c   | 21 +
 3 files changed, 2 insertions(+), 48 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7b7d535396f7..41a4666f1519 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -702,29 +702,6 @@ static int perf_evlist__resume(struct perf_evlist *evlist)
return perf_evlist__set_paused(evlist, false);
 }
 
-union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, 
int idx)
-{
-   struct perf_mmap *md = >mmap[idx];
-
-   /*
-* Check messup is required for forward overwritable ring buffer:
-* memory pointed by md->prev can be overwritten in this case.
-* No need for read-write ring buffer: kernel stop outputting when
-* it hit md->prev (perf_mmap__consume()).
-*/
-   return perf_mmap__read_forward(md);
-}
-
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
-{
-   return perf_evlist__mmap_read_forward(evlist, idx);
-}
-
-void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
-{
-   perf_mmap__consume(>mmap[idx], false);
-}
-
 static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
 {
int i;
@@ -761,7 +738,7 @@ static struct perf_mmap *perf_evlist__alloc_mmap(struct 
perf_evlist *evlist)
map[i].fd = -1;
/*
 * When the perf_mmap() call is made we grab one refcount, plus
-* one extra to let perf_evlist__mmap_consume() get the last
+* one extra to let perf_mmap__consume() get the last
 * events after all real references (perf_mmap__get()) are
 * dropped.
 *
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 336b838e6957..6c41b2f78713 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -129,10 +129,6 @@ struct perf_sample_id *perf_evlist__id2sid(struct 
perf_evlist *evlist, u64 id);
 
 void perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist, enum 
bkw_mmap_state state);
 
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
-
-union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist,
-int idx);
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
 
 int perf_evlist__open(struct perf_evlist *evlist);
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 91531a7c8fbf..4f27c464ce0b 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -63,25 +63,6 @@ static union perf_event *perf_mmap__read(struct perf_mmap 
*map,
return event;
 }
 
-/*
- * legacy interface for mmap read.
- * Don't use it. Use perf_mmap__read_event().
- */
-union perf_event *perf_mmap__read_forward(struct perf_mmap *map)
-{
-   u64 head;
-
-   /*
-* Check if event was unmapped due to a POLLHUP/POLLERR.
-*/
-   if (!refcount_read(>refcnt))
-   return NULL;
-
-   head = perf_mmap__read_head(map);
-
-   return perf_mmap__read(map, >prev, head);
-}
-
 /*
  * Read event from ring buffer one by one.
  * Return one event for each call.
@@ -191,7 +172,7 @@ void perf_mmap__munmap(struct perf_mmap *map)
 int perf_mmap__mmap(struct perf_mmap *map, struct mmap_params *mp, int fd)
 {
/*
-* The last one will be done at perf_evlist__mmap_consume(), so that we
+* The last one will be done at perf_mmap__consume(), so that we
 * make sure we don't prevent tools from consuming every last event in
 * the ring buffer.
 *
-- 
2.14.3



[PATCH 4/6] bus: fsl-mc: remove dma ops setup from driver

2018-03-05 Thread Nipun Gupta
The dma setup for fsl-mc devices is being done from device_add()
function. So, no need to call in mc bus driver.

Signed-off-by: Nipun Gupta 
---
 drivers/bus/fsl-mc/fsl-mc-bus.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 1b333c4..c9a239a 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -616,6 +616,7 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc *obj_desc,
mc_dev->icid = parent_mc_dev->icid;
mc_dev->dma_mask = FSL_MC_DEFAULT_DMA_MASK;
mc_dev->dev.dma_mask = _dev->dma_mask;
+   mc_dev->dev.coherent_dma_mask = mc_dev->dma_mask;
dev_set_msi_domain(_dev->dev,
   dev_get_msi_domain(_mc_dev->dev));
}
@@ -633,10 +634,6 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc *obj_desc,
goto error_cleanup_dev;
}
 
-   /* Objects are coherent, unless 'no shareability' flag set. */
-   if (!(obj_desc->flags & FSL_MC_OBJ_FLAG_NO_MEM_SHAREABILITY))
-   arch_setup_dma_ops(_dev->dev, 0, 0, NULL, true);
-
/*
 * The device-specific probe callback will get invoked by device_add()
 */
-- 
1.9.1



[PATCH 22/28] perf test: Switch to new perf_mmap__read_event() interface for tp fields

2018-03-05 Thread Arnaldo Carvalho de Melo
From: Kan Liang 

The perf test 'syscalls:sys_enter_openat event fields' still use the
legacy interface.

No functional change.

Committer notes:

Testing it:

  # perf test sys_enter_openat
  15: syscalls:sys_enter_openat event fields: Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-8-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/openat-syscall-tp-fields.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/openat-syscall-tp-fields.c 
b/tools/perf/tests/openat-syscall-tp-fields.c
index 43519267b93b..620b21023f72 100644
--- a/tools/perf/tests/openat-syscall-tp-fields.c
+++ b/tools/perf/tests/openat-syscall-tp-fields.c
@@ -86,8 +86,14 @@ int test__syscall_openat_tp_fields(struct test *test 
__maybe_unused, int subtest
 
for (i = 0; i < evlist->nr_mmaps; i++) {
union perf_event *event;
+   struct perf_mmap *md;
+   u64 end, start;
 
-   while ((event = perf_evlist__mmap_read(evlist, i)) != 
NULL) {
+   md = >mmap[i];
+   if (perf_mmap__read_init(md, false, , ) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, 
, end)) != NULL) {
const u32 type = event->header.type;
int tp_flags;
struct perf_sample sample;
@@ -95,7 +101,7 @@ int test__syscall_openat_tp_fields(struct test *test 
__maybe_unused, int subtest
++nr_events;
 
if (type != PERF_RECORD_SAMPLE) {
-   perf_evlist__mmap_consume(evlist, i);
+   perf_mmap__consume(md, false);
continue;
}
 
@@ -115,6 +121,7 @@ int test__syscall_openat_tp_fields(struct test *test 
__maybe_unused, int subtest
 
goto out_ok;
}
+   perf_mmap__read_done(md);
}
 
if (nr_events == before)
-- 
2.14.3



[PATCH 27/28] perf test: Switch to new perf_mmap__read_event() interface for task-exit

2018-03-05 Thread Arnaldo Carvalho de Melo
From: Kan Liang 

The perf test 'task-exit' still use the legacy interface.

No functional change.

Committer notes:

Testing it:

  # perf test exit
  21: Number of exit events of a simple workload: Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-13-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/task-exit.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/task-exit.c b/tools/perf/tests/task-exit.c
index 01b62b81751b..02b0888b72a3 100644
--- a/tools/perf/tests/task-exit.c
+++ b/tools/perf/tests/task-exit.c
@@ -47,6 +47,8 @@ int test__task_exit(struct test *test __maybe_unused, int 
subtest __maybe_unused
char sbuf[STRERR_BUFSIZE];
struct cpu_map *cpus;
struct thread_map *threads;
+   struct perf_mmap *md;
+   u64 end, start;
 
signal(SIGCHLD, sig_handler);
 
@@ -110,13 +112,19 @@ int test__task_exit(struct test *test __maybe_unused, int 
subtest __maybe_unused
perf_evlist__start_workload(evlist);
 
 retry:
-   while ((event = perf_evlist__mmap_read(evlist, 0)) != NULL) {
+   md = >mmap[0];
+   if (perf_mmap__read_init(md, false, , ) < 0)
+   goto out_init;
+
+   while ((event = perf_mmap__read_event(md, false, , end)) != NULL) 
{
if (event->header.type == PERF_RECORD_EXIT)
nr_exit++;
 
-   perf_evlist__mmap_consume(evlist, 0);
+   perf_mmap__consume(md, false);
}
+   perf_mmap__read_done(md);
 
+out_init:
if (!exited || !nr_exit) {
perf_evlist__poll(evlist, -1);
goto retry;
-- 
2.14.3



[PATCH 08/28] perf tests: Rename trace+probe_libc_inet_pton to record+probe_libc_inet_pton

2018-03-05 Thread Arnaldo Carvalho de Melo
From: Jiri Olsa 

Because the test is no longer using perf trace but perf record instead.

Signed-off-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: David Ahern 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20180301165215.6780-2-jo...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 .../{trace+probe_libc_inet_pton.sh => record+probe_libc_inet_pton.sh} | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename tools/perf/tests/shell/{trace+probe_libc_inet_pton.sh => 
record+probe_libc_inet_pton.sh} (100%)

diff --git a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh 
b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
similarity index 100%
rename from tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
rename to tools/perf/tests/shell/record+probe_libc_inet_pton.sh
-- 
2.14.3



[PATCH 26/28] perf test: Switch to new perf_mmap__read_event() interface for switch-tracking

2018-03-05 Thread Arnaldo Carvalho de Melo
From: Kan Liang 

The perf test 'switch-tracking' still use the legacy interface.

No functional change.

Committer testing:

  # perf test switch
  32: Track with sched_switch   : Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-12-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/switch-tracking.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/switch-tracking.c 
b/tools/perf/tests/switch-tracking.c
index 33e00295a972..10c4dcdc2324 100644
--- a/tools/perf/tests/switch-tracking.c
+++ b/tools/perf/tests/switch-tracking.c
@@ -258,16 +258,23 @@ static int process_events(struct perf_evlist *evlist,
unsigned pos, cnt = 0;
LIST_HEAD(events);
struct event_node *events_array, *node;
+   struct perf_mmap *md;
+   u64 end, start;
int i, ret;
 
for (i = 0; i < evlist->nr_mmaps; i++) {
-   while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
+   md = >mmap[i];
+   if (perf_mmap__read_init(md, false, , ) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, , end)) 
!= NULL) {
cnt += 1;
ret = add_event(evlist, , event);
-   perf_evlist__mmap_consume(evlist, i);
+perf_mmap__consume(md, false);
if (ret < 0)
goto out_free_nodes;
}
+   perf_mmap__read_done(md);
}
 
events_array = calloc(cnt, sizeof(struct event_node));
-- 
2.14.3



[PATCH 18/28] perf test: Switch to new perf_mmap__read_event() interface for bpf

2018-03-05 Thread Arnaldo Carvalho de Melo
From: Kan Liang 

The perf test 'bpf' still use the legacy interface.

No functional change.

Committer notes:

Tested with:

  # perf test bpf
  39: BPF filter:
  39.1: Basic BPF filtering : Ok
  39.2: BPF pinning : Ok
  39.3: BPF prologue generation : Ok
  39.4: BPF relocation checker  : Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-4-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/bpf.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index e8399beca62b..09c9c9f9e827 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -176,13 +176,20 @@ static int do_test(struct bpf_object *obj, int 
(*func)(void),
 
for (i = 0; i < evlist->nr_mmaps; i++) {
union perf_event *event;
+   struct perf_mmap *md;
+   u64 end, start;
 
-   while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
+   md = >mmap[i];
+   if (perf_mmap__read_init(md, false, , ) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, , end)) 
!= NULL) {
const u32 type = event->header.type;
 
if (type == PERF_RECORD_SAMPLE)
count ++;
}
+   perf_mmap__read_done(md);
}
 
if (count != expect) {
-- 
2.14.3



[PATCH v2 2/5] tpm: migrate tpm2_shutdown() to use struct tpm_buf

2018-03-05 Thread Jarkko Sakkinen
In order to make struct tpm_buf the first class object for constructing TPM
commands, migrate tpm2_shutdown() to use it. In addition, removed the klog
entry when tpm_transmit_cmd() fails because tpm_tansmit_cmd() already
prints an error message.

Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm2-cmd.c | 44 
 1 file changed, 12 insertions(+), 32 deletions(-)

diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index 89a5397b18d2..abe6ef4a7a0b 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -27,10 +27,6 @@ enum tpm2_session_attributes {
TPM2_SA_CONTINUE_SESSION= BIT(0),
 };
 
-struct tpm2_startup_in {
-   __be16  startup_type;
-} __packed;
-
 struct tpm2_get_tpm_pt_in {
__be32  cap_id;
__be32  property_id;
@@ -55,7 +51,6 @@ struct tpm2_get_random_out {
 } __packed;
 
 union tpm2_cmd_params {
-   struct  tpm2_startup_in startup_in;
struct  tpm2_get_tpm_pt_in  get_tpm_pt_in;
struct  tpm2_get_tpm_pt_out get_tpm_pt_out;
struct  tpm2_get_random_in  getrandom_in;
@@ -410,11 +405,8 @@ void tpm2_flush_context_cmd(struct tpm_chip *chip, u32 
handle,
int rc;
 
rc = tpm_buf_init(, TPM2_ST_NO_SESSIONS, TPM2_CC_FLUSH_CONTEXT);
-   if (rc) {
-   dev_warn(>dev, "0x%08x was not flushed, out of memory\n",
-handle);
+   if (rc)
return;
-   }
 
tpm_buf_append_u32(, handle);
 
@@ -760,40 +752,28 @@ ssize_t tpm2_get_tpm_pt(struct tpm_chip *chip, u32 
property_id,  u32 *value,
 }
 EXPORT_SYMBOL_GPL(tpm2_get_tpm_pt);
 
-#define TPM2_SHUTDOWN_IN_SIZE \
-   (sizeof(struct tpm_input_header) + \
-sizeof(struct tpm2_startup_in))
-
-static const struct tpm_input_header tpm2_shutdown_header = {
-   .tag = cpu_to_be16(TPM2_ST_NO_SESSIONS),
-   .length = cpu_to_be32(TPM2_SHUTDOWN_IN_SIZE),
-   .ordinal = cpu_to_be32(TPM2_CC_SHUTDOWN)
-};
-
 /**
  * tpm2_shutdown() - send shutdown command to the TPM chip
  *
+ * In places where shutdown command is sent there's no much we can do except
+ * print the error code on a system failure.
+ *
  * @chip:  TPM chip to use.
  * @shutdown_type: shutdown type. The value is either
  * TPM_SU_CLEAR or TPM_SU_STATE.
  */
 void tpm2_shutdown(struct tpm_chip *chip, u16 shutdown_type)
 {
-   struct tpm2_cmd cmd;
+   struct tpm_buf buf;
int rc;
 
-   cmd.header.in = tpm2_shutdown_header;
-   cmd.params.startup_in.startup_type = cpu_to_be16(shutdown_type);
-
-   rc = tpm_transmit_cmd(chip, NULL, , sizeof(cmd), 0, 0,
- "stopping the TPM");
-
-   /* In places where shutdown command is sent there's no much we can do
-* except print the error code on a system failure.
-*/
-   if (rc < 0 && rc != -EPIPE)
-   dev_warn(>dev, "transmit returned %d while stopping the 
TPM",
-rc);
+   rc = tpm_buf_init(, TPM2_ST_NO_SESSIONS, TPM2_CC_SHUTDOWN);
+   if (rc)
+   return;
+   tpm_buf_append_u16(, shutdown_type);
+   tpm_transmit_cmd(chip, NULL, buf.data, PAGE_SIZE, 0, 0,
+"stopping the TPM");
+   tpm_buf_destroy();
 }
 
 /*
-- 
2.15.1



[PATCH v2 5/5] tpm: migrate tpm2_get_random() to use struct tpm_buf

2018-03-05 Thread Jarkko Sakkinen
In order to make struct tpm_buf the first class object for constructing
TPM commands, migrate tpm2_get_random() to use it. In addition, removed
remaining references to struct tpm2_cmd. All of them use it to acquire
the length of the response, which can be achieved by using
tpm_buf_length().

Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm.h  | 19 -
 drivers/char/tpm/tpm2-cmd.c | 93 +
 2 files changed, 45 insertions(+), 67 deletions(-)

diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index cccd5994a0e1..29c0717437bc 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -412,23 +412,24 @@ struct tpm_buf {
u8 *data;
 };
 
-static inline int tpm_buf_init(struct tpm_buf *buf, u16 tag, u32 ordinal)
+static inline void tpm_buf_reset(struct tpm_buf *buf, u16 tag, u32 ordinal)
 {
struct tpm_input_header *head;
+   head = (struct tpm_input_header *)buf->data;
+   head->tag = cpu_to_be16(tag);
+   head->length = cpu_to_be32(sizeof(*head));
+   head->ordinal = cpu_to_be32(ordinal);
+}
 
+static inline int tpm_buf_init(struct tpm_buf *buf, u16 tag, u32 ordinal)
+{
buf->data_page = alloc_page(GFP_HIGHUSER);
if (!buf->data_page)
return -ENOMEM;
 
buf->flags = 0;
buf->data = kmap(buf->data_page);
-
-   head = (struct tpm_input_header *) buf->data;
-
-   head->tag = cpu_to_be16(tag);
-   head->length = cpu_to_be32(sizeof(*head));
-   head->ordinal = cpu_to_be32(ordinal);
-
+   tpm_buf_reset(buf, tag, ordinal);
return 0;
 }
 
@@ -557,7 +558,7 @@ static inline u32 tpm2_rc_value(u32 rc)
 int tpm2_pcr_read(struct tpm_chip *chip, int pcr_idx, u8 *res_buf);
 int tpm2_pcr_extend(struct tpm_chip *chip, int pcr_idx, u32 count,
struct tpm2_digest *digests);
-int tpm2_get_random(struct tpm_chip *chip, u8 *out, size_t max);
+int tpm2_get_random(struct tpm_chip *chip, u8 *dest, size_t max);
 void tpm2_flush_context_cmd(struct tpm_chip *chip, u32 handle,
unsigned int flags);
 int tpm2_seal_trusted(struct tpm_chip *chip,
diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index e02f7d46e9ac..36a58af6f5c7 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -27,25 +27,6 @@ enum tpm2_session_attributes {
TPM2_SA_CONTINUE_SESSION= BIT(0),
 };
 
-struct tpm2_get_random_in {
-   __be16  size;
-} __packed;
-
-struct tpm2_get_random_out {
-   __be16  size;
-   u8  buffer[TPM_MAX_RNG_DATA];
-} __packed;
-
-union tpm2_cmd_params {
-   struct  tpm2_get_random_in  getrandom_in;
-   struct  tpm2_get_random_out getrandom_out;
-};
-
-struct tpm2_cmd {
-   tpm_cmd_header  header;
-   union tpm2_cmd_params   params;
-} __packed;
-
 struct tpm2_hash {
unsigned int crypto_id;
unsigned int tpm_id;
@@ -298,66 +279,66 @@ int tpm2_pcr_extend(struct tpm_chip *chip, int pcr_idx, 
u32 count,
 }
 
 
-#define TPM2_GETRANDOM_IN_SIZE \
-   (sizeof(struct tpm_input_header) + \
-sizeof(struct tpm2_get_random_in))
-
-static const struct tpm_input_header tpm2_getrandom_header = {
-   .tag = cpu_to_be16(TPM2_ST_NO_SESSIONS),
-   .length = cpu_to_be32(TPM2_GETRANDOM_IN_SIZE),
-   .ordinal = cpu_to_be32(TPM2_CC_GET_RANDOM)
-};
+struct tpm2_get_random_out {
+   __be16 size;
+   u8 buffer[TPM_MAX_RNG_DATA];
+} __packed;
 
 /**
  * tpm2_get_random() - get random bytes from the TPM RNG
  *
  * @chip: TPM chip to use
- * @out: destination buffer for the random bytes
+ * @dest: destination buffer for the random bytes
  * @max: the max number of bytes to write to @out
  *
  * Return:
- *Size of the output buffer, or -EIO on error.
+ * size of the output buffer when the operation is successful.
+ * A negative number for system errors (errno).
  */
-int tpm2_get_random(struct tpm_chip *chip, u8 *out, size_t max)
+int tpm2_get_random(struct tpm_chip *chip, u8 *dest, size_t max)
 {
-   struct tpm2_cmd cmd;
-   u32 recd, rlength;
-   u32 num_bytes;
+   struct tpm2_get_random_out *out;
+   struct tpm_buf buf;
+   u32 recd;
+   u32 num_bytes = max;
int err;
int total = 0;
int retries = 5;
-   u8 *dest = out;
+   u8 *dest_ptr = dest;
 
-   num_bytes = min_t(u32, max, sizeof(cmd.params.getrandom_out.buffer));
-
-   if (!out || !num_bytes ||
-   max > sizeof(cmd.params.getrandom_out.buffer))
+   if (!num_bytes || max > TPM_MAX_RNG_DATA)
return -EINVAL;
 
-   do {
-   cmd.header.in = tpm2_getrandom_header;
-   cmd.params.getrandom_in.size = cpu_to_be16(num_bytes);
+   err = tpm_buf_init(, 0, 0);
+   if (err)
+   return err;
 
-   err = tpm_transmit_cmd(chip, NULL, , sizeof(cmd),
+   do {
+   

[PATCH v2 3/5] tpm: migrate tpm2_probe() to use struct tpm_buf

2018-03-05 Thread Jarkko Sakkinen
In order to make struct tpm_buf the first class object for constructing TPM
commands, migrate tpm2_probe() to use it.

Signed-off-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm2-cmd.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index abe6ef4a7a0b..890d83c5c78b 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -851,22 +851,25 @@ static int tpm2_do_selftest(struct tpm_chip *chip)
  */
 int tpm2_probe(struct tpm_chip *chip)
 {
-   struct tpm2_cmd cmd;
+   struct tpm_output_header *out;
+   struct tpm_buf buf;
int rc;
 
-   cmd.header.in = tpm2_get_tpm_pt_header;
-   cmd.params.get_tpm_pt_in.cap_id = cpu_to_be32(TPM2_CAP_TPM_PROPERTIES);
-   cmd.params.get_tpm_pt_in.property_id = cpu_to_be32(0x100);
-   cmd.params.get_tpm_pt_in.property_cnt = cpu_to_be32(1);
-
-   rc = tpm_transmit_cmd(chip, NULL, , sizeof(cmd), 0, 0, NULL);
-   if (rc <  0)
+   rc = tpm_buf_init(, TPM2_ST_NO_SESSIONS, TPM2_CC_GET_CAPABILITY);
+   if (rc)
return rc;
-
-   if (be16_to_cpu(cmd.header.out.tag) == TPM2_ST_NO_SESSIONS)
+   tpm_buf_append_u32(, TPM2_CAP_TPM_PROPERTIES);
+   tpm_buf_append_u32(, TPM_PT_TOTAL_COMMANDS);
+   tpm_buf_append_u32(, 1);
+   rc = tpm_transmit_cmd(chip, NULL, buf.data, PAGE_SIZE, 0, 0, NULL);
+   if (rc <  0)
+   goto out;
+   out = (struct tpm_output_header *)buf.data;
+   if (be16_to_cpu(out->tag) == TPM2_ST_NO_SESSIONS)
chip->flags |= TPM_CHIP_FLAG_TPM2;
-
-   return 0;
+out:
+   tpm_buf_destroy();
+   return rc;
 }
 EXPORT_SYMBOL_GPL(tpm2_probe);
 
-- 
2.15.1



Re: [PATCH 1/6] Docs: dt: add fsl-mc iommu-parent device-tree binding

2018-03-05 Thread Robin Murphy

On 05/03/18 14:29, Nipun Gupta wrote:

The existing IOMMU bindings cannot be used to specify the relationship
between fsl-mc devices and IOMMUs. This patch adds a binding for
mapping fsl-mc devices to IOMMUs, using a new iommu-parent property.


Given that allowing "msi-parent" for #msi-cells > 1 is merely a 
backward-compatibility bodge full of hard-coded assumptions, why would 
we want to knowingly introduce a similarly unpleasant equivalent for 
IOMMUs? What's wrong with "iommu-map"?



Signed-off-by: Nipun Gupta 
---
  .../devicetree/bindings/misc/fsl,qoriq-mc.txt  | 31 ++
  1 file changed, 31 insertions(+)

diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt 
b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
index 6611a7c..011c7d6 100644
--- a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
+++ b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
@@ -9,6 +9,24 @@ blocks that can be used to create functional hardware 
objects/devices
  such as network interfaces, crypto accelerator instances, L2 switches,
  etc.
  
+For an overview of the DPAA2 architecture and fsl-mc bus see:

+drivers/staging/fsl-mc/README.txt
+
+As described in the above overview, all DPAA2 objects in a DPRC share the
+same hardware "isolation context" and a 10-bit value called an ICID
+(isolation context id) is expressed by the hardware to identify
+the requester.


IOW, precisely the case for which "{msi,iommu}-map" exist. Yes, I know 
they're currently documented under bindings/pci, but they're not really 
intended to be absolutely PCI-specific.


Robin.


+The generic 'iommus' property is cannot be used to describe the relationship
+between fsl-mc and IOMMUs, so an iommu-parent property is used to define
+the same.
+
+For generic IOMMU bindings, see
+Documentation/devicetree/bindings/iommu/iommu.txt.
+
+For arm-smmu binding, see:
+Documentation/devicetree/bindings/iommu/arm,smmu.txt.
+
  Required properties:
  
  - compatible

@@ -88,14 +106,27 @@ Sub-nodes:
Value type: 
Definition: Specifies the phandle to the PHY device node 
associated
with the this dpmac.
+Optional properties:
+
+- iommu-parent: Maps the devices on fsl-mc bus to an IOMMU.
+  The property specifies the IOMMU behind which the devices on
+  fsl-mc bus are residing.
  
  Example:
  
+smmu: iommu@500 {

+   compatible = "arm,mmu-500";
+   #iommu-cells = <1>;
+   stream-match-mask = <0x7C00>;
+   ...
+};
+
  fsl_mc: fsl-mc@80c00 {
  compatible = "fsl,qoriq-mc";
  reg = <0x0008 0x0c00 0 0x40>,/* MC portal base */
<0x 0x0834 0 0x4>; /* MC control reg */
  msi-parent = <>;
+iommu-parent = <>;
  #address-cells = <3>;
  #size-cells = <1>;
  



[PATCH] mtdchar: fix usage of mtd_ooblayout_ecc()

2018-03-05 Thread OuYang ZhiZhong
Section was not properly computed. The value of OOB region definition is
always ECC section 0 information in the OOB area, but we want to get all
the ECC bytes information, so we should call
mtd_ooblayout_ecc(mtd, section++, ) until it returns -ERANGE.

This is fixed by using i instead of section.

Signed-off-by: OuYang ZhiZhong 
---
 drivers/mtd/mtdchar.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c
index de8c902..0cc929e 100644
--- a/drivers/mtd/mtdchar.c
+++ b/drivers/mtd/mtdchar.c
@@ -468,7 +468,7 @@ static int shrink_ecclayout(struct mtd_info *mtd,
struct nand_ecclayout_user *to)
 {
struct mtd_oob_region oobregion;
-   int i, section = 0, ret;
+   int i, ret;
 
if (!mtd || !to)
return -EINVAL;
@@ -479,7 +479,7 @@ static int shrink_ecclayout(struct mtd_info *mtd,
for (i = 0; i < MTD_MAX_ECCPOS_ENTRIES;) {
u32 eccpos;
 
-   ret = mtd_ooblayout_ecc(mtd, section, );
+   ret = mtd_ooblayout_ecc(mtd, i, );
if (ret < 0) {
if (ret != -ERANGE)
return ret;
@@ -515,7 +515,7 @@ static int shrink_ecclayout(struct mtd_info *mtd,
 static int get_oobinfo(struct mtd_info *mtd, struct nand_oobinfo *to)
 {
struct mtd_oob_region oobregion;
-   int i, section = 0, ret;
+   int i, ret;
 
if (!mtd || !to)
return -EINVAL;
@@ -526,7 +526,7 @@ static int get_oobinfo(struct mtd_info *mtd, struct 
nand_oobinfo *to)
for (i = 0; i < ARRAY_SIZE(to->eccpos);) {
u32 eccpos;
 
-   ret = mtd_ooblayout_ecc(mtd, section, );
+   ret = mtd_ooblayout_ecc(mtd, i, );
if (ret < 0) {
if (ret != -ERANGE)
return ret;
-- 
1.7.9.5



Re: [PATCH 1/6] tpm: sort objects in the Makefile

2018-03-05 Thread Jason Gunthorpe
On Mon, Mar 05, 2018 at 10:20:12PM +0200, Tomas Winkler wrote:
> Make the tpm Makefile a bit more in order by putting
> objects in one column and group together tpm2 modules
> 
> Prefer tpm-objs += instead of tpm-y += notation.
> 
> Signed-off-by: Tomas Winkler 
>  drivers/char/tpm/Makefile | 14 +++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
> index acd758381c58..2fc0e9a73cd6 100644
> +++ b/drivers/char/tpm/Makefile
> @@ -3,9 +3,17 @@
>  # Makefile for the kernel tpm device drivers.
>  #
>  obj-$(CONFIG_TCG_TPM) += tpm.o
> -tpm-y := tpm-interface.o tpm-dev.o tpm-sysfs.o tpm-chip.o tpm2-cmd.o \
> -  tpm-dev-common.o tpmrm-dev.o tpm1_eventlog.o tpm2_eventlog.o \
> - tpm2-space.o
> +tpm-objs := tpm-interface.o
> +tpm-objs += tpm-dev.o
> +tpm-objs += tpm-chip.o
> +tpm-objs += tpm-dev-common.o
> +tpm-objs += tpmrm-dev.o
> +tpm-objs += tpm-sysfs.o
> +tpm-objs += tpm1_eventlog.o
> +tpm-objs += tpm2-cmd.o
> +tpm-objs += tpm2-space.o
> +tpm-objs += tpm2_eventlog.o

If you are going to do this then sort the list please

Seems weird to me though, there are not that many examples of this
pattern in the kernel.

What is wrong with:

tpm-objs := \
 tpm-interface.o \
 tpm-dev.o \
 [..]

?
 
Jason


Re: [PATCH v2 1/1] HID: Logitech K290: Add driver for the Logitech K290 USB keyboard

2018-03-05 Thread kbuild test robot
Hi Florent,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on hid/for-next]
[also build test ERROR on v4.16-rc4 next-20180305]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Florent-Flament/Logitech-K290-Add-driver-for-the-Logitech-K290-USB-keyboard/20180305-153311
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid.git for-next
config: tile-allyesconfig (attached as .config)
compiler: tilegx-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=tile 

All errors (new ones prefixed by >>):

>> drivers/hid/hid-logitech-k290.c:92:3: error: 'struct hid_driver' has no 
>> member named 'resume'; did you mean 'remove'?
 .resume = k290_resume,
  ^~
  remove
>> drivers/hid/hid-logitech-k290.c:92:12: error: initialization from 
>> incompatible pointer type [-Werror=incompatible-pointer-types]
 .resume = k290_resume,
   ^~~
   drivers/hid/hid-logitech-k290.c:92:12: note: (near initialization for 
'k290_driver.feature_mapping')
>> drivers/hid/hid-logitech-k290.c:93:3: error: 'struct hid_driver' has no 
>> member named 'reset_resume'
 .reset_resume = k290_resume,
  ^~~~
   drivers/hid/hid-logitech-k290.c:93:18: error: initialization from 
incompatible pointer type [-Werror=incompatible-pointer-types]
 .reset_resume = k290_resume,
 ^~~
   drivers/hid/hid-logitech-k290.c:93:18: note: (near initialization for 
'k290_driver.bus_add_driver')
   cc1: some warnings being treated as errors

vim +92 drivers/hid/hid-logitech-k290.c

87  
88  static struct hid_driver k290_driver = {
89  .name = "hid-logitech-k290",
90  .id_table = k290_devices,
91  .input_configured = k290_input_configured,
  > 92  .resume = k290_resume,
  > 93  .reset_resume = k290_resume,
94  };
95  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v12 02/11] mm, swap: Add infrastructure for saving page metadata on swap

2018-03-05 Thread Khalid Aziz

On 03/05/2018 02:04 PM, Dave Hansen wrote:

On 03/05/2018 12:28 PM, Khalid Aziz wrote:

Do you have a way to tell that data is not being thrown away?  Like if
the ADI metadata is different for two different cachelines within a
single page?


Yes, since access to tagged data is made using pointers with ADI tag
embedded in the top bits, any mismatch between what app thinks the ADI
tags should be and what is stored in the RAM for corresponding page will
result in exception. If ADI data gets thrown away, we will get an ADI
tag mismatch exception. If ADI tags for two different ADI blocks on a
page are different when app expected them to be the same, we will see an
exception on access to the block with wrong ADI data.


So, when an app has two different ADI tags on two parts of a page, the
page gets swapped, and the ADI block size is under PAGE_SIZE, the app
will get an ADI exception after swap-in through no fault of its own?



Only if the kernel fails to re-establish ADI tags on the swapped in page 
which is why I added infrastructure to save the ADI tags for a page 
before it is swapped out and then re-establish those tags when the page 
is swapped back in. Kernel needs to save as many as ADI TAGS as may 
exist on each page, not just one tag per page. On sparc M7 8K pages, 
there are 128 ADI tags for the page, so kernel will store and restore 
128 ADI tags for each page on swap-out and swap-in. If kernel restores 
only one ADI tag for the page on swap in, app will get an exception and 
it will be kernel's fault.


--
Khalid


Re: [PATCH v12 10/11] sparc64: Add support for ADI (Application Data Integrity)

2018-03-05 Thread Khalid Aziz

On 03/05/2018 12:22 PM, Dave Hansen wrote:

On 02/21/2018 09:15 AM, Khalid Aziz wrote:

+#define arch_validate_prot(prot, addr) sparc_validate_prot(prot, addr)
+static inline int sparc_validate_prot(unsigned long prot, unsigned long addr)
+{
+   if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_ADI))
+   return 0;
+   if (prot & PROT_ADI) {
+   if (!adi_capable())
+   return 0;
+
+   if (addr) {
+   struct vm_area_struct *vma;
+
+   vma = find_vma(current->mm, addr);
+   if (vma) {
+   /* ADI can not be enabled on PFN
+* mapped pages
+*/
+   if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
+   return 0;


You don't hold mmap_sem here.  How can this work?



Are you suggesting that vma returned by find_vma() could be split or 
merged underneath me if I do not hold mmap_sem and thus make the flag 
check invalid? If so, that is a good point.


Thanks,
Khalid


Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls

2018-03-05 Thread Alexander Popov
On 05.03.2018 23:25, Peter Zijlstra wrote:
> On Mon, Mar 05, 2018 at 11:43:19AM -0800, Laura Abbott wrote:
>> On 03/05/2018 08:41 AM, Dave Hansen wrote:
>>> On 03/03/2018 12:00 PM, Alexander Popov wrote:
   Documentation/x86/x86_64/mm.txt  |   2 +
   arch/Kconfig |  27 ++
   arch/x86/Kconfig |   1 +
   arch/x86/entry/entry_32.S|  88 +++
   arch/x86/entry/entry_64.S| 108 
 +++
   arch/x86/entry/entry_64_compat.S |  11 
>>>
>>> This is a *lot* of assembly.  I wonder if you tried at all to get more
>>> of this into C or whether you just inherited the assembly from the
>>> original code?
>>>
>>
>> This came up previously 
>> http://www.openwall.com/lists/kernel-hardening/2017/10/23/5
>> there were concerns about trusting C to do the right thing as well as
>> speed.
> 
> And therefore the answer to this obvious question should've been part of
> the Changelog :-)
> 
> Dave is last in a long line of people asking this same question.

Yes, actually the changelog in the cover letter contains that:

  After some experiments, kept the asm implementation of erase_kstack(),
  because it gives a full control over the stack for clearing it neatly
  and doesn't offend KASAN.

Moreover, later erase_kstack() on x86_64 became different from one on x86_32.

Best regards,
Alexander


RE: [PATCH 1/6] tpm: sort objects in the Makefile

2018-03-05 Thread Winkler, Tomas
> On Mon, Mar 05, 2018 at 10:20:12PM +0200, Tomas Winkler wrote:
> > Make the tpm Makefile a bit more in order by putting objects in one
> > column and group together tpm2 modules
> >
> > Prefer tpm-objs += instead of tpm-y += notation.
> >
> > Signed-off-by: Tomas Winkler 
> > drivers/char/tpm/Makefile | 14 +++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
> > index acd758381c58..2fc0e9a73cd6 100644
> > +++ b/drivers/char/tpm/Makefile
> > @@ -3,9 +3,17 @@
> >  # Makefile for the kernel tpm device drivers.
> >  #
> >  obj-$(CONFIG_TCG_TPM) += tpm.o
> > -tpm-y := tpm-interface.o tpm-dev.o tpm-sysfs.o tpm-chip.o tpm2-cmd.o \
> > -tpm-dev-common.o tpmrm-dev.o tpm1_eventlog.o tpm2_eventlog.o
> \
> > - tpm2-space.o
> > +tpm-objs := tpm-interface.o
> > +tpm-objs += tpm-dev.o
> > +tpm-objs += tpm-chip.o
> > +tpm-objs += tpm-dev-common.o
> > +tpm-objs += tpmrm-dev.o
> > +tpm-objs += tpm-sysfs.o
> > +tpm-objs += tpm1_eventlog.o
> > +tpm-objs += tpm2-cmd.o
> > +tpm-objs += tpm2-space.o
> > +tpm-objs += tpm2_eventlog.o
> 
> If you are going to do this then sort the list please

I've sorted in that way that in the future will probably will compile tpm1- 
out, you probably mean to alphabetically. 
> 
> Seems weird to me though, there are not that many examples of this pattern
> in the kernel.

#find -name Makefile  | xargs grep -e 'objs +=' | awk -F: '{print $1}' | uniq  
| wc -l
74

Just a personal taste maybe. 

> What is wrong with:
> 
> tpm-objs := \
>tpm-interface.o \
>tpm-dev.o \
>[..]
> 
For me it's less error prone without backslashes, but at the end it's just the 
same again just a personal test.

Thanks
Tomas




Re: [PATCH 2/2] docs: add Co-Developed-by docs

2018-03-05 Thread Tobin C. Harding
On Mon, Mar 05, 2018 at 04:11:35AM -0800, Matthew Wilcox wrote:
> On Mon, Mar 05, 2018 at 02:58:21PM +1100, Tobin C. Harding wrote:
> > -12) When to use Acked-by: and Cc:
> > --
> > +12) When to use Acked-by: and Cc:, and Co-Developed-by:
> > +---
> 
> +12) When to use Acked-by:, Cc:, and Co-Developed-by:

thanks, sloppy work by me :)


Tobin


Re: [PATCH v12 10/11] sparc64: Add support for ADI (Application Data Integrity)

2018-03-05 Thread Dave Hansen
On 02/21/2018 09:15 AM, Khalid Aziz wrote:
> +tag_storage_desc_t *alloc_tag_store(struct mm_struct *mm,
> + struct vm_area_struct *vma,
> + unsigned long addr)
...
> + tags = kzalloc(size, GFP_NOWAIT|__GFP_NOWARN);
> + if (tags == NULL) {
> + tag_desc->tag_users = 0;
> + tag_desc = NULL;
> + goto out;
> + }
> + tag_desc->start = addr;
> + tag_desc->tags = tags;
> + tag_desc->end = end_addr;
> +
> +out:
> + spin_unlock_irqrestore(>context.tag_lock, flags);
> + return tag_desc;
> +}

OK, sorry, I missed this.  I do see that you now have per-ADI-block tag
storage and it is not per-page.

How big can this storage get, btw?  Superficially it seems like it might
be able to be gigantic for a large, sparse VMA.


[PATCH 11/36] fs: update documentation for __poll_t

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 Documentation/filesystems/Locking | 2 +-
 Documentation/filesystems/vfs.txt | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/Locking 
b/Documentation/filesystems/Locking
index 75d2d57e2c44..220bba28f72b 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -439,7 +439,7 @@ prototypes:
ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
int (*iterate) (struct file *, struct dir_context *);
-   unsigned int (*poll) (struct file *, struct poll_table_struct *);
+   __poll_t (*poll) (struct file *, struct poll_table_struct *);
long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
int (*mmap) (struct file *, struct vm_area_struct *);
diff --git a/Documentation/filesystems/vfs.txt 
b/Documentation/filesystems/vfs.txt
index 5fd325df59e2..f608180ad59d 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -856,7 +856,7 @@ struct file_operations {
ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
int (*iterate) (struct file *, struct dir_context *);
-   unsigned int (*poll) (struct file *, struct poll_table_struct *);
+   __poll_t (*poll) (struct file *, struct poll_table_struct *);
long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
int (*mmap) (struct file *, struct vm_area_struct *);
-- 
2.14.2



[PATCH 02/36] aio: remove an outdated comment in aio_complete

2018-03-05 Thread Christoph Hellwig
These days we don't treat sync iocbs special in the aio completion code as
they never use it.  Remove the old comment, and move the BUG_ON for a sync
iocb to the top of the function.

Signed-off-by: Christoph Hellwig 
Acked-by: Jeff Moyer 
---
 fs/aio.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 03d59593912d..41fc8ce6bc7f 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1088,6 +1088,8 @@ static void aio_complete(struct kiocb *kiocb, long res, 
long res2)
unsigned tail, pos, head;
unsigned long   flags;
 
+   BUG_ON(is_sync_kiocb(kiocb));
+
if (kiocb->ki_flags & IOCB_WRITE) {
struct file *file = kiocb->ki_filp;
 
@@ -1100,15 +1102,6 @@ static void aio_complete(struct kiocb *kiocb, long res, 
long res2)
file_end_write(file);
}
 
-   /*
-* Special case handling for sync iocbs:
-*  - events go directly into the iocb for fast handling
-*  - the sync task with the iocb in its stack holds the single iocb
-*ref, no other paths have a way to get another ref
-*  - the sync task helpfully left a reference to itself in the iocb
-*/
-   BUG_ON(is_sync_kiocb(kiocb));
-
if (iocb->ki_list.next) {
unsigned long flags;
 
-- 
2.14.2



Re: [PATCH v12 10/11] sparc64: Add support for ADI (Application Data Integrity)

2018-03-05 Thread Dave Hansen
On 03/05/2018 01:14 PM, Khalid Aziz wrote:
> On 03/05/2018 12:22 PM, Dave Hansen wrote:
>> On 02/21/2018 09:15 AM, Khalid Aziz wrote:
>>> +#define arch_validate_prot(prot, addr) sparc_validate_prot(prot, addr)
>>> +static inline int sparc_validate_prot(unsigned long prot, unsigned
>>> long addr)
>>> +{
>>> +    if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM |
>>> PROT_ADI))
>>> +    return 0;
>>> +    if (prot & PROT_ADI) {
>>> +    if (!adi_capable())
>>> +    return 0;
>>> +
>>> +    if (addr) {
>>> +    struct vm_area_struct *vma;
>>> +
>>> +    vma = find_vma(current->mm, addr);
>>> +    if (vma) {
>>> +    /* ADI can not be enabled on PFN
>>> + * mapped pages
>>> + */
>>> +    if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
>>> +    return 0;
>>
>> You don't hold mmap_sem here.  How can this work?
>>
> Are you suggesting that vma returned by find_vma() could be split or
> merged underneath me if I do not hold mmap_sem and thus make the flag
> check invalid? If so, that is a good point.

Um, yes.  You can't walk the vma tree without holding mmap_sem.


[PATCH 36/36] random: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
The big change is that random_read_wait and random_write_wait are merged
into a single waitqueue that uses keyed wakeups.  Because wait_event_*
doesn't know about that this will lead to occassional spurious wakeups
in _random_read and add_hwgenerator_randomness, but wait_event_* is
designed to handle these and were are not in a a hot path there.

Signed-off-by: Christoph Hellwig 
---
 drivers/char/random.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index e5b3d3ba4660..840d80b64431 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -401,8 +401,7 @@ static struct poolinfo {
 /*
  * Static global variables
  */
-static DECLARE_WAIT_QUEUE_HEAD(random_read_wait);
-static DECLARE_WAIT_QUEUE_HEAD(random_write_wait);
+static DECLARE_WAIT_QUEUE_HEAD(random_wait);
 static struct fasync_struct *fasync;
 
 static DEFINE_SPINLOCK(random_ready_list_lock);
@@ -710,7 +709,7 @@ static void credit_entropy_bits(struct entropy_store *r, 
int nbits)
 
/* should we wake readers? */
if (entropy_bits >= random_read_wakeup_bits) {
-   wake_up_interruptible(_read_wait);
+   wake_up_interruptible_poll(_wait, POLLIN);
kill_fasync(, SIGIO, POLL_IN);
}
/* If the input pool is getting full, send some
@@ -1293,7 +1292,7 @@ static size_t account(struct entropy_store *r, size_t 
nbytes, int min,
trace_debit_entropy(r->name, 8 * ibytes);
if (ibytes &&
(r->entropy_count >> ENTROPY_SHIFT) < random_write_wakeup_bits) {
-   wake_up_interruptible(_write_wait);
+   wake_up_interruptible_poll(_wait, POLLOUT);
kill_fasync(, SIGIO, POLL_OUT);
}
 
@@ -1748,7 +1747,7 @@ _random_read(int nonblock, char __user *buf, size_t 
nbytes)
if (nonblock)
return -EAGAIN;
 
-   wait_event_interruptible(random_read_wait,
+   wait_event_interruptible(random_wait,
ENTROPY_BITS(_pool) >=
random_read_wakeup_bits);
if (signal_pending(current))
@@ -1784,14 +1783,17 @@ urandom_read(struct file *file, char __user *buf, 
size_t nbytes, loff_t *ppos)
return ret;
 }
 
+static struct wait_queue_head *
+random_get_poll_head(struct file *file, __poll_t events)
+{
+   return _wait;
+}
+
 static __poll_t
-random_poll(struct file *file, poll_table * wait)
+random_poll_mask(struct file *file, __poll_t events)
 {
-   __poll_t mask;
+   __poll_t mask = 0;
 
-   poll_wait(file, _read_wait, wait);
-   poll_wait(file, _write_wait, wait);
-   mask = 0;
if (ENTROPY_BITS(_pool) >= random_read_wakeup_bits)
mask |= EPOLLIN | EPOLLRDNORM;
if (ENTROPY_BITS(_pool) < random_write_wakeup_bits)
@@ -1890,7 +1892,8 @@ static int random_fasync(int fd, struct file *filp, int 
on)
 const struct file_operations random_fops = {
.read  = random_read,
.write = random_write,
-   .poll  = random_poll,
+   .get_poll_head  = random_get_poll_head,
+   .poll_mask  = random_poll_mask,
.unlocked_ioctl = random_ioctl,
.fasync = random_fasync,
.llseek = noop_llseek,
@@ -2223,7 +2226,7 @@ void add_hwgenerator_randomness(const char *buffer, 
size_t count,
 * We'll be woken up again once below random_write_wakeup_thresh,
 * or when the calling thread is about to terminate.
 */
-   wait_event_interruptible(random_write_wait, kthread_should_stop() ||
+   wait_event_interruptible(random_wait, kthread_should_stop() ||
ENTROPY_BITS(_pool) <= random_write_wakeup_bits);
mix_pool_bytes(poolp, buffer, count);
credit_entropy_bits(poolp, entropy);
-- 
2.14.2



[PATCH 25/36] net/sctp: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 include/net/sctp/sctp.h | 3 +--
 net/sctp/ipv6.c | 2 +-
 net/sctp/protocol.c | 2 +-
 net/sctp/socket.c   | 4 +---
 4 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index f7ae6b0a21d0..37abd5ba4a3f 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -107,8 +107,7 @@ int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb);
 int sctp_inet_listen(struct socket *sock, int backlog);
 void sctp_write_space(struct sock *sk);
 void sctp_data_ready(struct sock *sk);
-__poll_t sctp_poll(struct file *file, struct socket *sock,
-   poll_table *wait);
+__poll_t sctp_poll_mask(struct socket *sock, __poll_t events);
 void sctp_sock_rfree(struct sk_buff *skb);
 void sctp_copy_sock(struct sock *newsk, struct sock *sk,
struct sctp_association *asoc);
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index e35d4f73d2df..6b0b8fc5b75a 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -976,7 +976,7 @@ static const struct proto_ops inet6_seqpacket_ops = {
.socketpair= sock_no_socketpair,
.accept= inet_accept,
.getname   = sctp_getname,
-   .poll  = sctp_poll,
+   .poll_mask = sctp_poll_mask,
.ioctl = inet6_ioctl,
.listen= sctp_inet_listen,
.shutdown  = inet_shutdown,
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 91813e686c67..20c544890e80 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1024,7 +1024,7 @@ static const struct proto_ops inet_seqpacket_ops = {
.socketpair= sock_no_socketpair,
.accept= inet_accept,
.getname   = inet_getname,  /* Semantics are different.  */
-   .poll  = sctp_poll,
+   .poll_mask = sctp_poll_mask,
.ioctl = inet_ioctl,
.listen= sctp_inet_listen,
.shutdown  = inet_shutdown, /* Looks harmless.  */
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index bf271f8c2dc9..097454740929 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7587,14 +7587,12 @@ int sctp_inet_listen(struct socket *sock, int backlog)
  * here, again, by modeling the current TCP/UDP code.  We don't have
  * a good way to test with it yet.
  */
-__poll_t sctp_poll(struct file *file, struct socket *sock, poll_table *wait)
+__poll_t sctp_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
struct sctp_sock *sp = sctp_sk(sk);
__poll_t mask;
 
-   poll_wait(file, sk_sleep(sk), wait);
-
sock_rps_record_flow(sk);
 
/* A TCP-style listening socket becomes readable when the accept queue
-- 
2.14.2



[PATCH 34/36] eventfd: switch to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 fs/eventfd.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/eventfd.c b/fs/eventfd.c
index 012f5bd46dfa..d70b4907f978 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -101,14 +101,20 @@ static int eventfd_release(struct inode *inode, struct 
file *file)
return 0;
 }
 
-static __poll_t eventfd_poll(struct file *file, poll_table *wait)
+static struct wait_queue_head *
+eventfd_get_poll_head(struct file *file, __poll_t events)
+{
+   struct eventfd_ctx *ctx = file->private_data;
+
+   return >wqh;
+}
+
+static __poll_t eventfd_poll_mask(struct file *file, __poll_t eventmask)
 {
struct eventfd_ctx *ctx = file->private_data;
__poll_t events = 0;
u64 count;
 
-   poll_wait(file, >wqh, wait);
-
/*
 * All writes to ctx->count occur within ctx->wqh.lock.  This read
 * can be done outside ctx->wqh.lock because we know that poll_wait
@@ -305,7 +311,8 @@ static const struct file_operations eventfd_fops = {
.show_fdinfo= eventfd_show_fdinfo,
 #endif
.release= eventfd_release,
-   .poll   = eventfd_poll,
+   .get_poll_head  = eventfd_get_poll_head,
+   .poll_mask  = eventfd_poll_mask,
.read   = eventfd_read,
.write  = eventfd_write,
.llseek = noop_llseek,
-- 
2.14.2



[PATCH 33/36] pipe: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 fs/pipe.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index 7b1954caf388..81937590ea0a 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -509,19 +509,22 @@ static long pipe_ioctl(struct file *filp, unsigned int 
cmd, unsigned long arg)
}
 }
 
-/* No kernel lock held - fine */
-static __poll_t
-pipe_poll(struct file *filp, poll_table *wait)
+static struct wait_queue_head *
+pipe_get_poll_head(struct file *filp, __poll_t events)
 {
-   __poll_t mask;
struct pipe_inode_info *pipe = filp->private_data;
-   int nrbufs;
 
-   poll_wait(filp, >wait, wait);
+   return >wait;
+}
+
+/* No kernel lock held - fine */
+static __poll_t pipe_poll_mask(struct file *filp, __poll_t events)
+{
+   struct pipe_inode_info *pipe = filp->private_data;
+   int nrbufs = pipe->nrbufs;
+   __poll_t mask = 0;
 
/* Reading only -- no need for acquiring the semaphore.  */
-   nrbufs = pipe->nrbufs;
-   mask = 0;
if (filp->f_mode & FMODE_READ) {
mask = (nrbufs > 0) ? EPOLLIN | EPOLLRDNORM : 0;
if (!pipe->writers && filp->f_version != pipe->w_counter)
@@ -1015,7 +1018,8 @@ const struct file_operations pipefifo_fops = {
.llseek = no_llseek,
.read_iter  = pipe_read,
.write_iter = pipe_write,
-   .poll   = pipe_poll,
+   .get_poll_head  = pipe_get_poll_head,
+   .poll_mask  = pipe_poll_mask,
.unlocked_ioctl = pipe_ioctl,
.release= pipe_release,
.fasync = pipe_fasync,
-- 
2.14.2



[PATCH 31/36] net/rxrpc: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 net/rxrpc/af_rxrpc.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 0c9c18aa7c77..d2440d5c3ce8 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -729,15 +729,11 @@ static int rxrpc_getsockopt(struct socket *sock, int 
level, int optname,
 /*
  * permit an RxRPC socket to be polled
  */
-static __poll_t rxrpc_poll(struct file *file, struct socket *sock,
-  poll_table *wait)
+static __poll_t rxrpc_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
struct rxrpc_sock *rx = rxrpc_sk(sk);
-   __poll_t mask;
-
-   sock_poll_wait(file, sk_sleep(sk), wait);
-   mask = 0;
+   __poll_t mask = 0;
 
/* the socket is readable if there are any messages waiting on the Rx
 * queue */
@@ -940,7 +936,7 @@ static const struct proto_ops rxrpc_rpc_ops = {
.socketpair = sock_no_socketpair,
.accept = sock_no_accept,
.getname= sock_no_getname,
-   .poll   = rxrpc_poll,
+   .poll_mask  = rxrpc_poll_mask,
.ioctl  = sock_no_ioctl,
.listen = rxrpc_listen,
.shutdown   = rxrpc_shutdown,
-- 
2.14.2



[PATCH 35/36] timerfd: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 fs/timerfd.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/timerfd.c b/fs/timerfd.c
index cdad49da3ff7..d84a2bee4f82 100644
--- a/fs/timerfd.c
+++ b/fs/timerfd.c
@@ -226,21 +226,20 @@ static int timerfd_release(struct inode *inode, struct 
file *file)
kfree_rcu(ctx, rcu);
return 0;
 }
-
-static __poll_t timerfd_poll(struct file *file, poll_table *wait)
+   
+static struct wait_queue_head *timerfd_get_poll_head(struct file *file,
+   __poll_t eventmask)
 {
struct timerfd_ctx *ctx = file->private_data;
-   __poll_t events = 0;
-   unsigned long flags;
 
-   poll_wait(file, >wqh, wait);
+   return >wqh;
+}
 
-   spin_lock_irqsave(>wqh.lock, flags);
-   if (ctx->ticks)
-   events |= EPOLLIN;
-   spin_unlock_irqrestore(>wqh.lock, flags);
+static __poll_t timerfd_poll_mask(struct file *file, __poll_t eventmask)
+{
+   struct timerfd_ctx *ctx = file->private_data;
 
-   return events;
+   return ctx->ticks ? EPOLLIN : 0;
 }
 
 static ssize_t timerfd_read(struct file *file, char __user *buf, size_t count,
@@ -364,7 +363,8 @@ static long timerfd_ioctl(struct file *file, unsigned int 
cmd, unsigned long arg
 
 static const struct file_operations timerfd_fops = {
.release= timerfd_release,
-   .poll   = timerfd_poll,
+   .get_poll_head  = timerfd_get_poll_head,
+   .poll_mask  = timerfd_poll_mask,
.read   = timerfd_read,
.llseek = noop_llseek,
.show_fdinfo= timerfd_show,
-- 
2.14.2



[PATCH 32/36] crypto: af_alg: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 crypto/af_alg.c | 13 +++--
 crypto/algif_aead.c |  4 ++--
 crypto/algif_skcipher.c |  4 ++--
 include/crypto/if_alg.h |  3 +--
 4 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 50d75de539f5..330aef1cd08b 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1060,19 +1060,12 @@ void af_alg_async_cb(struct crypto_async_request *_req, 
int err)
 }
 EXPORT_SYMBOL_GPL(af_alg_async_cb);
 
-/**
- * af_alg_poll - poll system call handler
- */
-__poll_t af_alg_poll(struct file *file, struct socket *sock,
-poll_table *wait)
+__poll_t af_alg_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
struct alg_sock *ask = alg_sk(sk);
struct af_alg_ctx *ctx = ask->private;
-   __poll_t mask;
-
-   sock_poll_wait(file, sk_sleep(sk), wait);
-   mask = 0;
+   __poll_t mask = 0;
 
if (!ctx->more || ctx->used)
mask |= EPOLLIN | EPOLLRDNORM;
@@ -1082,7 +1075,7 @@ __poll_t af_alg_poll(struct file *file, struct socket 
*sock,
 
return mask;
 }
-EXPORT_SYMBOL_GPL(af_alg_poll);
+EXPORT_SYMBOL_GPL(af_alg_poll_mask);
 
 /**
  * af_alg_alloc_areq - allocate struct af_alg_async_req
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 4b07edd5a9ff..330cf9f2b767 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -375,7 +375,7 @@ static struct proto_ops algif_aead_ops = {
.sendmsg=   aead_sendmsg,
.sendpage   =   af_alg_sendpage,
.recvmsg=   aead_recvmsg,
-   .poll   =   af_alg_poll,
+   .poll_mask  =   af_alg_poll_mask,
 };
 
 static int aead_check_key(struct socket *sock)
@@ -471,7 +471,7 @@ static struct proto_ops algif_aead_ops_nokey = {
.sendmsg=   aead_sendmsg_nokey,
.sendpage   =   aead_sendpage_nokey,
.recvmsg=   aead_recvmsg_nokey,
-   .poll   =   af_alg_poll,
+   .poll_mask  =   af_alg_poll_mask,
 };
 
 static void *aead_bind(const char *name, u32 type, u32 mask)
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index c4e885df4564..15cf3c5222e0 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -205,7 +205,7 @@ static struct proto_ops algif_skcipher_ops = {
.sendmsg=   skcipher_sendmsg,
.sendpage   =   af_alg_sendpage,
.recvmsg=   skcipher_recvmsg,
-   .poll   =   af_alg_poll,
+   .poll_mask  =   af_alg_poll_mask,
 };
 
 static int skcipher_check_key(struct socket *sock)
@@ -301,7 +301,7 @@ static struct proto_ops algif_skcipher_ops_nokey = {
.sendmsg=   skcipher_sendmsg_nokey,
.sendpage   =   skcipher_sendpage_nokey,
.recvmsg=   skcipher_recvmsg_nokey,
-   .poll   =   af_alg_poll,
+   .poll_mask  =   af_alg_poll_mask,
 };
 
 static void *skcipher_bind(const char *name, u32 type, u32 mask)
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 482461d8931d..cc414db9da0a 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -245,8 +245,7 @@ ssize_t af_alg_sendpage(struct socket *sock, struct page 
*page,
int offset, size_t size, int flags);
 void af_alg_free_resources(struct af_alg_async_req *areq);
 void af_alg_async_cb(struct crypto_async_request *_req, int err);
-__poll_t af_alg_poll(struct file *file, struct socket *sock,
-poll_table *wait);
+__poll_t af_alg_poll_mask(struct socket *sock, __poll_t events);
 struct af_alg_async_req *af_alg_alloc_areq(struct sock *sk,
   unsigned int areqlen);
 int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags,
-- 
2.14.2



Re: [BUG] Kernel crash on Allwinner H3 due to sound core changes

2018-03-05 Thread Jernej Škrabec
Hi,

Dne petek, 02. marec 2018 ob 13:40:50 CET je Mark Brown napisal(a):
> On Thu, Mar 01, 2018 at 11:23:57PM +0100, Jernej Škrabec wrote:
> > I removed parts of the code from the sun4i codec driver and interestingly
> > it doesn't crash if I remove following lines:
> > 
> > ret = devm_snd_dmaengine_pcm_register(>dev, NULL, 0);
> > if (ret) {
> > 
> > dev_err(>dev, "Failed to register against DMAEngine\n");
> > goto err_assert_reset;
> > 
> > }
> > 
> > Is it possible that NULL pointer causes troubles somewhere down the line?
> 
> Shouldn't be, that's just the configuration which is optional and not
> what we're crashing trying to register, we can mostly configure things
> by querying the capabilities of the DMA controller via the dmaengine API
> these days.  You're removing all the DMA support there so cutting out a
> huge segment of the initialization of both this driver and the machine
> driver.  Other sunxi devices seem to be starting happily in -next so
> there's something system dependent here...

I enabled memory debugging and it seems that there is an issue caused by 
loading sun4i-codec driver and it is somehow connected to 
snd_dmaengine_pcm_unregister().

Here is relevant dmesg: https://pastebin.com/raw/80K9GPnB

Does this tell anything?

Best regards,
Jernej






Re: [PATCH v12 10/11] sparc64: Add support for ADI (Application Data Integrity)

2018-03-05 Thread Dave Hansen
On 03/05/2018 01:14 PM, Khalid Aziz wrote:
> Are you suggesting that vma returned by find_vma() could be split or
> merged underneath me if I do not hold mmap_sem and thus make the flag
> check invalid? If so, that is a good point.

This part does make me think that this code hasn't been tested very
thoroughly.  Could you describe the testing that you have done?  For MPX
and protection keys, I added something to tools/testing/selftests/x86,
for instance.


[PATCH 30/36] net/iucv: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 include/net/iucv/af_iucv.h | 2 --
 net/iucv/af_iucv.c | 7 ++-
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/net/iucv/af_iucv.h b/include/net/iucv/af_iucv.h
index f4c21b5a1242..b0eaeb02d46d 100644
--- a/include/net/iucv/af_iucv.h
+++ b/include/net/iucv/af_iucv.h
@@ -153,8 +153,6 @@ struct iucv_sock_list {
atomic_t  autobind_name;
 };
 
-__poll_t iucv_sock_poll(struct file *file, struct socket *sock,
-   poll_table *wait);
 void iucv_sock_link(struct iucv_sock_list *l, struct sock *s);
 void iucv_sock_unlink(struct iucv_sock_list *l, struct sock *s);
 void iucv_accept_enqueue(struct sock *parent, struct sock *sk);
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index 1e8cc7bcbca3..539a312dc481 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1489,14 +1489,11 @@ static inline __poll_t iucv_accept_poll(struct sock 
*parent)
return 0;
 }
 
-__poll_t iucv_sock_poll(struct file *file, struct socket *sock,
-   poll_table *wait)
+static __poll_t iucv_sock_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
__poll_t mask = 0;
 
-   sock_poll_wait(file, sk_sleep(sk), wait);
-
if (sk->sk_state == IUCV_LISTEN)
return iucv_accept_poll(sk);
 
@@ -2389,7 +2386,7 @@ static const struct proto_ops iucv_sock_ops = {
.getname= iucv_sock_getname,
.sendmsg= iucv_sock_sendmsg,
.recvmsg= iucv_sock_recvmsg,
-   .poll   = iucv_sock_poll,
+   .poll_mask  = iucv_sock_poll_mask,
.ioctl  = sock_no_ioctl,
.mmap   = sock_no_mmap,
.socketpair = sock_no_socketpair,
-- 
2.14.2



[PATCH 24/36] net/tipc: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 net/tipc/socket.c | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index b0323ec7971e..1ea1666e8e95 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -694,10 +694,9 @@ static int tipc_getname(struct socket *sock, struct 
sockaddr *uaddr,
 }
 
 /**
- * tipc_poll - read and possibly block on pollmask
+ * tipc_poll - read pollmask
  * @file: file structure associated with the socket
  * @sock: socket for which to calculate the poll bits
- * @wait: ???
  *
  * Returns pollmask value
  *
@@ -711,15 +710,12 @@ static int tipc_getname(struct socket *sock, struct 
sockaddr *uaddr,
  * imply that the operation will succeed, merely that it should be performed
  * and will not block.
  */
-static __poll_t tipc_poll(struct file *file, struct socket *sock,
- poll_table *wait)
+static __poll_t tipc_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
struct tipc_sock *tsk = tipc_sk(sk);
__poll_t revents = 0;
 
-   sock_poll_wait(file, sk_sleep(sk), wait);
-
if (sk->sk_shutdown & RCV_SHUTDOWN)
revents |= EPOLLRDHUP | EPOLLIN | EPOLLRDNORM;
if (sk->sk_shutdown == SHUTDOWN_MASK)
@@ -3019,7 +3015,7 @@ static const struct proto_ops msg_ops = {
.socketpair = tipc_socketpair,
.accept = sock_no_accept,
.getname= tipc_getname,
-   .poll   = tipc_poll,
+   .poll_mask  = tipc_poll_mask,
.ioctl  = tipc_ioctl,
.listen = sock_no_listen,
.shutdown   = tipc_shutdown,
@@ -3040,7 +3036,7 @@ static const struct proto_ops packet_ops = {
.socketpair = tipc_socketpair,
.accept = tipc_accept,
.getname= tipc_getname,
-   .poll   = tipc_poll,
+   .poll_mask  = tipc_poll_mask,
.ioctl  = tipc_ioctl,
.listen = tipc_listen,
.shutdown   = tipc_shutdown,
@@ -3061,7 +3057,7 @@ static const struct proto_ops stream_ops = {
.socketpair = tipc_socketpair,
.accept = tipc_accept,
.getname= tipc_getname,
-   .poll   = tipc_poll,
+   .poll_mask  = tipc_poll_mask,
.ioctl  = tipc_ioctl,
.listen = tipc_listen,
.shutdown   = tipc_shutdown,
-- 
2.14.2



[PATCH 26/36] net/bluetooth: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 include/net/bluetooth/bluetooth.h | 2 +-
 net/bluetooth/af_bluetooth.c  | 7 ++-
 net/bluetooth/l2cap_sock.c| 2 +-
 net/bluetooth/rfcomm/sock.c   | 2 +-
 net/bluetooth/sco.c   | 2 +-
 5 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/include/net/bluetooth/bluetooth.h 
b/include/net/bluetooth/bluetooth.h
index ec9d6bc65855..53ce8176c313 100644
--- a/include/net/bluetooth/bluetooth.h
+++ b/include/net/bluetooth/bluetooth.h
@@ -271,7 +271,7 @@ int  bt_sock_recvmsg(struct socket *sock, struct msghdr 
*msg, size_t len,
 int flags);
 int  bt_sock_stream_recvmsg(struct socket *sock, struct msghdr *msg,
size_t len, int flags);
-__poll_t bt_sock_poll(struct file *file, struct socket *sock, poll_table 
*wait);
+__poll_t bt_sock_poll_mask(struct socket *sock, __poll_t events);
 int  bt_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
 int  bt_sock_wait_state(struct sock *sk, int state, unsigned long timeo);
 int  bt_sock_wait_ready(struct sock *sk, unsigned long flags);
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index 84d92a077834..80033a7e1de2 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -437,16 +437,13 @@ static inline __poll_t bt_accept_poll(struct sock *parent)
return 0;
 }
 
-__poll_t bt_sock_poll(struct file *file, struct socket *sock,
- poll_table *wait)
+__poll_t bt_sock_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
__poll_t mask = 0;
 
BT_DBG("sock %p, sk %p", sock, sk);
 
-   poll_wait(file, sk_sleep(sk), wait);
-
if (sk->sk_state == BT_LISTEN)
return bt_accept_poll(sk);
 
@@ -478,7 +475,7 @@ __poll_t bt_sock_poll(struct file *file, struct socket 
*sock,
 
return mask;
 }
-EXPORT_SYMBOL(bt_sock_poll);
+EXPORT_SYMBOL(bt_sock_poll_mask);
 
 int bt_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 {
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index 67a8642f57ea..d20b33daa80f 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1654,7 +1654,7 @@ static const struct proto_ops l2cap_sock_ops = {
.getname= l2cap_sock_getname,
.sendmsg= l2cap_sock_sendmsg,
.recvmsg= l2cap_sock_recvmsg,
-   .poll   = bt_sock_poll,
+   .poll_mask  = bt_sock_poll_mask,
.ioctl  = bt_sock_ioctl,
.mmap   = sock_no_mmap,
.socketpair = sock_no_socketpair,
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 1aaccf637479..b4dc96481d92 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -1049,7 +1049,7 @@ static const struct proto_ops rfcomm_sock_ops = {
.setsockopt = rfcomm_sock_setsockopt,
.getsockopt = rfcomm_sock_getsockopt,
.ioctl  = rfcomm_sock_ioctl,
-   .poll   = bt_sock_poll,
+   .poll_mask  = bt_sock_poll_mask,
.socketpair = sock_no_socketpair,
.mmap   = sock_no_mmap
 };
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 08df57665e1f..b2bf5c767b3e 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -1198,7 +1198,7 @@ static const struct proto_ops sco_sock_ops = {
.getname= sco_sock_getname,
.sendmsg= sco_sock_sendmsg,
.recvmsg= sco_sock_recvmsg,
-   .poll   = bt_sock_poll,
+   .poll_mask  = bt_sock_poll_mask,
.ioctl  = bt_sock_ioctl,
.mmap   = sock_no_mmap,
.socketpair = sock_no_socketpair,
-- 
2.14.2



[PATCH 27/36] net/caif: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 net/caif/caif_socket.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index a6fb1b3bcad9..c7991867d622 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -934,15 +934,11 @@ static int caif_release(struct socket *sock)
 }
 
 /* Copied from af_unix.c:unix_poll(), added CAIF tx_flow handling */
-static __poll_t caif_poll(struct file *file,
- struct socket *sock, poll_table *wait)
+static __poll_t caif_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
-   __poll_t mask;
struct caifsock *cf_sk = container_of(sk, struct caifsock, sk);
-
-   sock_poll_wait(file, sk_sleep(sk), wait);
-   mask = 0;
+   __poll_t mask = 0;
 
/* exceptional events? */
if (sk->sk_err)
@@ -976,7 +972,7 @@ static const struct proto_ops caif_seqpacket_ops = {
.socketpair = sock_no_socketpair,
.accept = sock_no_accept,
.getname = sock_no_getname,
-   .poll = caif_poll,
+   .poll_mask = caif_poll_mask,
.ioctl = sock_no_ioctl,
.listen = sock_no_listen,
.shutdown = sock_no_shutdown,
@@ -997,7 +993,7 @@ static const struct proto_ops caif_stream_ops = {
.socketpair = sock_no_socketpair,
.accept = sock_no_accept,
.getname = sock_no_getname,
-   .poll = caif_poll,
+   .poll_mask = caif_poll_mask,
.ioctl = sock_no_ioctl,
.listen = sock_no_listen,
.shutdown = sock_no_shutdown,
-- 
2.14.2



[PATCH 28/36] net/nfc: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 net/nfc/llcp_sock.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 376040092142..b6010750e634 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -549,16 +549,13 @@ static inline __poll_t llcp_accept_poll(struct sock 
*parent)
return 0;
 }
 
-static __poll_t llcp_sock_poll(struct file *file, struct socket *sock,
-  poll_table *wait)
+static __poll_t llcp_sock_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
__poll_t mask = 0;
 
pr_debug("%p\n", sk);
 
-   sock_poll_wait(file, sk_sleep(sk), wait);
-
if (sk->sk_state == LLCP_LISTEN)
return llcp_accept_poll(sk);
 
@@ -900,7 +897,7 @@ static const struct proto_ops llcp_sock_ops = {
.socketpair = sock_no_socketpair,
.accept = llcp_sock_accept,
.getname= llcp_sock_getname,
-   .poll   = llcp_sock_poll,
+   .poll_mask  = llcp_sock_poll_mask,
.ioctl  = sock_no_ioctl,
.listen = llcp_sock_listen,
.shutdown   = sock_no_shutdown,
@@ -920,7 +917,7 @@ static const struct proto_ops llcp_rawsock_ops = {
.socketpair = sock_no_socketpair,
.accept = sock_no_accept,
.getname= llcp_sock_getname,
-   .poll   = llcp_sock_poll,
+   .poll_mask  = llcp_sock_poll_mask,
.ioctl  = sock_no_ioctl,
.listen = sock_no_listen,
.shutdown   = sock_no_shutdown,
-- 
2.14.2



[PATCH 29/36] net/phonet: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 net/phonet/socket.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index 28d981512f5f..70ac4539d5b7 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -341,15 +341,12 @@ static int pn_socket_getname(struct socket *sock, struct 
sockaddr *addr,
return 0;
 }
 
-static __poll_t pn_socket_poll(struct file *file, struct socket *sock,
-   poll_table *wait)
+static __poll_t pn_socket_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
struct pep_sock *pn = pep_sk(sk);
__poll_t mask = 0;
 
-   poll_wait(file, sk_sleep(sk), wait);
-
if (sk->sk_state == TCP_CLOSE)
return EPOLLERR;
if (!skb_queue_empty(>sk_receive_queue))
@@ -474,7 +471,7 @@ const struct proto_ops phonet_stream_ops = {
.socketpair = sock_no_socketpair,
.accept = pn_socket_accept,
.getname= pn_socket_getname,
-   .poll   = pn_socket_poll,
+   .poll_mask  = pn_socket_poll_mask,
.ioctl  = pn_socket_ioctl,
.listen = pn_socket_listen,
.shutdown   = sock_no_shutdown,
-- 
2.14.2



[PATCH 23/36] net/vmw_vsock: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 net/vmw_vsock/af_vsock.c | 19 ++-
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index e0fc84daed94..b9210329bda8 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -850,18 +850,11 @@ static int vsock_shutdown(struct socket *sock, int mode)
return err;
 }
 
-static __poll_t vsock_poll(struct file *file, struct socket *sock,
-  poll_table *wait)
+static __poll_t vsock_poll_mask(struct socket *sock, __poll_t events)
 {
-   struct sock *sk;
-   __poll_t mask;
-   struct vsock_sock *vsk;
-
-   sk = sock->sk;
-   vsk = vsock_sk(sk);
-
-   poll_wait(file, sk_sleep(sk), wait);
-   mask = 0;
+   struct sock *sk = sock->sk;
+   struct vsock_sock *vsk = vsock_sk(sk);
+   __poll_t mask = 0;
 
if (sk->sk_err)
/* Signify that there has been an error on this socket. */
@@ -1091,7 +1084,7 @@ static const struct proto_ops vsock_dgram_ops = {
.socketpair = sock_no_socketpair,
.accept = sock_no_accept,
.getname = vsock_getname,
-   .poll = vsock_poll,
+   .poll_mask = vsock_poll_mask,
.ioctl = sock_no_ioctl,
.listen = sock_no_listen,
.shutdown = vsock_shutdown,
@@ -1849,7 +1842,7 @@ static const struct proto_ops vsock_stream_ops = {
.socketpair = sock_no_socketpair,
.accept = vsock_accept,
.getname = vsock_getname,
-   .poll = vsock_poll,
+   .poll_mask = vsock_poll_mask,
.ioctl = sock_no_ioctl,
.listen = vsock_listen,
.shutdown = vsock_shutdown,
-- 
2.14.2



[PATCH 22/36] net/atm: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 net/atm/common.c | 11 +++
 net/atm/common.h |  2 +-
 net/atm/pvc.c|  2 +-
 net/atm/svc.c|  2 +-
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/net/atm/common.c b/net/atm/common.c
index fc78a0508ae1..1f2af59935db 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -648,16 +648,11 @@ int vcc_sendmsg(struct socket *sock, struct msghdr *m, 
size_t size)
return error;
 }
 
-__poll_t vcc_poll(struct file *file, struct socket *sock, poll_table *wait)
+__poll_t vcc_poll_mask(struct socket *sock, __poll_t events)
 {
struct sock *sk = sock->sk;
-   struct atm_vcc *vcc;
-   __poll_t mask;
-
-   sock_poll_wait(file, sk_sleep(sk), wait);
-   mask = 0;
-
-   vcc = ATM_SD(sock);
+   struct atm_vcc *vcc = ATM_SD(sock);
+   __poll_t mask = 0;
 
/* exceptional events */
if (sk->sk_err)
diff --git a/net/atm/common.h b/net/atm/common.h
index 5850649068bb..526796ad230f 100644
--- a/net/atm/common.h
+++ b/net/atm/common.h
@@ -17,7 +17,7 @@ int vcc_connect(struct socket *sock, int itf, short vpi, int 
vci);
 int vcc_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
int flags);
 int vcc_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len);
-__poll_t vcc_poll(struct file *file, struct socket *sock, poll_table *wait);
+__poll_t vcc_poll_mask(struct socket *sock, __poll_t events);
 int vcc_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
 int vcc_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
 int vcc_setsockopt(struct socket *sock, int level, int optname,
diff --git a/net/atm/pvc.c b/net/atm/pvc.c
index e1140b3bdcaa..930651c5e77c 100644
--- a/net/atm/pvc.c
+++ b/net/atm/pvc.c
@@ -114,7 +114,7 @@ static const struct proto_ops pvc_proto_ops = {
.socketpair =   sock_no_socketpair,
.accept =   sock_no_accept,
.getname =  pvc_getname,
-   .poll = vcc_poll,
+   .poll_mask =vcc_poll_mask,
.ioctl =vcc_ioctl,
 #ifdef CONFIG_COMPAT
.compat_ioctl = vcc_compat_ioctl,
diff --git a/net/atm/svc.c b/net/atm/svc.c
index c458adcbc177..ad0e6ffb9cfe 100644
--- a/net/atm/svc.c
+++ b/net/atm/svc.c
@@ -637,7 +637,7 @@ static const struct proto_ops svc_proto_ops = {
.socketpair =   sock_no_socketpair,
.accept =   svc_accept,
.getname =  svc_getname,
-   .poll = vcc_poll,
+   .poll_mask =vcc_poll_mask,
.ioctl =svc_ioctl,
 #ifdef CONFIG_COMPAT
.compat_ioctl = svc_compat_ioctl,
-- 
2.14.2



[PATCH 21/36] net/dccp: convert to ->poll_mask

2018-03-05 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 net/dccp/dccp.h  |  3 +--
 net/dccp/ipv4.c  |  2 +-
 net/dccp/ipv6.c  |  2 +-
 net/dccp/proto.c | 13 ++---
 4 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h
index f91e3816806b..0ea2ee56ac1b 100644
--- a/net/dccp/dccp.h
+++ b/net/dccp/dccp.h
@@ -316,8 +316,7 @@ int dccp_recvmsg(struct sock *sk, struct msghdr *msg, 
size_t len, int nonblock,
 int flags, int *addr_len);
 void dccp_shutdown(struct sock *sk, int how);
 int inet_dccp_listen(struct socket *sock, int backlog);
-__poll_t dccp_poll(struct file *file, struct socket *sock,
-  poll_table *wait);
+__poll_t dccp_poll_mask(struct socket *sock, __poll_t events);
 int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len);
 void dccp_req_err(struct sock *sk, u64 seq);
 
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index e65fcb45c3f6..e8476f319efd 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -983,7 +983,7 @@ static const struct proto_ops inet_dccp_ops = {
.accept= inet_accept,
.getname   = inet_getname,
/* FIXME: work on tcp_poll to rename it to inet_csk_poll */
-   .poll  = dccp_poll,
+   .poll_mask = dccp_poll_mask,
.ioctl = inet_ioctl,
/* FIXME: work on inet_listen to rename it to sock_common_listen */
.listen= inet_dccp_listen,
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 5df7857fc0f3..f0aac8e4b888 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -1069,7 +1069,7 @@ static const struct proto_ops inet6_dccp_ops = {
.socketpair= sock_no_socketpair,
.accept= inet_accept,
.getname   = inet6_getname,
-   .poll  = dccp_poll,
+   .poll_mask = dccp_poll_mask,
.ioctl = inet6_ioctl,
.listen= inet_dccp_listen,
.shutdown  = inet_shutdown,
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 15bdc002d90c..26816032a7c2 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -314,20 +314,11 @@ int dccp_disconnect(struct sock *sk, int flags)
 
 EXPORT_SYMBOL_GPL(dccp_disconnect);
 
-/*
- * Wait for a DCCP event.
- *
- * Note that we don't need to lock the socket, as the upper poll layers
- * take care of normal races (between the test and the event) and we don't
- * go look at any of the socket buffers directly.
- */
-__poll_t dccp_poll(struct file *file, struct socket *sock,
-  poll_table *wait)
+__poll_t dccp_poll_mask(struct socket *sock, __poll_t events)
 {
__poll_t mask;
struct sock *sk = sock->sk;
 
-   sock_poll_wait(file, sk_sleep(sk), wait);
if (sk->sk_state == DCCP_LISTEN)
return inet_csk_listen_poll(sk);
 
@@ -369,7 +360,7 @@ __poll_t dccp_poll(struct file *file, struct socket *sock,
return mask;
 }
 
-EXPORT_SYMBOL_GPL(dccp_poll);
+EXPORT_SYMBOL_GPL(dccp_poll_mask);
 
 int dccp_ioctl(struct sock *sk, int cmd, unsigned long arg)
 {
-- 
2.14.2



[PATCH 17/36] net: remove sock_no_poll

2018-03-05 Thread Christoph Hellwig
Now that sock_poll handles a NULL ->poll or ->poll_mask there is no need
for a stub.

Signed-off-by: Christoph Hellwig 
---
 crypto/af_alg.c | 1 -
 crypto/algif_hash.c | 2 --
 crypto/algif_rng.c  | 1 -
 drivers/isdn/mISDN/socket.c | 1 -
 drivers/net/ppp/pptp.c  | 1 -
 include/net/sock.h  | 2 --
 net/bluetooth/bnep/sock.c   | 1 -
 net/bluetooth/cmtp/sock.c   | 1 -
 net/bluetooth/hidp/sock.c   | 1 -
 net/core/sock.c | 6 --
 10 files changed, 17 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index c49766b03165..50d75de539f5 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -347,7 +347,6 @@ static const struct proto_ops alg_proto_ops = {
.sendpage   =   sock_no_sendpage,
.sendmsg=   sock_no_sendmsg,
.recvmsg=   sock_no_recvmsg,
-   .poll   =   sock_no_poll,
 
.bind   =   alg_bind,
.release=   af_alg_release,
diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 6c9b1927a520..bfcf595fd8f9 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -288,7 +288,6 @@ static struct proto_ops algif_hash_ops = {
.mmap   =   sock_no_mmap,
.bind   =   sock_no_bind,
.setsockopt =   sock_no_setsockopt,
-   .poll   =   sock_no_poll,
 
.release=   af_alg_release,
.sendmsg=   hash_sendmsg,
@@ -396,7 +395,6 @@ static struct proto_ops algif_hash_ops_nokey = {
.mmap   =   sock_no_mmap,
.bind   =   sock_no_bind,
.setsockopt =   sock_no_setsockopt,
-   .poll   =   sock_no_poll,
 
.release=   af_alg_release,
.sendmsg=   hash_sendmsg_nokey,
diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
index 150c2b6480ed..22df3799a17b 100644
--- a/crypto/algif_rng.c
+++ b/crypto/algif_rng.c
@@ -106,7 +106,6 @@ static struct proto_ops algif_rng_ops = {
.bind   =   sock_no_bind,
.accept =   sock_no_accept,
.setsockopt =   sock_no_setsockopt,
-   .poll   =   sock_no_poll,
.sendmsg=   sock_no_sendmsg,
.sendpage   =   sock_no_sendpage,
 
diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index c5603d1a07d6..c84270e16bdd 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -746,7 +746,6 @@ static const struct proto_ops base_sock_ops = {
.getname= sock_no_getname,
.sendmsg= sock_no_sendmsg,
.recvmsg= sock_no_recvmsg,
-   .poll   = sock_no_poll,
.listen = sock_no_listen,
.shutdown   = sock_no_shutdown,
.setsockopt = sock_no_setsockopt,
diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
index 6dde9a0cfe76..87f892f1d0fe 100644
--- a/drivers/net/ppp/pptp.c
+++ b/drivers/net/ppp/pptp.c
@@ -627,7 +627,6 @@ static const struct proto_ops pptp_ops = {
.socketpair = sock_no_socketpair,
.accept = sock_no_accept,
.getname= pptp_getname,
-   .poll   = sock_no_poll,
.listen = sock_no_listen,
.shutdown   = sock_no_shutdown,
.setsockopt = sock_no_setsockopt,
diff --git a/include/net/sock.h b/include/net/sock.h
index 169c92afcafa..d9249fe65859 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1585,8 +1585,6 @@ int sock_no_connect(struct socket *, struct sockaddr *, 
int, int);
 int sock_no_socketpair(struct socket *, struct socket *);
 int sock_no_accept(struct socket *, struct socket *, int, bool);
 int sock_no_getname(struct socket *, struct sockaddr *, int *, int);
-__poll_t sock_no_poll(struct file *, struct socket *,
- struct poll_table_struct *);
 int sock_no_ioctl(struct socket *, unsigned int, unsigned long);
 int sock_no_listen(struct socket *, int);
 int sock_no_shutdown(struct socket *, int);
diff --git a/net/bluetooth/bnep/sock.c b/net/bluetooth/bnep/sock.c
index b5116fa9835e..00deacdcb51c 100644
--- a/net/bluetooth/bnep/sock.c
+++ b/net/bluetooth/bnep/sock.c
@@ -175,7 +175,6 @@ static const struct proto_ops bnep_sock_ops = {
.getname= sock_no_getname,
.sendmsg= sock_no_sendmsg,
.recvmsg= sock_no_recvmsg,
-   .poll   = sock_no_poll,
.listen = sock_no_listen,
.shutdown   = sock_no_shutdown,
.setsockopt = sock_no_setsockopt,
diff --git a/net/bluetooth/cmtp/sock.c b/net/bluetooth/cmtp/sock.c
index ce86a7bae844..e08f28fadd65 100644
--- a/net/bluetooth/cmtp/sock.c
+++ b/net/bluetooth/cmtp/sock.c
@@ -178,7 +178,6 @@ static const struct proto_ops cmtp_sock_ops = {
.getname= sock_no_getname,
.sendmsg= sock_no_sendmsg,
.recvmsg= 

Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls

2018-03-05 Thread Kees Cook
On Mon, Mar 5, 2018 at 1:21 PM, Alexander Popov  wrote:
> On 05.03.2018 23:25, Peter Zijlstra wrote:
>> On Mon, Mar 05, 2018 at 11:43:19AM -0800, Laura Abbott wrote:
>>> On 03/05/2018 08:41 AM, Dave Hansen wrote:
 On 03/03/2018 12:00 PM, Alexander Popov wrote:
>   Documentation/x86/x86_64/mm.txt  |   2 +
>   arch/Kconfig |  27 ++
>   arch/x86/Kconfig |   1 +
>   arch/x86/entry/entry_32.S|  88 +++
>   arch/x86/entry/entry_64.S| 108 
> +++
>   arch/x86/entry/entry_64_compat.S |  11 

 This is a *lot* of assembly.  I wonder if you tried at all to get more
 of this into C or whether you just inherited the assembly from the
 original code?

>>>
>>> This came up previously 
>>> http://www.openwall.com/lists/kernel-hardening/2017/10/23/5
>>> there were concerns about trusting C to do the right thing as well as
>>> speed.
>>
>> And therefore the answer to this obvious question should've been part of
>> the Changelog :-)
>>
>> Dave is last in a long line of people asking this same question.
>
> Yes, actually the changelog in the cover letter contains that:
>
>   After some experiments, kept the asm implementation of erase_kstack(),
>   because it gives a full control over the stack for clearing it neatly
>   and doesn't offend KASAN.
>
> Moreover, later erase_kstack() on x86_64 became different from one on x86_32.

Maybe explicitly mention the C experiments in future change log?

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH 07/34] x86/entry/32: Restore segments before int registers

2018-03-05 Thread Joerg Roedel
On Mon, Mar 05, 2018 at 12:50:33PM -0800, Linus Torvalds wrote:
> On Mon, Mar 5, 2018 at 12:38 PM, Brian Gerst  wrote:
> >
> > There already is a test: single_step_syscall.c
> 
> Ahh, good. So presumably Joerg actually did check it, just didn't even notice 
> ;)

Yeah, sort of. I ran the test, but it didn't catch the failure case in
previous versions which was return to user with kernel-cr3 :)

I could probably add some debug instrumentation to check for that in my
future testing, as there is no NX protection in the user address-range
for the kernel-cr3.


Joerg



<    2   3   4   5   6   7   8   9   10   11   >