Re: [PATCH] USB: pl2303: add ids for Hewlett-Packard HP POS pole displays

2018-12-18 Thread Johan Hovold
On Thu, Dec 13, 2018 at 06:01:47AM -0500, Scott Chen wrote:
> Add device ids to pl2303 for the HP POS pole displays:
> LM920:   03f0:026b
> TD620:   03f0:0956
> LD960TA: 03f0:4439
> LD220TA: 03f0:4349
> LM940:   03f0:5039
> 
> Signed-off-by: Scott Chen 

Applied for -next, thanks.

Johan


Re: [PATCH v2] lzo: fix ip overrun during compress.

2018-12-18 Thread Yueyi Li
Hi Markus & Kees,

On 2018/12/17 0:56, Markus F.X.J. Oberhumer wrote:
> Yueyi,
>
> if ASLR does indeed exclude the last page (like it should), how do
> you get the invalid (0xf000, 4096) mapping then?
Regarding following code, seems ASLR is align to ARM64_MEMSTART_ALIGN,I
don`t think it will exclude the top 4K address space.

```
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
 extern u16 memstart_offset_seed;
 u64 range = linear_region_size -
(bootloader_memory_limit - memblock_start_of_DRAM());

 /*
  * If the size of the linear region exceeds, by a sufficient
  * margin, the size of the region that the available physical
  * memory spans, randomize the linear region as well.
  */
 if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
 range = range / ARM64_MEMSTART_ALIGN + 1;
 memstart_addr -= ARM64_MEMSTART_ALIGN *
  ((range * memstart_offset_seed) >> 16);
 }
}
```

Thanks,
Yueyi



[PATCH V4 2/3] misc/pvpanic : add pci interface for pvpanic

2018-12-18 Thread Peng Hao
Support pvpanic as a pci device in guest kernel.

Signed-off-by: Peng Hao 
---
 drivers/misc/pvpanic.c | 72 --
 1 file changed, 70 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/pvpanic.c b/drivers/misc/pvpanic.c
index f84ed30..c30bf62 100644
--- a/drivers/misc/pvpanic.c
+++ b/drivers/misc/pvpanic.c
@@ -13,9 +13,12 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
+#define PCI_VENDOR_ID_REDHAT 0x1b36
+#define PCI_DEVICE_ID_REDHAT_PVPANIC 0x0101
 static void __iomem *base;
 
 #define PVPANIC_PANICKED(1 << 0)
@@ -172,12 +175,76 @@ static int pvpanic_mmio_remove(struct platform_device 
*pdev)
.remove = pvpanic_mmio_remove,
 };
 
+#ifdef CONFIG_PCI
+static const struct pci_device_id pvpanic_pci_id_tbl[]  = {
+   { PCI_DEVICE(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_PVPANIC),},
+   {}
+};
+
+static int pvpanic_pci_probe(struct pci_dev *pdev,
+const struct pci_device_id *ent)
+{
+   int ret;
+
+   ret = pcim_enable_device(pdev);
+   if (ret < 0)
+   return ret;
+
+   ret = pcim_iomap_regions(pdev, 1 << 0, pci_name(pdev));
+   if (ret)
+   return ret;
+
+   base = pcim_iomap_table(pdev)[0];
+
+   atomic_notifier_chain_register(_notifier_list,
+  _panic_nb);
+   return 0;
+}
+
+static void pvpanic_pci_remove(struct pci_dev *pdev)
+{
+   atomic_notifier_chain_unregister(_notifier_list,
+_panic_nb);
+}
+
+static struct pci_driver pvpanic_pci_driver = {
+   .name = "pvpanic-pci",
+   .id_table = pvpanic_pci_id_tbl,
+   .probe =pvpanic_pci_probe,
+   .remove =   pvpanic_pci_remove,
+};
+
+static int pvpanic_register_pci_driver(void)
+{
+   return pci_register_driver(_pci_driver);
+}
+
+static void pvpanic_unregister_pci_driver(void)
+{
+   pci_unregister_driver(_pci_driver);
+}
+#else
+static int pvpanic_register_pci_driver(void)
+{
+   return 0;
+}
+
+static void pvpanic_unregister_pci_drvier(void) {}
+#endif
+
 static int __init pvpanic_mmio_init(void)
 {
+   int r1, r2;
+
if (acpi_disabled)
-   return platform_driver_register(_mmio_driver);
+   r1 = platform_driver_register(_mmio_driver);
+   else
+   r1 = pvpanic_register_acpi_driver();
+   r2 = pvpanic_register_pci_driver();
+   if (r1 && r2) /* all drivers register failed */
+   return 1;
else
-   return pvpanic_register_acpi_driver();
+   return 0;
 }
 
 static void __exit pvpanic_mmio_exit(void)
@@ -186,6 +253,7 @@ static void __exit pvpanic_mmio_exit(void)
platform_driver_unregister(_mmio_driver);
else
pvpanic_unregister_acpi_driver();
+   pvpanic_unregister_pci_driver();
 }
 
 module_init(pvpanic_mmio_init);
-- 
1.8.3.1



[PATCH V4 1/3] misc/pvpanic: return 0 for empty body register function

2018-12-18 Thread Peng Hao
Return 0 for empty body register function normally.

Signed-off-by: Peng Hao 
---
QEMU community requires additional PCI devices to simulate PVPANIC devices
so that some architectures can not occupy precious less than 4G of memory 
space.

 drivers/misc/pvpanic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/pvpanic.c b/drivers/misc/pvpanic.c
index 3150dc2..f84ed30 100644
--- a/drivers/misc/pvpanic.c
+++ b/drivers/misc/pvpanic.c
@@ -125,7 +125,7 @@ static void pvpanic_unregister_acpi_driver(void)
 #else
 static int pvpanic_register_acpi_driver(void)
 {
-   return -ENODEV;
+   return 0;
 }
 
 static void pvpanic_unregister_acpi_driver(void) {}
-- 
1.8.3.1



[PATCH V4 3/3] misc/pvpanic : add pci dependency in Kconfig

2018-12-18 Thread Peng Hao
Add PCI dependency for pvpanic in Kconfig.

Signed-off-by: Peng Hao 
---
 drivers/misc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index f417b06..5ff8ca4 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -515,7 +515,7 @@ config MISC_RTSX
 
 config PVPANIC
tristate "pvpanic device support"
-   depends on HAS_IOMEM && (ACPI || OF)
+   depends on HAS_IOMEM && (ACPI || OF || PCI)
help
  This driver provides support for the pvpanic device.  pvpanic is
  a paravirtualized device provided by QEMU; it lets a virtual machine
-- 
1.8.3.1



[PATCH] mm: do not report isolation failures for CMA pages

2018-12-18 Thread Michal Hocko
From: Michal Hocko 

Heiko has complained that his log is swamped by warnings from 
has_unmovable_pages
[   20.536664] page dumped because: has_unmovable_pages
[   20.536792] page:03d081ff4080 count:1 mapcount:0 
mapping:8ff88600 index:0x0 compound_mapcount: 0
[   20.536794] flags: 0x3fffe010200(slab|head)
[   20.536795] raw: 03fffe010200 0100 0200 
8ff88600
[   20.536796] raw:  00200041 0001 

[   20.536797] page dumped because: has_unmovable_pages
[   20.536814] page:03d0823b count:1 mapcount:0 
mapping: index:0x0
[   20.536815] flags: 0x7fffe00()
[   20.536817] raw: 07fffe00 0100 0200 

[   20.536818] raw:   0001 


which are not triggered by the memory hotplug but rather CMA allocator.
The original idea behind dumping the page state for all call paths was
that these messages will be helpful debugging failures. From the above
it seems that this is not the case for the CMA path because we are
lacking much more context. E.g the second reported page might be a CMA
allocated page. It is still interesting to see a slab page in the CMA
area but it is hard to tell whether this is bug from the above output
alone.

Address this issue by dumping the page state only on request. Both
start_isolate_page_range and has_unmovable_pages already have an
argument to ignore hwpoison pages so make this argument more generic and
turn it into flags and allow callers to combine non-default modes into a
mask. While we are at it, has_unmovable_pages call from 
is_pageblock_removable_nolock
(sysfs removable file) is questionable to report the failure so drop it
from there as well.

Reported-by: Heiko Carstens 
Signed-off-by: Michal Hocko 
---

Hi,
this is triggered by [1]. I think it should go as a separate patch
rathen than folded in to [2] because it gives a more context for future
reference but I will not insist of course.

Implementation wise I went with the simplest patch but if there is a
strong feeling that we need a dedicated enum then I will do that. The
API is quite low level so I didn't feel an urge to do that myself.

[1] http://lkml.kernel.org/r/20181217155922.GC3560@osiris
[2] http://lkml.kernel.org/r/20181116083020.20260-6-mho...@kernel.org

 include/linux/page-isolation.h | 11 +--
 mm/memory_hotplug.c|  5 +++--
 mm/page_alloc.c| 11 +--
 mm/page_isolation.c| 10 --
 4 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 4ae347cbc36d..4eb26d278046 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -30,8 +30,11 @@ static inline bool is_migrate_isolate(int migratetype)
 }
 #endif
 
+#define SKIP_HWPOISON  0x1
+#define REPORT_FAILURE 0x2
+
 bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
-int migratetype, bool skip_hwpoisoned_pages);
+int migratetype, int flags);
 void set_pageblock_migratetype(struct page *page, int migratetype);
 int move_freepages_block(struct zone *zone, struct page *page,
int migratetype, int *num_movable);
@@ -44,10 +47,14 @@ int move_freepages_block(struct zone *zone, struct page 
*page,
  * For isolating all pages in the range finally, the caller have to
  * free all pages in the range. test_page_isolated() can be used for
  * test it.
+ *
+ * The following flags are allowed (they can be combined in a bit mask)
+ * SKIP_HWPOISON - ignore hwpoison pages
+ * REPORT_FAILURE - report details about the failure to isolate the range
  */
 int
 start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-unsigned migratetype, bool skip_hwpoisoned_pages);
+unsigned migratetype, int flags);
 
 /*
  * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE.
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c82193db4be6..8537429d33a6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1226,7 +1226,7 @@ static bool is_pageblock_removable_nolock(struct page 
*page)
if (!zone_spans_pfn(zone, pfn))
return false;
 
-   return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, true);
+   return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, 
SKIP_HWPOISON);
 }
 
 /* Checks if this range of memory is likely to be hot-removable. */
@@ -1577,7 +1577,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
 
/* set above range as isolated */
ret = start_isolate_page_range(start_pfn, end_pfn,
-  MIGRATE_MOVABLE, true);
+  MIGRATE_MOVABLE,
+  SKIP_HWPOISON | 

Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

2018-12-18 Thread Miroslav Benes
On Mon, 17 Dec 2018, Josh Poimboeuf wrote:

> On Mon, Dec 17, 2018 at 04:06:18PM -0800, Andi Kleen wrote:
> > On Mon, Dec 17, 2018 at 05:36:44PM -0500, Steven Rostedt wrote:
> > > On Mon, 17 Dec 2018 15:31:26 -0600
> > > Josh Poimboeuf  wrote:
> > > 
> > > > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> > > > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:
> > > > >   
> > > > > > > Yes LTO causes the to be treated like static functions.
> > > > > > > 
> > > > > > > I guess noclone is unlikely to be really needed here because these
> > > > > > > functions are unlikely to be cloned.
> > > > > > > 
> > > > > > > So as a workaround it could be removed.
> > > > > > > 
> > > > > > > But note we have other noclone functions in the tree (like in KVM)
> > > > > > > which actually need it.  
> > > > > > 
> > > > > > How about we just use the __used attribute then?  It seems to have 
> > > > > > the
> > > > > > same result of preventing IPA optimizations (without the weird side
> > > > > > effect of missing frame pointers).  
> > > > > 
> > > > > AFAIK we don't have any in-tree LTO, so it can all go in the bin.
> > > > > 
> > > > > When/if we get the LTO trainwreck sorted -- which very much includes
> > > > > getting that memory-order-consume fixed -- we can revisit all that.  
> > > > 
> > > > Ok, then if there are no objections I'll just send a revert of:
> > > > 
> > > >   dd3dad0d716d ("ftrace: Mark function tracer test functions 
> > > > noinline/noclone")
> 
> Sorry for suggesting this prematurely, my email client stopped syncing
> and I missed your later replies to Peter about this.
> 
> > > Should it be reverted, or just remove the noclone, and keep the
> > > noinline?
> > 
> > It should not be touched for now, until it is properly debugged.
> > 
> > IMHO Josh's explanation doesn't make much sense and there
> > was a lot of handwaving 
> > 
> > And just fixing one case isn't good enough because there are other
> > noclone functions in the tree.
> > 
> > It the problem is the plugin the plugin needs to be fixed.
> > 
> > If the problem is gcc we need a gcc test case and bug, with 
> > some analysis, and then based on that select the proper workaround.
> 
> The plugin is only used for older versions of GCC.  Newer versions have
> the same functionality builtin with -fsanitize-coverage=trace-pc.
> 
> So the problem is GCC.  We're using a function attribute which at least
> oneGCC developer doesn't recommend.  If you want to keep the LTO support
> then '__used' seems like a much better choice.

Martin added to CC.

Martin, the thread starts here 
http://lkml.kernel.org/r/CAK8P3a2K1K21ePBFbApaTKPCk+=bqj0lywok1mdfb1s9zwj...@mail.gmail.com

Can you explain the background of noclone vs. used attributes, please? 
We discussed it yesterday and I understood that maybe we should not rely 
on noclone that much. However it is used in the kernel. Should we avoid 
it in general and replace it with something else (used)? 

It definitely makes sense in our livepatching samples which Josh mentioned 
previously in the thread.

Thanks,
Miroslav


Re: linux-next: Tree for Dec 17 (regulator/mcp16502.c)

2018-12-18 Thread Andrei.Stefanescu
Hi Randy,

Thank you for the email. The error should be fixed with this patch [1].

Best regards,
Andrei

[1] - 
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-December/621292.html

On 17.12.2018 18:03, Randy Dunlap wrote:
> On 12/17/18 3:09 AM, Stephen Rothwell wrote:
>> Hi all,
>>
>> Changes since 20181214:
>>
> on i386:
> # CONFIG_SUSPEND is not set
> CONFIG_PM=y
>
>CC  drivers/regulator/mcp16502.o
> In file included from ../include/linux/device.h:23:0,
>   from ../include/linux/gpio/driver.h:5,
>   from ../include/asm-generic/gpio.h:13,
>   from ../include/linux/gpio.h:62,
>   from ../drivers/regulator/mcp16502.c:11:
> ../drivers/regulator/mcp16502.c:527:32: error: 'mcp16502_suspend_noirq' 
> undeclared here (not in a function)
>SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(mcp16502_suspend_noirq,
>  ^
> ../include/linux/pm.h:342:19: note: in definition of macro 
> 'SET_NOIRQ_SYSTEM_SLEEP_PM_OPS'
>.suspend_noirq = suspend_fn, \
> ^
> ../drivers/regulator/mcp16502.c:528:10: error: 'mcp16502_resume_noirq' 
> undeclared here (not in a function)
>mcp16502_resume_noirq)
>^
> ../include/linux/pm.h:343:18: note: in definition of macro 
> 'SET_NOIRQ_SYSTEM_SLEEP_PM_OPS'
>.resume_noirq = resume_fn, \
>^
>
>
>


Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

2018-12-18 Thread Jan Kara
On Mon 17-12-18 10:34:43, Matthew Wilcox wrote:
> On Mon, Dec 17, 2018 at 01:11:50PM -0500, Jerome Glisse wrote:
> > On Mon, Dec 17, 2018 at 08:58:19AM +1100, Dave Chinner wrote:
> > > Sure, that's a possibility, but that doesn't close off any race
> > > conditions because there can be DMA into the page in progress while
> > > the page is being bounced, right? AFAICT this ext3+DIF/DIX case is
> > > different in that there is no 3rd-party access to the page while it
> > > is under IO (ext3 arbitrates all access to it's metadata), and so
> > > nothing can actually race for modification of the page between
> > > submission and bouncing at the block layer.
> > > 
> > > In this case, the moment the page is unlocked, anyone else can map
> > > it and start (R)DMA on it, and that can happen before the bio is
> > > bounced by the block layer. So AFAICT, block layer bouncing doesn't
> > > solve the problem of racing writeback and DMA direct to the page we
> > > are doing IO on. Yes, it reduces the race window substantially, but
> > > it doesn't get rid of it.
> > 
> > So the event flow is:
> > - userspace create object that match a range of virtual address
> >   against a given kernel sub-system (let's say infiniband) and
> >   let's assume that the range is an mmap() of a regular file
> > - device driver do GUP on the range (let's assume it is a write
> >   GUP) so if the page is not already map with write permission
> >   in the page table than a page fault is trigger and page_mkwrite
> >   happens
> > - Once GUP return the page to the device driver and once the
> >   device driver as updated the hardware states to allow access
> >   to this page then from that point on hardware can write to the
> >   page at _any_ time, it is fully disconnected from any fs event
> >   like write back, it fully ignore things like page_mkclean
> > 
> > This is how it is to day, we allowed people to push upstream such
> > users of GUP. This is a fact we have to live with, we can not stop
> > hardware access to the page, we can not force the hardware to follow
> > page_mkclean and force a page_mkwrite once write back ends. This is
> > the situation we are inheriting (and i am personnaly not happy with
> > that).
> > 
> > >From my point of view we are left with 2 choices:
> > [C1] break all drivers that do not abide by the page_mkclean and
> >  page_mkwrite
> > [C2] mitigate as much as possible the issue
> > 
> > For [C2] the idea is to keep track of GUP per page so we know if we
> > can expect the page to be written to at any time. Here is the event
> > flow:
> > - driver GUP the page and program the hardware, page is mark as
> >   GUPed
> > ...
> > - write back kicks in on the dirty page, lock the page and every
> >   thing as usual , sees it is GUPed and inform the block layer to
> >   use a bounce page
> 
> No.  The solution John, Dan & I have been looking at is to take the
> dirty page off the LRU while it is pinned by GUP.  It will never be
> found for writeback.
> 
> That's not the end of the story though.  Other parts of the kernel (eg
> msync) also need to be taught to stay away from pages which are pinned
> by GUP.  But the idea is that no page gets written back to storage while
> it's pinned by GUP.  Only when the last GUP ends is the page returned
> to the list of dirty pages.

We've been through this in:

https://lore.kernel.org/lkml/20180709194740.rymbt2fzohbdm...@quack2.suse.cz/

back in July. You cannot just skip pages for fsync(2). So as I wrote above -
memory cleaning writeback can skip pinned pages. Data integrity writeback
must be able to write pinned pages. And bouncing is one reasonable way how
to do that.

This writeback decision is pretty much independent from the mechanism by
which we are going to identify pinned pages. Whether that's going to be
separate counter in struct page, using page->_mapcount, or separately
allocated data structure as you know promote.

I currently like the most the _mapcount suggestion from Jerome but I'm not
really attached to any solution as long as it performs reasonably and
someone can make it working :) as I don't have time to implement it at
least till January.

Honza
-- 
Jan Kara 
SUSE Labs, CR


Re: [patch] futex: Cure exit race

2018-12-18 Thread Thomas Gleixner
On Wed, 12 Dec 2018, Peter Zijlstra wrote:
> On Mon, Dec 10, 2018 at 06:43:51PM +0100, Thomas Gleixner wrote:
> @@ -806,6 +806,8 @@ void __noreturn do_exit(long code)
>* task into the wait for ever nirwana as well.
>*/
>   tsk->flags |= PF_EXITPIDONE;
> + smp_mb();
> + wake_up_bit(>flags, 3 /* PF_EXITPIDONE */);

Using ilog2(PF_EXITPIDONE) spares that horrible inline comment and more
importantly selects the right bit. 0x04 is bit 2 

> @@ -1187,10 +1236,15 @@ static int attach_to_pi_owner(u32 uval, union 
> futex_key *key,
>* set, we know that the task has finished the
>* cleanup:
>*/
>   int ret = handle_exit_race(uaddr, uval, p);
>  
>   raw_spin_unlock_irq(>pi_lock);
> - put_task_struct(p);
> +
> + if (ret == -EAGAIN)
> + *pe = p;

Hmm, no. We really want to split the return value for that. EAGAIN is also
returned for other reasons.

Plus requeue_pi() needs the same treatment. I'm staring into it, but all I
came up with so far is horribly ugly.

Thanks,

tglx


Re: [PATCH v2 1/1] xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront

2018-12-18 Thread Roger Pau Monné
On Tue, Dec 18, 2018 at 08:55:38AM +0800, Dongli Zhang wrote:
> The xenstore 'ring-page-order' is used globally for each blkback queue and
> therefore should be read from xenstore only once. However, it is obtained
> in read_per_ring_refs() which might be called multiple times during the
> initialization of each blkback queue.
> 
> If the blkfront is malicious and the 'ring-page-order' is set in different
> value by blkfront every time before blkback reads it, this may end up at
> the "WARN_ON(i != (XEN_BLKIF_REQS_PER_PAGE * blkif->nr_ring_pages));" in
> xen_blkif_disconnect() when frontend is destroyed.
> 
> This patch reworks connect_ring() to read xenstore 'ring-page-order' only
> once.
> 
> Signed-off-by: Dongli Zhang 
> ---
> Changed since v1:
>   * change the order of xenstore read in read_per_ring_refs(suggested by 
> Roger Pau Monne)
>   * use xenbus_read_unsigned() in connect_ring() (suggested by Roger Pau 
> Monne)
> 
>  drivers/block/xen-blkback/xenbus.c | 70 
> ++
>  1 file changed, 40 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/xenbus.c 
> b/drivers/block/xen-blkback/xenbus.c
> index a4bc74e..7178f0f 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -926,7 +926,7 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> *ring, const char *dir)
>   int err, i, j;
>   struct xen_blkif *blkif = ring->blkif;
>   struct xenbus_device *dev = blkif->be->dev;
> - unsigned int ring_page_order, nr_grefs, evtchn;
> + unsigned int nr_grefs, evtchn;
>  
>   err = xenbus_scanf(XBT_NIL, dir, "event-channel", "%u",
> );
> @@ -936,43 +936,38 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> *ring, const char *dir)
>   return err;
>   }
>  
> - err = xenbus_scanf(XBT_NIL, dev->otherend, "ring-page-order", "%u",
> -   _page_order);
> - if (err != 1) {
> - err = xenbus_scanf(XBT_NIL, dir, "ring-ref", "%u", 
> _ref[0]);
> - if (err != 1) {
> + nr_grefs = blkif->nr_ring_pages;
> + WARN_ON(!nr_grefs);
> +
> + for (i = 0; i < nr_grefs; i++) {
> + char ring_ref_name[RINGREF_NAME_LEN];
> +
> + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> + err = xenbus_scanf(XBT_NIL, dir, ring_ref_name,
> +"%u", _ref[i]);
> +
> + if (err != 1 && (i || (!i && nr_grefs > 1))) {

AFAICT the above condition can be simplified as "err != 1 &&
nr_grefs".

>   err = -EINVAL;

There's no point in setting err here...

> - xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
> + xenbus_dev_fatal(dev, err, "reading %s/%s",
> +  dir, ring_ref_name);
>   return err;

...since you can just return -EINVAL (same applies to the other
instance below).

The rest LGTM, Thanks.


Re: [PATCH v2 3/3] sched/fair: fix unnecessary increase of balance interval

2018-12-18 Thread Vincent Guittot
Hi Valentin,

On Mon, 17 Dec 2018 at 17:56, Valentin Schneider
 wrote:
>
> Hi Vincent,
>
> About time I had a look at this one...
>
> On 14/12/2018 16:01, Vincent Guittot wrote:
> > In case of active balance, we increase the balance interval to cover
> > pinned tasks cases not covered by all_pinned logic.
>
> AFAIUI the balance increase is there to have plenty of time to
> stop the task before going through another load_balance().
>
> Seeing as there is a cpus_allowed check that leads to
>
> env.flags |= LBF_ALL_PINNED;
> goto out_one_pinned;
>
> in the active balancing part of load_balance(), the condition you're
> changing should never be hit when we have pinned tasks. So you may want
> to rephrase that bit.

In this asym packing case, It has nothing to do with pinned tasks and
that's the root cause of the problem:
the active balance triggered by asym packing is wrongly assumed to be
an active balance due to pinned task(s) and the load balance interval
is increased without any reason

>
> > Neverthless, the active
> > migration triggered by asym packing should be treated as the normal
> > unbalanced case and reset the interval to default value otherwise active
> > migration for asym_packing can be easily delayed for hundreds of ms
> > because of this all_pinned detection mecanism.
>
> Mmm so it's not exactly clear why we need this change. If we double the
> interval because of a pinned task we wanted to active balance, well it's
> just regular pinned task issues and we can't do much about it.

As explained above, it's not a pinned task case

>
> The only scenario I can think of is if you had a workload where you wanted
> to do an active balance in several successive load_balance(), in which case
> you would keep increasing the interval even though you do migrate a task
> every time (which would harm every subsequent active_balance).
>
> In that case every active_balance "user" (pressured CPU, misfit) is
> affected, so maybe what we want is something like this:
>
> -8<-
> @@ -9136,13 +9149,11 @@ static int load_balance(int this_cpu, struct rq 
> *this_rq,
> sd->balance_interval = sd->min_interval;
> } else {
> /*
> -* If we've begun active balancing, start to back off. This
> -* case may not be covered by the all_pinned logic if there
> -* is only 1 task on the busy runqueue (because we don't call
> -* detach_tasks).
> +* If we've begun active balancing, start to back off.
> +* Don't increase too much in case we have more active 
> balances
> +* coming up.
>  */
> -   if (sd->balance_interval < sd->max_interval)
> -   sd->balance_interval *= 2;
> +   sd->balance_interval = 2 * sd->min_interval;
> }
>
> goto out;
> ->8-
>
> Maybe we want a larger threshold - truth be told, it all depends on how
> long the cpu stopper can take and if that delay increase is still relevant
> nowadays.

hmm the increase of balance interval is not linked to cpu stopper but
to increase the load balance interval when we know that there is no
possible load balance to perform

Regards,
Vincent
>
> >
> > Signed-off-by: Vincent Guittot 
> > ---
> >  kernel/sched/fair.c | 27 +++
> >  1 file changed, 15 insertions(+), 12 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 9591e7a..487287e 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8857,21 +8857,24 @@ static struct rq *find_busiest_queue(struct lb_env 
> > *env,
> >   */
> >  #define MAX_PINNED_INTERVAL  512
> >
> > +static inline bool
> > +asym_active_balance(struct lb_env *env)
> > +{
> > + /*
> > +  * ASYM_PACKING needs to force migrate tasks from busy but
> > +  * lower priority CPUs in order to pack all tasks in the
> > +  * highest priority CPUs.
> > +  */
> > + return env->idle != CPU_NOT_IDLE && (env->sd->flags & 
> > SD_ASYM_PACKING) &&
> > +sched_asym_prefer(env->dst_cpu, env->src_cpu);
> > +}
> > +
> >  static int need_active_balance(struct lb_env *env)
> >  {
> >   struct sched_domain *sd = env->sd;
> >
> > - if (env->idle != CPU_NOT_IDLE) {
> > -
> > - /*
> > -  * ASYM_PACKING needs to force migrate tasks from busy but
> > -  * lower priority CPUs in order to pack all tasks in the
> > -  * highest priority CPUs.
> > -  */
> > - if ((sd->flags & SD_ASYM_PACKING) &&
> > - sched_asym_prefer(env->dst_cpu, env->src_cpu))
> > - return 1;
> > - }
> > + if (asym_active_balance(env))
> > + return 1;
> >
> >   /*
> >* The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
> > @@ -9150,7 +9153,7 @@ static int load_balance(int this_cpu, struct rq 
> > *this_rq,
> 

Re: [GIT PULL] cpupower update for Linux 4.21-rc1

2018-12-18 Thread Rafael J. Wysocki
Hi Shuah,

On Mon, Dec 17, 2018 at 6:42 PM shuah  wrote:
>
> Hi Rafael,
>
> Please pull the following cpupower update for Linux 4.21-rc1.
>
> This cpupower update Linux 4.21 adds support for auto-completion for
> cpupower tool from Abhishek Goel.
>
> diff is attached.

Pulled, thank you!


Re: Question: pause mode disabled for marvell 88e151x phy

2018-12-18 Thread Yunsheng Lin
On 2018/12/17 22:36, Russell King - ARM Linux wrote:
> On Mon, Dec 17, 2018 at 05:42:20PM +0800, Yunsheng Lin wrote:
>> On 2018/12/15 18:37, Russell King - ARM Linux wrote:
>>> On Sat, Dec 15, 2018 at 04:07:42PM +0800, Yunsheng Lin wrote:
 There seems to be some problem with pause subsequent negotiation.
 We reverted the above patch and tried to reproduce the above problem
 by triggering another negotiation by reconnection of the cable, using
 ethtool -a cmd shows both still have tx and rx pause enable.
>>>
>>> That's where the problem is - as far as the network device and Linux
>>> is concerned, pause was successfully negotiated.  However, as the
>>> advertisment register has ended up with the pause mode bits cleared,
>>> Linux doesn't realise that what we conveyed to the partner was an
>>> advertisment containing no pause mode bits.
>>>
>>> ethtool doesn't read the PHY advertisment register when displaying
>>> what we advertised, it returns what's in phydev->advertising - it
>>> gives you the cached value not the this-is-what-the-hardware-is-doing
>>> value.
>>>
 1. Does all the 88e151x supporting SGMII-to-Copper have the above problem?
>>>
>>> Unknown.
>>>
 2. If not, can we use revision id field in phydev->phy_id to only disable
the pause support for specific 88e151x phy? We can not find some useful
revision info in datasheet, and by printing the phy id when phy init, we
are able to find that the phy we are using has a phy id as 0x1d10dd1,
which has revision id as 0x1.
>>>
>>> 0x01d10dd1 doesn't look to be a Marvell part - Marvell parts generally
>>> start with 0x0141  Is your 0x1d1 a typo?  My device is 0x01410dd1.
>>
>> Sorry, 0x1d1 is a typo.
>> My device is also 0x01410dd1.
>>
>>>
 3. Does this problem only happen marvel 88e1512 phy with some specific 
 partner
phy? We are unable to reproduce this problem, so any suggestion to 
 reproduce
this would be very helpful to us too.
>>>
>>> I don't think you've proven that you do not have a problem (see below
>>> for how to do this.)
>>>
 4. Also the commit disables the pause support completely, if using 
 revision id
can not aviod this problem, can we only disable pause support when 
 negotiation
by only clearing pause support in phydev->advertising, but not 
 phydev->supported?
>>>
>>> No comment at present.
>>>
>>>
>>> I think you first need to ensure that your observations are correct.
>>> You are basing your assumptions on ethtool -a's output, which is
>>> definitely wrong as I've mentioned above.
>>>
>>> You need to read directly from the hardware using mii-diag -v ethN
>>> and manually decode the advertisment register (register 4) checking
>>> bits 11 and 10 (the pause mode bits).  My observation is that Linux
>>> can set these bits, but then both bits clear during the negotiation
>>> process.
>>
>> Thanks for the info.
>>
>> Using arm64 with marvel 88e1512 phy connected to a X86 with intel phy,
>> The 88e1512 phy' advertisment register did change after negotiation:
>>
>> arm64 with marvel 88e1512 phy:
>>  MII PHY #1 transceiver registers:
>>3100 796d 0141 0dd1 05e1 cde1 000d 2001
>>4006 0200 3800   0003  3000
>>3060 af08   0020   
>>  0040     
>>
>> X86 with intel phy:
>>1140 796d 0154 03b1 0de1 c5e1 000d 2001
>>6801 0600 7800     3000
>> 000a 840a 1075  000c ff08 3048
>> 816c 1ac6 0003 210a 1f55  c064
>>
>> But ethtool -a on both arm64 and X86 shows that tx and rx pause are
>> both enabled.
> 
> I'll say this again, ignore ethtool when it comes to this problem.
> ethtool uses cached information to compute the pause settings.
> 
>> And in include/linux/mii.h, we have:
>> /**
>>  * mii_resolve_flowctrl_fdx
>>  * @lcladv: value of MII ADVERTISE register
>>  * @rmtadv: value of MII LPA register
>>  *
>>  * Resolve full duplex flow control as per IEEE 802.3-2005 table 28B-3
>>  */
>> static inline u8 mii_resolve_flowctrl_fdx(u16 lcladv, u16 rmtadv)
>> {
>>  u8 cap = 0;
>>
>>  if (lcladv & rmtadv & ADVERTISE_PAUSE_CAP) {
>>  cap = FLOW_CTRL_TX | FLOW_CTRL_RX;
>>  } else if (lcladv & rmtadv & ADVERTISE_PAUSE_ASYM) {
>>  if (lcladv & ADVERTISE_PAUSE_CAP)
>>  cap = FLOW_CTRL_RX;
>>  else if (rmtadv & ADVERTISE_PAUSE_CAP)
>>  cap = FLOW_CTRL_TX;
>>  }
>>
>>  return cap;
>> }
> 
> Not used by the marvell PHY driver.  It uses this code instead:
> 
> if (phydev->duplex == DUPLEX_FULL) {
> phydev->pause = lpa & LPA_PAUSE_CAP ? 1 : 0;
> phydev->asym_pause = lpa & LPA_PAUSE_ASYM ? 1 : 0;
> }
> 
> and then its up to the network driver to decide what to do with
> phydev->pause and phydev->asym_pause.

Thanks, I see.

> 
> 
>> As the comment has 

Re: [PATCH v3 3/3] PCI: imx6: Add support for i.MX8MQ

2018-12-18 Thread Leonard Crestez
On Mon, 2018-12-17 at 20:07 -0800, Andrey Smirnov wrote:
> Add code needed to support i.MX8MQ variant.

>  static void imx6_pcie_init_phy(struct imx6_pcie *imx6_pcie)
>  {
> +
> +
Remove empty lines?

> + unsigned int mask, val, offset;
> +
> + mask = IMX6Q_GPR12_DEVICE_TYPE;
> + val  = FIELD_PREP(IMX6Q_GPR12_DEVICE_TYPE, PCI_EXP_TYPE_ROOT_PORT);

... snip ...

> - regmap_update_bits(imx6_pcie->iomuxc_gpr, IOMUXC_GPR12,
> - IMX6Q_GPR12_DEVICE_TYPE, PCI_EXP_TYPE_ROOT_PORT << 12);
> + regmap_update_bits(imx6_pcie->iomuxc_gpr, IOMUXC_GPR12, mask, val);

Maybe setting port type should be a separate function from init_phy?


Re: [PATCH][resend] drm: dw-hdmi-i2s: convert to SPDX identifiers

2018-12-18 Thread Daniel Vetter
On Tue, Dec 18, 2018 at 7:47 AM Laurent Pinchart
 wrote:
>
> Hi Morimoto-san,
>
> Thank you for the patch.
>
> On Tuesday, 18 December 2018 08:00:24 EET Kuninori Morimoto wrote:
> > From: Kuninori Morimoto 
> >
> > This patch updates license to use SPDX-License-Identifier
> > instead of verbose license text.
> >
> > Signed-off-by: Kuninori Morimoto 
>
> Reviewed-by: Laurent Pinchart 
>
> > ---
> > few weeks passed, nothing happen. I re-post this patch again.
> > I added Andrew on Cc
>
> The driver seems to be lacking a maintainer :-S

bridge drivers all have a fallback maintainer, but none of them are
cc'ed. It's maintained in drm-misc, so you could just push the patch
too :-) Especially since you're listed:

DRM DRIVERS FOR BRIDGE CHIPS
M:Archit Taneja 
M:Andrzej Hajda 
R:Laurent Pinchart 
S:Maintained
T:git git://anongit.freedesktop.org/drm/drm-misc
F:drivers/gpu/drm/bridge/


Cheers, Daniel

>
> >  drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 5 +
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
> > b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index
> > 8f9c8a6..2228689 100644
> > --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
> > +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
> > @@ -1,12 +1,9 @@
> > +// SPDX-License-Identifier: GPL-2.0
> >  /*
> >   * dw-hdmi-i2s-audio.c
> >   *
> >   * Copyright (c) 2017 Renesas Solutions Corp.
> >   * Kuninori Morimoto 
> > - *
> > - * This program is free software; you can redistribute it and/or modify
> > - * it under the terms of the GNU General Public License version 2 as
> > - * published by the Free Software Foundation.
> >   */
> >  #include 
>
>
> --
> Regards,
>
> Laurent Pinchart
>
>
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


Re: [PATCH v4 2/9] arch/arm/mm/dma-mapping.c: Convert to use vm_insert_range

2018-12-18 Thread Russell King - ARM Linux
On Tue, Dec 18, 2018 at 01:52:09AM +0530, Souptick Joarder wrote:
> Convert to use vm_insert_range() to map range of kernel
> memory to user vma.
> 
> Signed-off-by: Souptick Joarder 
> ---
>  arch/arm/mm/dma-mapping.c | 21 +++--
>  1 file changed, 7 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 661fe48..7cbcde5 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -1582,31 +1582,24 @@ static int __arm_iommu_mmap_attrs(struct device *dev, 
> struct vm_area_struct *vma
>   void *cpu_addr, dma_addr_t dma_addr, size_t size,
>   unsigned long attrs)
>  {
> - unsigned long uaddr = vma->vm_start;
> - unsigned long usize = vma->vm_end - vma->vm_start;
> + unsigned long page_count = vma_pages(vma);
>   struct page **pages = __iommu_get_pages(cpu_addr, attrs);
>   unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>   unsigned long off = vma->vm_pgoff;
> + int err;
>  
>   if (!pages)
>   return -ENXIO;
>  
> - if (off >= nr_pages || (usize >> PAGE_SHIFT) > nr_pages - off)
> + if (off >= nr_pages)
>   return -ENXIO;

Are you sure you can make this change?

You are restricting the offset to be within 0..nr_pages which ensures
that the initial struct page that is passed to vm_insert_range() is
valid, but I think the removal of the following check is unsafe.

Your new vm_insert_range() function only checks page_count <=
vma_pages(vma), which it will be since it _is_ vma_pages(vma).  With
the removal of the second condition, there will be nothing checking
that (eg) off may be nr_pages - 1, and page_count=50, meaning that
vm_insert_range() will walk off the end of the page array.

Please take another look at this.

What about the other callsites of your new function - do they have
the same issue?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] media: uvcvideo: Fix 'type' check leading to overflow

2018-12-18 Thread Laurent Pinchart
Hi Alistair,

Thank you for the patch.

On Monday, 17 December 2018 23:02:22 EET Alistair Strachan wrote:
> When initially testing the Camera Terminal Descriptor wTerminalType
> field (buffer[4]), no mask is used. Later in the function, the MSB is
> overloaded to store the descriptor subtype, and so a mask of 0x7fff
> is used to check the type.
> 
> If a descriptor is specially crafted to set this overloaded bit in the
> original wTerminalType field, the initial type check will fail (falling
> through, without adjusting the buffer size), but the later type checks
> will pass, assuming the buffer has been made suitably large, causing an
> overflow.
> 
> This problem could be resolved in a few different ways, but this fix
> applies the same initial type check as used by UVC_ENTITY_TYPE (once we
> have a 'term' structure.) Such crafted wTerminalType fields will then be
> treated as *generic* Input Terminals, not as CAMERA or
> MEDIA_TRANSPORT_INPUT, avoiding an overflow.
> 
> Originally reported here:
> https://groups.google.com/forum/#!topic/syzkaller/Ot1fOE6v1d8
> A similar (non-compiling) patch was provided at that time.
> 
> Reported-by: syzbot 
> Signed-off-by: Alistair Strachan 
> Cc: Laurent Pinchart 
> Cc: Mauro Carvalho Chehab 
> Cc: linux-me...@vger.kernel.org
> Cc: kernel-t...@android.com
> ---
>  drivers/media/usb/uvc/uvc_driver.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/media/usb/uvc/uvc_driver.c
> b/drivers/media/usb/uvc/uvc_driver.c index bc369a0934a3..279a967b8264
> 100644
> --- a/drivers/media/usb/uvc/uvc_driver.c
> +++ b/drivers/media/usb/uvc/uvc_driver.c
> @@ -1082,11 +1082,11 @@ static int uvc_parse_standard_control(struct
> uvc_device *dev, p = 0;
>   len = 8;
> 
> - if (type == UVC_ITT_CAMERA) {
> + if ((type & 0x7fff) == UVC_ITT_CAMERA) {
>   n = buflen >= 15 ? buffer[14] : 0;
>   len = 15;
> 
> - } else if (type == UVC_ITT_MEDIA_TRANSPORT_INPUT) {
> + } else if ((type & 0x7fff) == UVC_ITT_MEDIA_TRANSPORT_INPUT) {
>   n = buflen >= 9 ? buffer[8] : 0;
>   p = buflen >= 10 + n ? buffer[9+n] : 0;
>   len = 10;

How about rejecting invalid types instead ? Something along the lines of

diff --git a/drivers/media/usb/uvc/uvc_driver.c 
b/drivers/media/usb/uvc/uvc_driver.c
index b62cbd800111..33a22c016456 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1106,11 +1106,19 @@ static int uvc_parse_standard_control(struct uvc_device 
*dev,
return -EINVAL;
}
 
-   /* Make sure the terminal type MSB is not null, otherwise it
-* could be confused with a unit.
+   /*
+* Reject invalid terminal types that would cause issues:
+*
+* - The high byte must be non-zero, otherwise it would be
+*   confused with a unit.
+*
+* - Bit 15 must be 0, as we use it internally as a terminal
+*   direction flag.
+*
+* Other unknown types are accepted.
 */
type = get_unaligned_le16([4]);
-   if ((type & 0xff00) == 0) {
+   if ((type & 0x7f00) == 0 || (type & 0x8000) != 0) {
uvc_trace(UVC_TRACE_DESCR, "device %d videocontrol "
"interface %d INPUT_TERMINAL %d has invalid "
"type 0x%04x, skipping\n", udev->devnum,

-- 
Regards,

Laurent Pinchart





Re: [Xen-devel] [PATCH v2 1/1] xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront

2018-12-18 Thread Roger Pau Monné
On Tue, Dec 18, 2018 at 10:33:00AM +0100, Roger Pau Monné wrote:
> On Tue, Dec 18, 2018 at 08:55:38AM +0800, Dongli Zhang wrote:
> > +   for (i = 0; i < nr_grefs; i++) {
> > +   char ring_ref_name[RINGREF_NAME_LEN];
> > +
> > +   snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> > +   err = xenbus_scanf(XBT_NIL, dir, ring_ref_name,
> > +  "%u", _ref[i]);
> > +
> > +   if (err != 1 && (i || (!i && nr_grefs > 1))) {
> 
> AFAICT the above condition can be simplified as "err != 1 &&
> nr_grefs".

Sorry, this should be "err != 1 && nr_grefs > 1", since it's not order
but rather the number of grefs.

Roger.


Re: [PATCH v5 2/2] media: usb: pwc: Don't use coherent DMA buffers for ISO transfer

2018-12-18 Thread Tomasz Figa
On Tue, Dec 18, 2018 at 4:38 PM Christoph Hellwig  wrote:
>
> On Tue, Dec 18, 2018 at 04:22:43PM +0900, Tomasz Figa wrote:
> > It kind of limits the usability of this API, since it enforces
> > contiguous allocations even for big sizes even for devices behind
> > IOMMU (contrary to the case when DMA_ATTR_NON_CONSISTENT is not set),
> > but given that it's just a temporary solution for devices like these
> > USB cameras, I guess that's fine.
>
> The problem is that you can't have flexibility and simplicity at the
> same time.  Once you use kernel virtual address remapping you need to
> be prepared to have multiple segments.
>
> So as I said you can call dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT
> in a loop with a suitably small chunk size, then stuff the results into
> a scatterlist and map that again for the device share with if you don't
> want a single contigous region.  You just have to either deal with
> non-contigous access from the kernel or use vmap and the right vmap
> cache flushing helpers.

The point is that you didn't have to do this small chunk loop without
DMA_ATTR_NON_CONSISTENT, so it's at least inconsistent now and not
sure why it could be better than just a loop of alloc_page().

>
> > Note that in V4L2 we use the DMA API extensively, so that we don't
> > need to embed any device-specific or integration-specific knowledge in
> > the framework. Right now we're using dma_alloc_attrs() with
> > driver-provided attrs [1], but current driver never request
> > non-consistent memory. We're however thinking about making it possible
> > to allocate non-consistent memory. What would you suggest for this?
> >
> > [1] 
> > https://elixir.bootlin.com/linux/v4.20-rc7/source/drivers/media/common/videobuf2/videobuf2-dma-contig.c#L139
>
> I would advice against new non-consistent users until this series
> goes through, mostly because dma_cache_sync is such an amazing bad
> API.  Otherwise things will just work at the allocation side, you'll
> just need to be careful to transfer ownership between the cpu and
> the device(s) carefully using the dma_sync_* APIs.

Just to clarify, the actual code isn't very likely to surface any time
soon. so I assume it would be after this series lands.

We will however need an API that can transparently handle both cases
of contiguous (without IOMMU) and page-by-page allocations (with
IOMMU) behind the scenes, like the current dma_alloc_attrs() without
DMA_ATTR_NON_CONSISTENT.

Best regards,
Tomasz


Re: [RFT][PATCH 0/2] ACPI / PM: Avoid spurious wakeups from non-wakeup GPEs in suspend-to-idle

2018-12-18 Thread Rafael J. Wysocki
On Mon, Dec 17, 2018 at 5:50 PM Mika Westerberg
 wrote:
>
> On Mon, Dec 17, 2018 at 12:19:27PM +0100, Rafael J. Wysocki wrote:
> > Hi All,
>
> Hi Rafael,
>
> > It turns out that on some systems non-wakeup GPEs cause trouble (see the
> > changelog of patch [1/2]), so they better should be disabled during
> > suspend-to-idle (at least before entering the main loop of it).  This
> > is done in patch [1/2].
> >
> > Patch [2/2] is just a folllow-up.
>
> I tested both patches on Lenovo Carbon X1 6th generation and one new
> Dell system. Both default to s2idle. I did not see any issues around
> s2idle with the two patches applied.
>
> Tested-by: Mika Westerberg 

Thank you!


Re: [PATCH] Fix mm->owner point to a tsk that has been free

2018-12-18 Thread Michal Hocko
On Tue 18-12-18 13:24:44, gchen.guo...@gmail.com wrote:
> From: guomin chen 
> 
> When mm->owner is modified by exit_mm, if the new owner directly calls
> unuse_mm to exit, it will cause Use-After-Free. Due to the unuse_mm()
> directly sets tsk->mm=NULL.
> 
>  Under normal circumstances,When do_exit exits, mm->owner will
>  be updated on exit_mm(). but when the kernel process calls
>  unuse_mm() and then exits,mm->owner cannot be updated. And it
>  will point to a task that has been released.
> 
> The current issue flow is as follows: (Process A,B,C use the same mm)
> Process C  Process A Process B
> qemu-system-x86_64: kernel:vhost_net  kernel: vhost_net
> open /dev/vhost-net
>   VHOST_SET_OWNER   create kthread vhost-%d  create kthread vhost-%d
>   network init   use_mm()  use_mm()
>...   ...
>Abnormal exited
>...
>   do_exit
>   exit_mm()
>   update mm->owner to A
>   exit_files()
>close_files()
>kthread_should_stop() unuse_mm()
> Stop Process A   tsk->mm=NULL
>  do_exit()
>  can't update owner
>  A exit completed  vhost-%d  rcv first package
>vhost-%d build rcv buffer for vq
>page fault
>access mm & mm->owner
>NOW,mm->owner still pointer A
>kernel UAF
> stop Process B
> 
> Although I am having this issue on vhost_net,But it affects all users of
> unuse_mm.

I am confused. How can we ever assign the owner to a kernel thread. We
skip those explicitly. It simply doesn't make any sense to have an owner
a kernel thread.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH][resend] drm: dw-hdmi-i2s: convert to SPDX identifiers

2018-12-18 Thread Laurent Pinchart
Hi Daniel,

On Tuesday, 18 December 2018 11:37:14 EET Daniel Vetter wrote:
> On Tue, Dec 18, 2018 at 7:47 AM Laurent Pinchart wrote:
> > On Tuesday, 18 December 2018 08:00:24 EET Kuninori Morimoto wrote:
> >> From: Kuninori Morimoto 
> >> 
> >> This patch updates license to use SPDX-License-Identifier
> >> instead of verbose license text.
> >> 
> >> Signed-off-by: Kuninori Morimoto 
> > 
> > Reviewed-by: Laurent Pinchart 
> > 
> >> ---
> >> few weeks passed, nothing happen. I re-post this patch again.
> >> I added Andrew on Cc
> > 
> > The driver seems to be lacking a maintainer :-S
> 
> bridge drivers all have a fallback maintainer, but none of them are
> cc'ed. It's maintained in drm-misc, so you could just push the patch
> too :-) Especially since you're listed:
> 
> DRM DRIVERS FOR BRIDGE CHIPS
> M:Archit Taneja 
> M:Andrzej Hajda 
> R:Laurent Pinchart 

Note the R, not the M :-)

> S:Maintained
> T:git git://anongit.freedesktop.org/drm/drm-misc
> F:drivers/gpu/drm/bridge/
> 
> 
> Cheers, Daniel
> 
> >>  drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 5 +
> >>  1 file changed, 1 insertion(+), 4 deletions(-)
> >> 
> >> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
> >> b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index
> >> 8f9c8a6..2228689 100644
> >> --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
> >> +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
> >> @@ -1,12 +1,9 @@
> >> +// SPDX-License-Identifier: GPL-2.0
> >>  /*
> >>   * dw-hdmi-i2s-audio.c
> >>   *
> >>   * Copyright (c) 2017 Renesas Solutions Corp.
> >>   * Kuninori Morimoto 
> >> - *
> >> - * This program is free software; you can redistribute it and/or modify
> >> - * it under the terms of the GNU General Public License version 2 as
> >> - * published by the Free Software Foundation.
> >>   */
> >>  
> >>  #include 

-- 
Regards,

Laurent Pinchart





Re: [PATCH 06/14] mm, migrate: Immediately fail migration of a page with no migration handler

2018-12-18 Thread Mel Gorman
On Tue, Dec 18, 2018 at 10:06:31AM +0100, Vlastimil Babka wrote:
> On 12/15/18 12:03 AM, Mel Gorman wrote:
> > Pages with no migration handler use a fallback hander which sometimes
> > works and sometimes persistently fails such as blockdev pages. Migration
> > will retry a number of times on these persistent pages which is wasteful
> > during compaction. This patch will fail migration immediately unless the
> > caller is in MIGRATE_SYNC mode which indicates the caller is willing to
> > wait while being persistent.
> 
> Right.
> 
> > This is not expected to help THP allocation success rates but it does
> > reduce latencies slightly.
> > 
> > 1-socket thpfioscale
> > 4.20.0-rc6 4.20.0-rc6
> >noreserved-v1r4  failfast-v1r4
> > Amean fault-both-1 0.00 (   0.00%)0.00 *   0.00%*
> > Amean fault-both-3  2276.15 (   0.00%) 3867.54 * -69.92%*
> 
> This is rather weird.
> 

Fault latency is extremely variable and there can be very large outliers
that skew the mean (the full report includes quartiles but it makes for an
excessive changelog). It can be down to luck about how often the migrate
scanner advances and how often it gets reset. For this series, it'll
not be unusual to see jitter in the latencies for individual patches
that will not get nailed down reliably until later in the series. The
alternative is massive patches that do multiple things which will look
nice in changelogs and be horrible to review.

> > Amean fault-both-5  4992.20 (   0.00%) 5313.20 (  -6.43%)
> > Amean fault-both-7  7373.30 (   0.00%) 7039.11 (   4.53%)
> > Amean fault-both-1211911.52 (   0.00%)11328.29 (   4.90%)
> > Amean fault-both-1817209.42 (   0.00%)16455.34 (   4.38%)
> > Amean fault-both-2420943.71 (   0.00%)20448.94 (   2.36%)
> > Amean fault-both-3022703.00 (   0.00%)21655.07 (   4.62%)
> > Amean fault-both-3222461.41 (   0.00%)21415.35 (   4.66%)
> > 
> > The 2-socket results are not materially different. Scan rates are
> > similar as expected.
> > 
> > Signed-off-by: Mel Gorman 
> 
> Acked-by: Vlastimil Babka 
> 

Thanks.

-- 
Mel Gorman
SUSE Labs


Re: [PATCH 08/14] mm, compaction: Use the page allocator bulk-free helper for lists of pages

2018-12-18 Thread Vlastimil Babka
On 12/15/18 12:03 AM, Mel Gorman wrote:
> release_pages() is a simpler version of free_unref_page_list() but it
> tracks the highest PFN for caching the restart point of the compaction
> free scanner. This patch optionally tracks the highest PFN in the core
> helper and converts compaction to use it.
> 
> Signed-off-by: Mel Gorman 

Acked-by: Vlastimil Babka 

Nit below:

> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2961,18 +2961,26 @@ void free_unref_page(struct page *page)
>  /*
>   * Free a list of 0-order pages
>   */
> -void free_unref_page_list(struct list_head *list)
> +void __free_page_list(struct list_head *list, bool dropref,
> + unsigned long *highest_pfn)
>  {
>   struct page *page, *next;
>   unsigned long flags, pfn;
>   int batch_count = 0;
>  
> + if (highest_pfn)
> + *highest_pfn = 0;
> +
>   /* Prepare pages for freeing */
>   list_for_each_entry_safe(page, next, list, lru) {
> + if (dropref)
> + WARN_ON_ONCE(!put_page_testzero(page));

That will warn just once, but then page will remain with elevated count
and free_unref_page_prepare() will warn either immediately or later
depending on DEBUG_VM, for each page.
Also IIRC it's legal for basically anyone to do get_page_unless_zero()
and later put_page(), and this would now cause warning. Maybe just test
for put_page_testzero() result without warning, and continue? Hm but
then we should still do a list_del() and that becomes racy after
dropping our ref...

>   pfn = page_to_pfn(page);
>   if (!free_unref_page_prepare(page, pfn))
>   list_del(>lru);
>   set_page_private(page, pfn);
> + if (highest_pfn && pfn > *highest_pfn)
> + *highest_pfn = pfn;
>   }
>  
>   local_irq_save(flags);
> 



Re: [PATCH v2 3/3] drm/i915: Move to new PM core fields

2018-12-18 Thread Rafael J. Wysocki
On Mon, Dec 17, 2018 at 3:22 PM Vincent Guittot
 wrote:
>
> On Fri, 14 Dec 2018 at 15:36, Ulf Hansson  wrote:
> >
> > On Fri, 14 Dec 2018 at 15:22, Vincent Guittot
> >  wrote:
> > >
> > > With jiffies been replaced by raw ns in PM core accounting, 915 driver is
> > > updated to use this new time infrastructure.
> > >
> > > Signed-off-by: Vincent Guittot 
> > > ---
> > >  drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
> > >  drivers/gpu/drm/i915/i915_pmu.h |  4 ++--
> > >  2 files changed, 8 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > > b/drivers/gpu/drm/i915/i915_pmu.c
> > > index d6c8f8f..cf6437d 100644
> > > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > > @@ -493,14 +493,14 @@ static u64 get_rc6(struct drm_i915_private *i915)
> > >  */
> > > if (kdev->power.runtime_status == RPM_SUSPENDED) {
> > > if 
> > > (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur)
> > > -   i915->pmu.suspended_jiffies_last =
> > > - 
> > > kdev->power.suspended_jiffies;
> > > +   i915->pmu.suspended_time_last =
> > > +   kdev->power.suspended_time;
> > >
> >
> > Huh, so patch 2 introduces a complier error because of removing the
> > old fields. We can't have that.
>
> I agree
> The patch was mainly to raise discussion

OK, so patch [1/3] from this series should be applicable regardless, right?


Re: [PATCH v4 4/9] drm/rockchip/rockchip_drm_gem.c: Convert to use vm_insert_range

2018-12-18 Thread Russell King - ARM Linux
On Tue, Dec 18, 2018 at 01:53:34AM +0530, Souptick Joarder wrote:
> Convert to use vm_insert_range() to map range of kernel
> memory to user vma.
> 
> Signed-off-by: Souptick Joarder 
> Tested-by: Heiko Stuebner 
> Acked-by: Heiko Stuebner 
> ---
>  drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 19 ++-
>  1 file changed, 2 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c 
> b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
> index a8db758..8279084 100644
> --- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
> @@ -221,26 +221,11 @@ static int rockchip_drm_gem_object_mmap_iommu(struct 
> drm_gem_object *obj,
> struct vm_area_struct *vma)
>  {
>   struct rockchip_gem_object *rk_obj = to_rockchip_obj(obj);
> - unsigned int i, count = obj->size >> PAGE_SHIFT;
>   unsigned long user_count = vma_pages(vma);
> - unsigned long uaddr = vma->vm_start;
>   unsigned long offset = vma->vm_pgoff;
> - unsigned long end = user_count + offset;
> - int ret;
> -
> - if (user_count == 0)
> - return -ENXIO;
> - if (end > count)
> - return -ENXIO;
>  
> - for (i = offset; i < end; i++) {
> - ret = vm_insert_page(vma, uaddr, rk_obj->pages[i]);
> - if (ret)
> - return ret;
> - uaddr += PAGE_SIZE;
> - }
> -
> - return 0;
> + return vm_insert_range(vma, vma->vm_start, rk_obj->pages + offset,
> + user_count - offset);

This looks like a change in behaviour.

If user_count is zero, and offset is zero, then we pass into
vm_insert_range() a page_count of zero, and vm_insert_range() does
nothing and returns zero.

However, as we can see from the above code, the original behaviour
was to return -ENXIO in that case.

The other thing that I'm wondering is that if (eg) count is 8 (the
object is 8 pages), offset is 2, and the user requests mapping 6
pages (user_count = 6), then we call vm_insert_range() with a
pages of rk_obj->pages + 2, and a pages_count of 6 - 2 = 4. So we
end up inserting four pages.

The original code would calculate end = 6 + 2 = 8.  i would iterate
from 2 through 8, inserting six pages.

(I hadn't spotted that second issue until I'd gone through the
calculations manually - which is worrying.)

I don't have patches 5 through 9 to look at, but I'm concerned that
similar issues also exist in those patches.

I'm concerned that this series seems to be introducing subtle bugs,
it seems to be unnecessarily difficult to use this function correctly.
I think your existing proposal for vm_insert_range() provides an
interface that is way too easy to get wrong, and, therefore, is the
wrong interface.

I think it would be way better to have vm_insert_range() take the
object page array without any offset adjustment, and the object
page_count again without any adjustment, and have vm_insert_range()
itself handle the offsetting and VMA size validation.  That would
then give a simple interface and probably give a further reduction
in code at each call site.

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 3/3] drm/i915: Move to new PM core fields

2018-12-18 Thread Vincent Guittot
On Tue, 18 Dec 2018 at 10:57, Rafael J. Wysocki  wrote:
>
> On Mon, Dec 17, 2018 at 3:22 PM Vincent Guittot
>  wrote:
> >
> > On Fri, 14 Dec 2018 at 15:36, Ulf Hansson  wrote:
> > >
> > > On Fri, 14 Dec 2018 at 15:22, Vincent Guittot
> > >  wrote:
> > > >
> > > > With jiffies been replaced by raw ns in PM core accounting, 915 driver 
> > > > is
> > > > updated to use this new time infrastructure.
> > > >
> > > > Signed-off-by: Vincent Guittot 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
> > > >  drivers/gpu/drm/i915/i915_pmu.h |  4 ++--
> > > >  2 files changed, 8 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > > > b/drivers/gpu/drm/i915/i915_pmu.c
> > > > index d6c8f8f..cf6437d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > > > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > > > @@ -493,14 +493,14 @@ static u64 get_rc6(struct drm_i915_private *i915)
> > > >  */
> > > > if (kdev->power.runtime_status == RPM_SUSPENDED) {
> > > > if 
> > > > (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur)
> > > > -   i915->pmu.suspended_jiffies_last =
> > > > - 
> > > > kdev->power.suspended_jiffies;
> > > > +   i915->pmu.suspended_time_last =
> > > > +   kdev->power.suspended_time;
> > > >
> > >
> > > Huh, so patch 2 introduces a complier error because of removing the
> > > old fields. We can't have that.
> >
> > I agree
> > The patch was mainly to raise discussion
>
> OK, so patch [1/3] from this series should be applicable regardless, right?

Yes


[PATCH v2 1/1] MAINTAINERS: update list of qcom drivers

2018-12-18 Thread Amit Kucheria
Several drivers didn't have a specific maintainer (other than the
subsystem maintainer). Add all drivers referring to qcom or msm to the
list of supported drivers.

Signed-off-by: Amit Kucheria 
---
 MAINTAINERS | 49 +
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3318f30903b2..8f8dfb6f9675 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1929,20 +1929,61 @@ M:  Andy Gross 
 M: David Brown 
 L: linux-arm-...@vger.kernel.org
 S: Maintained
+F: include/dt-bindings/*/qcom*
+F: include/linux/*/qcom*
 F: Documentation/devicetree/bindings/soc/qcom/
+F: Documentation/devicetree/bindings/*/qcom*
 F: arch/arm/boot/dts/qcom-*.dts
 F: arch/arm/boot/dts/qcom-*.dtsi
 F: arch/arm/mach-qcom/
-F: arch/arm64/boot/dts/qcom/*
-F: drivers/i2c/busses/i2c-qup.c
+F: arch/arm64/boot/dts/qcom/
+F: drivers/bluetooth/btqcomsmd.c
+F: drivers/bus/qcom*
+F: drivers/cpufreq/qcom*
 F: drivers/clk/qcom/
+F: drivers/clocksource/timer-qcom.c
+F: drivers/crypto/qcom-*
 F: drivers/dma/qcom/
+F: drivers/edac/qcom*
+F: drivers/extcon/extcon-qcom*
+F: drivers/firmware/qcom*
+F: drivers/hwspinlock/qcom_*
+F: drivers/iio/adc/qcom*
+F: drivers/iommu/qcom*
+F: drivers/iommu/msm*
+F: drivers/i2c/busses/i2c-qup.c
+F: drivers/i2c/busses/i2c-qcom-geni.c
+F: drivers/irqchip/qcom*
+F: drivers/mailbox/qcom-*
+F: drivers/media/platform/qcom/
+F: drivers/mfd/qcom*
+F: drivers/mfd/ssbi.c
+F: drivers/misc/qcom-*
+F: drivers/mmc/host/mmci_qcom*
+F: drivers/mmc/host/sdhci_msm.c
+F: drivers/mtd/*/qcom*
+F: drivers/pci/controller/dwc/pcie-qcom.c
+F: drivers/perf/qcom*
+F: drivers/pinctrl/qcom/
+F: drivers/phy/qualcomm/
+F: drivers/power/*/qcom*
+F: drivers/power/*/msm*
+F: drivers/regulator/qcom*
+F: drivers/reset/reset-qcom-*
+F: drivers/remoteproc/qcom*
+F: drivers/rpmsg/qcom*
+F: drivers/scsi/ufs/ufs-qcom.*
+F: drivers/slimbus/qcom*
 F: drivers/soc/qcom/
 F: drivers/spi/spi-qup.c
+F: drivers/spi/spi-geni-qcom.c
+F: drivers/spi/spi-qcom-qspi.c
+F: drivers/thermal/qcom/
 F: drivers/tty/serial/msm_serial.c
+F: drivers/tty/serial/qcom*
+F: drivers/usb/dwc3/dwc3-qcom.c
+F: drivers/watchdog/qcom*
 F: drivers/*/pm8???-*
-F: drivers/mfd/ssbi.c
-F: drivers/firmware/qcom_scm*
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux.git
 
 ARM/RADISYS ENP2611 MACHINE SUPPORT
-- 
2.17.1



[PATCH] r8a66597: Fix a possible concurrency use-after-free bug in r8a66597_endpoint_disable()

2018-12-18 Thread Jia-Ju Bai
The function r8a66597_endpoint_disable() and r8a66597_urb_enqueue() may
be concurrently executed.
The two functions both access a possible shared variable "hep->hcpriv".

This shared variable is freed by r8a66597_endpoint_disable() via the
call path:
r8a66597_endpoint_disable
  kfree(hep->hcpriv) (line 1995 in Linux-4.19)

This variable is read by r8a66597_urb_enqueue() via the call path:
r8a66597_urb_enqueue
  spin_lock_irqsave(>lock);
  init_pipe_info
enable_r8a66597_pipe
  pipe = hep->hcpriv (line 802 in Linux-4.19)

The read operation is protected by a spinlock, but the free operation
is not protected by this spinlock, thus a concurrency use-after-free bug
may occur.

To fix this bug, the spin-lock and spin-unlock function calls in
r8a66597_endpoint_disable() are moved to protect the free operation.

Signed-off-by: Jia-Ju Bai 
---
 drivers/usb/host/r8a66597-hcd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/r8a66597-hcd.c b/drivers/usb/host/r8a66597-hcd.c
index 984892dd72f5..1495ce14ad22 100644
--- a/drivers/usb/host/r8a66597-hcd.c
+++ b/drivers/usb/host/r8a66597-hcd.c
@@ -1991,13 +1991,14 @@ static void r8a66597_endpoint_disable(struct usb_hcd 
*hcd,
return;
pipenum = pipe->info.pipenum;
 
+   spin_lock_irqsave(>lock, flags);
if (pipenum == 0) {
kfree(hep->hcpriv);
hep->hcpriv = NULL;
+   spin_unlock_irqrestore(>lock, flags);
return;
}
 
-   spin_lock_irqsave(>lock, flags);
pipe_stop(r8a66597, pipe);
pipe_irq_disable(r8a66597, pipenum);
disable_irq_empty(r8a66597, pipenum);
-- 
2.17.0



Re: [PATCH 0/7] ARM: hacks for link-time optimization

2018-12-18 Thread Peter Zijlstra
On Tue, Dec 18, 2018 at 10:18:24AM +0100, Peter Zijlstra wrote:
> In particular turning an address-dependency into a control-dependency,
> which is something allowed by the C language, since it doesn't recognise
> these concepts as such.
> 
> The 'optimization' is allowed currently, but LTO will make it much more
> likely since it will have a much wider view of things. Esp. when combined
> with PGO.
> 
> Specifically; if you have something like:
> 
> int idx;
> struct object objs[2];
> 
> the statement:
> 
>   val = objs[idx & 1].ponies;
> 
> which you 'need' to be translated like:
> 
>   struct object *obj = objs;
>   obj += (idx & 1);
>   val = obj->ponies;
> 
> Such that the load of obj->ponies depends on the load of idx. However
> our dear compiler is allowed to make it:
> 
>   if (idx & 1)
> obj = [1];
>   else
> obj = [0];
> 
>   val = obj->ponies;
> 
> Because C doesn't recognise this as being different. However this is
> utterly broken, because in this translation we can speculate the load
> of obj->ponies such that it no longer depends on the load of idx, which
> breaks RCU.
> 
> Note that further 'optimization' is possible and the compiler could even
> make it:
> 
>   if (idx & 1)
> val = objs[1].ponies;
>   else
> val = objs[0].ponies;

A variant that is actually broken on x86 too (due to issuing the loads
in the 'wrong' order):

  val = objs[0].ponies;
  if (idx & 1)
val = objs[1].ponies;

Which is a translation that makes sense if we either marked
unlikely(idx & 1) or if PGO found the same.

> Now, granted, this is a fairly artificial example, but it does
> illustrate the exact problem.
> 
> The more the compiler can see of the complete program, the more likely
> it can make inferrences like this, esp. when coupled with PGO.
> 
> Now, we're (usually) very careful to wrap things in READ_ONCE() and
> rcu_dereference() and the like, which makes it harder on the compiler
> (because 'volatile' is special), but nothing really stops it from doing
> this.
> 
> Paul has been trying to beat clue into the language people, but given
> he's been at it for 10 years now, and there's no resolution, I figure we
> ought to get compiler implementations to give us a knob.


Re: [PATCH v1 1/1] MAINTAINERS: update list of qcom drivers

2018-12-18 Thread Amit Kucheria
On Tue, Dec 18, 2018 at 1:12 PM Kalle Valo  wrote:
>
> Amit Kucheria  writes:
>
> > Several drivers didn't have a specific maintainer (other than the
> > subsystem maintainer). Switch to using the 'qcom' and 'msm' regex
> > patterns to capture all of them and add exceptions to the couple of
> > drivers that contain 'msm' but are not related to qcom hardware.
> >
> > Thanks to Marc for the idea to use the N regex.
> >
> > Signed-off-by: Amit Kucheria 
> > ---
> >  MAINTAINERS | 14 --
> >  1 file changed, 4 insertions(+), 10 deletions(-)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 3318f30903b2..c9376030f77a 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -1929,20 +1929,14 @@ M:Andy Gross 
> >  M:   David Brown 
> >  L:   linux-arm-...@vger.kernel.org
> >  S:   Maintained
> > -F:   Documentation/devicetree/bindings/soc/qcom/
> > -F:   arch/arm/boot/dts/qcom-*.dts
> > -F:   arch/arm/boot/dts/qcom-*.dtsi
> > -F:   arch/arm/mach-qcom/
> > -F:   arch/arm64/boot/dts/qcom/*
> > +N:   qcom
> > +N:   msm
>
> IMHO this is pretty fragile in the long term. For example only due to
> historical reasons qualcomm wireless drivers currently under ath
> directory but who knows if at some point we switch using qcom (or
> qualcomm) directory. Also the wireless drivers might easily have
> filenames containing strings like "msm" or "qcom" (which I assume would
> match with "N" rules above).

I've now sent a v2 of the patch that tries to list all the drivers
explicitly. Let's see which one is better liked. :)

Regards,
Amit


Re: [PATCH] printk: Add caller information to printk() output.

2018-12-18 Thread Petr Mladek
On Tue 2018-12-18 17:55:24, Sergey Senozhatsky wrote:
> On (12/18/18 06:05), Tetsuo Handa wrote:
> > +#ifdef CONFIG_PRINTK_CALLER
> > +static size_t print_caller(u32 id, char *buf)
> > +{
> > +   char from[12];
> > +
> > +   snprintf(from, sizeof(from), "%c%u",
> > +id & 0x8000 ? 'C' : 'T', id & ~0x8000);
> > +   return sprintf(buf, "[%6s]", from);
> > +}
> 
> A nitpick:
> 
> s/from/caller/g   :)

Great catch!

> 
> > + Selecting this option causes "thread id" (if in task context) or
> > + "processor id" (if not in task context) of the printk() messages
> > + to be added.
> 
> Would the following wording be a bit simpler?
> 
>   "Selecting this option causes printk() to add a caller "thread id" (if
>in task context) or a caller "processor id" (if not in task context)
>to every message."

It sounds good to me.

I have updated the patch in printk.git, for-4.22 branch.

Best Regards,
Petr

PS: I think that I have rushed the patch probably too much.
I did too much nitpicking in the past and am trying to find
a better balance now.


Re: [PATCH v2 3/3] drm/i915: Move to new PM core fields

2018-12-18 Thread Rafael J. Wysocki
On Tue, Dec 18, 2018 at 10:58 AM Vincent Guittot
 wrote:
>
> On Tue, 18 Dec 2018 at 10:57, Rafael J. Wysocki  wrote:
> >
> > On Mon, Dec 17, 2018 at 3:22 PM Vincent Guittot
> >  wrote:
> > >
> > > On Fri, 14 Dec 2018 at 15:36, Ulf Hansson  wrote:
> > > >
> > > > On Fri, 14 Dec 2018 at 15:22, Vincent Guittot
> > > >  wrote:
> > > > >
> > > > > With jiffies been replaced by raw ns in PM core accounting, 915 
> > > > > driver is
> > > > > updated to use this new time infrastructure.
> > > > >
> > > > > Signed-off-by: Vincent Guittot 
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
> > > > >  drivers/gpu/drm/i915/i915_pmu.h |  4 ++--
> > > > >  2 files changed, 8 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > > > > b/drivers/gpu/drm/i915/i915_pmu.c
> > > > > index d6c8f8f..cf6437d 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > > > > @@ -493,14 +493,14 @@ static u64 get_rc6(struct drm_i915_private 
> > > > > *i915)
> > > > >  */
> > > > > if (kdev->power.runtime_status == RPM_SUSPENDED) {
> > > > > if 
> > > > > (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur)
> > > > > -   i915->pmu.suspended_jiffies_last =
> > > > > - 
> > > > > kdev->power.suspended_jiffies;
> > > > > +   i915->pmu.suspended_time_last =
> > > > > +   kdev->power.suspended_time;
> > > > >
> > > >
> > > > Huh, so patch 2 introduces a complier error because of removing the
> > > > old fields. We can't have that.
> > >
> > > I agree
> > > The patch was mainly to raise discussion
> >
> > OK, so patch [1/3] from this series should be applicable regardless, right?
>
> Yes

OK, I'll queue it up, then.

Next time you do something like that  please mark patches for
discussion in a series as [RFC] so it is all clear.


Re: [PATCH 07/18] mfd: rc5t583: Make it explicitly non-modular

2018-12-18 Thread Laxman Dewangan




On Tuesday 18 December 2018 02:01 AM, Paul Gortmaker wrote:

The Kconfig currently controlling compilation of this code is:

drivers/mfd/Kconfig:config MFD_RC5T583
drivers/mfd/Kconfig:bool "Ricoh RC5T583 Power Management system device"

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init was not in use by this code, the init ordering
remains unchanged with this commit.

Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.

We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.

Cc: Lee Jones 
Cc: Laxman Dewangan 
Signed-off-by: Paul Gortmaker 
Acked-by: Linus Walleij 


Acked-by: Laxman Dewangan 


Re: [PATCH 12/18] mfd: tps80031: Make it explicitly non-modular

2018-12-18 Thread Laxman Dewangan




On Tuesday 18 December 2018 02:01 AM, Paul Gortmaker wrote:

The Kconfig currently controlling compilation of this code is:

drivers/mfd/Kconfig:config MFD_TPS80031
drivers/mfd/Kconfig:bool "TI TPS80031/TPS80032 Power Management chips"

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

We explicitly disallow a driver unbind, since that doesn't have a
sensible use case anyway, and it allows us to drop the ".remove"
code for non-modular drivers.

Since module_init was not in use by this code, the init ordering
remains unchanged with this commit.

We don't replace module.h with init.h since the file already has that.

Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.

We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.

Cc: Lee Jones 
Cc: Laxman Dewangan 
Signed-off-by: Paul Gortmaker 
Acked-by: Linus Walleij 



Acked-by: Laxman Dewangan 


Re: [PATCH v2 3/3] drm/i915: Move to new PM core fields

2018-12-18 Thread Vincent Guittot
On Tue, 18 Dec 2018 at 11:03, Rafael J. Wysocki  wrote:
>
> On Tue, Dec 18, 2018 at 10:58 AM Vincent Guittot
>  wrote:
> >
> > On Tue, 18 Dec 2018 at 10:57, Rafael J. Wysocki  wrote:
> > >
> > > On Mon, Dec 17, 2018 at 3:22 PM Vincent Guittot
> > >  wrote:
> > > >
> > > > On Fri, 14 Dec 2018 at 15:36, Ulf Hansson  
> > > > wrote:
> > > > >
> > > > > On Fri, 14 Dec 2018 at 15:22, Vincent Guittot
> > > > >  wrote:
> > > > > >
> > > > > > With jiffies been replaced by raw ns in PM core accounting, 915 
> > > > > > driver is
> > > > > > updated to use this new time infrastructure.
> > > > > >
> > > > > > Signed-off-by: Vincent Guittot 
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
> > > > > >  drivers/gpu/drm/i915/i915_pmu.h |  4 ++--
> > > > > >  2 files changed, 8 insertions(+), 8 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > > > > > b/drivers/gpu/drm/i915/i915_pmu.c
> > > > > > index d6c8f8f..cf6437d 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > > > > > @@ -493,14 +493,14 @@ static u64 get_rc6(struct drm_i915_private 
> > > > > > *i915)
> > > > > >  */
> > > > > > if (kdev->power.runtime_status == RPM_SUSPENDED) {
> > > > > > if 
> > > > > > (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur)
> > > > > > -   i915->pmu.suspended_jiffies_last =
> > > > > > - 
> > > > > > kdev->power.suspended_jiffies;
> > > > > > +   i915->pmu.suspended_time_last =
> > > > > > +   kdev->power.suspended_time;
> > > > > >
> > > > >
> > > > > Huh, so patch 2 introduces a complier error because of removing the
> > > > > old fields. We can't have that.
> > > >
> > > > I agree
> > > > The patch was mainly to raise discussion
> > >
> > > OK, so patch [1/3] from this series should be applicable regardless, 
> > > right?
> >
> > Yes
>
> OK, I'll queue it up, then.

Thanks

>
> Next time you do something like that  please mark patches for
> discussion in a series as [RFC] so it is all clear.

ok. will do for the next version of the last 2 patches


Re: Question: pause mode disabled for marvell 88e151x phy

2018-12-18 Thread Russell King - ARM Linux
On Tue, Dec 18, 2018 at 05:34:27PM +0800, Yunsheng Lin wrote:
> On 2018/12/17 22:36, Russell King - ARM Linux wrote:
> > As I've previously stated, the behaviour I've seen is _both_ pause bits
> > clear:
> > 
> > If I set bit 10 (pause), and read back to confirm:
> > 
> >   MII PHY #0 transceiver registers:
> >1000 796d 0141 0dd1 05e1 c5e1 000d 2001
> >
> >4806 0200 3800   0003  3000
> >3060 af48  7c40 0020   
> >  0040     .
> > 
> > Now if I trigger a renegotiation of any kind, and read-back the registers:
> > 
> >  MII PHY #0 transceiver registers:
> >1000 7949 0141 0dd1 01e1  0004 2001
> >
> > 0200    0003  3000
> >3060 8000  0040 0020   
> >  0040     .
> > ...
> >  MII PHY #0 transceiver registers:
> >1000 796d 0141 0dd1 01e1 c5e1 000d 2001
> >
> >4806 0200 3800   0003  3000
> >3060 af48  7c40 0020   
> >  0040     .
> > 
> > See that register 4 now has the pause bit cleared.
> 
> I wonder when the the pause bit was clear, is it during negotiation
> process or after negotiation? if the partner phy see the pause bit before
> the pause bit is clear?

As was stated in the original commit:

While these bits may be correctly conveyed to the link partner on the
first negotiation, a subsequent negotiation (eg, due to negotiation
restart by the link partner, or reconnection of the cable) will result
in the link partner seeing these bits as zero, while the kernel
believes that it has advertised pause modes.

To state in a different way:

On the _first_ negotiation, the pause bits as seen by the link partner
will be what we wrote into the register.  However, during that
negotiation the bits in the register clear _after_ the base page has
been sent to the link partner.

A subsequent negotiation will send the now-clear pause bits to the
link partner.

I hope that's now clear.

> Also, it seems the phy driver always set the pause bit back before
> begining a negotiation. But I am not sure it makes a difference here.

There is a world of difference between asking the kernel to do a
renegotiation, and unplugging/replugging the cable, or asking the
link partner to do a renegotiation.

In the former case, yes, the kernel will reprogram the advertisement
register.  In the latter two cases, the kernel does not and this is
where the problem occurs.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v1 1/1] MAINTAINERS: update list of qcom drivers

2018-12-18 Thread Kalle Valo
Marc Gonzalez  writes:

> On 18/12/2018 08:42, Kalle Valo wrote:
>
>> Amit Kucheria wrote:
>> 
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -1929,20 +1929,14 @@ M:  Andy Gross 
>>>  M: David Brown 
>>>  L: linux-arm-...@vger.kernel.org
>>>  S: Maintained
>>> -F: Documentation/devicetree/bindings/soc/qcom/
>>> -F: arch/arm/boot/dts/qcom-*.dts
>>> -F: arch/arm/boot/dts/qcom-*.dtsi
>>> -F: arch/arm/mach-qcom/
>>> -F: arch/arm64/boot/dts/qcom/*
>>> +N: qcom
>>> +N: msm
>> 
>> IMHO this is pretty fragile in the long term. For example only due to
>> historical reasons qualcomm wireless drivers currently under ath
>> directory but who knows if at some point we switch using qcom (or
>> qualcomm) directory.
>
> I am failing to follow your logic.
>
> (IIUC, you are talking about drivers/net/wireless/ath/ath10k)

Yeah, my example was just about ath10k and wil6210 as they go through my
tree. But it can apply to any other driver and subsystem as well:
bluetooth, future drivers and what ever works with Qualcomm hardware.

> The fact that the "qcom" or "msm" nomenclature is not used for this driver now
> just means that an explicit F entry is required. The fact that it could be 
> renamed
> in the future just means that the entry would need to be updated or folded 
> into a
> more generic matching pattern. What am I missing?

Not sure, but maybe you are missing the point that keeping MAINTAINER's
file up-to-date is hard and having uncommon rules like Amit and you
propose makes it even harder. Yeah, it should be simple but in practise
it's not, people easily forget to update it.

>> Also the wireless drivers might easily have filenames containing
>> strings like "msm" or "qcom" (which I assume would match with "N"
>> rules above).
>
> Any driver (not just wireless) might match "msm" or "qcom". These could be 
> excluded
> with an X directive (as the proposed patch does, in fact).

Nobody will remember, or even know (for example I saw Amit's patch by
accident), that when adding files with string "qcom" or "msm" in path
you also need to add an exclusion to "ARM/QUALCOMM SUPPORT". That won't
work so errors are likely. It's a much safer approach to use F: rules
just like Joe proposed, that way the risk of people submitting patches
to wrong lists is reduced.

-- 
Kalle Valo


Re: [PATCH] dmaengine: ti: omap-dma: Configure LCH_TYPE for OMAP1

2018-12-18 Thread Peter Ujfalusi



On 17/12/2018 21.16, Aaro Koskinen wrote:
> On Thu, Nov 22, 2018 at 03:12:36PM +, Russell King - ARM Linux wrote:
>> Also we can't deal with the omap_set_dma_dest_burst_mode() setting -
>> DMAengine always uses a 64 byte burst, but udc wants a smaller burst
>> setting.  Does this matter?
> 
> Looking at OMAP1 docs, it seems it supports only 16 bytes. Then checking
> DMAengine code, I don't think these CSDP bit values are not valid
> for OMAP1:
> 
> CSDP_SRC_BURST_1= 0 << 7,
> CSDP_SRC_BURST_16   = 1 << 7,
> CSDP_SRC_BURST_32   = 2 << 7,
> CSDP_SRC_BURST_64   = 3 << 7,
> 
> From TI SPRU674 document, pages 50-51:
> 
>   0   single access (no burst)
>   1   single access (no burst)
>   2   burst 4

In omap1510 it is 4 x data_type
In omap1610/1710 it is 4 x data_type (only data_type == 32bit is supported)
>From omap2420+ 32 bytes (8x32bit/4x64bit)

So for OMAP1 we need to have different handling of the burst:
only enable if data_type is 32bit.

>   3   reserved (do not use this setting)
> 
> So if CSDP_SRC_BURST_64 (3) gets programmed OMAP1, I wonder what is the
> end result, no burst or burst 4...
> 
> A.
> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


Re: [PATCH] printk: Add caller information to printk() output.

2018-12-18 Thread Sergey Senozhatsky
On (12/18/18 11:01), Petr Mladek wrote:
> I have updated the patch in printk.git, for-4.22 branch.

Thanks.

> PS: I think that I have rushed the patch probably too much.
> I did too much nitpicking in the past and am trying to find
> a better balance now.

It's all good.

-ss


[PATCH v5 2/8] lib/test_bitmap.c: Add for_each_set_clump8 test cases

2018-12-18 Thread William Breathitt Gray
The introduction of the for_each_set_clump8 macro warrants test cases to
verify the implementation. This patch adds test case checks for whether
an out-of-bounds clump index is returned, a zero clump is returned, or
the returned clump value differs from the expected clump value.

Cc: Andy Shevchenko 
Cc: Andrew Morton 
Cc: Rasmus Villemoes 
Signed-off-by: William Breathitt Gray 
---
 lib/test_bitmap.c | 67 +++
 1 file changed, 67 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 6cd7d0740005..8a8dbe513ab4 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -88,6 +88,36 @@ __check_eq_u32_array(const char *srcfile, unsigned int line,
return true;
 }
 
+static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
+   const unsigned int offset,
+   const unsigned int size,
+   const unsigned char *const clump_exp,
+   const unsigned long *const clump)
+{
+   unsigned long exp;
+
+   if (offset >= size) {
+   pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected 
less than %zu, got %zu\n",
+   srcfile, line, size, offset);
+   return false;
+   }
+
+   exp = clump_exp[offset / 8];
+   if (!exp) {
+   pr_warn("[%s:%u] bit offset for zero clump: expected nonzero 
clump, got bit offset %zu with clump value 0",
+   srcfile, line, offset);
+   return false;
+   }
+
+   if (*clump != exp) {
+   pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value 
of 0x%lX",
+   srcfile, line, exp, *clump);
+   return false;
+   }
+
+   return true;
+}
+
 #define __expect_eq(suffix, ...)   \
({  \
int result = 0; \
@@ -104,6 +134,7 @@ __check_eq_u32_array(const char *srcfile, unsigned int line,
 #define expect_eq_bitmap(...)  __expect_eq(bitmap, ##__VA_ARGS__)
 #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)   __expect_eq(u32_array, ##__VA_ARGS__)
+#define expect_eq_clump8(...)  __expect_eq(clump, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -361,6 +392,41 @@ static void noinline __init test_mem_optimisations(void)
}
 }
 
+static const unsigned char clump_exp[] __initconst = {
+   0x01,   /* 1 bit set */
+   0x02,   /* non-edge 1 bit set */
+   0x00,   /* zero bits set */
+   0x28,   /* 3 bits set across 4-bit boundary */
+   0x28,   /* Repeated clump */
+   0x0F,   /* 4 bits set */
+   0xFF,   /* all bits set */
+   0x05,   /* non-adjacent 2 bits set */
+};
+
+static void __init test_for_each_set_clump8(void)
+{
+   unsigned int start;
+   unsigned long clump;
+#define CLUMP_BITMAP_NUMBITS 64
+   DECLARE_BITMAP(bits, CLUMP_BITMAP_NUMBITS);
+#define CLUMP_SIZE 8
+   const size_t size = DIV_ROUND_UP(CLUMP_BITMAP_NUMBITS, CLUMP_SIZE);
+
+   /* set bitmap to test case */
+   bitmap_zero(bits, CLUMP_BITMAP_NUMBITS);
+   bitmap_set(bits, 0, 1); /* 0x01 */
+   bitmap_set(bits, 8, 1); /* 0x02 */
+   bitmap_set(bits, 27, 3);/* 0x28 */
+   bitmap_set(bits, 35, 3);/* 0x28 */
+   bitmap_set(bits, 40, 4);/* 0x0F */
+   bitmap_set(bits, 48, 8);/* 0xFF */
+   bitmap_set(bits, 56, 1);/* 0x05 - part 1 */
+   bitmap_set(bits, 58, 1);/* 0x05 - part 2 */
+
+   for_each_set_clump8(start, clump, bits, size)
+   expect_eq_clump8(offset, size, clump_exp, clump);
+}
+
 static int __init test_bitmap_init(void)
 {
test_zero_clear();
@@ -369,6 +435,7 @@ static int __init test_bitmap_init(void)
test_bitmap_arr32();
test_bitmap_parselist();
test_mem_optimisations();
+   test_for_each_set_clump8();
 
if (failed_tests == 0)
pr_info("all %u tests passed\n", total_tests);
-- 
2.20.1



[PATCH V3 0/2] Replace all open encodings for NUMA_NO_NODE

2018-12-18 Thread Anshuman Khandual
Changes in V3:

- Dropped all references to NUMA_NO_NODE as per Lubomir Rinetl
- Split the patch into two creating a new one specifically for tools
- Folded Stephen's linux-next build fix into the second patch

Changes in V2: (https://patchwork.kernel.org/patch/10698089/)

- Added inclusion of 'numa.h' header at various places per Andrew
- Updated 'dev_to_node' to use NUMA_NO_NODE instead per Vinod

Changes in V1: (https://lkml.org/lkml/2018/11/23/485)

- Dropped OCFS2 changes per Joseph
- Dropped media/video drivers changes per Hans

RFC - https://patchwork.kernel.org/patch/10678035/

Build tested this with multiple cross compiler options like alpha, sparc,
arm64, x86, powerpc, powerpc64le etc with their default config which might
not have compiled tested all driver related changes. I will appreciate
folks giving this a test in their respective build environments.

All these places for replacement were found by running the following grep
patterns on the entire kernel code. Please let me know if this might have
missed some instances. This might also have replaced some false positives.
I will appreciate suggestions, inputs and review.

1. git grep "nid == -1"
2. git grep "node == -1"
3. git grep "nid = -1"
4. git grep "node = -1"

NOTE: I can still split the first patch into multiple ones - one for each
subsystem as suggested by Lubomir if that would be better.

Anshuman Khandual (1):
  mm: Replace all open encodings for NUMA_NO_NODE

Stephen Rothwell (1):
  Tools: Replace open encodings for NUMA_NO_NODE

 arch/alpha/include/asm/topology.h |  3 ++-
 arch/ia64/kernel/numa.c   |  2 +-
 arch/ia64/mm/discontig.c  |  6 +++---
 arch/powerpc/include/asm/pci-bridge.h |  3 ++-
 arch/powerpc/kernel/paca.c|  3 ++-
 arch/powerpc/kernel/pci-common.c  |  3 ++-
 arch/powerpc/mm/numa.c| 14 +++---
 arch/powerpc/platforms/powernv/memtrace.c |  5 +++--
 arch/sparc/kernel/pci_fire.c  |  3 ++-
 arch/sparc/kernel/pci_schizo.c|  3 ++-
 arch/sparc/kernel/psycho_common.c |  3 ++-
 arch/sparc/kernel/sbus.c  |  3 ++-
 arch/sparc/mm/init_64.c   |  6 +++---
 arch/x86/include/asm/pci.h|  3 ++-
 arch/x86/kernel/apic/x2apic_uv_x.c|  7 ---
 arch/x86/kernel/smpboot.c |  3 ++-
 drivers/block/mtip32xx/mtip32xx.c |  5 +++--
 drivers/dma/dmaengine.c   |  4 +++-
 drivers/infiniband/hw/hfi1/affinity.c |  3 ++-
 drivers/infiniband/hw/hfi1/init.c |  3 ++-
 drivers/iommu/dmar.c  |  5 +++--
 drivers/iommu/intel-iommu.c   |  3 ++-
 drivers/misc/sgi-xp/xpc_uv.c  |  3 ++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  5 +++--
 include/linux/device.h|  2 +-
 init/init_task.c  |  3 ++-
 kernel/kthread.c  |  3 ++-
 kernel/sched/fair.c   | 15 ---
 lib/cpumask.c |  3 ++-
 mm/huge_memory.c  | 13 +++--
 mm/hugetlb.c  |  3 ++-
 mm/ksm.c  |  2 +-
 mm/memory.c   |  7 ---
 mm/memory_hotplug.c   | 12 ++--
 mm/mempolicy.c|  2 +-
 mm/page_alloc.c   |  4 ++--
 mm/page_ext.c |  2 +-
 net/core/pktgen.c |  3 ++-
 net/qrtr/qrtr.c   |  3 ++-
 tools/include/linux/numa.h| 16 
 tools/perf/bench/numa.c   |  6 +++---
 41 files changed, 123 insertions(+), 77 deletions(-)
 create mode 100644 tools/include/linux/numa.h

-- 
2.7.4



Re: [PATCH 6/7] HID: logitech-hidpp: support the G700 over wireless

2018-12-18 Thread Benjamin Tissoires
Hi Harry,

[now that the series has been reverted and re-inserted, I am starting
to have a look at this again]

On Fri, Sep 7, 2018 at 7:33 PM Harry Cutts  wrote:
>
> Hi Benjamin,
>
> On Fri, 7 Sep 2018 at 03:35, Benjamin Tissoires
>  wrote:
> >
> > The G700 is using a non unifying receiver, so it's easy to add its support
> > in hid-logitech-hidpp now.
> > [snip]
> > @@ -3671,6 +3671,9 @@ static const struct hid_device_id hidpp_devices[] = {
> > { /* Solar Keyboard Logitech K750 */
> >   LDJ_DEVICE(0x4002),
> >   .driver_data = HIDPP_QUIRK_CLASS_K750 },
> > +   { /* G700 over Wireless */
> > + HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 
> > USB_DEVICE_ID_LOGITECH_G700_RECEIVER),
> > + .driver_data = HIDPP_QUIRK_RECEIVER | HIDPP_QUIRK_UNIFYING },
>
> As someone who's new to the codebase, it seems rather confusing to me
> that HIDPP_QUIRK_UNIFYING would be present here for a device that
> doesn't use a Unifying receiver. Am I misunderstanding, or should we
> consider renaming the quirk or adding some clarifying comment?
> (Similarly for the G900 in the next patch.)

The initial reason is that the gaming receivers are Unifying but that
are not normally allowed to connect to more that one device at a time.
The reason is that those receiver are using a higher frequency and
need all of the bandwidth to operate (so they can't multiplex the
devices).

But as I re-read the series, it is clear that HIDPP_QUIRK_RECEIVER and
HIDPP_QUIRK_UNIFYING are never used separately, so it doesn't make
sense to have 2 quirks. We should probably rename UNIFYING into
RECEIVER for consistency and we will be good.

Cheers,
Benjamin

>
> >
> > { LDJ_DEVICE(HID_ANY_ID) },
> >
> > --
> > 2.14.3
> >
>
> Thanks,
>
> Harry Cutts
> Chrome OS Touch/Input team


[PATCH v5 1/8] bitops: Introduce the for_each_set_clump8 macro

2018-12-18 Thread William Breathitt Gray
This macro iterates for each 8-bit group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to the
bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value8 and bitmap_set_value8 functions are introduced to
respectively get and set an 8-bit value in a bitmap memory region.

Suggested-by: Andy Shevchenko 
Suggested-by: Rasmus Villemoes 
Cc: Arnd Bergmann 
Cc: Andrew Morton 
Signed-off-by: William Breathitt Gray 
---
 include/asm-generic/bitops/find.h | 14 +++
 include/linux/bitops.h|  5 +++
 lib/find_bit.c| 63 +++
 3 files changed, 82 insertions(+)

diff --git a/include/asm-generic/bitops/find.h 
b/include/asm-generic/bitops/find.h
index 8a1ee10014de..457b93e6f5c9 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -80,4 +80,18 @@ extern unsigned long find_first_zero_bit(const unsigned long 
*addr,
 
 #endif /* CONFIG_GENERIC_FIND_FIRST_BIT */
 
+unsigned int bitmap_get_value8(const unsigned long *const bitmap,
+  const unsigned int start);
+
+void bitmap_set_value8(unsigned long *const bitmap,
+  const unsigned long *const value,
+  const unsigned int start);
+
+unsigned int find_next_clump8(unsigned long *const clump,
+ const unsigned long *const addr,
+ unsigned int offset, const unsigned int size);
+
+#define find_first_clump8(clump, bits, size) \
+   find_next_clump8((clump), (bits), 0, (size))
+
 #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 705f7c442691..850af60d36b9 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -40,6 +40,11 @@ extern unsigned long __sw_hweight64(__u64 w);
 (bit) < (size);\
 (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
 
+#define for_each_set_clump8(start, clump, bits, size) \
+   for ((start) = find_first_clump8(&(clump), (bits), (size)); \
+(start) < (size); \
+(start) = find_next_clump8(&(clump), (bits), (offset) + 1, (size)))
+
 static inline int get_bitmask_order(unsigned int count)
 {
int order;
diff --git a/lib/find_bit.c b/lib/find_bit.c
index ee3df93ba69a..2e56d2b907bc 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -218,3 +218,66 @@ EXPORT_SYMBOL(find_next_bit_le);
 #endif
 
 #endif /* __BIG_ENDIAN */
+
+/**
+ * bitmap_get_value8 - get an 8-bit value within a memory region
+ * @bitmap: address to the bitmap memory region
+ * @start: bit offset of the 8-bit value
+ *
+ * Returns the 8-bit value located at the @start bit offset within the @bitmap
+ * memory region.
+ */
+unsigned int bitmap_get_value8(const unsigned long *const bitmap,
+  const unsigned int start)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned int offset = start % BITS_PER_LONG;
+
+   return (bitmap[index] >> offset) & 0xFF;
+}
+EXPORT_SYMBOL(bitmap_get_value8);
+
+/**
+ * bitmap_set_value8 - set an 8-bit value within a memory region
+ * @bitmap: address to the bitmap memory region
+ * @value: the 8-bit value
+ * @start: bit offset of the 8-bit value
+ */
+void bitmap_set_value8(unsigned long *const bitmap,
+  const unsigned long *const value,
+  const unsigned int start)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned int offset = start % BITS_PER_LONG;
+   const unsigned long mask = GENMASK(7, offset);
+
+   bitmap[index] &= ~mask;
+   bitmap[index] |= (*value << offset) & mask;
+}
+EXPORT_SYMBOL(bitmap_set_value8);
+
+/**
+ * find_next_clump8 - find next 8-bit clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @offset: bit offset at which to start searching
+ * @size: bitmap size in number of bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+unsigned int find_next_clump8(unsigned long *const clump,
+ const unsigned long *const addr,
+ unsigned int offset, const unsigned int size)
+{
+   for (; offset < size; offset += 8) {
+   *clump = bitmap_get_value8(addr, offset);
+   if (!*clump)
+   continue;
+
+   return offset;
+   }
+
+   return size;
+}
+EXPORT_SYMBOL(find_next_clump8);
-- 
2.20.1



[PATCH v5 0/8] Introduce the for_each_set_clump8 macro

2018-12-18 Thread William Breathitt Gray
Changes in v5:
  - Restrict the function of the for_each_set_clump macro to handle only
8-bit clumps (i.e. for_each_set_clump8)
  - Introduce the bitmap_get_value8 and bitmap_set_value8 functions to
get and set 8-bit values respectively
  - Reimplement the for_each_set_clump macro to return the bit offset
and clump value; this simplifies the arguments list and aligns the
macro use to similar macros such as find_next_bit et al.

While adding GPIO get_multiple/set_multiple callback support for various
drivers, I noticed a pattern of looping manifesting that would be useful
standardized as a macro.

This patchset introduces the for_each_set_clump8 macro and utilizes it
in several GPIO drivers. The for_each_set_clump macro8 facilitates a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

The for_each_set_clump8 macro has four parameters:

* start: set to the bit offset of the current clump
* clump: set to the current clump value
* bits: bitmap to search within
* size: bitmap size in number of bits

In this version of the patchset, the for_each_set_clump macro has been
reimplemented and simplified based on the suggestions provided by Rasmus
Villemoes and Andy Shevchenko in the version 4 submission.

In particular, the function of the for_each_set_clump macro has been
restricted to handle only 8-bit clumps; the drivers that use the
for_each_set_clump macro only handle 8-bit ports so a generic
for_each_set_clump implementation is not necessary. Thus, a solution for
odd-sized clumps (e.g. 3-bit, 7-bit, etc.) mismatching word boundaries
can be postponed until a driver appears that actually requires a generic
for_each_set_clump implementation.

In addition, the bitmap_get_value8 and bitmap_set_value8 functions are
introduced to get and set 8-bit values respectively. Their use is based
on the behavior suggested in the previous patchset version review.
Similarly, the implementation of the find_next_clump function has been
simplified in order for the function to match the syntax and use of the
find_next_bit function.

William Breathitt Gray (8):
  bitops: Introduce the for_each_set_clump8 macro
  lib/test_bitmap.c: Add for_each_set_clump8 test cases
  gpio: 104-dio-48e: Utilize for_each_set_clump8 macro
  gpio: 104-idi-48: Utilize for_each_set_clump8 macro
  gpio: gpio-mm: Utilize for_each_set_clump8 macro
  gpio: ws16c48: Utilize for_each_set_clump8 macro
  gpio: pci-idio-16: Utilize for_each_set_clump8 macro
  gpio: pcie-idio-24: Utilize for_each_set_clump8 macro

 drivers/gpio/gpio-104-dio-48e.c   |  71 ++-
 drivers/gpio/gpio-104-idi-48.c|  36 ++
 drivers/gpio/gpio-gpio-mm.c   |  71 ++-
 drivers/gpio/gpio-pci-idio-16.c   |  73 +++-
 drivers/gpio/gpio-pcie-idio-24.c  | 109 +++---
 drivers/gpio/gpio-ws16c48.c   |  71 ++-
 include/asm-generic/bitops/find.h |  14 
 include/linux/bitops.h|   5 ++
 lib/find_bit.c|  63 +
 lib/test_bitmap.c |  67 ++
 10 files changed, 281 insertions(+), 299 deletions(-)

-- 
2.20.1



Re: [PATCH 0/9] RZ/G2E basic device tree

2018-12-18 Thread Simon Horman
On Fri, Dec 14, 2018 at 09:10:04AM +, Fabrizio Castro wrote:
> Dear All,
> 
> this series adds a basic dtsi for the RZ/G2E (a.k.a. r8a774c0).

Series applied for v4.22. Thanks!


[PATCH v5 5/8] gpio: gpio-mm: Utilize for_each_set_clump8 macro

2018-12-18 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-gpio-mm.c | 71 +++--
 1 file changed, 20 insertions(+), 51 deletions(-)

diff --git a/drivers/gpio/gpio-gpio-mm.c b/drivers/gpio/gpio-gpio-mm.c
index 8c150fd68d9d..5647abe72376 100644
--- a/drivers/gpio/gpio-gpio-mm.c
+++ b/drivers/gpio/gpio-gpio-mm.c
@@ -172,46 +172,25 @@ static int gpiomm_gpio_get(struct gpio_chip *chip, 
unsigned int offset)
return !!(port_state & mask);
 }
 
+static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
+
 static int gpiomm_gpio_get_multiple(struct gpio_chip *chip, unsigned long 
*mask,
unsigned long *bits)
 {
struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip);
-   size_t i;
-   static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+   unsigned int offset;
+   unsigned long gpio_mask;
+   unsigned int port_addr;
unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
-
-   /* read bits from current gpio port */
-   port_state = inb(gpiommgpio->base + ports[i]);
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   port_addr = gpiommgpio->base + ports[offset / 8];
+   port_state = inb(port_addr) & gpio_mask;
 
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, _state, offset);
}
 
return 0;
@@ -242,37 +221,27 @@ static void gpiomm_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip);
-   unsigned int i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int port;
-   unsigned int out_port;
+   unsigned int offset;
+   unsigned long gpio_mask;
+   size_t index;
+   unsigned int port_addr;
unsigned int bitmask;
unsigned long flags;
 
-   /* set bits are evaluated a gpio register size at a time */
-   for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
-   /* no more set bits in this mask word; skip to the next word */
-   if (!mask[BIT_WORD(i)]) {
-   i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
-   continue;
-   }
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   index = offset / 8;
+   port_addr = gpiommgpio->base + ports[index];
 
-   port = i / gpio_reg_size;
-   out_port = (port > 2) ? port + 1 : port;
-   bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)];
+   bitmask = bitmap_get_value8(bits, offset) & gpio_mask;
 
spin_lock_irqsave(>lock, flags);
 
/* update output state data and set device gpio register */
-   gpiommgpio->out_state[port] &= ~mask[BIT_WORD(i)];
-   gpiommgpio->out_state[port] |= bitmask;
-   outb(gpiommgpio->out_state[port], gpiommgpio->base + out_port);
+   gpiommgpio->out_state[index] &= ~gpio_mask;
+   gpiommgpio->out_state[index] |= bitmask;
+   outb(gpiommgpio->out_state[index], port_addr);
 
spin_unlock_irqrestore(>lock, flags);
-
-   /* prepare for next gpio register set */
-   mask[BIT_WORD(i)] >>= gpio_reg_size;
-   bits[BIT_WORD(i)] >>= gpio_reg_size;
}
 }
 
-- 
2.20.1



[PATCH v5 3/8] gpio: 104-dio-48e: Utilize for_each_set_clump8 macro

2018-12-18 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-104-dio-48e.c | 71 ++---
 1 file changed, 20 insertions(+), 51 deletions(-)

diff --git a/drivers/gpio/gpio-104-dio-48e.c b/drivers/gpio/gpio-104-dio-48e.c
index 92c8f944bf64..b68c39f8aa23 100644
--- a/drivers/gpio/gpio-104-dio-48e.c
+++ b/drivers/gpio/gpio-104-dio-48e.c
@@ -183,46 +183,25 @@ static int dio48e_gpio_get(struct gpio_chip *chip, 
unsigned offset)
return !!(port_state & mask);
 }
 
+static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
+
 static int dio48e_gpio_get_multiple(struct gpio_chip *chip, unsigned long 
*mask,
unsigned long *bits)
 {
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
-   size_t i;
-   static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+   unsigned int offset;
+   unsigned long gpio_mask;
+   unsigned int port_addr;
unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
-
-   /* read bits from current gpio port */
-   port_state = inb(dio48egpio->base + ports[i]);
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   port_addr = dio48egpio->base + ports[offset / 8];
+   port_state = inb(port_addr) & gpio_mask;
 
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, _state, offset);
}
 
return 0;
@@ -252,37 +231,27 @@ static void dio48e_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
-   unsigned int i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int port;
-   unsigned int out_port;
+   unsigned int offset;
+   unsigned long gpio_mask;
+   size_t index;
+   unsigned int port_addr;
unsigned int bitmask;
unsigned long flags;
 
-   /* set bits are evaluated a gpio register size at a time */
-   for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
-   /* no more set bits in this mask word; skip to the next word */
-   if (!mask[BIT_WORD(i)]) {
-   i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
-   continue;
-   }
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   index = offset / 8;
+   port_addr = dio48egpio->base + ports[index];
 
-   port = i / gpio_reg_size;
-   out_port = (port > 2) ? port + 1 : port;
-   bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)];
+   bitmask = bitmap_get_value8(bits, offset) & gpio_mask;
 
raw_spin_lock_irqsave(>lock, flags);
 
/* update output state data and set device gpio register */
-   dio48egpio->out_state[port] &= ~mask[BIT_WORD(i)];
-   dio48egpio->out_state[port] |= bitmask;
-   outb(dio48egpio->out_state[port], dio48egpio->base + out_port);
+   dio48egpio->out_state[index] &= ~gpio_mask;
+   dio48egpio->out_state[index] |= bitmask;
+   outb(dio48egpio->out_state[index], port_addr);
 
raw_spin_unlock_irqrestore(>lock, flags);
-
-   /* prepare for next gpio register set */
-   mask[BIT_WORD(i)] >>= gpio_reg_size;
-   bits[BIT_WORD(i)] >>= gpio_reg_size;
}
 }
 
-- 
2.20.1



[PATCH v5 6/8] gpio: ws16c48: Utilize for_each_set_clump8 macro

2018-12-18 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-ws16c48.c | 71 ++---
 1 file changed, 19 insertions(+), 52 deletions(-)

diff --git a/drivers/gpio/gpio-ws16c48.c b/drivers/gpio/gpio-ws16c48.c
index 5cf3697bfb15..b4c544d5da18 100644
--- a/drivers/gpio/gpio-ws16c48.c
+++ b/drivers/gpio/gpio-ws16c48.c
@@ -134,42 +134,19 @@ static int ws16c48_gpio_get_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip);
-   const unsigned int gpio_reg_size = 8;
-   size_t i;
-   const size_t num_ports = chip->ngpio / gpio_reg_size;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+   unsigned int offset;
+   unsigned long gpio_mask;
+   unsigned int port_addr;
unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < num_ports; i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
+   for_each_set_clump8(offset, gpio_mask, mask, chip->ngpio) {
+   port_addr = ws16c48gpio->base + offset / 8;
+   port_state = inb(port_addr) & gpio_mask;
 
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
-
-   /* read bits from current gpio port */
-   port_state = inb(ws16c48gpio->base + i);
-
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, _state, offset);
}
 
return 0;
@@ -203,39 +180,29 @@ static void ws16c48_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip);
-   unsigned int i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int port;
-   unsigned int iomask;
+   unsigned int offset;
+   unsigned long gpio_mask;
+   size_t index;
+   unsigned int port_addr;
unsigned int bitmask;
unsigned long flags;
 
-   /* set bits are evaluated a gpio register size at a time */
-   for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
-   /* no more set bits in this mask word; skip to the next word */
-   if (!mask[BIT_WORD(i)]) {
-   i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
-   continue;
-   }
-
-   port = i / gpio_reg_size;
+   for_each_set_clump8(offset, gpio_mask, mask, chip->ngpio) {
+   index = offset / 8;
+   port_addr = ws16c48gpio->base + index;
 
/* mask out GPIO configured for input */
-   iomask = mask[BIT_WORD(i)] & ~ws16c48gpio->io_state[port];
-   bitmask = iomask & bits[BIT_WORD(i)];
+   gpio_mask &= ~ws16c48gpio->io_state[index];
+   bitmask = bitmap_get_value8(bits, offset) & gpio_mask;
 
raw_spin_lock_irqsave(>lock, flags);
 
/* update output state data and set device gpio register */
-   ws16c48gpio->out_state[port] &= ~iomask;
-   ws16c48gpio->out_state[port] |= bitmask;
-   outb(ws16c48gpio->out_state[port], ws16c48gpio->base + port);
+   ws16c48gpio->out_state[index] &= ~gpio_mask;
+   ws16c48gpio->out_state[index] |= bitmask;
+   outb(ws16c48gpio->out_state[index], port_addr);
 
raw_spin_unlock_irqrestore(>lock, flags);
-
-   /* prepare for next gpio register set */
-   mask[BIT_WORD(i)] >>= gpio_reg_size;
-   bits[BIT_WORD(i)] >>= gpio_reg_size;
}
 }
 
-- 
2.20.1



[PATCH V3 1/2] mm: Replace all open encodings for NUMA_NO_NODE

2018-12-18 Thread Anshuman Khandual
At present there are multiple places where invalid node number is encoded
as -1. Even though implicitly understood it is always better to have macros
in there. Replace these open encodings for an invalid node number with the
global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like
'invalid node' from various places redirecting them to a common definition.

Reviewed-by: David Hildenbrand 
Acked-by: Jeff Kirsher [ixgbe]
Acked-by: Jens Axboe   [mtip32xx]
Acked-by: Vinod Koul  [dmaengine.c]
Acked-by: Michael Ellerman [powerpc]
Acked-by: Doug Ledford [drivers/infiniband]
Signed-off-by: Anshuman Khandual 
---
 arch/alpha/include/asm/topology.h |  3 ++-
 arch/ia64/kernel/numa.c   |  2 +-
 arch/ia64/mm/discontig.c  |  6 +++---
 arch/powerpc/include/asm/pci-bridge.h |  3 ++-
 arch/powerpc/kernel/paca.c|  3 ++-
 arch/powerpc/kernel/pci-common.c  |  3 ++-
 arch/powerpc/mm/numa.c| 14 +++---
 arch/powerpc/platforms/powernv/memtrace.c |  5 +++--
 arch/sparc/kernel/pci_fire.c  |  3 ++-
 arch/sparc/kernel/pci_schizo.c|  3 ++-
 arch/sparc/kernel/psycho_common.c |  3 ++-
 arch/sparc/kernel/sbus.c  |  3 ++-
 arch/sparc/mm/init_64.c   |  6 +++---
 arch/x86/include/asm/pci.h|  3 ++-
 arch/x86/kernel/apic/x2apic_uv_x.c|  7 ---
 arch/x86/kernel/smpboot.c |  3 ++-
 drivers/block/mtip32xx/mtip32xx.c |  5 +++--
 drivers/dma/dmaengine.c   |  4 +++-
 drivers/infiniband/hw/hfi1/affinity.c |  3 ++-
 drivers/infiniband/hw/hfi1/init.c |  3 ++-
 drivers/iommu/dmar.c  |  5 +++--
 drivers/iommu/intel-iommu.c   |  3 ++-
 drivers/misc/sgi-xp/xpc_uv.c  |  3 ++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  5 +++--
 include/linux/device.h|  2 +-
 init/init_task.c  |  3 ++-
 kernel/kthread.c  |  3 ++-
 kernel/sched/fair.c   | 15 ---
 lib/cpumask.c |  3 ++-
 mm/huge_memory.c  | 13 +++--
 mm/hugetlb.c  |  3 ++-
 mm/ksm.c  |  2 +-
 mm/memory.c   |  7 ---
 mm/memory_hotplug.c   | 12 ++--
 mm/mempolicy.c|  2 +-
 mm/page_alloc.c   |  4 ++--
 mm/page_ext.c |  2 +-
 net/core/pktgen.c |  3 ++-
 net/qrtr/qrtr.c   |  3 ++-
 39 files changed, 104 insertions(+), 74 deletions(-)

diff --git a/arch/alpha/include/asm/topology.h 
b/arch/alpha/include/asm/topology.h
index e6e13a8..5a77a40 100644
--- a/arch/alpha/include/asm/topology.h
+++ b/arch/alpha/include/asm/topology.h
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #ifdef CONFIG_NUMA
@@ -29,7 +30,7 @@ static const struct cpumask *cpumask_of_node(int node)
 {
int cpu;
 
-   if (node == -1)
+   if (node == NUMA_NO_NODE)
return cpu_all_mask;
 
cpumask_clear(_to_cpumask_map[node]);
diff --git a/arch/ia64/kernel/numa.c b/arch/ia64/kernel/numa.c
index 92c3762..1315da6 100644
--- a/arch/ia64/kernel/numa.c
+++ b/arch/ia64/kernel/numa.c
@@ -74,7 +74,7 @@ void __init build_cpu_to_node_map(void)
cpumask_clear(_to_cpu_mask[node]);
 
for_each_possible_early_cpu(cpu) {
-   node = -1;
+   node = NUMA_NO_NODE;
for (i = 0; i < NR_CPUS; ++i)
if (cpu_physical_id(cpu) == node_cpuid[i].phys_id) {
node = node_cpuid[i].nid;
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 8a96578..f9c3675 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -227,7 +227,7 @@ void __init setup_per_cpu_areas(void)
 * CPUs are put into groups according to node.  Walk cpu_map
 * and create new groups at node boundaries.
 */
-   prev_node = -1;
+   prev_node = NUMA_NO_NODE;
ai->nr_groups = 0;
for (unit = 0; unit < nr_units; unit++) {
cpu = cpu_map[unit];
@@ -435,7 +435,7 @@ static void __init *memory_less_node_alloc(int nid, 
unsigned long pernodesize)
 {
void *ptr = NULL;
u8 best = 0xff;
-   int bestnode = -1, node, anynode = 0;
+   int bestnode = NUMA_NO_NODE, node, anynode = 0;
 
for_each_online_node(node) {
if (node_isset(node, memory_less_mask))
@@ -447,7 +447,7 @@ static void __init *memory_less_node_alloc(int 

[PATCH v5 4/8] gpio: 104-idi-48: Utilize for_each_set_clump8 macro

2018-12-18 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-104-idi-48.c | 36 +++---
 1 file changed, 7 insertions(+), 29 deletions(-)

diff --git a/drivers/gpio/gpio-104-idi-48.c b/drivers/gpio/gpio-104-idi-48.c
index 88dc6f2449f6..fdf1b8b64cc4 100644
--- a/drivers/gpio/gpio-104-idi-48.c
+++ b/drivers/gpio/gpio-104-idi-48.c
@@ -93,42 +93,20 @@ static int idi_48_gpio_get_multiple(struct gpio_chip *chip, 
unsigned long *mask,
unsigned long *bits)
 {
struct idi_48_gpio *const idi48gpio = gpiochip_get_data(chip);
-   size_t i;
+   unsigned int offset;
+   unsigned long gpio_mask;
static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+   unsigned int port_addr;
unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   port_addr = idi48gpio->base + ports[offset / 8];
+   port_state = inb(port_addr) & gpio_mask;
 
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
-
-   /* read bits from current gpio port */
-   port_state = inb(idi48gpio->base + ports[i]);
-
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, _state, offset);
}
 
return 0;
-- 
2.20.1



[PATCH V3 2/2] Tools: Replace open encodings for NUMA_NO_NODE

2018-12-18 Thread Anshuman Khandual
From: Stephen Rothwell 

This replaces all open encodings in tools with NUMA_NO_NODE.
Also linux/numa.h is now needed for the perf build.

Signed-off-by: Anshuman Khandual 
Signed-off-by: Stephen Rothwell 
---
 tools/include/linux/numa.h | 16 
 tools/perf/bench/numa.c|  6 +++---
 2 files changed, 19 insertions(+), 3 deletions(-)
 create mode 100644 tools/include/linux/numa.h

diff --git a/tools/include/linux/numa.h b/tools/include/linux/numa.h
new file mode 100644
index 000..110b0e5
--- /dev/null
+++ b/tools/include/linux/numa.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_NUMA_H
+#define _LINUX_NUMA_H
+
+
+#ifdef CONFIG_NODES_SHIFT
+#define NODES_SHIFT CONFIG_NODES_SHIFT
+#else
+#define NODES_SHIFT 0
+#endif
+
+#define MAX_NUMNODES(1 << NODES_SHIFT)
+
+#defineNUMA_NO_NODE(-1)
+
+#endif /* _LINUX_NUMA_H */
diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c
index 4419551..e0ad5f1 100644
--- a/tools/perf/bench/numa.c
+++ b/tools/perf/bench/numa.c
@@ -298,7 +298,7 @@ static cpu_set_t bind_to_node(int target_node)
 
CPU_ZERO();
 
-   if (target_node == -1) {
+   if (target_node == NUMA_NO_NODE) {
for (cpu = 0; cpu < g->p.nr_cpus; cpu++)
CPU_SET(cpu, );
} else {
@@ -339,7 +339,7 @@ static void bind_to_memnode(int node)
unsigned long nodemask;
int ret;
 
-   if (node == -1)
+   if (node == NUMA_NO_NODE)
return;
 
BUG_ON(g->p.nr_nodes > (int)sizeof(nodemask)*8);
@@ -1363,7 +1363,7 @@ static void init_thread_data(void)
int cpu;
 
/* Allow all nodes by default: */
-   td->bind_node = -1;
+   td->bind_node = NUMA_NO_NODE;
 
/* Allow all CPUs by default: */
CPU_ZERO(>bind_cpumask);
-- 
2.7.4



[PATCH v5 8/8] gpio: pcie-idio-24: Utilize for_each_set_clump8 macro

2018-12-18 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-pcie-idio-24.c | 109 ---
 1 file changed, 40 insertions(+), 69 deletions(-)

diff --git a/drivers/gpio/gpio-pcie-idio-24.c b/drivers/gpio/gpio-pcie-idio-24.c
index 52f1647a46fd..b1686b052633 100644
--- a/drivers/gpio/gpio-pcie-idio-24.c
+++ b/drivers/gpio/gpio-pcie-idio-24.c
@@ -198,52 +198,34 @@ static int idio_24_gpio_get_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip);
-   size_t i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
-   unsigned long port_state;
+   unsigned int offset;
+   unsigned long gpio_mask;
void __iomem *ports[] = {
>reg->out0_7, >reg->out8_15,
>reg->out16_23, >reg->in0_7,
>reg->in8_15, >reg->in16_23,
};
+   size_t index;
+   unsigned long port_state;
const unsigned long out_mode_mask = BIT(1);
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports) + 1; i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   index = offset / 8;
 
/* read bits from current gpio port (port 6 is TTL GPIO) */
-   if (i < 6)
-   port_state = ioread8(ports[i]);
+   if (index < 6)
+   port_state = ioread8(ports[index]);
else if (ioread8(>reg->ctl) & out_mode_mask)
port_state = ioread8(>reg->ttl_out0_7);
else
port_state = ioread8(>reg->ttl_in0_7);
 
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   port_state &= gpio_mask;
+
+   bitmap_set_value8(bits, _state, offset);
}
 
return 0;
@@ -294,59 +276,48 @@ static void idio_24_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip);
-   size_t i;
-   unsigned long bits_offset;
+   unsigned int offset;
unsigned long gpio_mask;
-   const unsigned int gpio_reg_size = 8;
-   const unsigned long port_mask = GENMASK(gpio_reg_size, 0);
-   unsigned long flags;
-   unsigned int out_state;
void __iomem *ports[] = {
>reg->out0_7, >reg->out8_15,
>reg->out16_23
};
+   size_t index;
+   unsigned int bitmask;
+   unsigned long flags;
+   unsigned int out_state;
const unsigned long out_mode_mask = BIT(1);
-   const unsigned int ttl_offset = 48;
-   const size_t ttl_i = BIT_WORD(ttl_offset);
-   const unsigned int word_offset = ttl_offset % BITS_PER_LONG;
-   const unsigned long ttl_mask = (mask[ttl_i] >> word_offset) & port_mask;
-   const unsigned long ttl_bits = (bits[ttl_i] >> word_offset) & ttl_mask;
-
-   /* set bits are processed a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* check if any set bits for current port */
-   gpio_mask = (*mask >> bits_offset) & port_mask;
-   if (!gpio_mask) {
-   /* no set bits for this port so move on to next port */
-   continue;
-   }
 
-   raw_spin_lock_irqsave(>lock, flags);
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   index = offset / 8;
 
-   /* process output lines */
-   out_state = ioread8(ports[i]) & ~gpio_mask;
-   out_state 

[PATCH v5 7/8] gpio: pci-idio-16: Utilize for_each_set_clump8 macro

2018-12-18 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-pci-idio-16.c | 73 -
 1 file changed, 26 insertions(+), 47 deletions(-)

diff --git a/drivers/gpio/gpio-pci-idio-16.c b/drivers/gpio/gpio-pci-idio-16.c
index 6b7349783223..4eb89f8b5f9b 100644
--- a/drivers/gpio/gpio-pci-idio-16.c
+++ b/drivers/gpio/gpio-pci-idio-16.c
@@ -108,45 +108,23 @@ static int idio_16_gpio_get_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip);
-   size_t i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
-   unsigned long port_state;
+   unsigned int offset;
+   unsigned long gpio_mask;
void __iomem *ports[] = {
>reg->out0_7, >reg->out8_15,
>reg->in0_7, >reg->in8_15,
};
+   void __iomem *port_addr;
+   unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   port_addr = ports[offset / 8];
+   port_state = ioread8(port_addr) & gpio_mask;
 
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
-
-   /* read bits from current gpio port */
-   port_state = ioread8(ports[i]);
-
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, _state, offset);
}
 
return 0;
@@ -186,30 +164,31 @@ static void idio_16_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip);
+   unsigned int offset;
+   unsigned long gpio_mask;
+   void __iomem *ports[] = {
+   >reg->out0_7, >reg->out8_15,
+   };
+   size_t index;
+   void __iomem *port_addr;
+   unsigned int bitmask;
unsigned long flags;
unsigned int out_state;
 
-   raw_spin_lock_irqsave(>lock, flags);
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   index = offset / 8;
+   port_addr = ports[index];
 
-   /* process output lines 0-7 */
-   if (*mask & 0xFF) {
-   out_state = ioread8(>reg->out0_7) & ~*mask;
-   out_state |= *mask & *bits;
-   iowrite8(out_state, >reg->out0_7);
-   }
+   bitmask = bitmap_get_value8(bits, offset) & gpio_mask;
+
+   raw_spin_lock_irqsave(>lock, flags);
 
-   /* shift to next output line word */
-   *mask >>= 8;
+   out_state = ioread8(port_addr) & ~gpio_mask;
+   out_state |= bitmask;
+   iowrite8(out_state, port_addr);
 
-   /* process output lines 8-15 */
-   if (*mask & 0xFF) {
-   *bits >>= 8;
-   out_state = ioread8(>reg->out8_15) & ~*mask;
-   out_state |= *mask & *bits;
-   iowrite8(out_state, >reg->out8_15);
+   raw_spin_unlock_irqrestore(>lock, flags);
}
-
-   raw_spin_unlock_irqrestore(>lock, flags);
 }
 
 static void idio_16_irq_ack(struct irq_data *data)
-- 
2.20.1



Re: [PATCH 4/6] psi: introduce state_mask to represent stalled psi states

2018-12-18 Thread Peter Zijlstra
On Mon, Dec 17, 2018 at 05:14:53PM -0800, Suren Baghdasaryan wrote:
> On Mon, Dec 17, 2018 at 7:55 AM Peter Zijlstra  wrote:
> > > + if (state_mask & (1 << s))
> >
> > We have the BIT() macro, but I'm honestly not sure that will improve
> > things.
> 
> I was mimicking the rest of the code in psi.c that uses this kind of
> bit masking. Can change if you think that would be better.

Yeah, I really don't know.. keep it as is I suppose.


Re: [PATCH] mm: do not report isolation failures for CMA pages

2018-12-18 Thread Oscar Salvador
On Tue, Dec 18, 2018 at 10:28:02AM +0100, Michal Hocko wrote:
> From: Michal Hocko 
> 
> Heiko has complained that his log is swamped by warnings from 
> has_unmovable_pages
> [   20.536664] page dumped because: has_unmovable_pages
> [   20.536792] page:03d081ff4080 count:1 mapcount:0 
> mapping:8ff88600 index:0x0 compound_mapcount: 0
> [   20.536794] flags: 0x3fffe010200(slab|head)
> [   20.536795] raw: 03fffe010200 0100 0200 
> 8ff88600
> [   20.536796] raw:  00200041 0001 
> 
> [   20.536797] page dumped because: has_unmovable_pages
> [   20.536814] page:03d0823b count:1 mapcount:0 
> mapping: index:0x0
> [   20.536815] flags: 0x7fffe00()
> [   20.536817] raw: 07fffe00 0100 0200 
> 
> [   20.536818] raw:   0001 
> 
> 
> which are not triggered by the memory hotplug but rather CMA allocator.
> The original idea behind dumping the page state for all call paths was
> that these messages will be helpful debugging failures. From the above
> it seems that this is not the case for the CMA path because we are
> lacking much more context. E.g the second reported page might be a CMA
> allocated page. It is still interesting to see a slab page in the CMA
> area but it is hard to tell whether this is bug from the above output
> alone.
> 
> Address this issue by dumping the page state only on request. Both
> start_isolate_page_range and has_unmovable_pages already have an
> argument to ignore hwpoison pages so make this argument more generic and
> turn it into flags and allow callers to combine non-default modes into a
> mask. While we are at it, has_unmovable_pages call from 
> is_pageblock_removable_nolock
> (sysfs removable file) is questionable to report the failure so drop it
> from there as well.
> 
> Reported-by: Heiko Carstens 
> Signed-off-by: Michal Hocko 

Looks good to me, and it makes sense to not spam other users.

Just one thing:

AFAICS alloc_contig_range() can also be called from hugetlb code.
Do we weant to specify that in the changelog too?
And possibly change the patch title to:

"Only report isolation failures from memhotplug code" ?

Although is_pageblock_removable_nolock will not report the failures
now, so I am not sure.

Reviewed-by: Oscar Salvador 

-- 
Oscar Salvador
SUSE L3


Re: [RFC PATCH 2/2] PCI/portdrv Hisilicon PCIe transport layer Port PMU driver.

2018-12-18 Thread Jonathan Cameron
On Mon, 17 Dec 2018 12:19:15 -0600
Bjorn Helgaas  wrote:

> [+cc Logan for incidental Switchtec question below]
> 
> On Mon, Dec 17, 2018 at 11:09:06AM +, Jonathan Cameron wrote:
> > On Fri, 14 Dec 2018 17:55:05 -0600
> > Bjorn Helgaas  wrote:  
> > > On Fri, Dec 14, 2018 at 09:10:55PM +0800, Jonathan Cameron wrote:  
> > > > The Hip08 SoCs contain relatively detailed performance units for the
> > > > PCIe Transport Layer at each port.
> > > > 
> > > > The support here is a subset of what will come, but is intended to
> > > > provide some initial basic functionality.
> > > > 
> > > > Note that there is a _lot_ more functionality in this hardware unit
> > > > so this is the first RFC of several.
> > > > 
> > > > RFC questions:
> > > > 
> > > > 1. There is no standard for PCIe PMUs.  However, there are things that
> > > >are elements of the PCIe protocol so any similar PMU is likely to
> > > >support them.  Do we want to have a go at some consistent naming?
> > > 
> > > Is this a perf question, i.e., are you asking about the event names
> > > from "perf list"?  If so, I have no idea :)  But you're right that the
> > > events on the PCIe side are mostly defined by the PCIe spec and I
> > > agree it would make a lot of sense to use common names for those
> > > things.  
> > 
> > Exactly that.  It's nice if users know what they are getting rather than
> > having lots of subtly different variants of posted write tlp counters etc.
> > Ideally this would be defined by the PCI SIG in a similar fashion to
> > architectural counters on Arm, but they aren't yet...
> > 
> > I'm far from an expert on PCIe but I can pull in some of our PCI hardware
> > people and see if I can sketch out some docs for what a hypothetical
> > magic PMU could count, then working out a naming scheme to cover that.
> > +CC Alex and Eric. 
> > 
> > We have things like bandwidth estimation and credits statistics to
> > potentially consider as well.  Right now I can only find limited precedence
> > for handling that sort of a thing in PMUs in general (there are similar
> > things in Intel's graphics drivers).
> >   
> > > > 2. We are using an ACPI DSDT description to find what is basically a
> > > >platform device that is associated with a PCIe device. Is this an
> > > >acceptable thing to do?
> > > 
> > > If the PCIe device itself, e.g., a Root Port, consumes address space,
> > > it should have a BAR that describes it.  
> > 
> > I agree that would be the ideal / right way. Here we have an oddity
> > because the port is in the host SoC. It's not remotely compliant
> > with any standard.  We are looking at this for future hardware but I
> > don't think there is much we can do with this one.
> > 
> > Looking forwards, there isn't a huge amount of flexibility in port
> > definitions if we do want to put this in Bar space as only 2 Bars
> > available (type 1 config space header)  I have no idea if these are
> > in heavy use or not on existing hardware.  Looks like the Switchtec
> > parts use one of them.  I know from other activities that people get
> > very defensive of their BAR Space, particularly when looking to
> > retrofit in new versions of a long standing device.  
> 
> I *think* drivers/pci/switch/switchtec.c is for a type 0 endpoint that
> happens to be built into the switch, e.g., the type 1 bridge is
> function 0 and the type 0 management endpoint is function 1.  Logan,
> can you confirm/deny?

To bring the threads back together, Logan confirmed this.

> 
> The only use of BARs for type 1 devices that I'm aware of is portdrv
> (where a BAR may be used for MSI-X tables).  I'm sure there's other
> hardware out there that implements BARs, but I'm not aware of Linux
> drivers that use them.
> 
> This could be partly because Linux makes it very difficult for drivers
> to bind to these devices because portdrv is typically in the way.
> This is part of why I think we should rework portdrv so it's part of
> the core instead of binding as a driver.

That's certainly making sense to me as well.  Are we going to have issues
with backwards compatibility though?  It'll be fiddly to expose the existing
portdrv interface if this is reworked.  Probably possible though.

> 
> If the 2 BARs in type 1 devices is limiting, maybe multifunction
> devices as in the Switchtec parts is a possibility?


For some of the devices I'm looking at eating up functions is also
controversial :) They have an embedded switch and a 'lot' of functions.

People don't like spilling into additional bus numbers as that space is
also limited, so if you eat one of the 256 ARI functions they get annoyed.

One somewhat nasty trick might be to define a config space header
that tells you 'where' to find the PMU for a given port.  Pretty
much what MSIX does, but with the addition of a function number.

Down stream ports are trickier unless we assume their PMUs are in their
own bar space or that any reference to another function is one on the
upstream port.


Re: [PATCH] clk: imx: add CLK_GET_RATE_NOCACHE flag for i.MX8M composite clock

2018-12-18 Thread Fabio Estevam
Hi Anson,

On Tue, Dec 18, 2018 at 12:56 AM Anson Huang  wrote:
>
> On i.MX8M, some of the bus clocks' rate could be changed in TF-A,

Do you mean ATF (ARM Trusted Firmware) instead?


Re: [PATCH V4 1/3] misc/pvpanic: return 0 for empty body register function

2018-12-18 Thread Greg KH
On Tue, Dec 18, 2018 at 05:46:41PM +0800, Peng Hao wrote:
> Return 0 for empty body register function normally.
> 
> Signed-off-by: Peng Hao 
> ---
> QEMU community requires additional PCI devices to simulate PVPANIC devices
> so that some architectures can not occupy precious less than 4G of memory 
> space.

What is this for below the --- line?

And again, you did not specify what changed from the previous versions.
I can not take these unless you provide it.

thanks,

greg k-h


Re: [PATCH v2] f2fs: fix sbi->extent_list corruption issue

2018-12-18 Thread Chao Yu
On 2018/12/14 22:25, Jaegeuk Kim wrote:
> On 12/14, Sahitya Tummala wrote:
>> On Wed, Dec 12, 2018 at 11:36:08AM +0800, Chao Yu wrote:
>>> On 2018/12/12 11:17, Sahitya Tummala wrote:
 On Fri, Dec 07, 2018 at 05:47:31PM +0800, Chao Yu wrote:
> On 2018/12/1 4:33, Jaegeuk Kim wrote:
>> On 11/29, Sahitya Tummala wrote:
>>>
>>> On Tue, Nov 27, 2018 at 09:42:39AM +0800, Chao Yu wrote:
 On 2018/11/27 8:30, Jaegeuk Kim wrote:
> On 11/26, Sahitya Tummala wrote:
>> When there is a failure in f2fs_fill_super() after/during
>> the recovery of fsync'd nodes, it frees the current sbi and
>> retries again. This time the mount is successful, but the files
>> that got recovered before retry, still holds the extent tree,
>> whose extent nodes list is corrupted since sbi and sbi->extent_list
>> is freed up. The list_del corruption issue is observed when the
>> file system is getting unmounted and when those recoverd files extent
>> node is being freed up in the below context.
>>
>> list_del corruption. prev->next should be fff1e1ef5480, but was 
>> (null)
>> <...>
>> kernel BUG at kernel/msm-4.14/lib/list_debug.c:53!
>> task: fff1f46f2280 task.stack: ff8008068000
>> lr : __list_del_entry_valid+0x94/0xb4
>> pc : __list_del_entry_valid+0x94/0xb4
>> <...>
>> Call trace:
>> __list_del_entry_valid+0x94/0xb4
>> __release_extent_node+0xb0/0x114
>> __free_extent_tree+0x58/0x7c
>> f2fs_shrink_extent_tree+0xdc/0x3b0
>> f2fs_leave_shrinker+0x28/0x7c
>> f2fs_put_super+0xfc/0x1e0
>> generic_shutdown_super+0x70/0xf4
>> kill_block_super+0x2c/0x5c
>> kill_f2fs_super+0x44/0x50
>> deactivate_locked_super+0x60/0x8c
>> deactivate_super+0x68/0x74
>> cleanup_mnt+0x40/0x78
>> __cleanup_mnt+0x1c/0x28
>> task_work_run+0x48/0xd0
>> do_notify_resume+0x678/0xe98
>> work_pending+0x8/0x14
>>
>> Fix this by cleaning up inodes, extent tree and nodes of those
>> recovered files before freeing up sbi and before next retry.
>>
>> Signed-off-by: Sahitya Tummala 
>> ---
>> v2:
>> -call evict_inodes() and f2fs_shrink_extent_tree() to cleanup inodes
>>
>>  fs/f2fs/f2fs.h |  1 +
>>  fs/f2fs/shrinker.c |  2 +-
>>  fs/f2fs/super.c| 13 -
>>  3 files changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index 1e03197..aaee63b 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -3407,6 +3407,7 @@ struct rb_entry 
>> *f2fs_lookup_rb_tree_ret(struct rb_root_cached *root,
>>  bool f2fs_check_rb_tree_consistence(struct f2fs_sb_info *sbi,
>>  struct rb_root_cached 
>> *root);
>>  unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int 
>> nr_shrink);
>> +unsigned long __count_extent_cache(struct f2fs_sb_info *sbi);
>>  bool f2fs_init_extent_tree(struct inode *inode, struct f2fs_extent 
>> *i_ext);
>>  void f2fs_drop_extent_tree(struct inode *inode);
>>  unsigned int f2fs_destroy_extent_node(struct inode *inode);
>> diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c
>> index 9e13db9..7e3c13b 100644
>> --- a/fs/f2fs/shrinker.c
>> +++ b/fs/f2fs/shrinker.c
>> @@ -30,7 +30,7 @@ static unsigned long __count_free_nids(struct 
>> f2fs_sb_info *sbi)
>>  return count > 0 ? count : 0;
>>  }
>>  
>> -static unsigned long __count_extent_cache(struct f2fs_sb_info *sbi)
>> +unsigned long __count_extent_cache(struct f2fs_sb_info *sbi)
>>  {
>>  return atomic_read(>total_zombie_tree) +
>>  atomic_read(>total_ext_node);
>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>> index af58b2c..769e7b1 100644
>> --- a/fs/f2fs/super.c
>> +++ b/fs/f2fs/super.c
>> @@ -3016,6 +3016,16 @@ static void f2fs_tuning_parameters(struct 
>> f2fs_sb_info *sbi)
>>  sbi->readdir_ra = 1;
>>  }
>>  
>> +static void f2fs_cleanup_inodes(struct f2fs_sb_info *sbi)
>> +{
>> +struct super_block *sb = sbi->sb;
>> +
>> +sync_filesystem(sb);
>
> This writes another checkpoint, which would not be what this retrial 
> intended.

 Actually, checkpoint will not be triggered due to SBI_POR_DOING flag 
 check
 as below:

 int f2fs_sync_fs(struct super_block *sb, int sync)
 {
 ...
if 

Re: [PATCH v3 6/6] irqchip: sifive-plic: Implement irq_set_affinity() for SMP host

2018-12-18 Thread Anup Patel
On Tue, Dec 18, 2018 at 12:02 AM Christoph Hellwig  wrote:
>
> On Fri, Nov 30, 2018 at 01:32:07PM +0530, Anup Patel wrote:
> > This patch provides irq_set_affinity() implementation for PLIC driver.
> > It also updates irq_enable() such that PLIC interrupts are only enabled
> > for one of CPUs specified in IRQ affinity mask.
>
> But normally our affinity masks are that - masks of CPUs that can take
> it.  It seems a bit odd to then just pick the first one, as this means
> with default all-CPU masks we'll have all interrupts handled by the
> first CPU only.

Yes, affinity mask are CPUs which can take but there is also effective
affinity mask which represent CPUs which will actually receive IRQ.

Interrupt controllers (unlike PLIC) can support hardware IRQ balancing.
For such interrupt controllers, we inform all CPUs that can take IRQ but
interrupt controller will only deliver IRQ to only one of the CPUs.

There are quite a few interrupt controllers which only allow IRQ to
be taken by exactly one CPU. For such interrupt controllers, the
interrupt controller driver has to to pick one CPU out of CPUs which
can take IRQ (Example GICv2, GICv3, etc).

>
>
> > --- a/drivers/irqchip/irq-sifive-plic.c
> > +++ b/drivers/irqchip/irq-sifive-plic.c
> > @@ -106,14 +106,42 @@ static void plic_irq_toggle(const struct cpumask 
> > *mask, int hwirq, int enable)
> >
> >  static void plic_irq_enable(struct irq_data *d)
> >  {
> > - plic_irq_toggle(irq_data_get_affinity_mask(d), d->hwirq, 1);
> > + unsigned int cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
> > +cpu_online_mask);
> > + WARN_ON(cpu >= nr_cpu_ids);
>
> I think this should be WARN_ON_ONCE and actually return instead of then
> proceeding using the invalid cpu index.

Sure, will update.

>
> > +#ifdef CONFIG_SMP
> static int plic_set_affinity(struct irq_data *d, const struct cpumask 
> *mask_val,
> > + bool force)
> > +{
> > + unsigned int cpu;
> > +
> > + if (!force)
> > + cpu = cpumask_any_and(mask_val, cpu_online_mask);
> > + else
> > + cpu = cpumask_first(mask_val);
>
> maybe swap the two branches around to avoid the inversion of the force
> flag?

Sure, will update.

Regards,
Anup


Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

2018-12-18 Thread Jan Kara
On Mon 17-12-18 08:58:19, Dave Chinner wrote:
> On Fri, Dec 14, 2018 at 04:43:21PM +0100, Jan Kara wrote:
> > Hi!
> > 
> > On Thu 13-12-18 08:46:41, Dave Chinner wrote:
> > > On Wed, Dec 12, 2018 at 10:03:20AM -0500, Jerome Glisse wrote:
> > > > On Mon, Dec 10, 2018 at 11:28:46AM +0100, Jan Kara wrote:
> > > > > On Fri 07-12-18 21:24:46, Jerome Glisse wrote:
> > > > > So this approach doesn't look like a win to me over using counter in 
> > > > > struct
> > > > > page and I'd rather try looking into squeezing HMM public page usage 
> > > > > of
> > > > > struct page so that we can fit that gup counter there as well. I know 
> > > > > that
> > > > > it may be easier said than done...
> > > > 
> > > > So i want back to the drawing board and first i would like to ascertain
> > > > that we all agree on what the objectives are:
> > > > 
> > > > [O1] Avoid write back from a page still being written by either a
> > > >  device or some direct I/O or any other existing user of GUP.
> > > >  This would avoid possible file system corruption.
> > > > 
> > > > [O2] Avoid crash when set_page_dirty() is call on a page that is
> > > >  considered clean by core mm (buffer head have been remove and
> > > >  with some file system this turns into an ugly mess).
> > > 
> > > I think that's wrong. This isn't an "avoid a crash" case, this is a
> > > "prevent data and/or filesystem corruption" case. The primary goal
> > > we have here is removing our exposure to potential corruption, which
> > > has the secondary effect of avoiding the crash/panics that currently
> > > occur as a result of inconsistent page/filesystem state.
> > > 
> > > i.e. The goal is to have ->page_mkwrite() called on the clean page
> > > /before/ the file-backed page is marked dirty, and hence we don't
> > > expose ourselves to potential corruption or crashes that are a
> > > result of inappropriately calling set_page_dirty() on clean
> > > file-backed pages.
> > 
> > I agree that [O1] - i.e., avoid corrupting fs data - is more important and
> > [O2] is just one consequence of [O1].
> > 
> > > > For [O1] and [O2] i believe a solution with mapcount would work. So
> > > > no new struct, no fake vma, nothing like that. In GUP for file back
> > > > pages we increment both refcount and mapcount (we also need a special
> > > > put_user_page to decrement mapcount when GUP user are done with the
> > > > page).
> > > 
> > > I don't see how a mapcount can prevent anyone from calling
> > > set_page_dirty() inappropriately.
> > > 
> > > > Now for [O1] the write back have to call page_mkclean() to go through
> > > > all reverse mapping of the page and map read only. This means that
> > > > we can count the number of real mapping and see if the mapcount is
> > > > bigger than that. If mapcount is bigger than page is pin and we need
> > > > to use a bounce page to do the writeback.
> > > 
> > > Doesn't work. Generally filesystems have already mapped the page
> > > into bios before they call clear_page_dirty_for_io(), so it's too
> > > late for the filesystem to bounce the page at that point.
> > 
> > Yes, for filesystem it is too late. But the plan we figured back in October
> > was to do the bouncing in the block layer. I.e., mark the bio (or just the
> > particular page) as needing bouncing and then use the existing page
> > bouncing mechanism in the block layer to do the bouncing for us. Ext3 (when
> > it was still a separate fs driver) has been using a mechanism like this to
> > make DIF/DIX work with its metadata.
> 
> Sure, that's a possibility, but that doesn't close off any race
> conditions because there can be DMA into the page in progress while
> the page is being bounced, right? AFAICT this ext3+DIF/DIX case is
> different in that there is no 3rd-party access to the page while it
> is under IO (ext3 arbitrates all access to it's metadata), and so
> nothing can actually race for modification of the page between
> submission and bouncing at the block layer.
>
> In this case, the moment the page is unlocked, anyone else can map
> it and start (R)DMA on it, and that can happen before the bio is
> bounced by the block layer. So AFAICT, block layer bouncing doesn't
> solve the problem of racing writeback and DMA direct to the page we
> are doing IO on. Yes, it reduces the race window substantially, but
> it doesn't get rid of it.

The scenario you describe here cannot happen exactly because of the
wait_for_stable_page() in ->page_mkwrite() you mention below. If someone
will try to GUP a page that is under writeback (has already PageWriteback
set), GUP will have to do a write fault because the page is writeprotected
in page tables and go into ->page_mkwrite() which will wait.

The problem rather is with someone mapping the page *before* writeback
starts, giving the page to HW. Then clear_page_dirty_for_io() writeprotects
the page in PTEs but the HW gives a damn about that. Then, after we add the
page to the bio but before the 

Re: [f2fs-dev] [PATCH] f2fs: fix missing unlock(sbi->gc_mutex)

2018-12-18 Thread Chao Yu
On 2018/12/18 10:15, Jaegeuk Kim wrote:
> This fixes missing unlock call.
> 
> Cc: 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/super.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index b79677639108..2689a2cb56cc 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1462,6 +1462,9 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info 
> *sbi)
>  
>   while (!f2fs_time_over(sbi, DISABLE_TIME)) {
>   err = f2fs_gc(sbi, true, false, NULL_SEGNO);
> +
> + /* f2fs_gc guarantees unlock gc_mutex */

How about:

while () {
mutex_lock(>gc_mutex);
err = f2fs_gc();
if (err) {
handle error cases..
}
}

Thanks,

> + mutex_lock(>gc_mutex);
>   if (err == -ENODATA)
>   break;
>   if (err && err != -EAGAIN) {
> 



[PATCH] perf/x86/intel: Avoid unnecessary reallocations of memory allocated in cpu hotplug prepare state

2018-12-18 Thread zhe.he
From: He Zhe 

The memory of shared_regs excl_cntrs and constraint_list in struct cpu_hw_events
is currently allocated in hotplug prepare state and freed in dying state. The
memory can actually be reused across multiple cpu pluggings.

Besides, in preempt-rt full mode, the freeing can happen in atomic context and
thus cause the following BUG.

BUG: scheduling while atomic: migration/4/44/0x0002
 snip 
Preemption disabled at:
[] cpu_stopper_thread+0x71/0x100
CPU: 4 PID: 44 Comm: migration/4 Not tainted 4.19.8-rt6-preempt-rt #1
Hardware name: Intel Corporation Broadwell Client platform/Basking Ridge, BIOS 
BDW-E1R1.86C.0100.R03.1411050121 11/05/2014
Call Trace:
 dump_stack+0x4f/0x6a
 ? cpu_stopper_thread+0x71/0x100
 __schedule_bug.cold.16+0x38/0x55
 __schedule+0x484/0x6c0
 schedule+0x3d/0xf0
 rt_spin_lock_slowlock_locked+0x11a/0x2a0
 rt_spin_lock_slowlock+0x57/0x90
 __rt_spin_lock+0x26/0x30
 __write_rt_lock+0x23/0x1a0
 ? intel_pmu_cpu_dying+0x67/0x70
 rt_write_lock+0x2a/0x30
 find_and_remove_object+0x1e/0x80
 delete_object_full+0x10/0x20
 kmemleak_free+0x32/0x50
 kfree+0x104/0x1f0
 intel_pmu_cpu_dying+0x67/0x70
 ? x86_pmu_starting_cpu+0x30/0x30
 x86_pmu_dying_cpu+0x1a/0x30
 cpuhp_invoke_callback+0x9c/0x770
 ? cpu_disable_common+0x241/0x250
 take_cpu_down+0x70/0xa0
 multi_cpu_stop+0x62/0xc0
 ? cpu_stop_queue_work+0x130/0x130
 cpu_stopper_thread+0x79/0x100
 smpboot_thread_fn+0x217/0x2e0
 kthread+0x121/0x140
 ? sort_range+0x30/0x30
 ? kthread_park+0x90/0x90
 ret_from_fork+0x35/0x40

This patch changes to allocate the memory only when it has not been allocated,
and fill it with all zero when it has already been allocated, and remove the
unnecessary freeings.

Credit to Sebastian Andrzej Siewior for his suggestion.

Signed-off-by: He Zhe 
---
 arch/x86/events/core.c   |  2 +-
 arch/x86/events/intel/core.c | 45 
 arch/x86/events/perf_event.h |  5 ++---
 3 files changed, 23 insertions(+), 29 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 374a197..f07d1b1 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2010,7 +2010,7 @@ static struct cpu_hw_events *allocate_fake_cpuc(void)
 
/* only needed, if we have extra_regs */
if (x86_pmu.extra_regs) {
-   cpuc->shared_regs = allocate_shared_regs(cpu);
+   allocate_shared_regs(>shared_regs, cpu);
if (!cpuc->shared_regs)
goto error;
}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ecc3e34..a3c18de 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3398,13 +3398,16 @@ ssize_t intel_event_sysfs_show(char *page, u64 config)
return x86_event_sysfs_show(page, config, event);
 }
 
-struct intel_shared_regs *allocate_shared_regs(int cpu)
+void allocate_shared_regs(struct intel_shared_regs **pregs, int cpu)
 {
-   struct intel_shared_regs *regs;
+   struct intel_shared_regs *regs = *pregs;
int i;
 
-   regs = kzalloc_node(sizeof(struct intel_shared_regs),
-   GFP_KERNEL, cpu_to_node(cpu));
+   if (regs)
+   memset(regs, 0, sizeof(struct intel_shared_regs));
+   else
+   regs = *pregs = kzalloc_node(sizeof(struct intel_shared_regs),
+GFP_KERNEL, cpu_to_node(cpu));
if (regs) {
/*
 * initialize the locks to keep lockdep happy
@@ -3414,20 +3417,21 @@ struct intel_shared_regs *allocate_shared_regs(int cpu)
 
regs->core_id = -1;
}
-   return regs;
 }
 
-static struct intel_excl_cntrs *allocate_excl_cntrs(int cpu)
+static void allocate_excl_cntrs(struct intel_excl_cntrs **pc, int cpu)
 {
-   struct intel_excl_cntrs *c;
+   struct intel_excl_cntrs *c = *pc;
 
-   c = kzalloc_node(sizeof(struct intel_excl_cntrs),
-GFP_KERNEL, cpu_to_node(cpu));
+   if (c)
+   memset(c, 0, sizeof(struct intel_excl_cntrs));
+   else
+   c = *pc = kzalloc_node(sizeof(struct intel_excl_cntrs),
+  GFP_KERNEL, cpu_to_node(cpu));
if (c) {
raw_spin_lock_init(>lock);
c->core_id = -1;
}
-   return c;
 }
 
 static int intel_pmu_cpu_prepare(int cpu)
@@ -3435,7 +3439,7 @@ static int intel_pmu_cpu_prepare(int cpu)
struct cpu_hw_events *cpuc = _cpu(cpu_hw_events, cpu);
 
if (x86_pmu.extra_regs || x86_pmu.lbr_sel_map) {
-   cpuc->shared_regs = allocate_shared_regs(cpu);
+   allocate_shared_regs(>shared_regs, cpu);
if (!cpuc->shared_regs)
goto err;
}
@@ -3443,11 +3447,14 @@ static int intel_pmu_cpu_prepare(int cpu)
if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) {
size_t sz = X86_PMC_IDX_MAX * sizeof(struct 

Re: [PATCH] f2fs: fix block address for __check_sit_bitmap

2018-12-18 Thread Chao Yu
On 2018/12/18 17:32, sunqiuyang wrote:
> From: Qiuyang Sun 
> 
> Should use lstart (logical start address) instead of start (in dev) here.
> This fixes a bug in multi-device scenarios.

Good catch!

> 
> Signed-off-by: Qiuyang Sun 

Reviewed-by: Chao Yu 

Thanks,



Re: [PATCH v10 01/27] PM / Domains: Add generic data pointer to genpd_power_state struct

2018-12-18 Thread Daniel Lezcano
On 29/11/2018 18:46, Ulf Hansson wrote:
> Let's add a data pointer to the genpd_power_state struct, to allow a genpd
> backend driver to store per state specific data. In order to introduce the
> pointer, we also need to adopt how genpd frees the allocated data for the
> default genpd_power_state struct, that it may allocate at pm_genpd_init().
> 
> More precisely, let's use an internal genpd flag to understand when the
> states needs to be freed by genpd. When freeing the states data in
> genpd_remove(), let's also clear the corresponding genpd->states pointer
> and reset the genpd->state_count. In this way, a genpd backend driver
> becomes aware of when there is state specific data for it to free.
> 
> Cc: Lina Iyer 
> Co-developed-by: Lina Iyer 
> Signed-off-by: Ulf Hansson 
> ---
> 
> Changes in v10:
>   - Update the patch allow backend drivers to free the states specific
> data during genpd removal. Due to this added complexity, I decided to
> keep the patch separate, rather than fold it into the patch that makes
> use of the new void pointer, which was suggested by Rafael.
>   - Claim authorship of the patch as lots of changes has been done since
> the original pick up from Lina Iyer.
> 
> ---
>  drivers/base/power/domain.c | 8 ++--
>  include/linux/pm_domain.h   | 3 ++-
>  2 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 7f38a92b444a..e27b91d36a2a 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -1620,7 +1620,7 @@ static int genpd_set_default_power_state(struct 
> generic_pm_domain *genpd)
>  
>   genpd->states = state;
>   genpd->state_count = 1;
> - genpd->free = state;
> + genpd->free_state = true;
>  
>   return 0;
>  }
> @@ -1736,7 +1736,11 @@ static int genpd_remove(struct generic_pm_domain 
> *genpd)
>   list_del(>gpd_list_node);
>   genpd_unlock(genpd);
>   cancel_work_sync(>power_off_work);
> - kfree(genpd->free);
> + if (genpd->free_state) {
> + kfree(genpd->states);
> + genpd->states = NULL;
> + genpd->state_count = 0;

Why these two initializations? After genpd_remove, this structure
shouldn't be used anymore, no ?

> + }

Instead of a flag, replacing the 'free' pointer to a 'free' callback
will allow to keep the free path self-encapsulated in domain.c

genpd->free(genpd->states);

Patch 18/27 can fill this field with its specific free pointer.



>   pr_debug("%s: removed %s\n", __func__, genpd->name);
>  
>   return 0;
> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
> index 3b5d7280e52e..f9e09bd4152c 100644
> --- a/include/linux/pm_domain.h
> +++ b/include/linux/pm_domain.h
> @@ -69,6 +69,7 @@ struct genpd_power_state {
>   s64 residency_ns;
>   struct fwnode_handle *fwnode;
>   ktime_t idle_time;
> + void *data;
>  };
>  
>  struct genpd_lock_ops;
> @@ -110,7 +111,7 @@ struct generic_pm_domain {
>   struct genpd_power_state *states;
>   unsigned int state_count; /* number of states */
>   unsigned int state_idx; /* state that genpd will go to when off */
> - void *free; /* Free the state that was allocated for default */
> + bool free_state; /* Free the state that was allocated for default */
>   ktime_t on_time;
>   ktime_t accounting_time;
>   const struct genpd_lock_ops *lock_ops;
> 


-- 
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog



Re: [PATCH] clk: imx: add CLK_GET_RATE_NOCACHE flag for i.MX8M composite clock

2018-12-18 Thread Lucas Stach
Am Dienstag, den 18.12.2018, 08:24 -0200 schrieb Fabio Estevam:
> Hi Anson,
> 
> On Tue, Dec 18, 2018 at 12:56 AM Anson Huang 
> wrote:
> > 
> > On i.MX8M, some of the bus clocks' rate could be changed in TF-A,
> 
> Do you mean ATF (ARM Trusted Firmware) instead?

TF-A is the name of the day for what was formerly known as ATF...

However I don't think that it's correct to just don't cache the clock
settings. Normally the secure world firmware should not change any
clock settings at runtime, or it would run into all kinds of conflicts
with the clock driver. So there are probably some well known points in
time like a suspend or resume event when the firmware might change
clock settings, so we could instead use those to trigger an explicit
invalidate of the clock caches with much lower overhead.

Regards,
Lucas


Re: kernel BUG at fs/inode.c:LINE!

2018-12-18 Thread Tetsuo Handa
On 2018/12/17 19:08, Dmitry Vyukov wrote:
> On Mon, Dec 17, 2018 at 8:21 AM Al Viro  wrote:
>>>  slab_pre_alloc_hook mm/slab.h:423 [inline]
>>>  slab_alloc mm/slab.c:3365 [inline]
>>>  kmem_cache_alloc+0x2c4/0x730 mm/slab.c:3539
>>>  __d_alloc+0xc8/0xb90 fs/dcache.c:1599
>>> [ cut here ]
>>> kernel BUG at fs/inode.c:1566!
>>>  d_alloc_anon fs/dcache.c:1698 [inline]
>>>  d_make_root+0x43/0xc0 fs/dcache.c:1885
>>>  autofs_fill_super+0x6f1/0x1c30 fs/autofs/inode.c:273
>>
>> Huh?  BUG is in iput(), AFAICS, so the stack trace is rather misreported.
> 
> Yes, it's a known problem that kernel is generally incapable of
> producing parsable crash reports. I think Tetsuo is working on a
> solution, but it takes very large amount of discussions and months of
> time.

The solution, CONFIG_PRINTK_FROM option (which will be renamed to 
CONFIG_PRINTK_CALLER
option tomorrow), just arrived at linux-next-20181218. We can try it...

But currently syzbot cannot boot using linux-next-20181218 due to
"Kernel panic - not syncing: Can't create rootfs" presumably due to changes 
merged by
"Merge branches 'work.mount', 'work.misc', 'misc.misc' and 'work.iov_iter' into 
for-next".
I think that we need to examine this "Can't create rootfs" problem, for
the merge window is approaching...



Re: [PATCH -next] x86/xen: Fix read buffer overflow

2018-12-18 Thread YueHaibing
On 2018/12/18 16:31, Juergen Gross wrote:
> On 18/12/2018 09:19, YueHaibing wrote:
>> Fix smatch warning:
>>
>> arch/x86/xen/enlighten_pv.c:649 get_trap_addr() error:
>>  buffer overflow 'early_idt_handler_array' 32 <= 32
>>
>> Fixes: 42b3a4cb5609 ("x86/xen: Support early interrupts in xen pv guests")
>> Signed-off-by: YueHaibing 
>> ---
>>  arch/x86/xen/enlighten_pv.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
>> index 2f6787f..81f200d 100644
>> --- a/arch/x86/xen/enlighten_pv.c
>> +++ b/arch/x86/xen/enlighten_pv.c
>> @@ -646,7 +646,7 @@ static bool __ref get_trap_addr(void **addr, unsigned 
>> int ist)
>>  
>>  if (nr == ARRAY_SIZE(trap_array) &&
>>  *addr >= (void *)early_idt_handler_array[0] &&
>> -*addr < (void *)early_idt_handler_array[NUM_EXCEPTION_VECTORS]) {
>> +*addr < (void *)early_idt_handler_array[NUM_EXCEPTION_VECTORS - 1]) 
>> {
>>  nr = (*addr - (void *)early_idt_handler_array[0]) /
>>   EARLY_IDT_HANDLER_SIZE;
>>  *addr = (void *)xen_early_idt_handler_array[nr];
>>
> 
> No, this patch is wrong.
> 
> early_idt_handler_array is a 2-dimensional array:
> 
> const char
> early_idt_handler_array[NUM_EXCEPTION_VECTORS][EARLY_IDT_HANDLER_SIZE];
> 
> So above code doesn't do an out of bounds array access, but checks for
> *addr being in the array or outside of it (note the "<" used for the
> test).

Thank you for your explanation.

> 
> 
> Juergen
> 
> .
> 



Re: kernel BUG at fs/inode.c:LINE!

2018-12-18 Thread Ian Kent
On Mon, 2018-12-17 at 07:21 +, Al Viro wrote:
> On Sun, Dec 16, 2018 at 10:11:04PM -0800, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following crash on:
> > 
> > HEAD commit:d14b746c6c1c Add linux-next specific files for 20181214
> > git tree:   linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1370634740
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=1da6d2d18f803140
> > dashboard link: https://syzkaller.appspot.com/bug?extid=5399ed0832693e29f392
> > compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
> > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=101032b340
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1653406340
> > 
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+5399ed0832693e29f...@syzkaller.appspotmail.com
> > 
> >  slab_pre_alloc_hook mm/slab.h:423 [inline]
> >  slab_alloc mm/slab.c:3365 [inline]
> >  kmem_cache_alloc+0x2c4/0x730 mm/slab.c:3539
> >  __d_alloc+0xc8/0xb90 fs/dcache.c:1599
> > [ cut here ]
> > kernel BUG at fs/inode.c:1566!
> >  d_alloc_anon fs/dcache.c:1698 [inline]
> >  d_make_root+0x43/0xc0 fs/dcache.c:1885
> >  autofs_fill_super+0x6f1/0x1c30 fs/autofs/inode.c:273
> 
> Huh?  BUG is in iput(), AFAICS, so the stack trace is rather misreported.
> iput() can be called by d_make_root(), provided that dentry allocation
> fails.  So the most straightforward interpretation would be that we
> had an allocation failure (injected?), followed by iput() of the inode
> passed to d_make_root().  Which happened to find I_CLEAR in ->i_state
> of that inode somehow, which should be impossible short of seriously
> buggered inode refcounting somewhere - the inode has just been returned
> by new_inode(), which clears i_state, and it would have to have passed
> clear_inode() (i.e. has been through inode eviction) since then...

Sorry Al, that's my bad.

See 
https://www.ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-possible-inode-leak-in-autofs_fill_super.patch

I think this will fix it, I'll forward it to Andrew if you agree:

autofs - fix handling of d_make_root() return in autofs_fill_super()

From: Ian Kent 

A previous change to handle a possible inode leak in autofs_fill_super()
added an iput() on d_make_root() failure but d_make_root() already puts
the passed in inode on failure.

Reported-by: syzbot+5399ed0832693e29f...@syzkaller.appspotmail.com
Signed-off-by: Ian Kent 
---
 fs/autofs/inode.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/autofs/inode.c b/fs/autofs/inode.c
index 501833cc49a8..953f76b95172 100644
--- a/fs/autofs/inode.c
+++ b/fs/autofs/inode.c
@@ -271,7 +271,7 @@ int autofs_fill_super(struct super_block *s, void *data, 
int silent)
}
root = d_make_root(root_inode);
if (!root)
-   goto fail_iput;
+   goto fail_ino;
pipe = NULL;
 
root->d_fsdata = ino;
@@ -347,8 +347,6 @@ int autofs_fill_super(struct super_block *s, void *data, 
int silent)
 fail_dput:
dput(root);
goto fail_free;
-fail_iput:
-   iput(root_inode);
 fail_ino:
autofs_free_ino(ino);
 fail_free:



Re: [PATCH v4 3/9] drivers/firewire/core-iso.c: Convert to use vm_insert_range

2018-12-18 Thread Russell King - ARM Linux
On Tue, Dec 18, 2018 at 01:52:46AM +0530, Souptick Joarder wrote:
> Convert to use vm_insert_range to map range of kernel memory
> to user vma.
> 
> Signed-off-by: Souptick Joarder 
> Reviewed-by: Matthew Wilcox 
> ---
>  drivers/firewire/core-iso.c | 15 ++-
>  1 file changed, 2 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/firewire/core-iso.c b/drivers/firewire/core-iso.c
> index 35e784c..7bf28bb 100644
> --- a/drivers/firewire/core-iso.c
> +++ b/drivers/firewire/core-iso.c
> @@ -107,19 +107,8 @@ int fw_iso_buffer_init(struct fw_iso_buffer *buffer, 
> struct fw_card *card,
>  int fw_iso_buffer_map_vma(struct fw_iso_buffer *buffer,
> struct vm_area_struct *vma)
>  {
> - unsigned long uaddr;
> - int i, err;
> -
> - uaddr = vma->vm_start;
> - for (i = 0; i < buffer->page_count; i++) {
> - err = vm_insert_page(vma, uaddr, buffer->pages[i]);
> - if (err)
> - return err;
> -
> - uaddr += PAGE_SIZE;
> - }
> -
> - return 0;
> + return vm_insert_range(vma, vma->vm_start, buffer->pages,
> + buffer->page_count);

This looks functionally equivalent.  Note that if we go with my
proposal to your patch 4, that would cause an issue for this
implementation.

Maybe we need two functions, but that then causes problems with
which function should be used (which makes it easy to get wrong.)

I'm beginning to wonder if the risks of causing regressions and
introducing bugs is actually worth the effort of trying to clean
this up.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v5 0/5] mm: Randomize free memory

2018-12-18 Thread Rafael J. Wysocki
On Monday, December 17, 2018 5:32:10 PM CET Dan Williams wrote:
> On Mon, Dec 17, 2018 at 2:12 AM Rafael J. Wysocki  wrote:
> >
> > On Saturday, December 15, 2018 2:48:30 AM CET Dan Williams wrote:
> > > Changes since v4: [1]
> > > * Default the randomization to off and enable it dynamically based on
> > >   the detection of a memory side cache advertised by platform firmware.
> > >   In the case of x86 this enumeration comes from the ACPI HMAT. (Michal
> > >   and Mel)
> > > * Improve the changelog of the patch that introduces the shuffling to
> > >   clarify the motivation and better explain the tradeoffs. (Michal and
> > >   Mel)
> > > * Include the required HMAT enabling in the series.
> > >
> > > [1]: 
> > > https://lkml.kernel.org/r/153922180166.838512.8260339805733812034.st...@dwillia2-desk3.amr.corp.intel.com
> > >
> > > ---
> > >
> > > Quote patch 3:
> > >
> > > Randomization of the page allocator improves the average utilization of
> > > a direct-mapped memory-side-cache. Memory side caching is a platform
> > > capability that Linux has been previously exposed to in HPC
> > > (high-performance computing) environments on specialty platforms. In
> > > that instance it was a smaller pool of high-bandwidth-memory relative to
> > > higher-capacity / lower-bandwidth DRAM. Now, this capability is going to
> > > be found on general purpose server platforms where DRAM is a cache in
> > > front of higher latency persistent memory [2].
> > >
> > > Robert offered an explanation of the state of the art of Linux
> > > interactions with memory-side-caches [3], and I copy it here:
> > >
> > > It's been a problem in the HPC space:
> > > 
> > > http://www.nersc.gov/research-and-development/knl-cache-mode-performance-coe/
> > >
> > > A kernel module called zonesort is available to try to help:
> > > https://software.intel.com/en-us/articles/xeon-phi-software
> > >
> > > and this abandoned patch series proposed that for the kernel:
> > > https://lkml.org/lkml/2017/8/23/195
> > >
> > > Dan's patch series doesn't attempt to ensure buffers won't conflict, 
> > > but
> > > also reduces the chance that the buffers will. This will make 
> > > performance
> > > more consistent, albeit slower than "optimal" (which is near 
> > > impossible
> > > to attain in a general-purpose kernel).  That's better than forcing
> > > users to deploy remedies like:
> > > "To eliminate this gradual degradation, we have added a Stream
> > >  measurement to the Node Health Check that follows each job;
> > >  nodes are rebooted whenever their measured memory bandwidth
> > >  falls below 300 GB/s."
> > >
> > > A replacement for zonesort was merged upstream in commit cc9aec03e58f
> > > "x86/numa_emulation: Introduce uniform split capability". With this
> > > numa_emulation capability, memory can be split into cache sized
> > > ("near-memory" sized) numa nodes. A bind operation to such a node, and
> > > disabling workloads on other nodes, enables full cache performance.
> > > However, once the workload exceeds the cache size then cache conflicts
> > > are unavoidable. While HPC environments might be able to tolerate
> > > time-scheduling of cache sized workloads, for general purpose server
> > > platforms, the oversubscribed cache case will be the common case.
> > >
> > > The worst case scenario is that a server system owner benchmarks a
> > > workload at boot with an un-contended cache only to see that performance
> > > degrade over time, even below the average cache performance due to
> > > excessive conflicts. Randomization clips the peaks and fills in the
> > > valleys of cache utilization to yield steady average performance.
> > >
> > > See patch 3 for more details.
> > >
> > > [2]: 
> > > https://itpeernetwork.intel.com/intel-optane-dc-persistent-memory-operating-modes/
> > > [3]: https://lkml.org/lkml/2018/9/22/54
> >
> > Has this hibernation been tested with this series applied?
> 
> It has not. Is QEMU sufficient? What's your concern?

Well, hibernation does quite a bit of memory management and that involves
free memory too.  I'm not expecting any particular issues, but I may be
overlooking something and I would like to know that it doesn't break before
the changes go in.

QEMU should be sufficient, but let me talk to the power lab folks if they can
test that for you.

Is there a git branch with these changes available somewhere?



Re: [PATCH 6/6] psi: introduce psi monitor

2018-12-18 Thread Peter Zijlstra
On Mon, Dec 17, 2018 at 05:21:05PM -0800, Suren Baghdasaryan wrote:
> On Mon, Dec 17, 2018 at 8:22 AM Peter Zijlstra  wrote:

> > How well has this thing been fuzzed? Custom string parser, yay!
> 
> Honestly, not much. Normal cases and some obvious corner cases. Will
> check if I can use some fuzzer to get more coverage or will write a
> script.
> I'm not thrilled about writing a custom parser, so if there is a
> better way to handle this please advise.

The grammar seems fairly simple, something like:

  some-full = "some" | "full" ;
  threshold-abs = integer ;
  threshold-pct = integer, { "%" } ;
  threshold = threshold-abs | threshold-pct ;
  window = integer ;
  trigger = some-full, space, threshold, space, window ;

And that could even be expressed as two scanf formats:

 "%4s %u%% %u" , "%4s %u %u"

which then gets your something like:

  char type[5];

  if (sscanf(input, "%4s %u%% %u", , , ) == 3) {
// do pct thing
  } else if (sscanf(intput, "%4s %u %u", , , ) == 3) {
// do abs thing
  } else return -EFAIL;

  if (!strcmp(type, "some")) {
// some
  } else if (!strcmp(type, "full")) {
// full
  } else return -EFAIL;

  // do more

which seems like a lot less error prone. Alternatively you can use 4
formats:

  "some %u%% %u" "some %u %u"
  "full %u%% %u" "full %u %u"

and avoid the whole 'type' thing.




Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit

2018-12-18 Thread Peter Zijlstra
On Tue, Dec 18, 2018 at 06:18:42PM +0900, Sergey Senozhatsky wrote:
>   Hello,
>   
>   RFC
> 
> A painful subject:
> 
> I just noticed that stock systemd (no systemd debugging enabled) on my
> x86 box write()-s during shutdown to devkmsg more than before, so old
> devkmsg ratelimits do not apply:
> 
> $ sudo journalctl -n 4 -f | grep "kernel: printk: systemd-shutdow"
>  kernel: printk: systemd-shutdow: 35 output lines suppressed due to 
> ratelimiting
>  kernel: printk: systemd-shutdow: 31 output lines suppressed due to 
> ratelimiting
>  kernel: printk: systemd-shutdow: 35 output lines suppressed due to 
> ratelimiting
>  kernel: printk: systemd-shutdow: 36 output lines suppressed due to 
> ratelimiting
>  kernel: printk: systemd-shutdow: 36 output lines suppressed due to 
> ratelimiting
>  kernel: printk: systemd-shutdow: 36 output lines suppressed due to 
> ratelimiting
>  kernel: printk: systemd-shutdow: 36 output lines suppressed due to 
> ratelimiting
>  kernel: printk: systemd-shutdow: 35 output lines suppressed due to 
> ratelimiting
> 
> I know that there is a "kernel.printk_devkmsg" interface; do we
> expect every systemd-enabled distro to find that out and to tweak
> kernel.printk_devkmsg or shall we change the default devkmsg
> ratelimit instead?

How about we complain to systemd instead?


Re: [PATCH v2] kmemleak: Turn kmemleak_lock to raw spinlock on RT

2018-12-18 Thread He Zhe



On 2018/12/6 03:14, Sebastian Andrzej Siewior wrote:
> On 2018-12-05 21:53:37 [+0800], He Zhe wrote:
>> For call trace 1:
> …
>> Since kmemleak would most likely be used to debug in environments where
>> we would not expect as great performance as without it, and kfree() has raw 
>> locks
>> in its main path and other debug function paths, I suppose it wouldn't hurt 
>> that
>> we change to raw locks.
> okay.
>
 >From what I reached above, this is RT-only and happens on v4.18 and v4.19.

 The call trace above is caused by grabbing kmemleak_lock and then getting
 scheduled and then re-grabbing kmemleak_lock. Using raw lock can also solve
 this problem.
>>> But this is a reader / writer lock. And if I understand the other part
>>> of the thread then it needs multiple readers.
>> For call trace 2:
>>
>> I don't get what "it needs multiple readers" exactly means here.
>>
>> In this call trace, the kmemleak_lock is grabbed as write lock, and then 
>> scheduled
>> away, and then grabbed again as write lock from another path. It's a
>> write->write locking, compared to the discussion in the other part of the 
>> thread.
>>
>> This is essentially because kmemleak hooks on the very low level memory
>> allocation and free operations. After scheduled away, it can easily re-enter 
>> itself.
>> We need raw locks to prevent this from happening.
> With raw locks you wouldn't have multiple readers at the same time.
> Maybe you wouldn't have recursion but since you can't have multiple
> readers you would add lock contention where was none (because you could
> have two readers at the same time).

Sorry for slow reply.

OK. I understand your concern finally. At the commit log said, I wanted to use 
raw
rwlock but didn't find the DEFINE helper for it. Thinking it would not be 
expected to
have great performance, I turn to use raw spinlock instead. And yes, this would
introduce worse performance.

Maybe I miss the reason, but why don't we have rwlock_types_raw.h to define raw
rwlock helper for RT? With that, we can cleanly replace kmemleak_lock with a raw
rwlock.

Or should we just define a raw rwlock using basic type, like arch_rwlock_t, 
only in
kmemleak?

>
>>> Couldn't we just get rid of that kfree() or move it somewhere else?
>>> I mean if the free() memory on CPU-down and allocate it again CPU-up
>>> then we could skip that, rigth? Just allocate it and don't free it
>>> because the CPU will likely get up again.
>> For call trace 1:
>>
>> I went through the CPU hotplug code and found that the allocation of the
>> problematic data, cpuc->shared_regs, is done in intel_pmu_cpu_prepare. And
>> the free is done in intel_pmu_cpu_dying. They are handlers triggered by two
>> different perf events.
>>
>> It seems we can hardly form a convincing method that holds the data while
>> CPUs are off and then uses it again. raw locks would be easy and good enough.
> Why not allocate the memory in intel_pmu_cpu_prepare() if it is not
> already there (otherwise skip the allocation) and in
> intel_pmu_cpu_dying() not free it. It looks easy.

Thanks for your suggestion. I've sent the change for call trace 1 to mainline
mailing list. Hopefully it can be accepted.

Zhe

>
>> Thanks,
>> Zhe
> Sebastian
>



Re: [PATCH v4 1/5] Bluetooth: hci_qca: use wait_until_sent() for power pulses

2018-12-18 Thread Balakrishna Godavarthi

Hi Johan,

On 2018-12-18 14:43, Johan Hovold wrote:

On Mon, Dec 17, 2018 at 07:43:26PM +0530, Balakrishna Godavarthi wrote:

wcn3990 requires a power pulse to turn ON/OFF along with
regulators. Sometimes we are observing the power pulses are sent
out with some time delay, due to queuing these commands. This is
causing synchronization issues with chip, which intern delay the
chip setup or may end up with communication issues.

Signed-off-by: Balakrishna Godavarthi 
---
v4:
 * used serdev_device_write_buf() instead of serdev_device_write()

v3:
  * no change.
v2:
  * Updated function qca_send_power_pulse()
  * addressed reviewer comments.

v1:
 * initial patch

---
 drivers/bluetooth/hci_qca.c | 37 
+

 1 file changed, 13 insertions(+), 24 deletions(-)

diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index f036c8f98ea3..d8bc77c8c9b9 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1013,11 +1013,9 @@ static inline void host_set_baudrate(struct 
hci_uart *hu, unsigned int speed)

hci_uart_set_baudrate(hu, speed);
 }

-static int qca_send_power_pulse(struct hci_dev *hdev, u8 cmd)
+static int qca_send_power_pulse(struct hci_uart *hu, u8 cmd)
 {
-   struct hci_uart *hu = hci_get_drvdata(hdev);
-   struct qca_data *qca = hu->priv;
-   struct sk_buff *skb;
+   int ret;

/* These power pulses are single byte command which are sent
 * at required baudrate to wcn3990. On wcn3990, we have an external
@@ -1029,19 +1027,16 @@ static int qca_send_power_pulse(struct hci_dev 
*hdev, u8 cmd)

 * save power. Disabling hardware flow control is mandatory while
 * sending power pulses to SoC.
 */
-   bt_dev_dbg(hdev, "sending power pulse %02x to SoC", cmd);
-
-   skb = bt_skb_alloc(sizeof(cmd), GFP_KERNEL);
-   if (!skb)
-   return -ENOMEM;
-
+   bt_dev_dbg(hu->hdev, "sending power pulse %02x to SoC", cmd);
hci_uart_set_flow_control(hu, true);
+   ret = serdev_device_write_buf(hu->serdev, , sizeof(cmd));
+   if (ret < 0) {
+   bt_dev_err(hu->hdev, "failed to send power pulse %02x to SoC",
+  cmd);
+   return ret;
+   }


As I mentioned earlier, serdev_device_write_buf() can buffer less than
sizeof(cmd) bytes if the tty driver's write buffer is full (and return
the number of bytes buffered).

How you want to deal with that is up to you and the bluetooth
maintainers, but I think you want to at least log it even if you choose
to ignore it.

Johan


[Bala]: thanks for reminding me for buffer size.
we use the qca_send_power_pulse() where we use 
serdev_write_buf() to send
a single byte commands to the chip during power ON or power OFF 
i.e. as soon as we open port or before close.


during power on:
ideally open port can guarantee me to the have tty buffer empty 
as we didn't queued any data.
let us assume that if the buffer is full(), 
serdev_device_write_buf() will return -1 (as  buffer is full)
anyways i have check for return status to log the write_buf 
failure.


during power off:
yes here we may face an issue as already we have some data 
queued in the buffer. i.e. due to previous transactions.
I think we can achieve this by calling 
serdev_device_write_flush() and then calling serdev_device_write_buf().


--
Regards
Balakrishna.


Re: [PATCH 05/12] PCI: aardvark: add suspend to RAM support

2018-12-18 Thread Rafael J. Wysocki
On Monday, December 17, 2018 3:54:26 PM CET Miquel Raynal wrote:
> Hi Rafael,
> 
> "Rafael J. Wysocki"  wrote on Thu, 13 Dec 2018
> 22:50:51 +0100:
> 
> > On Thursday, December 13, 2018 3:30:00 PM CET Miquel Raynal wrote:
> > > Hi Lorenzo,
> > >   
> > > > > If that's really the case, then I can see how one device and it's
> > > > > children are suspended and the irq for it is disabled but the 
> > > > > providing
> > > > > devices (clk, regulator, bus controller, etc.) are still fully active
> > > > > and not suspended but in fact completely usable and able to service
> > > > > interrupts. If that all makes sense, then I would answer the question
> > > > > with a definitive "yes it's all fine" because the clk consumer could 
> > > > > be
> > > > > in the NOIRQ phase of its suspend but the clk provider wouldn't have
> > > > > even started suspending yet when clk_disable_unprepare() is called.   
> > > > >  
> > > > 
> > > > That's a very good summary and address my concern, I still question this
> > > > patch correctness (and many others that carry out clk operations in S2R
> > > > NOIRQ phase), they may work but do not tell me they are rock solid given
> > > > your accurate summary above.  
> > > 
> > > I understand your concern but I don't see any alternative right now
> > > and a deep rework of the PM core to respect such dependency is not
> > > something that can be done in a reasonable amount of time.  
> > 
> > Maybe you don't need to rework anything. :-)
> > 
> > Have you considered using device links?
> 
> Absolutely, yes :) I am actively working on it in parallel, you can
> check the third version there [1]. Stephen Boyd has a slightly
> different idea of how it should be done, I will propose a v4 this week,
> I can add you in copy if you are interested!
> 
> Anyway, there is one thing that is still missing:
> * Let's have device A that requests clock B
> * With the device link series, A is linked (as a child) to B.
> * A suspend/resume hooks handle things in the NOIRQ phase.

Why do you need them to run in the "noirq" phase in the first place?

> * B suspend/resume hooks handle things in the default phase.
> 
> What I expected during a suspend:
> 1/ ->suspend_noirq(device A)
> 2/ ->suspend(clock B)

This expectation is not in agreement with the documented suspend code flow,
however.

Each phase of it is carried out for *all* devices completely before getting
to the next phase, "prepare" first, then "suspend", "suspend_late" and
"suspend_noirq", in this order.

> Unfortunately, device links do not seem to enforce any priority between
> phases (default/late/noirq) and what happens is:
> 1/ ->suspend(B)
> 2/ ->suspend_noirq(A)
> Which has no sense in my case. Hence, I had to request the clock
> suspend/resume callbacks to be upgraded to the NOIRQ phase as well (I
> don't have a better solution for now). This is still under discussion
> in a thread you have been recently added to by Bjorn, see [2].
> 
> So when I told you I was not confident in "reworking the PM core to
> respect such dependency", this is what I was referring to. I am
> definitely ready to help, but I don't feel I can do it alone.
> 
> [1] https://www.spinics.net/lists/linux-clk/msg32824.html
> [2] https://marc.info/?l=linux-pm=154465198510735=2

The rework you seem to be talking about is not possible, I'm afraid.



[PATCH] regmap: regmap-irq: Remove default irq type setting from core

2018-12-18 Thread Matti Vaittinen
The common code should not set IRQ type. Read HW defaults to the
cache at startup instead of forcing type to EDGE_BOTH. If
default setting is needed this should be done via normal
mechanisms or by chip specific code if normal mechanisms are not
suitable for some reason. Common regmap-irq code should not have
defaults hard-coded but keep the HW/boot defaults untouched.

Signed-off-by: Matti Vaittinen 
---
So let's try removing the hard-coded default setting from generic
regmap-irq code as discussed with Mark here:
https://lore.kernel.org/lkml/20181217180722.gg27...@sirena.org.uk/

Core code should not care about the default trigger level - such
settings should be done by code which knows the target
platform/board.

I was not able to test this change as I have no max77620 which seems to
be the only user of regmap-irq type-setting in tree as of now.

The patch was created on top of the regulator-next tree, with
"regmap: irq: handle HW using separate rising/falling edge interrupts"
from Bartosz Golaszewski cherry-picked. This should still cleanly apply
on regmap-tree.

 drivers/base/regmap/regmap-irq.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/base/regmap/regmap-irq.c b/drivers/base/regmap/regmap-irq.c
index 603b1554f81c..8b216b2e2c19 100644
--- a/drivers/base/regmap/regmap-irq.c
+++ b/drivers/base/regmap/regmap-irq.c
@@ -625,26 +625,20 @@ int regmap_add_irq_chip(struct regmap *map, int irq, int 
irq_flags,
}
 
if (chip->num_type_reg && !chip->type_in_mask) {
-   for (i = 0; i < chip->num_irqs; i++) {
-   reg = chip->irqs[i].type_reg_offset / map->reg_stride;
-   d->type_buf_def[reg] |= chip->irqs[i].type_rising_mask |
-   chip->irqs[i].type_falling_mask;
-   }
for (i = 0; i < chip->num_type_reg; ++i) {
if (!d->type_buf_def[i])
continue;
 
reg = chip->type_base +
(i * map->reg_stride * d->type_reg_stride);
-   if (chip->type_invert)
-   ret = regmap_irq_update_bits(d, reg,
-   d->type_buf_def[i], 0xFF);
-   else
-   ret = regmap_irq_update_bits(d, reg,
-   d->type_buf_def[i], 0x0);
-   if (ret != 0) {
-   dev_err(map->dev,
-   "Failed to set type in 0x%x: %x\n",
+
+   ret = regmap_read(map, reg, >type_buf_def[i]);
+
+   if (d->chip->type_invert)
+   d->type_buf_def[i] = ~d->type_buf_def[i];
+
+   if (ret) {
+   dev_err(map->dev, "Failed to get type defaults 
at 0x%x: %d\n",
reg, ret);
goto err_alloc;
}
-- 
2.14.3


-- 
Matti Vaittinen
ROHM Semiconductors

~~~ "I don't think so," said Rene Descartes.  Just then, he vanished ~~~


Re: kernel BUG at fs/inode.c:LINE!

2018-12-18 Thread Amir Goldstein
On Tue, Dec 18, 2018 at 12:43 PM Ian Kent  wrote:
>
> On Mon, 2018-12-17 at 07:21 +, Al Viro wrote:
> > On Sun, Dec 16, 2018 at 10:11:04PM -0800, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:d14b746c6c1c Add linux-next specific files for 20181214
> > > git tree:   linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1370634740
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=1da6d2d18f803140
> > > dashboard link: 
> > > https://syzkaller.appspot.com/bug?extid=5399ed0832693e29f392
> > > compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
> > > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=101032b340
> > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1653406340
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+5399ed0832693e29f...@syzkaller.appspotmail.com
> > >
> > >  slab_pre_alloc_hook mm/slab.h:423 [inline]
> > >  slab_alloc mm/slab.c:3365 [inline]
> > >  kmem_cache_alloc+0x2c4/0x730 mm/slab.c:3539
> > >  __d_alloc+0xc8/0xb90 fs/dcache.c:1599
> > > [ cut here ]
> > > kernel BUG at fs/inode.c:1566!
> > >  d_alloc_anon fs/dcache.c:1698 [inline]
> > >  d_make_root+0x43/0xc0 fs/dcache.c:1885
> > >  autofs_fill_super+0x6f1/0x1c30 fs/autofs/inode.c:273
> >
> > Huh?  BUG is in iput(), AFAICS, so the stack trace is rather misreported.
> > iput() can be called by d_make_root(), provided that dentry allocation
> > fails.  So the most straightforward interpretation would be that we
> > had an allocation failure (injected?), followed by iput() of the inode
> > passed to d_make_root().  Which happened to find I_CLEAR in ->i_state
> > of that inode somehow, which should be impossible short of seriously
> > buggered inode refcounting somewhere - the inode has just been returned
> > by new_inode(), which clears i_state, and it would have to have passed
> > clear_inode() (i.e. has been through inode eviction) since then...
>
> Sorry Al, that's my bad.
>
> See 
> https://www.ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-possible-inode-leak-in-autofs_fill_super.patch
>
> I think this will fix it, I'll forward it to Andrew if you agree:
>
> autofs - fix handling of d_make_root() return in autofs_fill_super()

You realize you can just revert that patch.
d_make_root() can take NULL inode as argument.
At the very least, please mention the offending commit with Fixes tag.

Thanks,
Amir.

>
> From: Ian Kent 
>
> A previous change to handle a possible inode leak in autofs_fill_super()
> added an iput() on d_make_root() failure but d_make_root() already puts
> the passed in inode on failure.
>
> Reported-by: syzbot+5399ed0832693e29f...@syzkaller.appspotmail.com
> Signed-off-by: Ian Kent 
> ---
>  fs/autofs/inode.c |4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/fs/autofs/inode.c b/fs/autofs/inode.c
> index 501833cc49a8..953f76b95172 100644
> --- a/fs/autofs/inode.c
> +++ b/fs/autofs/inode.c
> @@ -271,7 +271,7 @@ int autofs_fill_super(struct super_block *s, void *data, 
> int silent)
> }
> root = d_make_root(root_inode);
> if (!root)
> -   goto fail_iput;
> +   goto fail_ino;
> pipe = NULL;
>
> root->d_fsdata = ino;
> @@ -347,8 +347,6 @@ int autofs_fill_super(struct super_block *s, void *data, 
> int silent)
>  fail_dput:
> dput(root);
> goto fail_free;
> -fail_iput:
> -   iput(root_inode);
>  fail_ino:
> autofs_free_ino(ino);
>  fail_free:
>


[PATCH net-next 07/12] net: hns3: update coalesce param per second

2018-12-18 Thread Peng Li
coalesce param updates every 100 napi times, it may update a little
late if ping test after a high rate flow, may over napi poll is called
100 times as ping test sends packets every second.

This patch updates coalesce param every second, instead with every
100 napi times. It can not update the param 100% in time, but the
lag time is very short.

Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 8 +++-
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 4 
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 354eca8..d060029 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -240,7 +240,6 @@ static void hns3_vector_gl_rl_init(struct 
hns3_enet_tqp_vector *tqp_vector,
tqp_vector->tx_group.coal.int_gl = HNS3_INT_GL_50K;
tqp_vector->rx_group.coal.int_gl = HNS3_INT_GL_50K;
 
-   tqp_vector->int_adapt_down = HNS3_INT_ADAPT_DOWN_START;
tqp_vector->rx_group.coal.flow_level = HNS3_FLOW_LOW;
tqp_vector->tx_group.coal.flow_level = HNS3_FLOW_LOW;
 }
@@ -2846,10 +2845,10 @@ static void hns3_update_new_int_gl(struct 
hns3_enet_tqp_vector *tqp_vector)
struct hns3_enet_ring_group *tx_group = _vector->tx_group;
bool rx_update, tx_update;
 
-   if (tqp_vector->int_adapt_down > 0) {
-   tqp_vector->int_adapt_down--;
+   /* update param every 1000ms */
+   if (time_before(jiffies,
+   tqp_vector->last_jiffies + msecs_to_jiffies(1000)))
return;
-   }
 
if (rx_group->coal.gl_adapt_enable) {
rx_update = hns3_get_new_int_gl(rx_group);
@@ -2866,7 +2865,6 @@ static void hns3_update_new_int_gl(struct 
hns3_enet_tqp_vector *tqp_vector)
}
 
tqp_vector->last_jiffies = jiffies;
-   tqp_vector->int_adapt_down = HNS3_INT_ADAPT_DOWN_START;
 }
 
 static int hns3_nic_common_poll(struct napi_struct *napi, int budget)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index 4e4b48b..e55995e 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -476,8 +476,6 @@ enum hns3_link_mode_bits {
 #define HNS3_INT_RL_MAX0x00EC
 #define HNS3_INT_RL_ENABLE_MASK0x40
 
-#define HNS3_INT_ADAPT_DOWN_START  100
-
 struct hns3_enet_coalesce {
u16 int_gl;
u8 gl_adapt_enable;
@@ -512,8 +510,6 @@ struct hns3_enet_tqp_vector {
 
char name[HNAE3_INT_NAME_LEN];
 
-   /* when 0 should adjust interrupt coalesce parameter */
-   u8 int_adapt_down;
unsigned long last_jiffies;
 } cacheline_internodealigned_in_smp;
 
-- 
1.9.1



[PATCH net-next 12/12] net: hns3: fix a SSU buffer checking bug

2018-12-18 Thread Peng Li
From: Yunsheng Lin 

When caculating the SSU buffer, it first allocate tx and
rx private buffer, then the remaining buffer is for rx
shared buffer. The remaining buffer size should be at
least bigger than or equal to the shared_std, which is the
minimum shared buffer size required by the driver, but
currently if the remaining buffer size is equal to the
shared_std, it returns failure, which causes SSU buffer
allocation failure problem.

This patch fixes this problem by rounding up shared_std before
checking the the remaining buffer size bigger than or equal to
the shared_std.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility 
Layer Support")
Signed-off-by: Yunsheng Lin 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index f847fde..d0e84de 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -1403,10 +1403,11 @@ static bool  hclge_is_rx_buf_ok(struct hclge_dev *hdev,
shared_buf_tc = pfc_enable_num * aligned_mps +
(tc_num - pfc_enable_num) * aligned_mps / 2 +
aligned_mps;
-   shared_std = max_t(u32, shared_buf_min, shared_buf_tc);
+   shared_std = roundup(max_t(u32, shared_buf_min, shared_buf_tc),
+HCLGE_BUF_SIZE_UNIT);
 
rx_priv = hclge_get_rx_priv_buff_alloced(buf_alloc);
-   if (rx_all <= rx_priv + shared_std)
+   if (rx_all < rx_priv + shared_std)
return false;
 
shared_buf = rounddown(rx_all - rx_priv, HCLGE_BUF_SIZE_UNIT);
-- 
1.9.1



[PATCH net-next 09/12] net: hns3: synchronize speed and duplex from phy when phy link up

2018-12-18 Thread Peng Li
Driver calls phy_connect_direct and registers hclge_mac_adjust_link
to synchronize mac speed and duplex from phy. It is better to
synchronize mac speed and duplex from phy when phy link up.

Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
index d8ef436..dabb843 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
@@ -179,6 +179,10 @@ static void hclge_mac_adjust_link(struct net_device 
*netdev)
int duplex, speed;
int ret;
 
+   /* When phy link down, do nothing */
+   if (netdev->phydev->link == 0)
+   return;
+
speed = netdev->phydev->speed;
duplex = netdev->phydev->duplex;
 
-- 
1.9.1



[PATCH net-next 06/12] net: hns3: fix incomplete uninitialization of IRQ in the hns3_nic_uninit_vector_data()

2018-12-18 Thread Peng Li
From: Huazhong Tan 

In the hns3_nic_uninit_vector_data(), the procedure of uninitializing
the tqp_vector's IRQ has not set affinity_notify to NULL and changes
its init flag. This patch fixes it. And for simplificaton, local
variable tqp_vector is used instead of priv->tqp_vector[i].

Fixes: 424eb834a9be ("net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 
SoC")
Signed-off-by: Huazhong Tan 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index e7cde08..354eca8 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3181,12 +3181,12 @@ static int hns3_nic_uninit_vector_data(struct 
hns3_nic_priv *priv)
 
hns3_free_vector_ring_chain(tqp_vector, _ring_chain);
 
-   if (priv->tqp_vector[i].irq_init_flag == HNS3_VECTOR_INITED) {
-   (void)irq_set_affinity_hint(
-   priv->tqp_vector[i].vector_irq,
-   NULL);
-   free_irq(priv->tqp_vector[i].vector_irq,
->tqp_vector[i]);
+   if (tqp_vector->irq_init_flag == HNS3_VECTOR_INITED) {
+   irq_set_affinity_notifier(tqp_vector->vector_irq,
+ NULL);
+   irq_set_affinity_hint(tqp_vector->vector_irq, NULL);
+   free_irq(tqp_vector->vector_irq, tqp_vector);
+   tqp_vector->irq_init_flag = HNS3_VECTOR_NOT_INITED;
}
 
priv->ring_data[i].ring->irq_init_flag = HNS3_VECTOR_NOT_INITED;
-- 
1.9.1



[PATCH net-next 03/12] net: hns3: fix napi_disable not return problem

2018-12-18 Thread Peng Li
From: Huazhong Tan 

While doing DOWN, the calling of napi_disable() may not return, since the
napi_complete() in the hns3_nic_common_poll() will never be called when
HNS3_NIC_STATE_DOWN is set. So we need to call napi_complete() before
checking HNS3_NIC_STETE_DOWN.

Fixes: ff0699e04b97 ("net: hns3: stop napi polling when HNS3_NIC_STATE_DOWN is 
set")
Signed-off-by: Huazhong Tan 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 2081e2e..e7cde08 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2909,8 +2909,8 @@ static int hns3_nic_common_poll(struct napi_struct *napi, 
int budget)
if (!clean_complete)
return budget;
 
-   if (likely(!test_bit(HNS3_NIC_STATE_DOWN, >state)) &&
-   napi_complete(napi)) {
+   if (napi_complete(napi) &&
+   likely(!test_bit(HNS3_NIC_STATE_DOWN, >state))) {
hns3_update_new_int_gl(tqp_vector);
hns3_mask_vector_irq(tqp_vector, 1);
}
-- 
1.9.1



[PATCH net-next 01/12] net: hns3: fix error handling int the hns3_get_vector_ring_chain

2018-12-18 Thread Peng Li
From: Huazhong Tan 

When hns3_get_vector_ring_chain() failed in the
hns3_nic_init_vector_data(), it should do the error handling instead
of return directly.

Also, cur_chain should be freed instead of chain and head->next should
be set to NULL in error handling of hns3_get_vector_ring_chain.

This patch fixes them.

Fixes: 73b907a083b8 ("net: hns3: bugfix for buffer not free problem during 
resetting")
Signed-off-by: Huazhong Tan 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 69142a3..2081e2e 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2993,9 +2993,10 @@ static int hns3_get_vector_ring_chain(struct 
hns3_enet_tqp_vector *tqp_vector,
cur_chain = head->next;
while (cur_chain) {
chain = cur_chain->next;
-   devm_kfree(>dev, chain);
+   devm_kfree(>dev, cur_chain);
cur_chain = chain;
}
+   head->next = NULL;
 
return -ENOMEM;
 }
@@ -3086,7 +3087,7 @@ static int hns3_nic_init_vector_data(struct hns3_nic_priv 
*priv)
ret = hns3_get_vector_ring_chain(tqp_vector,
 _ring_chain);
if (ret)
-   return ret;
+   goto map_ring_fail;
 
ret = h->ae_algo->ops->map_ring_to_vector(h,
tqp_vector->vector_irq, _ring_chain);
-- 
1.9.1



Re: [PATCH] perf/x86/intel: Avoid unnecessary reallocations of memory allocated in cpu hotplug prepare state

2018-12-18 Thread Peter Zijlstra
On Tue, Dec 18, 2018 at 06:30:33PM +0800, zhe...@windriver.com wrote:
> Besides, in preempt-rt full mode, the freeing can happen in atomic context and
> thus cause the following BUG.

Hurm, I though we fixed all those long ago..

And no, the patch is horrible; that's what we have things like
x86_pmu::cpu_dead() for.


[PATCH net-next 04/12] net: hns3: update some variables while hclge_reset()/hclgevf_reset() done

2018-12-18 Thread Peng Li
From: Huazhong Tan 

When hclge_reset() completes successfully, it should update the
last_reset_time, set reset_fail_cnt to 0, and set reset_type of
hnae3_ae_dev to HNAE3_NONE_RESET.

Also when hclgevf_reset() completes successfully, it should update
the last_reset_time, and set reset_type of hnae3_ae_dev to
HNAE3_NONE_RESET.

Signed-off-by: Huazhong Tan 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c   | 5 -
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 3fe08cf..a8a2ccf 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2810,7 +2810,6 @@ static void hclge_reset(struct hclge_dev *hdev)
 */
ae_dev->reset_type = hdev->reset_type;
hdev->reset_count++;
-   hdev->last_reset_time = jiffies;
/* perform reset of the stack & ae device for a client */
ret = hclge_notify_roce_client(hdev, HNAE3_DOWN_CLIENT);
if (ret)
@@ -2873,6 +2872,10 @@ static void hclge_reset(struct hclge_dev *hdev)
if (ret)
goto err_reset;
 
+   hdev->last_reset_time = jiffies;
+   hdev->reset_fail_cnt = 0;
+   ae_dev->reset_type = HNAE3_NONE_RESET;
+
return;
 
 err_reset_lock:
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 86596ee..54ba93a 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1342,6 +1342,9 @@ static int hclgevf_reset(struct hclgevf_dev *hdev)
 
rtnl_unlock();
 
+   hdev->last_reset_time = jiffies;
+   ae_dev->reset_type = HNAE3_NONE_RESET;
+
return ret;
 err_reset_lock:
rtnl_unlock();
-- 
1.9.1



[PATCH net-next 00/12] net: hns3: code optimizations & bugfixes for HNS3 driver

2018-12-18 Thread Peng Li
This patchset includes bugfixes and code optimizations for the HNS3
ethernet controller driver

Fuyun Liang (1):
  net: hns3: remove 1000M/half support of phy

Huazhong Tan (6):
  net: hns3: fix error handling int the hns3_get_vector_ring_chain
  net: hns3: uninitialize pci in the hclgevf_uninit
  net: hns3: fix napi_disable not return problem
  net: hns3: update some variables while hclge_reset()/hclgevf_reset()
done
  net: hns3: remove unnecessary configuration recapture while resetting
  net: hns3: fix incomplete uninitialization of IRQ in the
hns3_nic_uninit_vector_data()

Peng Li (2):
  net: hns3: update coalesce param per second
  net: hns3: synchronize speed and duplex from phy when phy link up

Yunsheng Lin (3):
  net: hns3: getting tx and dv buffer size through firmware
  net: hns3: aligning buffer size in SSU to 256 bytes
  net: hns3: fix a SSU buffer checking bug

 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c| 29 ---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h|  4 -
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |  5 +-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 95 +-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h|  3 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c|  6 +-
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c  |  5 +-
 7 files changed, 88 insertions(+), 59 deletions(-)

-- 
1.9.1



[PATCH net-next 10/12] net: hns3: getting tx and dv buffer size through firmware

2018-12-18 Thread Peng Li
From: Yunsheng Lin 

This patch adds support of getting tx and dv buffer size through
firmware, because different version of hardware requires different
size of tx and dv buffer.

This patch also add dv_buf_size to tc' private buffer size even if
pfc is not enable for the tc.

Signed-off-by: Yunsheng Lin 
Signed-off-by: Peng Li 
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |  5 ++-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 41 --
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h|  3 ++
 3 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 4771780..f23042b 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -420,7 +420,9 @@ struct hclge_pf_res_cmd {
 #define HCLGE_PF_VEC_NUM_M GENMASK(7, 0)
__le16 pf_intr_vector_number;
__le16 pf_own_fun_number;
-   __le32 rsv[3];
+   __le16 tx_buf_size;
+   __le16 dv_buf_size;
+   __le32 rsv[2];
 };
 
 #define HCLGE_CFG_OFFSET_S 0
@@ -839,6 +841,7 @@ struct hclge_serdes_lb_cmd {
 #define HCLGE_TOTAL_PKT_BUF0x108000 /* 1.03125M bytes */
 #define HCLGE_DEFAULT_DV   0xA000   /* 40k byte */
 #define HCLGE_DEFAULT_NON_DCB_DV   0x7800  /* 30K byte */
+#define HCLGE_NON_DCB_ADDITIONAL_BUF   0x200   /* 512 byte */
 
 #define HCLGE_TYPE_CRQ 0
 #define HCLGE_TYPE_CSQ 1
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index b66eee9..c52e903 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -687,6 +687,18 @@ static int hclge_query_pf_resource(struct hclge_dev *hdev)
hdev->num_tqps = __le16_to_cpu(req->tqp_num);
hdev->pkt_buf_size = __le16_to_cpu(req->buf_size) << HCLGE_BUF_UNIT_S;
 
+   if (req->tx_buf_size)
+   hdev->tx_buf_size =
+   __le16_to_cpu(req->tx_buf_size) << HCLGE_BUF_UNIT_S;
+   else
+   hdev->tx_buf_size = HCLGE_DEFAULT_TX_BUF;
+
+   if (req->dv_buf_size)
+   hdev->dv_buf_size =
+   __le16_to_cpu(req->dv_buf_size) << HCLGE_BUF_UNIT_S;
+   else
+   hdev->dv_buf_size = HCLGE_DEFAULT_DV;
+
if (hnae3_dev_roce_supported(hdev)) {
hdev->roce_base_msix_offset =
hnae3_get_field(__le16_to_cpu(req->msixcap_localid_ba_rocee),
@@ -1376,9 +1388,10 @@ static bool  hclge_is_rx_buf_ok(struct hclge_dev *hdev,
pfc_enable_num = hclge_get_pfc_enalbe_num(hdev);
 
if (hnae3_dev_dcb_supported(hdev))
-   shared_buf_min = 2 * hdev->mps + HCLGE_DEFAULT_DV;
+   shared_buf_min = 2 * hdev->mps + hdev->dv_buf_size;
else
-   shared_buf_min = 2 * hdev->mps + HCLGE_DEFAULT_NON_DCB_DV;
+   shared_buf_min = hdev->mps + HCLGE_NON_DCB_ADDITIONAL_BUF
+   + hdev->dv_buf_size;
 
shared_buf_tc = pfc_enable_num * hdev->mps +
(tc_num - pfc_enable_num) * hdev->mps / 2 +
@@ -1391,8 +1404,15 @@ static bool  hclge_is_rx_buf_ok(struct hclge_dev *hdev,
 
shared_buf = rx_all - rx_priv;
buf_alloc->s_buf.buf_size = shared_buf;
-   buf_alloc->s_buf.self.high = shared_buf;
-   buf_alloc->s_buf.self.low =  2 * hdev->mps;
+   if (hnae3_dev_dcb_supported(hdev)) {
+   buf_alloc->s_buf.self.high = shared_buf - hdev->dv_buf_size;
+   buf_alloc->s_buf.self.low = buf_alloc->s_buf.self.high
+   - hdev->mps / 2;
+   } else {
+   buf_alloc->s_buf.self.high = hdev->mps +
+   HCLGE_NON_DCB_ADDITIONAL_BUF;
+   buf_alloc->s_buf.self.low = hdev->mps / 2;
+   }
 
for (i = 0; i < HCLGE_MAX_TC_NUM; i++) {
if ((hdev->hw_tc_map & BIT(i)) &&
@@ -1419,11 +1439,11 @@ static int hclge_tx_buffer_calc(struct hclge_dev *hdev,
for (i = 0; i < HCLGE_MAX_TC_NUM; i++) {
struct hclge_priv_buf *priv = _alloc->priv_buf[i];
 
-   if (total_size < HCLGE_DEFAULT_TX_BUF)
+   if (total_size < hdev->tx_buf_size)
return -ENOMEM;
 
if (hdev->hw_tc_map & BIT(i))
-   priv->tx_buf_size = HCLGE_DEFAULT_TX_BUF;
+   priv->tx_buf_size = hdev->tx_buf_size;
else
priv->tx_buf_size = 0;
 
@@ -1469,11 +1489,12 @@ static int hclge_rx_buffer_calc(struct hclge_dev *hdev,
priv->wl.low = aligned_mps;
priv->wl.high = priv->wl.low + aligned_mps;
 

[PATCH net-next 11/12] net: hns3: aligning buffer size in SSU to 256 bytes

2018-12-18 Thread Peng Li
From: Yunsheng Lin 

The hardware expects the buffer size set to SSU is aligned to
256 bytes, this patch aligns the buffer size to 256 byte using
roundup or rounddown function.

Signed-off-by: Yunsheng Lin 
Signed-off-by: Peng Li 
---
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 43 +-
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index c52e903..f847fde 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -26,6 +26,8 @@
 #define HCLGE_STATS_READ(p, offset) (*((u64 *)((u8 *)(p) + (offset
 #define HCLGE_MAC_STATS_FIELD_OFF(f) (offsetof(struct hclge_mac_stats, f))
 
+#define HCLGE_BUF_SIZE_UNIT256
+
 static int hclge_set_mac_mtu(struct hclge_dev *hdev, int new_mps);
 static int hclge_init_vlan_config(struct hclge_dev *hdev);
 static int hclge_reset_ae_dev(struct hnae3_ae_dev *ae_dev);
@@ -693,12 +695,16 @@ static int hclge_query_pf_resource(struct hclge_dev *hdev)
else
hdev->tx_buf_size = HCLGE_DEFAULT_TX_BUF;
 
+   hdev->tx_buf_size = roundup(hdev->tx_buf_size, HCLGE_BUF_SIZE_UNIT);
+
if (req->dv_buf_size)
hdev->dv_buf_size =
__le16_to_cpu(req->dv_buf_size) << HCLGE_BUF_UNIT_S;
else
hdev->dv_buf_size = HCLGE_DEFAULT_DV;
 
+   hdev->dv_buf_size = roundup(hdev->dv_buf_size, HCLGE_BUF_SIZE_UNIT);
+
if (hnae3_dev_roce_supported(hdev)) {
hdev->roce_base_msix_offset =
hnae3_get_field(__le16_to_cpu(req->msixcap_localid_ba_rocee),
@@ -1380,48 +1386,50 @@ static bool  hclge_is_rx_buf_ok(struct hclge_dev *hdev,
 {
u32 shared_buf_min, shared_buf_tc, shared_std;
int tc_num, pfc_enable_num;
-   u32 shared_buf;
+   u32 shared_buf, aligned_mps;
u32 rx_priv;
int i;
 
tc_num = hclge_get_tc_num(hdev);
pfc_enable_num = hclge_get_pfc_enalbe_num(hdev);
+   aligned_mps = roundup(hdev->mps, HCLGE_BUF_SIZE_UNIT);
 
if (hnae3_dev_dcb_supported(hdev))
-   shared_buf_min = 2 * hdev->mps + hdev->dv_buf_size;
+   shared_buf_min = 2 * aligned_mps + hdev->dv_buf_size;
else
-   shared_buf_min = hdev->mps + HCLGE_NON_DCB_ADDITIONAL_BUF
+   shared_buf_min = aligned_mps + HCLGE_NON_DCB_ADDITIONAL_BUF
+ hdev->dv_buf_size;
 
-   shared_buf_tc = pfc_enable_num * hdev->mps +
-   (tc_num - pfc_enable_num) * hdev->mps / 2 +
-   hdev->mps;
+   shared_buf_tc = pfc_enable_num * aligned_mps +
+   (tc_num - pfc_enable_num) * aligned_mps / 2 +
+   aligned_mps;
shared_std = max_t(u32, shared_buf_min, shared_buf_tc);
 
rx_priv = hclge_get_rx_priv_buff_alloced(buf_alloc);
if (rx_all <= rx_priv + shared_std)
return false;
 
-   shared_buf = rx_all - rx_priv;
+   shared_buf = rounddown(rx_all - rx_priv, HCLGE_BUF_SIZE_UNIT);
buf_alloc->s_buf.buf_size = shared_buf;
if (hnae3_dev_dcb_supported(hdev)) {
buf_alloc->s_buf.self.high = shared_buf - hdev->dv_buf_size;
buf_alloc->s_buf.self.low = buf_alloc->s_buf.self.high
-   - hdev->mps / 2;
+   - roundup(aligned_mps / 2, HCLGE_BUF_SIZE_UNIT);
} else {
-   buf_alloc->s_buf.self.high = hdev->mps +
+   buf_alloc->s_buf.self.high = aligned_mps +
HCLGE_NON_DCB_ADDITIONAL_BUF;
-   buf_alloc->s_buf.self.low = hdev->mps / 2;
+   buf_alloc->s_buf.self.low =
+   roundup(aligned_mps / 2, HCLGE_BUF_SIZE_UNIT);
}
 
for (i = 0; i < HCLGE_MAX_TC_NUM; i++) {
if ((hdev->hw_tc_map & BIT(i)) &&
(hdev->tm_info.hw_pfc_map & BIT(i))) {
-   buf_alloc->s_buf.tc_thrd[i].low = hdev->mps;
-   buf_alloc->s_buf.tc_thrd[i].high = 2 * hdev->mps;
+   buf_alloc->s_buf.tc_thrd[i].low = aligned_mps;
+   buf_alloc->s_buf.tc_thrd[i].high = 2 * aligned_mps;
} else {
buf_alloc->s_buf.tc_thrd[i].low = 0;
-   buf_alloc->s_buf.tc_thrd[i].high = hdev->mps;
+   buf_alloc->s_buf.tc_thrd[i].high = aligned_mps;
}
}
 
@@ -1461,7 +1469,6 @@ static int hclge_tx_buffer_calc(struct hclge_dev *hdev,
 static int hclge_rx_buffer_calc(struct hclge_dev *hdev,
struct hclge_pkt_buf_alloc *buf_alloc)
 {
-#define HCLGE_BUF_SIZE_UNIT128
u32 rx_all = hdev->pkt_buf_size, aligned_mps;
  

[PATCH net-next 08/12] net: hns3: remove 1000M/half support of phy

2018-12-18 Thread Peng Li
From: Fuyun Liang 

Our phy does not support 1000M/half, this patch removes 1000M/half from
PHY_SUPPORTED_FEATURES.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
index 741cb3b..d8ef436 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
@@ -12,7 +12,7 @@
 SUPPORTED_TP | \
 PHY_10BT_FEATURES | \
 PHY_100BT_FEATURES | \
-PHY_1000BT_FEATURES)
+SUPPORTED_1000baseT_Full)
 
 enum hclge_mdio_c22_op_seq {
HCLGE_MDIO_C22_WRITE = 1,
-- 
1.9.1



[PATCH net-next 05/12] net: hns3: remove unnecessary configuration recapture while resetting

2018-12-18 Thread Peng Li
From: Huazhong Tan 

When doing reset, it is unnecessary to get the hardware's default
configuration again, otherwise, the user's configuration will be
overwritten.

Fixes: 4ed340ab8f49 ("net: hns3: Add reset process in hclge_main")
Signed-off-by: Huazhong Tan 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index a8a2ccf..b66eee9 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -7380,19 +7380,6 @@ static int hclge_reset_ae_dev(struct hnae3_ae_dev 
*ae_dev)
return ret;
}
 
-   ret = hclge_get_cap(hdev);
-   if (ret) {
-   dev_err(>dev, "get hw capability error, ret = %d.\n",
-   ret);
-   return ret;
-   }
-
-   ret = hclge_configure(hdev);
-   if (ret) {
-   dev_err(>dev, "Configure dev error, ret = %d.\n", ret);
-   return ret;
-   }
-
ret = hclge_map_tqp(hdev);
if (ret) {
dev_err(>dev, "Map tqp error, ret = %d.\n", ret);
-- 
1.9.1



[PATCH net-next 02/12] net: hns3: uninitialize pci in the hclgevf_uninit

2018-12-18 Thread Peng Li
From: Huazhong Tan 

In the hclgevf_pci_reset(), it only uninitialize and initialize
the msi, so if the initialization fails, hclgevf_uninit_hdev()
does not need to uninitialize the msi, but needs to uninitialize
the pci, otherwise it will cause pci resource not free.

Fixes: 862d969a3a4d ("net: hns3: do VF's pci re-initialization while PF doing 
FLR")
Signed-off-by: Huazhong Tan 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 75327dc..86596ee 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -2401,9 +2401,9 @@ static void hclgevf_uninit_hdev(struct hclgevf_dev *hdev)
if (test_bit(HCLGEVF_STATE_IRQ_INITED, >state)) {
hclgevf_misc_irq_uninit(hdev);
hclgevf_uninit_msi(hdev);
-   hclgevf_pci_uninit(hdev);
}
 
+   hclgevf_pci_uninit(hdev);
hclgevf_cmd_uninit(hdev);
 }
 
-- 
1.9.1



[PATCH v3] f2fs: fix sbi->extent_list corruption issue

2018-12-18 Thread Sahitya Tummala
When there is a failure in f2fs_fill_super() after/during
the recovery of fsync'd nodes, it frees the current sbi and
retries again. This time the mount is successful, but the files
that got recovered before retry, still holds the extent tree,
whose extent nodes list is corrupted since sbi and sbi->extent_list
is freed up. The list_del corruption issue is observed when the
file system is getting unmounted and when those recoverd files extent
node is being freed up in the below context.

list_del corruption. prev->next should be fff1e1ef5480, but was (null)
<...>
kernel BUG at kernel/msm-4.14/lib/list_debug.c:53!
lr : __list_del_entry_valid+0x94/0xb4
pc : __list_del_entry_valid+0x94/0xb4
<...>
Call trace:
__list_del_entry_valid+0x94/0xb4
__release_extent_node+0xb0/0x114
__free_extent_tree+0x58/0x7c
f2fs_shrink_extent_tree+0xdc/0x3b0
f2fs_leave_shrinker+0x28/0x7c
f2fs_put_super+0xfc/0x1e0
generic_shutdown_super+0x70/0xf4
kill_block_super+0x2c/0x5c
kill_f2fs_super+0x44/0x50
deactivate_locked_super+0x60/0x8c
deactivate_super+0x68/0x74
cleanup_mnt+0x40/0x78
__cleanup_mnt+0x1c/0x28
task_work_run+0x48/0xd0
do_notify_resume+0x678/0xe98
work_pending+0x8/0x14

Fix this by not creating extents for those recovered files if shrinker is
not registered yet. Once mount is successful and shrinker is registered,
those files can have extents again.

Signed-off-by: Sahitya Tummala 
---
v3:
-do not create extents itself in the first place for those recovered files,
instead of cleaning it up later via sync/evict_inodes.

 fs/f2fs/f2fs.h | 11 ++-
 fs/f2fs/shrinker.c |  2 +-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 7cec897..1380f07 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2695,10 +2695,19 @@ static inline bool is_dot_dotdot(const struct qstr *str)
 
 static inline bool f2fs_may_extent_tree(struct inode *inode)
 {
-   if (!test_opt(F2FS_I_SB(inode), EXTENT_CACHE) ||
+   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+
+   if (!test_opt(sbi, EXTENT_CACHE) ||
is_inode_flag_set(inode, FI_NO_EXTENT))
return false;
 
+   /*
+* for recovered files during mount do not create extents
+* if shrinker is not registered.
+*/
+   if (list_empty(>s_list))
+   return false;
+
return S_ISREG(inode->i_mode);
 }
 
diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c
index 9e13db9..a467aca 100644
--- a/fs/f2fs/shrinker.c
+++ b/fs/f2fs/shrinker.c
@@ -135,6 +135,6 @@ void f2fs_leave_shrinker(struct f2fs_sb_info *sbi)
f2fs_shrink_extent_tree(sbi, __count_extent_cache(sbi));
 
spin_lock(_list_lock);
-   list_del(>s_list);
+   list_del_init(>s_list);
spin_unlock(_list_lock);
 }
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH] cpuidle: New timer events oriented governor for tickless systems

2018-12-18 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

The venerable menu governor does some thigns that are quite
questionable in my view.

First, it includes timer wakeups in the pattern detection data and
mixes them up with wakeups from other sources which in some cases
causes it to expect what essentially would be a timer wakeup in a
time frame in which no timer wakeups are possible (becuase it knows
the time until the next timer event and that is later than the
expected wakeup time).

Second, it uses the extra exit latency limit based on the predicted
idle duration and depending on the number of tasks waiting on I/O,
even though those tasks may run on a different CPU when they are
woken up.  Moreover, the time ranges used by it for the sleep length
correction factors depend on whether or not there are tasks waiting
on I/O, which again doesn't imply anything in particular, and they
are not correlated to the list of available idle states in any way
whatever.

Also, the pattern detection code in menu may end up considering
values that are too large to matter at all, in which cases running
it is a waste of time.

A major rework of the menu governor would be required to address
these issues and the performance of at least some workloads (tuned
specifically to the current behavior of the menu governor) is likely
to suffer from that.  It is thus better to introduce an entirely new
governor without them and let everybody use the governor that works
better with their actual workloads.

The new governor introduced here, the timer events oriented (TEO)
governor, uses the same basic strategy as menu: it always tries to
find the deepest idle state that can be used in the given conditions.
However, it applies a different approach to that problem.

First, it doesn't use "correction factors" for the time till the
closest timer, but instead it tries to correlate the measured idle
duration values with the available idle states and use that
information to pick up the idle state that is most likely to "match"
the upcoming CPU idle interval.

Second, it doesn't take the number of "I/O waiters" into account at
all and the pattern detection code in it avoids taking timer wakeups
into account.  It also only uses idle duration values less than the
current time till the closest timer (with the tick excluded) for that
purpose.

Signed-off-by: Rafael J. Wysocki 
---

The only code change (except for updates of comments) from the last RFC (v8)
is the rating adjustment in the new governor so that menu still is the default
on tickless systems.  It is now possible to build the kernel without menu when
the new one is selected, however.

Apart from that, the new governor's help text in Kconfig has been updated and
there is a documentation update describing it (on top of my linux-next branch
which should be present in linux-next proper).

Of course, testing it and reporting results will still be appreciated. :-)

---
 Documentation/admin-guide/pm/cpuidle.rst |   86 ++
 drivers/cpuidle/Kconfig  |   11 
 drivers/cpuidle/governors/Makefile   |1 
 drivers/cpuidle/governors/teo.c  |  443 +++
 4 files changed, 540 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/cpuidle/governors/teo.c
===
--- /dev/null
+++ linux-pm/drivers/cpuidle/governors/teo.c
@@ -0,0 +1,443 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Timer events oriented CPU idle governor
+ *
+ * Copyright (C) 2018 Intel Corporation
+ * Author: Rafael J. Wysocki 
+ *
+ * The idea of this governor is based on the observation that on many systems
+ * timer events are two or more orders of magnitude more frequent than any
+ * other interrupts, so they are likely to be the most significant source of 
CPU
+ * wakeups from idle states.  Moreover, information about what happened in the
+ * (relatively recent) past can be used to estimate whether or not the deepest
+ * idle state with target residency within the time to the closest timer is
+ * likely to be suitable for the upcoming idle time of the CPU and, if not, 
then
+ * which of the shallower idle states to choose.
+ *
+ * Of course, non-timer wakeup sources are more important in some use cases and
+ * they can be covered by taking a few most recent idle time intervals of the
+ * CPU into account.  However, even in that case it is not necessary to 
consider
+ * idle duration values greater than the time till the closest timer, as the
+ * patterns that they may belong to produce average values close enough to
+ * the time till the closest timer (sleep length) anyway.
+ *
+ * Thus this governor estimates whether or not the upcoming idle time of the 
CPU
+ * is likely to be significantly shorter than the sleep length and selects an
+ * idle state for it in accordance with that, as follows:
+ *
+ * - Find an idle state on the basis of the sleep length and state statistics
+ *   collected over time:
+ *
+ *   o Find the deepest idle 

Re: [PATCH v3 1/3] staging: greybus: gpio: switch GPIO portions to use GPIOLIB_IRQCHIP

2018-12-18 Thread Johan Hovold
On Thu, Nov 22, 2018 at 10:37:16PM +0530, Nishad Kamdar wrote:
> Convert the GPIO driver to use the GPIO irqchip library
> GPIOLIB_IRQCHIP instead of reimplementing the same.
> 
> Signed-off-by: Nishad Kamdar 
> ---
> Changes in v2:
>  - Retained irq.h and irqdomain.h headers.
>  - Dropped function gb_gpio_irqchip_add() and
>called gpiochip_irqchip_add() from probe().
>  - Referred 
> https://lkml.kernel.org/r/1476054589-28422-1-git-send-email-linus.wall...@linaro.org.

Thanks for the update, and sorry about the late review. This looks
mostly good now, except for a couple minor things pointed out below.

You also included the conversion to gpiochip_get_data() (as Linus also
did in his patch) although that's really a separate change and should go
in its own patch. Please break that bit out in a follow-up patch.

Also note that someone did a bunch random white space changes to this
file in the staging tree, so it will not apply cleanly any more.

> ---
>  drivers/staging/greybus/Kconfig |   1 +
>  drivers/staging/greybus/gpio.c  | 184 
>  2 files changed, 24 insertions(+), 161 deletions(-)
> 
> diff --git a/drivers/staging/greybus/Kconfig b/drivers/staging/greybus/Kconfig
> index ab096bcef98c..b571e4e8060b 100644
> --- a/drivers/staging/greybus/Kconfig
> +++ b/drivers/staging/greybus/Kconfig
> @@ -148,6 +148,7 @@ if GREYBUS_BRIDGED_PHY
>  config GREYBUS_GPIO
>   tristate "Greybus GPIO Bridged PHY driver"
>   depends on GPIOLIB
> + select GPIOLIB_IRQCHIP
>   ---help---
> Select this option if you have a device that follows the
> Greybus GPIO Bridged PHY Class specification.
> diff --git a/drivers/staging/greybus/gpio.c b/drivers/staging/greybus/gpio.c
> index b1d4698019a1..2ec54744171d 100644
> --- a/drivers/staging/greybus/gpio.c
> +++ b/drivers/staging/greybus/gpio.c
> @@ -9,9 +9,9 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include "greybus.h"
> @@ -39,15 +39,8 @@ struct gb_gpio_controller {
>  
>   struct gpio_chipchip;
>   struct irq_chip irqc;

Turns out struct gpio_chip will have an irqchip whenever
CONFIG_GPIOLIB_IRQCHIP is selected so you can drop this one too.

> - struct irq_chip *irqchip;
> - struct irq_domain   *irqdomain;
> - unsigned intirq_base;
> - irq_flow_handler_t  irq_handler;
> - unsigned intirq_default_type;
>   struct mutexirq_lock;
>  };
> -#define gpio_chip_to_gb_gpio_controller(chip) \
> - container_of(chip, struct gb_gpio_controller, chip)
>  #define irq_data_to_gpio_chip(d) (d->domain->host_data)
>  
>  static int gb_gpio_line_count_operation(struct gb_gpio_controller *ggc)
> @@ -276,7 +269,7 @@ static void _gb_gpio_irq_set_type(struct 
> gb_gpio_controller *ggc,
>  static void gb_gpio_irq_mask(struct irq_data *d)
>  {
>   struct gpio_chip *chip = irq_data_to_gpio_chip(d);
> - struct gb_gpio_controller *ggc = gpio_chip_to_gb_gpio_controller(chip);
> + struct gb_gpio_controller *ggc = gpiochip_get_data(chip);

So please split these changes into a separate patch as they are not
related to the irqchip changes.

Oh, and don't forget to update the TODO file now that the conversion is
done. :)

Thanks,
Johan


Re: [PATCH] r8a66597: Fix a possible concurrency use-after-free bug in r8a66597_endpoint_disable()

2018-12-18 Thread Greg KH
On Tue, Dec 18, 2018 at 06:00:20PM +0800, Jia-Ju Bai wrote:
> The function r8a66597_endpoint_disable() and r8a66597_urb_enqueue() may
> be concurrently executed.
> The two functions both access a possible shared variable "hep->hcpriv".
> 
> This shared variable is freed by r8a66597_endpoint_disable() via the
> call path:
> r8a66597_endpoint_disable
>   kfree(hep->hcpriv) (line 1995 in Linux-4.19)
> 
> This variable is read by r8a66597_urb_enqueue() via the call path:
> r8a66597_urb_enqueue
>   spin_lock_irqsave(>lock);
>   init_pipe_info
> enable_r8a66597_pipe
>   pipe = hep->hcpriv (line 802 in Linux-4.19)
> 
> The read operation is protected by a spinlock, but the free operation
> is not protected by this spinlock, thus a concurrency use-after-free bug
> may occur.
> 
> To fix this bug, the spin-lock and spin-unlock function calls in
> r8a66597_endpoint_disable() are moved to protect the free operation.
> 
> Signed-off-by: Jia-Ju Bai 
> ---
>  drivers/usb/host/r8a66597-hcd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/r8a66597-hcd.c b/drivers/usb/host/r8a66597-hcd.c
> index 984892dd72f5..1495ce14ad22 100644
> --- a/drivers/usb/host/r8a66597-hcd.c
> +++ b/drivers/usb/host/r8a66597-hcd.c
> @@ -1991,13 +1991,14 @@ static void r8a66597_endpoint_disable(struct usb_hcd 
> *hcd,
>   return;
>   pipenum = pipe->info.pipenum;
>  
> + spin_lock_irqsave(>lock, flags);

Don't you also need the __aquires/__releases markings on this function
in order to properly annotate it, like the rest of the driver has?

Otherwise this seems to look good to me.

thanks,

greg k-h


Re: [PATCH] perf/x86/intel: Avoid unnecessary reallocations of memory allocated in cpu hotplug prepare state

2018-12-18 Thread Sebastian Andrzej Siewior
On 2018-12-18 18:30:33 [+0800], zhe...@windriver.com wrote:
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3398,13 +3398,16 @@ ssize_t intel_event_sysfs_show(char *page, u64 config)
>   return x86_event_sysfs_show(page, config, event);
>  }
>  
> -struct intel_shared_regs *allocate_shared_regs(int cpu)
> +void allocate_shared_regs(struct intel_shared_regs **pregs, int cpu)
>  {
> - struct intel_shared_regs *regs;
> + struct intel_shared_regs *regs = *pregs;
>   int i;
>  
> - regs = kzalloc_node(sizeof(struct intel_shared_regs),
> - GFP_KERNEL, cpu_to_node(cpu));
> + if (regs)
> + memset(regs, 0, sizeof(struct intel_shared_regs));
> + else
> + regs = *pregs = kzalloc_node(sizeof(struct intel_shared_regs),
> +  GFP_KERNEL, cpu_to_node(cpu));
>   if (regs) {
>   /*
>* initialize the locks to keep lockdep happy

void allocate_shared_regs(int cpu)
{
struct cpu_hw_events *cpuc = _cpu(cpu_hw_events, cpu);
struct intel_shared_regs *regs = puc->shared_regs;
int i;

if (!regs)
regs = kmalloc_node(sizeof(struct intel_shared_regs),
GFP_KERNEL, cpu_to_node(cpu));
if (!regs)
return;
memset(regs, 0, sizeof(struct intel_shared_regs));
for (i = 0; i < EXTRA_REG_MAX; i++)
raw_spin_lock_init(>regs[i].lock);

return regs;
}


> @@ -3414,20 +3417,21 @@ struct intel_shared_regs *allocate_shared_regs(int 
> cpu)
>  
>   regs->core_id = -1;
>   }
> - return regs;
>  }
>  
> -static struct intel_excl_cntrs *allocate_excl_cntrs(int cpu)
> +static void allocate_excl_cntrs(struct intel_excl_cntrs **pc, int cpu)
>  {
> - struct intel_excl_cntrs *c;
> + struct intel_excl_cntrs *c = *pc;
>  
> - c = kzalloc_node(sizeof(struct intel_excl_cntrs),
> -  GFP_KERNEL, cpu_to_node(cpu));
> + if (c)
> + memset(c, 0, sizeof(struct intel_excl_cntrs));
> + else
> + c = *pc = kzalloc_node(sizeof(struct intel_excl_cntrs),
> +GFP_KERNEL, cpu_to_node(cpu));
>   if (c) {
>   raw_spin_lock_init(>lock);
>   c->core_id = -1;
>   }
> - return c;
>  }

static void allocate_excl_cntrs(int cpu)
{
struct cpu_hw_events *cpuc = _cpu(cpu_hw_events, cpu);
struct intel_excl_cntrs *c = cpuc->excl_cntrs;

if (!c)
c = kmalloc_node(sizeof(struct intel_excl_cntrs),
 GFP_KERNEL, cpu_to_node(cpu));
if (!c)
return;
memset(c, 0, sizeof(struct intel_excl_cntrs));
raw_spin_lock_init(>lock);
c->core_id = -1;
cpuc->excl_cntrs = c;
}


>  static void intel_pmu_cpu_dying(int cpu)
>  {
> - struct cpu_hw_events *cpuc = _cpu(cpu_hw_events, cpu);
> - struct intel_shared_regs *pc;
> -
> - pc = cpuc->shared_regs;
> - if (pc) {
> - if (pc->core_id == -1 || --pc->refcnt == 0)

I think ->refcnt member can go, too. It is only incremented now for no
reason now.

> - kfree(pc);
> - cpuc->shared_regs = NULL;
> - }
> -
> - free_excl_cntrs(cpu);
> -
>   fini_debug_store_on_cpu(cpu);
>  
>   if (x86_pmu.counter_freezing)

Sebastian


Re: [RFC][PATCH] printk: increase devkmsg write() ratelimit

2018-12-18 Thread Sergey Senozhatsky
On (12/18/18 11:48), Peter Zijlstra wrote:
> > I know that there is a "kernel.printk_devkmsg" interface; do we
> > expect every systemd-enabled distro to find that out and to tweak
> > kernel.printk_devkmsg or shall we change the default devkmsg
> > ratelimit instead?
> 
> How about we complain to systemd instead?

We certainly can. As far as I understand, they log shutdown events
(including errors and warnings): what they kill, what they stop,
what they umount, etc. The more partitions, services are running
(I guess), the more things they need to umount, kill, stop; hence,
the more messages. I kinda can imagine what they will answer ;)

The below (and a bunch of other) messages are getting ratelimited.
I'm not sure what will happen should any of those steps fail and
print warning-s. My guess would be that we probably can ratelimit
those warnings:

...
 systemd[1]: Unmounting /home...
 systemd[1]: Unmounting Temporary Directory (/tmp)...
 systemd[1]: Unmounted Temporary Directory (/tmp).
 systemd[1]: Stopped target Swap.
 systemd[1]: Unmounted /boot.
 systemd[1]: Stopped File System Check on 
/dev/disk/by-uuid/a0737dff-e797-44f0-aea7-d0df1107ff63.
 systemd[1]: Stopped File System Check on 
/dev/disk/by-uuid/5d773b72-e200-4d11-a219-176d62a16d8d.
 systemd[1]: Unmounted /home.
 systemd[1]: Stopped File System Check on 
/dev/disk/by-uuid/35319ddc-9b92-4ab0-aaa4-9922db636a5e.
 systemd[1]: Unmounted /media/edev.
 systemd[1]: Stopped File System Check on 
/dev/disk/by-uuid/da00daaf-5601-4531-912e-bd69103b379d.
 systemd[1]: Unmounted /media/dump.
 systemd[1]: Reached target Unmount All Filesystems.
 systemd[1]: Stopped File System Check on 
/dev/disk/by-uuid/b52da2df-161b-4c33-b700-277d95b9672f.
 systemd[1]: Removed slice system-systemd\x2dfsck.slice.
 systemd[1]: Stopped target Local File Systems (Pre).
 systemd[1]: Stopped Create Static Device Nodes in /dev.
 systemd[1]: Stopped Create System Users.
 systemd[1]: Stopped Remount Root and Kernel File Systems.
 systemd[1]: Reached target Shutdown.
 systemd[1]: Reached target Final Step.
 systemd[1]: Starting Reboot...

-ss


<    1   2   3   4   5   6   7   8   9   10   >