date:20190122

Re: [PATCH kernel] vfio-pci/nvlink2: Fix ancient gcc warnings

2019-01-22 Thread Alex Williamson

Hi Geert,

The below patch comes about from the build regressions and improvements
list you've sent out, but something doesn't add up that we'd be testing
with an old compiler where initialization with { 0 } generates a
"missing braces around initialization" warning.  Is this really the
case or are we missing something here?  There's no harm that I can see
with Alexey's fix, but are these really just false positives from a
compiler bug that we should selectively ignore if the "fix" is less
clean?  Thanks,

Alex

On Wed, 23 Jan 2019 15:07:11 +1100
Alexey Kardashevskiy  wrote:

> Using the {0} construct as a generic initializer is perfectly fine in C,
> however due to a bug in old gcc there is a warning:
> 
>   + /kisskb/src/drivers/vfio/pci/vfio_pci_nvlink2.c: warning: (near
> initialization for 'cap.header') [-Wmissing-braces]:  => 181:9
> 
> Since for whatever reason we still want to compile the modern kernel
> with such an old gcc without warnings, this changes the capabilities
> initialization.
> 
> The gcc bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  drivers/vfio/pci/vfio_pci_nvlink2.c | 30 ++---
>  1 file changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_nvlink2.c 
> b/drivers/vfio/pci/vfio_pci_nvlink2.c
> index 054a2cf..91d945b 100644
> --- a/drivers/vfio/pci/vfio_pci_nvlink2.c
> +++ b/drivers/vfio/pci/vfio_pci_nvlink2.c
> @@ -178,11 +178,11 @@ static int vfio_pci_nvgpu_add_capability(struct 
> vfio_pci_device *vdev,
>   struct vfio_pci_region *region, struct vfio_info_cap *caps)
>  {
>   struct vfio_pci_nvgpu_data *data = region->data;
> - struct vfio_region_info_cap_nvlink2_ssatgt cap = { 0 };
> -
> - cap.header.id = VFIO_REGION_INFO_CAP_NVLINK2_SSATGT;
> - cap.header.version = 1;
> - cap.tgt = data->gpu_tgt;
> + struct vfio_region_info_cap_nvlink2_ssatgt cap = {
> + .header.id = VFIO_REGION_INFO_CAP_NVLINK2_SSATGT,
> + .header.version = 1,
> + .tgt = data->gpu_tgt
> + };
>  
>   return vfio_info_add_capability(caps, , sizeof(cap));
>  }
> @@ -365,18 +365,18 @@ static int vfio_pci_npu2_add_capability(struct 
> vfio_pci_device *vdev,
>   struct vfio_pci_region *region, struct vfio_info_cap *caps)
>  {
>   struct vfio_pci_npu2_data *data = region->data;
> - struct vfio_region_info_cap_nvlink2_ssatgt captgt = { 0 };
> - struct vfio_region_info_cap_nvlink2_lnkspd capspd = { 0 };
> + struct vfio_region_info_cap_nvlink2_ssatgt captgt = {
> + .header.id = VFIO_REGION_INFO_CAP_NVLINK2_SSATGT,
> + .header.version = 1,
> + .tgt = data->gpu_tgt
> + };
> + struct vfio_region_info_cap_nvlink2_lnkspd capspd = {
> + .header.id = VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD,
> + .header.version = 1,
> + .link_speed = data->link_speed
> + };
>   int ret;
>  
> - captgt.header.id = VFIO_REGION_INFO_CAP_NVLINK2_SSATGT;
> - captgt.header.version = 1;
> - captgt.tgt = data->gpu_tgt;
> -
> - capspd.header.id = VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD;
> - capspd.header.version = 1;
> - capspd.link_speed = data->link_speed;
> -
>   ret = vfio_info_add_capability(caps, , sizeof(captgt));
>   if (ret)
>   return ret;

Re: [PATCH] ath: move spin_lock_bh to spin_lock in tasklet

2019-01-22 Thread Kalle Valo

姜智伟  writes:

> Will do, thanks! 

Also don't send HTML mail :) Maillists drop those automatically.

-- 
Kalle Valo

Re: linux-next: Fixes tag needs some work in the cpufreq-arm tree

2019-01-22 Thread Stephen Rothwell

Hi Viresh,

On Fri, 18 Jan 2019 11:08:02 +0530 Viresh Kumar  wrote:
>
> I missed looking into that. You must be running some sort of sanity
> checks on the branch itself, can I know what exactly are you doing so
> that I can try the same.

I have attached my current script.  I run this on the range of new
commits for each tree each day.

Suggestions welcome! :-)

-- 
Cheers,
Stephen Rothwell


check_fixes
Description: application/shellscript


pgpSG4Uhn1gl4.pgp
Description: OpenPGP digital signature

Re: [PATCH] virtio: support VIRTIO_F_ORDER_PLATFORM

2019-01-22 Thread Michael S. Tsirkin

On Wed, Jan 23, 2019 at 01:03:46AM +0800, Tiwei Bie wrote:
> This patch introduces the support for VIRTIO_F_ORDER_PLATFORM.
> When this feature is negotiated, driver will use the barriers
> suitable for hardware devices.
> 
> Signed-off-by: Tiwei Bie 

Could you pls add a bit more explanation in the commit log?
E.g. which configurations are broken without this patch?
How severe is the problem?

I'm trying to decide whether this belongs in 5.0 or 5.1.

> ---
>  drivers/virtio/virtio_ring.c   | 8 
>  include/uapi/linux/virtio_config.h | 6 ++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index cd7e755484e3..27d3f057493e 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -1609,6 +1609,9 @@ static struct virtqueue *vring_create_virtqueue_packed(
>   !context;
>   vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
>  
> + if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
> + vq->weak_barriers = false;
> +
>   vq->packed.ring_dma_addr = ring_dma_addr;
>   vq->packed.driver_event_dma_addr = driver_event_dma_addr;
>   vq->packed.device_event_dma_addr = device_event_dma_addr;
> @@ -2079,6 +2082,9 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
> index,
>   !context;
>   vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
>  
> + if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
> + vq->weak_barriers = false;
> +
>   vq->split.queue_dma_addr = 0;
>   vq->split.queue_size_in_bytes = 0;
>  
> @@ -2213,6 +2219,8 @@ void vring_transport_features(struct virtio_device 
> *vdev)
>   break;
>   case VIRTIO_F_RING_PACKED:
>   break;
> + case VIRTIO_F_ORDER_PLATFORM:
> + break;
>   default:
>   /* We don't understand this bit. */
>   __virtio_clear_bit(vdev, i);
> diff --git a/include/uapi/linux/virtio_config.h 
> b/include/uapi/linux/virtio_config.h
> index 1196e1c1d4f6..ff8e7dc9d4dd 100644
> --- a/include/uapi/linux/virtio_config.h
> +++ b/include/uapi/linux/virtio_config.h
> @@ -78,6 +78,12 @@
>  /* This feature indicates support for the packed virtqueue layout. */
>  #define VIRTIO_F_RING_PACKED 34
>  
> +/*
> + * This feature indicates that memory accesses by the driver and the
> + * device are ordered in a way described by the platform.
> + */
> +#define VIRTIO_F_ORDER_PLATFORM  36
> +
>  /*
>   * Does the device support Single Root I/O Virtualization?
>   */
> -- 
> 2.17.1

Re: question about head_64.S

2019-01-22 Thread Cao jin

On 1/22/19 9:08 PM, Kirill A. Shutemov wrote:
> On Tue, Jan 22, 2019 at 03:31:25PM +0800, Cao jin wrote:
>> Hi, Kirll,
>>

>>> 2.
>>> Why gdt64 has following definition?:
>>>
>>> gdt64:
>>> .word   gdt_end - gdt
>>> .long   0
>>> .word   0
>>> .quad   0
>>>
>>> obviously, gdt64 stores the GDTR content under x86_64, which is 10 bytes
>>> long, so why not just:
>>>
>>> gdt64:
>>> .word   gdt_end - gdt
>>> .quad   0
>>>
>>> With above modification, it can boot.
>>>
>>
>> Seems you introduced gdt64 code in commit beebaccd50, could you help
>> with this question?
> 
> Looks like you are right. I've got confused at some point.
> 
> Could you prepare a patch?

Sure.

> 
>> And it also remind me of another question about adjust_got which is also
>> introduced by you. Because I failed to construct a test environment with
>> ld version less than 2.24 until now, so I wanna do a quick ask here:
>> does it make sense to adjust GOT from the 4th entry of it? Because as I
>> know, the first 3 entries are special one, which (I guess) will be not used.
> 
> No.
> 
> These 3 entries are reserved for a special symbols (like entry 0 for
> _DYNAMIC). It means linker should not use these entries for normal
> symbols, but it doesn't mean that they don't need to be adjusted during
> the load.
> 

Thanks for your info! BTW, could I know how you construct the test
environment?

I tried centos6, the GCC version is too old to compile; then tried
fedora28 with binutils-2.20.51.0.2-5.48.el6.x86_64.rpm from centos6, ld
reported errors; and then tried compiling binutils source with tag 2.23,
stopped at configure phase:(

-- 
Sincerely,
Cao jin

Re: [virtio-dev] [PATCH] virtio: support VIRTIO_F_ORDER_PLATFORM

2019-01-22 Thread Michael S. Tsirkin

On Wed, Jan 23, 2019 at 11:08:04AM +0800, Jason Wang wrote:
> 
> On 2019/1/23 上午1:03, Tiwei Bie wrote:
> > This patch introduces the support for VIRTIO_F_ORDER_PLATFORM.
> > When this feature is negotiated, driver will use the barriers
> > suitable for hardware devices.
> > 
> > Signed-off-by: Tiwei Bie 
> > ---
> >   drivers/virtio/virtio_ring.c   | 8 
> >   include/uapi/linux/virtio_config.h | 6 ++
> >   2 files changed, 14 insertions(+)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index cd7e755484e3..27d3f057493e 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -1609,6 +1609,9 @@ static struct virtqueue 
> > *vring_create_virtqueue_packed(
> > !context;
> > vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
> > +   if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
> > +   vq->weak_barriers = false;
> > +
> > vq->packed.ring_dma_addr = ring_dma_addr;
> > vq->packed.driver_event_dma_addr = driver_event_dma_addr;
> > vq->packed.device_event_dma_addr = device_event_dma_addr;
> > @@ -2079,6 +2082,9 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
> > index,
> > !context;
> > vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
> > +   if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
> > +   vq->weak_barriers = false;
> > +
> > vq->split.queue_dma_addr = 0;
> > vq->split.queue_size_in_bytes = 0;
> > @@ -2213,6 +2219,8 @@ void vring_transport_features(struct virtio_device 
> > *vdev)
> > break;
> > case VIRTIO_F_RING_PACKED:
> > break;
> > +   case VIRTIO_F_ORDER_PLATFORM:
> > +   break;
> > default:
> > /* We don't understand this bit. */
> > __virtio_clear_bit(vdev, i);
> > diff --git a/include/uapi/linux/virtio_config.h 
> > b/include/uapi/linux/virtio_config.h
> > index 1196e1c1d4f6..ff8e7dc9d4dd 100644
> > --- a/include/uapi/linux/virtio_config.h
> > +++ b/include/uapi/linux/virtio_config.h
> > @@ -78,6 +78,12 @@
> >   /* This feature indicates support for the packed virtqueue layout. */
> >   #define VIRTIO_F_RING_PACKED  34
> > +/*
> > + * This feature indicates that memory accesses by the driver and the
> > + * device are ordered in a way described by the platform.
> > + */
> > +#define VIRTIO_F_ORDER_PLATFORM36
> > +
> >   /*
> >* Does the device support Single Root I/O Virtualization?
> >*/
> 
> 
> I wonder whether or not this is sufficient. Is dma barrier implies a mmio
> barrier? Looks not.

IIUC we don't need an mmio barrier because we are using a
serializing API: Documentation/memory-barriers.txt says:

Note that, when using writel(), a prior
 wmb() is not needed to guarantee that the cache coherent memory writes
 have completed before writing to the MMIO region.


> See ia64/include/asm/barrier.h:
> 
>  * Note: "mb()" and its variants cannot be used as a fence to order
>  * accesses to memory mapped I/O registers.  For that, mf.a needs to
>  * be used.  However, we don't want to always use mf.a because (a)
>  * it's (presumably) much slower than mf and (b) mf.a is supported for
>  * sequential memory pages only.
>  */
> #define mb()    ia64_mf()
> #define rmb()   mb()
> #define wmb()   mb()
> 
> #define dma_rmb()   mb()
> =>efine dma_wmb()   mb()
> 
> Thanks

Frankly no idea about ia64. Sorry. Are any less esoteric platforms
affected?


-- 
MST

[PATCH v2 3/4] locking/qspinlock_stat: Separate out the PV specific stat counts

2019-01-22 Thread Waiman Long

Some of the statistics counts are for PV qspinlocks only and are not
applicable if PARAVIRT_SPINLOCKS aren't configured. So make those counts
dependent on the PARAVIRT_SPINLOCKS config option now.

Signed-off-by: Waiman Long 
---
 kernel/locking/qspinlock_stat.h | 129 +---
 1 file changed, 81 insertions(+), 48 deletions(-)

diff --git a/kernel/locking/qspinlock_stat.h b/kernel/locking/qspinlock_stat.h
index 31728f6..7a0a848 100644
--- a/kernel/locking/qspinlock_stat.h
+++ b/kernel/locking/qspinlock_stat.h
@@ -49,6 +49,7 @@
  * There may be slight difference between pv_kick_wake and pv_kick_unlock.
  */
 enum qlock_stats {
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
qstat_pv_hash_hops,
qstat_pv_kick_unlock,
qstat_pv_kick_wake,
@@ -60,6 +61,7 @@ enum qlock_stats {
qstat_pv_wait_early,
qstat_pv_wait_head,
qstat_pv_wait_node,
+#endif
qstat_lock_pending,
qstat_lock_slowpath,
qstat_lock_use_node2,
@@ -80,6 +82,7 @@ enum qlock_stats {
 #include 
 
 static const char * const qstat_names[qstat_num + 1] = {
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
[qstat_pv_hash_hops]   = "pv_hash_hops",
[qstat_pv_kick_unlock] = "pv_kick_unlock",
[qstat_pv_kick_wake]   = "pv_kick_wake",
@@ -91,6 +94,7 @@ enum qlock_stats {
[qstat_pv_wait_early]  = "pv_wait_early",
[qstat_pv_wait_head]   = "pv_wait_head",
[qstat_pv_wait_node]   = "pv_wait_node",
+#endif
[qstat_lock_pending]   = "lock_pending",
[qstat_lock_slowpath]  = "lock_slowpath",
[qstat_lock_use_node2] = "lock_use_node2",
@@ -104,6 +108,20 @@ enum qlock_stats {
  * Per-cpu counters
  */
 static DEFINE_PER_CPU(unsigned long, qstats[qstat_num]);
+
+/*
+ * Increment the PV qspinlock statistical counters
+ */
+static inline void qstat_inc(enum qlock_stats stat, bool cond)
+{
+   if (cond)
+   this_cpu_inc(qstats[stat]);
+}
+
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+/*
+ * PV specific per-cpu counters
+ */
 static DEFINE_PER_CPU(u64, pv_kick_time);
 
 /*
@@ -178,6 +196,69 @@ static ssize_t qstat_read(struct file *file, char __user 
*user_buf,
 }
 
 /*
+ * PV hash hop count
+ */
+static inline void qstat_hop(int hopcnt)
+{
+   this_cpu_add(qstats[qstat_pv_hash_hops], hopcnt);
+}
+
+/*
+ * Replacement function for pv_kick()
+ */
+static inline void __pv_kick(int cpu)
+{
+   u64 start = sched_clock();
+
+   per_cpu(pv_kick_time, cpu) = start;
+   pv_kick(cpu);
+   this_cpu_add(qstats[qstat_pv_latency_kick], sched_clock() - start);
+}
+
+/*
+ * Replacement function for pv_wait()
+ */
+static inline void __pv_wait(u8 *ptr, u8 val)
+{
+   u64 *pkick_time = this_cpu_ptr(_kick_time);
+
+   *pkick_time = 0;
+   pv_wait(ptr, val);
+   if (*pkick_time) {
+   this_cpu_add(qstats[qstat_pv_latency_wake],
+sched_clock() - *pkick_time);
+   qstat_inc(qstat_pv_kick_wake, true);
+   }
+}
+
+#define pv_kick(c) __pv_kick(c)
+#define pv_wait(p, v)  __pv_wait(p, v)
+
+#else /* CONFIG_PARAVIRT_SPINLOCKS */
+static ssize_t qstat_read(struct file *file, char __user *user_buf,
+ size_t count, loff_t *ppos)
+{
+   char buf[64];
+   int cpu, counter, len;
+   u64 stat = 0;
+
+   /*
+* Get the counter ID stored in file->f_inode->i_private
+*/
+   counter = (long)file_inode(file)->i_private;
+
+   if (counter >= qstat_num)
+   return -EBADF;
+
+   for_each_possible_cpu(cpu)
+   stat += per_cpu(qstats[counter], cpu);
+   len = snprintf(buf, sizeof(buf) - 1, "%llu\n", stat);
+
+   return simple_read_from_buffer(user_buf, count, ppos, buf, len);
+}
+#endif /* CONFIG_PARAVIRT_SPINLOCKS */
+
+/*
  * Function to handle write request
  *
  * When counter = reset_cnts, reset all the counter values.
@@ -250,54 +331,6 @@ static int __init init_qspinlock_stat(void)
 }
 fs_initcall(init_qspinlock_stat);
 
-/*
- * Increment the PV qspinlock statistical counters
- */
-static inline void qstat_inc(enum qlock_stats stat, bool cond)
-{
-   if (cond)
-   this_cpu_inc(qstats[stat]);
-}
-
-/*
- * PV hash hop count
- */
-static inline void qstat_hop(int hopcnt)
-{
-   this_cpu_add(qstats[qstat_pv_hash_hops], hopcnt);
-}
-
-/*
- * Replacement function for pv_kick()
- */
-static inline void __pv_kick(int cpu)
-{
-   u64 start = sched_clock();
-
-   per_cpu(pv_kick_time, cpu) = start;
-   pv_kick(cpu);
-   this_cpu_add(qstats[qstat_pv_latency_kick], sched_clock() - start);
-}
-
-/*
- * Replacement function for pv_wait()
- */
-static inline void __pv_wait(u8 *ptr, u8 val)
-{
-   u64 *pkick_time = this_cpu_ptr(_kick_time);
-
-   *pkick_time = 0;
-   pv_wait(ptr, val);
-   if (*pkick_time) {
-   this_cpu_add(qstats[qstat_pv_latency_wake],
-

[PATCH v2 0/4] locking/qspinlock: Handle > 4 nesting levels

2019-01-22 Thread Waiman Long

 v2:
  - Use the simple trylock loop as suggested by PeterZ.

The current allows up to 4 levels of nested slowpath spinlock calls.
That should be enough for the process, soft irq, hard irq, and nmi.
With the unfortunate event of nested NMIs happening with slowpath
spinlock call in each of the previous level, we are going to run out
of useable MCS node for queuing.

In this case, we fall back to a simple TAS lock and spin on the lock
cacheline until the lock is free. This is not most elegant solution
but is simple enough.

Patch 1 implements the TAS loop when all the existing MCS nodes are
occupied.

Patches 2-4 enhances the locking statistics code to track the new code
as well as enabling it on other architectures such as ARM64.

By setting MAX_NODES to 1, we can have some usage of the new code path
during the booting process as demonstrated by the stat counter values
shown below on an 1-socket 22-core 44-thread x86-64 system after booting
up the new kernel.

  lock_no_node=20
  lock_pending=29660
  lock_slowpath=172714

Waiman Long (4):
  locking/qspinlock: Handle > 4 slowpath nesting levels
  locking/qspinlock_stat: Track the no MCS node available case
  locking/qspinlock_stat: Separate out the PV specific stat counts
  locking/qspinlock_stat: Allow QUEUED_LOCK_STAT for all archs

 arch/Kconfig|   7 ++
 arch/x86/Kconfig|   8 ---
 kernel/locking/qspinlock.c  |  18 -
 kernel/locking/qspinlock_stat.h | 150 +---
 4 files changed, 120 insertions(+), 63 deletions(-)

-- 
1.8.3.1

[PATCH v2 4/4] locking/qspinlock_stat: Allow QUEUED_LOCK_STAT for all archs

2019-01-22 Thread Waiman Long

The QUEUED_LOCK_STAT option to report queued spinlocks statistics was
previously allowed only on x86 architecture. Now queued spinlocks are
used in multiple architectures, we now allow QUEUED_LOCK_STAT to be
enabled for all those architectures that use queued spinlocks. This
option is listed as part of the general architecture-dependent options.

Signed-off-by: Waiman Long 
---
 arch/Kconfig | 7 +++
 arch/x86/Kconfig | 8 
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 4cfb6de..c82e32f 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -885,6 +885,13 @@ config HAVE_ARCH_PREL32_RELOCATIONS
  architectures, and don't require runtime relocation on relocatable
  kernels.
 
+config QUEUED_LOCK_STAT
+   bool "Queued spinlock statistics"
+   depends on QUEUED_SPINLOCKS && DEBUG_FS
+   ---help---
+ Enable the collection of statistical data on the slowpath
+ behavior of queued spinlocks and report them on debugfs.
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4b4a7f3..872e681 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -784,14 +784,6 @@ config PARAVIRT_SPINLOCKS
 
  If you are unsure how to answer this question, answer Y.
 
-config QUEUED_LOCK_STAT
-   bool "Paravirt queued spinlock statistics"
-   depends on PARAVIRT_SPINLOCKS && DEBUG_FS
-   ---help---
- Enable the collection of statistical data on the slowpath
- behavior of paravirtualized queued spinlocks and report
- them on debugfs.
-
 source "arch/x86/xen/Kconfig"
 
 config KVM_GUEST
-- 
1.8.3.1

[PATCH v2 2/4] locking/qspinlock_stat: Track the no MCS node available case

2019-01-22 Thread Waiman Long

Track the number of slowpath locking operations that are being done
without any MCS node available as well renaming lock_index[123] to make
them more descriptive.

Using these stat counters is one way to find out if a code path is
being exercised.

Signed-off-by: Waiman Long 
---
 kernel/locking/qspinlock.c  |  3 ++-
 kernel/locking/qspinlock_stat.h | 21 +++--
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 0875053..21ee51b 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -422,6 +422,7 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 
val)
 * simple enough.
 */
if (unlikely(idx >= MAX_NODES)) {
+   qstat_inc(qstat_lock_no_node, true);
while (!queued_spin_trylock(lock))
cpu_relax();
goto release;
@@ -432,7 +433,7 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 
val)
/*
 * Keep counts of non-zero index values:
 */
-   qstat_inc(qstat_lock_idx1 + idx - 1, idx);
+   qstat_inc(qstat_lock_use_node2 + idx - 1, idx);
 
/*
 * Ensure that we increment the head node->count before initialising
diff --git a/kernel/locking/qspinlock_stat.h b/kernel/locking/qspinlock_stat.h
index 42d3d8d..31728f6 100644
--- a/kernel/locking/qspinlock_stat.h
+++ b/kernel/locking/qspinlock_stat.h
@@ -30,6 +30,13 @@
  *   pv_wait_node  - # of vCPU wait's at a non-head queue node
  *   lock_pending  - # of locking operations via pending code
  *   lock_slowpath - # of locking operations via MCS lock queue
+ *   lock_use_node2- # of locking operations that use 2nd percpu node
+ *   lock_use_node3- # of locking operations that use 3rd percpu node
+ *   lock_use_node4- # of locking operations that use 4th percpu node
+ *   lock_no_node  - # of locking operations without using percpu node
+ *
+ * Subtraccting lock_use_node[234] from lock_slowpath will give you
+ * lock_use_node1.
  *
  * Writing to the "reset_counters" file will reset all the above counter
  * values.
@@ -55,9 +62,10 @@ enum qlock_stats {
qstat_pv_wait_node,
qstat_lock_pending,
qstat_lock_slowpath,
-   qstat_lock_idx1,
-   qstat_lock_idx2,
-   qstat_lock_idx3,
+   qstat_lock_use_node2,
+   qstat_lock_use_node3,
+   qstat_lock_use_node4,
+   qstat_lock_no_node,
qstat_num,  /* Total number of statistical counters */
qstat_reset_cnts = qstat_num,
 };
@@ -85,9 +93,10 @@ enum qlock_stats {
[qstat_pv_wait_node]   = "pv_wait_node",
[qstat_lock_pending]   = "lock_pending",
[qstat_lock_slowpath]  = "lock_slowpath",
-   [qstat_lock_idx1]  = "lock_index1",
-   [qstat_lock_idx2]  = "lock_index2",
-   [qstat_lock_idx3]  = "lock_index3",
+   [qstat_lock_use_node2] = "lock_use_node2",
+   [qstat_lock_use_node3] = "lock_use_node3",
+   [qstat_lock_use_node4] = "lock_use_node4",
+   [qstat_lock_no_node]   = "lock_no_node",
[qstat_reset_cnts] = "reset_counters",
 };
 
-- 
1.8.3.1

[PATCH v2 1/4] locking/qspinlock: Handle > 4 slowpath nesting levels

2019-01-22 Thread Waiman Long

Four queue nodes per cpu are allocated to enable up to 4 nesting levels
using the per-cpu nodes. Nested NMIs are possible in some architectures.
Still it is very unlikely that we will ever hit more than 4 nested
levels with contention in the slowpath.

When that rare condition happens, however, it is likely that the system
will hang or crash shortly after that. It is not good and we need to
handle this exception case.

This is done by spinning directly on the lock using repeated trylock.
This alternative code path should only be used when there is nested
NMIs. Assuming that the locks used by those NMI handlers will not be
heavily contended, a simple TAS locking should work out.

Suggested-by: Peter Zijlstra 
Signed-off-by: Waiman Long 
---
 kernel/locking/qspinlock.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 8a8c3c2..0875053 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -412,6 +412,21 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 
val)
idx = node->count++;
tail = encode_tail(smp_processor_id(), idx);
 
+   /*
+* 4 nodes are allocated based on the assumption that there will
+* not be nested NMIs taking spinlocks. That may not be true in
+* some architectures even though the chance of needing more than
+* 4 nodes will still be extremely unlikely. When that happens,
+* we fall back to spinning on the lock directly without using
+* any MCS node. This is not the most elegant solution, but is
+* simple enough.
+*/
+   if (unlikely(idx >= MAX_NODES)) {
+   while (!queued_spin_trylock(lock))
+   cpu_relax();
+   goto release;
+   }
+
node = grab_mcs_node(node, idx);
 
/*
-- 
1.8.3.1

Re: [PATCH] capabilities:: annotate implicit fall through

2019-01-22 Thread James Morris

On Mon, 14 Jan 2019, Mathieu Malaterre wrote:

> There is a plan to build the kernel with -Wimplicit-fallthrough and
> this place in the code produced a warning (W=1).
> 
> In this particular case change put the fall through comment on a single
> line so as to match the regular expression expected by GCC.
> 
> This commit remove the following warning:
> 
>   kernel/capability.c:95:3: warning: this statement may fall through 
> [-Wimplicit-fallthrough=]
> 
> Signed-off-by: Mathieu Malaterre 

Applied to
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git 
next-general

-- 
James Morris

[v6 2/3] drm/msm/dpu: Integrate interconnect API in MDSS

2019-01-22 Thread Jayant Shekhar

The interconnect framework is designed to provide a
standard kernel interface to control the settings of
the interconnects on a SoC.

The interconnect API uses a consumer/provider-based model,
where the providers are the interconnect buses and the
consumers could be various drivers.

MDSS is one of the interconnect consumers which uses the
interconnect APIs to get the path between endpoints and
set its bandwidth requirement for the given interconnected
path.

Changes in v2:
- Remove error log and unnecessary check (Jordan Crouse)

Changes in v3:
- Code clean involving variable name change, removal
  of extra paranthesis and variables (Matthias Kaehlcke)

Changes in v4:
- Add comments, spacings, tabs, proper port name
  and icc macro (Georgi Djakov)

Changes in v5:
- Commit text and parenthesis alignment (Georgi Djakov)

Changes in v6:
- Change to new icc_set API's (Doug Anderson)

Signed-off-by: Sravanthi Kollukuduru1 
Signed-off-by: Jayant Shekhar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c | 49 +---
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
index 38576f8..38daf8a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
@@ -4,11 +4,15 @@
  */
 
 #include "dpu_kms.h"
+#include 
 
 #define to_dpu_mdss(x) container_of(x, struct dpu_mdss, base)
 
 #define HW_INTR_STATUS 0x0010
 
+/* Max BW defined in KBps */
+#define MAX_BW 680
+
 struct dpu_mdss {
struct msm_mdss base;
void __iomem *mmio;
@@ -16,8 +20,30 @@ struct dpu_mdss {
u32 hwversion;
struct dss_module_power mp;
struct dpu_irq_controller irq_controller;
+   struct icc_path *path[2];
+   u32 num_paths;
 };
 
+static int dpu_mdss_parse_data_bus_icc_path(struct drm_device *dev,
+   struct dpu_mdss *dpu_mdss)
+{
+   struct icc_path *path0 = of_icc_get(dev->dev, "mdp0-mem");
+   struct icc_path *path1 = of_icc_get(dev->dev, "mdp1-mem");
+
+   if (IS_ERR(path0))
+   return PTR_ERR(path0);
+
+   dpu_mdss->path[0] = path0;
+   dpu_mdss->num_paths = 1;
+
+   if (!IS_ERR(path1)) {
+   dpu_mdss->path[1] = path1;
+   dpu_mdss->num_paths++;
+   }
+
+   return 0;
+}
+
 static irqreturn_t dpu_mdss_irq(int irq, void *arg)
 {
struct dpu_mdss *dpu_mdss = arg;
@@ -127,7 +153,11 @@ static int dpu_mdss_enable(struct msm_mdss *mdss)
 {
struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss);
struct dss_module_power *mp = _mdss->mp;
-   int ret;
+   int ret, i;
+   u64 avg_bw = dpu_mdss->num_paths ? MAX_BW / dpu_mdss->num_paths : 0;
+
+   for (i = 0; i < dpu_mdss->num_paths; i++)
+   icc_set_bw(dpu_mdss->path[i], avg_bw, kBps_to_icc(MAX_BW));
 
ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, true);
if (ret)
@@ -140,12 +170,15 @@ static int dpu_mdss_disable(struct msm_mdss *mdss)
 {
struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss);
struct dss_module_power *mp = _mdss->mp;
-   int ret;
+   int ret, i;
 
ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, false);
if (ret)
DPU_ERROR("clock disable failed, ret:%d\n", ret);
 
+   for (i = 0; i < dpu_mdss->num_paths; i++)
+   icc_set_bw(dpu_mdss->path[i], 0, 0);
+
return ret;
 }
 
@@ -155,6 +188,7 @@ static void dpu_mdss_destroy(struct drm_device *dev)
struct msm_drm_private *priv = dev->dev_private;
struct dpu_mdss *dpu_mdss = to_dpu_mdss(priv->mdss);
struct dss_module_power *mp = _mdss->mp;
+   int i;
 
pm_runtime_disable(dev->dev);
_dpu_mdss_irq_domain_fini(dpu_mdss);
@@ -162,6 +196,9 @@ static void dpu_mdss_destroy(struct drm_device *dev)
msm_dss_put_clk(mp->clk_config, mp->num_clk);
devm_kfree(>dev, mp->clk_config);
 
+   for (i = 0; i < dpu_mdss->num_paths; i++)
+   icc_put(dpu_mdss->path[i]);
+
if (dpu_mdss->mmio)
devm_iounmap(>dev, dpu_mdss->mmio);
dpu_mdss->mmio = NULL;
@@ -200,6 +237,10 @@ int dpu_mdss_init(struct drm_device *dev)
}
dpu_mdss->mmio_len = resource_size(res);
 
+   ret = dpu_mdss_parse_data_bus_icc_path(dev, dpu_mdss);
+   if (ret)
+   return ret;
+
mp = _mdss->mp;
ret = msm_dss_parse_clock(pdev, mp);
if (ret) {
@@ -221,14 +262,14 @@ int dpu_mdss_init(struct drm_device *dev)
goto irq_error;
}
 
+   priv->mdss = _mdss->base;
+
pm_runtime_enable(dev->dev);
 
pm_runtime_get_sync(dev->dev);
dpu_mdss->hwversion = readl_relaxed(dpu_mdss->mmio);
pm_runtime_put_sync(dev->dev);
 
-   priv->mdss = _mdss->base;
-

Re: [PATCH v2 0/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2

2019-01-22 Thread Ching Huang

On Tue, 2019-01-22 at 21:41 -0500, Martin K. Petersen wrote:
> Ching,
> 
> > This patch series are against to mkp's 5.1/scsi-queue.
> 
> Applied to 5.1/scsi-queue. Thank you.
> 
> PS. Your file permissions are odd. I always have to change your diffs
> from 755 to 644 before applying.
> 
Thanks Martin and Dan's help.

The file permission problem also confused to me.
I used Evolution mail of CentOS 6.x to submit the patches.
The mail context format is Plain text, preformatted.
I inserted the diff text file to the mail, and diff file listing as below.

-rw-r--r--. 1 root root 1663 Jan 16 04:11 p1.txt

Don't know why and when it's permission changed from 644 to 755.

[v6 3/3] dt-bindings: msm/disp: Introduce interconnect bindings for MDSS on SDM845

2019-01-22 Thread Jayant Shekhar

Add interconnect properties such as interconnect provider specifier
, the edge source and destination ports which are required by the
interconnect API to configure interconnect path for MDSS.

Changes in v2:
- none

Changes in v3:
- Remove common property definitions (Rob Herring)

Changes in v4:
- Use port macros and change port string names (Georgi Djakov)

Changes in v5:
- None

Changes in v6:
- None

Signed-off-by: Sravanthi Kollukuduru 
Signed-off-by: Jayant Shekhar 
---
 Documentation/devicetree/bindings/display/msm/dpu.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/msm/dpu.txt 
b/Documentation/devicetree/bindings/display/msm/dpu.txt
index ad2e883..a61dd40 100644
--- a/Documentation/devicetree/bindings/display/msm/dpu.txt
+++ b/Documentation/devicetree/bindings/display/msm/dpu.txt
@@ -28,6 +28,11 @@ Required properties:
 - #address-cells: number of address cells for the MDSS children. Should be 1.
 - #size-cells: Should be 1.
 - ranges: parent bus address space is the same as the child bus address space.
+- interconnects : interconnect path specifier for MDSS according to
+  Documentation/devicetree/bindings/interconnect/interconnect.txt. Should be
+  2 paths corresponding to 2 AXI ports.
+- interconnect-names : MDSS will have 2 port names to differentiate between the
+  2 interconnect paths defined with interconnect specifier.
 
 Optional properties:
 - assigned-clocks: list of clock specifiers for clocks needing rate assignment
@@ -86,6 +91,11 @@ Example:
interrupt-controller;
#interrupt-cells = <1>;
 
+   interconnects = <_hlos MASTER_MDP0 _hlos SLAVE_EBI1>,
+   <_hlos MASTER_MDP1 _hlos SLAVE_EBI1>;
+
+   interconnect-names = "mdp0-mem", "mdp1-mem";
+
iommus = <_iommu 0>;
 
#address-cells = <2>;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[v6 0/3] Use interconnect API in MDSS on SDM845

2019-01-22 Thread Jayant Shekhar

The interconnect API provides an interface for consumer drivers to express
their bandwidth needs in the SoC. This data is aggregated and the on-chip
interconnect hardware is configured to the appropriate power/performance
profile.

MDSS is one of the interconnect consumers which uses the interconnect APIs
to get the path between endpoints and set its bandwidth requirements
for the given interconnected path.

Subsequently, there is a clean up patch to remove all the references
of the DPU custom bus scaling.

There is corresponding DT patch with the source and destination ports
defined for display driver which will be sent separately.

Changes in v2:
- Remove error log and unnecessary check (Jordan Crouse)
- Fixed build error due to partial clean up

Changes in v3:
- Remove common property definitions (Rob Herring)
- Code clean up involving variable name change, removal
  of extra paranthesis and variables (Matthias Kaehlcke)
- Condense multiple lines into a single line (Sean Paul)

Changes in v4:
   - Add comments, spacings, tabs, proper port name and icc macro
   - Use port macros and change port string names (Georgi Djakov)

Changes in v5:
   - Updated commit text and parenthesis alignment (Georgi Djakov)

Changes in v6:
   - Change icc_set to icc_set_bw (Doug Anderson)

Jayant Shekhar (3):
  drm/msm/dpu: clean up references of DPU custom bus scaling
  drm/msm/dpu: Integrate interconnect API in MDSS
  dt-bindings: msm/disp: Introduce interconnect bindings for MDSS on
SDM845

 .../devicetree/bindings/display/msm/dpu.txt|  10 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c  | 174 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h  |   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c   |  13 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c   |  49 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.c   |  47 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.h   |  68 
 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h  |  22 +--
 8 files changed, 144 insertions(+), 243 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[v6 1/3] drm/msm/dpu: clean up references of DPU custom bus scaling

2019-01-22 Thread Jayant Shekhar

Since the upstream interconnect bus framework has landed
upstream, the existing references of custom bus scaling
needs to be cleaned up.

Changes in v2:
- Fixed build error due to partial clean up

Changes in v3:
- Condense multiple lines into a single line (Sean Paul)

Changes in v4:
- None

Changes in v5:
- None

Changes in v6:
- None

Signed-off-by: Sravanthi Kollukuduru 
Signed-off-by: Jayant Shekhar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c| 174 +--
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h|   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c |  13 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.c |  47 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.h |  68 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h|  22 +--
 6 files changed, 89 insertions(+), 239 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
index 22e84b3..c75536e 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
@@ -84,7 +84,6 @@ static void _dpu_core_perf_calc_crtc(struct dpu_kms *kms,
struct dpu_core_perf_params *perf)
 {
struct dpu_crtc_state *dpu_cstate;
-   int i;
 
if (!kms || !kms->catalog || !crtc || !state || !perf) {
DPU_ERROR("invalid parameters\n");
@@ -95,35 +94,24 @@ static void _dpu_core_perf_calc_crtc(struct dpu_kms *kms,
memset(perf, 0, sizeof(struct dpu_core_perf_params));
 
if (!dpu_cstate->bw_control) {
-   for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) {
-   perf->bw_ctl[i] = kms->catalog->perf.max_bw_high *
+   perf->bw_ctl = kms->catalog->perf.max_bw_high *
1000ULL;
-   perf->max_per_pipe_ib[i] = perf->bw_ctl[i];
-   }
+   perf->max_per_pipe_ib = perf->bw_ctl;
perf->core_clk_rate = kms->perf.max_core_clk_rate;
} else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_MINIMUM) {
-   for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) {
-   perf->bw_ctl[i] = 0;
-   perf->max_per_pipe_ib[i] = 0;
-   }
+   perf->bw_ctl = 0;
+   perf->max_per_pipe_ib = 0;
perf->core_clk_rate = 0;
} else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_FIXED) {
-   for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) {
-   perf->bw_ctl[i] = kms->perf.fix_core_ab_vote;
-   perf->max_per_pipe_ib[i] = kms->perf.fix_core_ib_vote;
-   }
+   perf->bw_ctl = kms->perf.fix_core_ab_vote;
+   perf->max_per_pipe_ib = kms->perf.fix_core_ib_vote;
perf->core_clk_rate = kms->perf.fix_core_clk_rate;
}
 
DPU_DEBUG(
-   "crtc=%d clk_rate=%llu core_ib=%llu core_ab=%llu llcc_ib=%llu 
llcc_ab=%llu mem_ib=%llu mem_ab=%llu\n",
+   "crtc=%d clk_rate=%llu core_ib=%llu core_ab=%llu\n",
crtc->base.id, perf->core_clk_rate,
-   perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_MNOC],
-   perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_MNOC],
-   perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_LLCC],
-   perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_LLCC],
-   perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_EBI],
-   perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_EBI]);
+   perf->max_per_pipe_ib, perf->bw_ctl);
 }
 
 int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
@@ -136,7 +124,6 @@ int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
struct dpu_crtc_state *dpu_cstate;
struct drm_crtc *tmp_crtc;
struct dpu_kms *kms;
-   int i;
 
if (!crtc || !state) {
DPU_ERROR("invalid crtc\n");
@@ -158,31 +145,25 @@ int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
/* obtain new values */
_dpu_core_perf_calc_crtc(kms, crtc, state, _cstate->new_perf);
 
-   for (i = DPU_POWER_HANDLE_DBUS_ID_MNOC;
-   i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) {
-   bw_sum_of_intfs = dpu_cstate->new_perf.bw_ctl[i];
-   curr_client_type = dpu_crtc_get_client_type(crtc);
+   bw_sum_of_intfs = dpu_cstate->new_perf.bw_ctl;
+   curr_client_type = dpu_crtc_get_client_type(crtc);
 
-   drm_for_each_crtc(tmp_crtc, crtc->dev) {
-   if (_dpu_core_perf_crtc_is_power_on(tmp_crtc) &&
-   (dpu_crtc_get_client_type(tmp_crtc) ==
-   curr_client_type) &&
-   (tmp_crtc != crtc)) {
-   struct dpu_crtc_state *tmp_cstate =
-

Re: LTP case read_all_proc fails on qemux86-64 since 5.0-rc1

2019-01-22 Thread Jens Axboe

On Jan 22, 2019, at 8:13 PM, He Zhe  wrote:
> 
> 
> LTP case read_all_proc(read_all -d /proc -q -r 10) often, but not every time, 
> fails with the following call traces, since 600335205b8d "ide: convert to 
> blk-mq"(5.0-rc1) till now(5.0-rc3).
> 
> qemu-system-x86_64 -drive file=rootfs.ext4,if=virtio,format=raw -object 
> rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0 
> -nographic -m 16192 -smp cpus=12 -cpu core2duo -enable-kvm -serial mon:stdio 
> -serial null -kernel bzImage -append 'root=/dev/vda rw highres=off 
> console=ttyS0 mem=16192M'
> 
> tst_test.c:1085: INFO: Timeout per run is 0h 05m 00s
> [   47.080156] Warning: /proc/ide/hd?/settings interface is obsolete, and 
> will be removed soon!
> [   47.085330] [ cut here ]
> [   47.085810] kernel BUG at block/blk-mq.c:767!
> [   47.086498] invalid opcode:  [#1] PREEMPT SMP PTI
> [   47.087022] CPU: 5 PID: 146 Comm: kworker/5:1H Not tainted 5.0.0-rc3 #1
> [   47.087858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
> [   47.088992] Workqueue: kblockd blk_mq_run_work_fn
> [   47.089469] RIP: 0010:blk_mq_add_to_requeue_list+0xc1/0xd0
> [   47.090035] Code: 48 8d 53 48 49 8b 8c 24 b8 04 00 00 48 89 51 08 48 89 4b 
> 48 49 8d 8c 24 b8 04 00 00 48 89 4b 50 49 89 94 24 b8 04 00 008
> [   47.091930] RSP: 0018:9e1ea4b43e40 EFLAGS: 00010002
> [   47.092458] RAX: 9e1ea13c0048 RBX: 9e1ea13c RCX: 
> 0006
> [   47.093181] RDX:  RSI: 0001 RDI: 
> 9e1ea13c
> [   47.093906] RBP: 9e1ea4b43e68 R08: eb5bcf630680 R09: 
> 
> [   47.094626] R10: 0001 R11: 0012 R12: 
> 9e1ea1033a40
> [   47.095347] R13: 9e1ea13a8d00 R14: 9e1ea13a9000 R15: 
> 0046
> [   47.096071] FS:  () GS:9e1ea4b4() 
> knlGS:
> [   47.096898] CS:  0010 DS:  ES:  CR0: 80050033
> [   47.097477] CR2: 003fda41fda0 CR3: 0003d8e6a000 CR4: 
> 06e0
> [   47.098203] DR0:  DR1:  DR2: 
> 
> [   47.098929] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [   47.099650] Call Trace:
> [   47.099910]  
> [   47.100128]  blk_mq_requeue_request+0x58/0x60
> [   47.100576]  ide_requeue_and_plug+0x20/0x50
> [   47.101014]  ide_intr+0x21a/0x230
> [   47.101362]  ? idecd_open+0xc0/0xc0
> [   47.101735]  __handle_irq_event_percpu+0x43/0x1e0
> [   47.102214]  handle_irq_event_percpu+0x32/0x80
> [   47.102668]  handle_irq_event+0x39/0x60
> [   47.103074]  handle_edge_irq+0xe8/0x1c0
> [   47.103470]  handle_irq+0x20/0x30
> [   47.103819]  do_IRQ+0x46/0xe0
> [   47.104128]  common_interrupt+0xf/0xf
> [   47.104505]  
> [   47.104731] RIP: 0010:ide_output_data+0xbc/0x100
> [   47.105201] Code: 74 22 8d 41 ff 85 c9 74 24 49 8d 54 40 02 41 0f b7 00 66 
> 41 89 01 49 83 c0 02 49 39 d0 75 ef 5b 41 5c 5d c3 4c 89 c6 445
> [   47.107092] RSP: 0018:bd508059bb18 EFLAGS: 00010246 ORIG_RAX: 
> ffdd
> [   47.107862] RAX: 9e1ea13a8800 RBX: 9e1ea13a9000 RCX: 
> 
> [   47.108581] RDX: 0170 RSI: 9e1ea13c012c RDI: 
> 
> [   47.109293] RBP: bd508059bb28 R08: 9e1ea13c0120 R09: 
> 0170
> [   47.110016] R10: 000d R11: 000c R12: 
> 9e1ea13a8800
> [   47.110731] R13: 000c R14: 9e1ea13c R15: 
> 7530
> [   47.111446]  ide_transfer_pc+0x216/0x310
> [   47.111848]  ? __const_udelay+0x3d/0x40
> [   47.112236]  ? ide_execute_command+0x85/0xb0
> [   47.112668]  ? ide_pc_intr+0x3f0/0x3f0
> [   47.113051]  ? ide_check_atapi_device+0x110/0x110
> [   47.113524]  ide_issue_pc+0x178/0x240
> [   47.113901]  ide_cd_do_request+0x15c/0x350
> [   47.114314]  ide_queue_rq+0x180/0x6b0
> [   47.114686]  ? blk_mq_get_driver_tag+0xa1/0x110
> [   47.115153]  blk_mq_dispatch_rq_list+0x90/0x550
> [   47.115606]  ? __queue_delayed_work+0x63/0x90
> [   47.116054]  ? deadline_fifo_request+0x41/0x90
> [   47.116506]  blk_mq_do_dispatch_sched+0x80/0x100
> [   47.116976]  blk_mq_sched_dispatch_requests+0xfc/0x170
> [   47.117491]  __blk_mq_run_hw_queue+0x6f/0xd0
> [   47.117941]  blk_mq_run_work_fn+0x1b/0x20
> [   47.118342]  process_one_work+0x14c/0x450
> [   47.118747]  worker_thread+0x4a/0x440
> [   47.119125]  kthread+0x105/0x140
> [   47.119456]  ? process_one_work+0x450/0x450
> [   47.119880]  ? kthread_park+0x90/0x90
> [   47.120251]  ret_from_fork+0x35/0x40
> [   47.120619] Modules linked in:
> [   47.120952] ---[ end trace 4562f716e88fdefe ]---
> [   47.121423] RIP: 0010:blk_mq_add_to_requeue_list+0xc1/0xd0
> [   47.121981] Code: 48 8d 53 48 49 8b 8c 24 b8 04 00 00 48 89 51 08 48 89 4b 
> 48 49 8d 8c 24 b8 04 00 00 48 89 4b 50 49 89 94 24 b8 04 00 008
> [   47.123851] RSP: 0018:9e1ea4b43e40 EFLAGS: 00010002
> [

[v6 2/3] drm/msm/dpu: Integrate interconnect API in MDSS

2019-01-22 Thread Jayant Shekhar

The interconnect framework is designed to provide a
standard kernel interface to control the settings of
the interconnects on a SoC.

The interconnect API uses a consumer/provider-based model,
where the providers are the interconnect buses and the
consumers could be various drivers.

MDSS is one of the interconnect consumers which uses the
interconnect APIs to get the path between endpoints and
set its bandwidth requirement for the given interconnected
path.

Changes in v2:
- Remove error log and unnecessary check (Jordan Crouse)

Changes in v3:
- Code clean involving variable name change, removal
  of extra paranthesis and variables (Matthias Kaehlcke)

Changes in v4:
- Add comments, spacings, tabs, proper port name
  and icc macro (Georgi Djakov)

Changes in v5:
- Commit text and parenthesis alignment (Georgi Djakov)

Changes in v6:
- Change to new icc_set API's (Doug Anderson)

Signed-off-by: Sravanthi Kollukuduru1 
Signed-off-by: Jayant Shekhar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c | 49 +---
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
index 38576f8..38daf8a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
@@ -4,11 +4,15 @@
  */
 
 #include "dpu_kms.h"
+#include 
 
 #define to_dpu_mdss(x) container_of(x, struct dpu_mdss, base)
 
 #define HW_INTR_STATUS 0x0010
 
+/* Max BW defined in KBps */
+#define MAX_BW 680
+
 struct dpu_mdss {
struct msm_mdss base;
void __iomem *mmio;
@@ -16,8 +20,30 @@ struct dpu_mdss {
u32 hwversion;
struct dss_module_power mp;
struct dpu_irq_controller irq_controller;
+   struct icc_path *path[2];
+   u32 num_paths;
 };
 
+static int dpu_mdss_parse_data_bus_icc_path(struct drm_device *dev,
+   struct dpu_mdss *dpu_mdss)
+{
+   struct icc_path *path0 = of_icc_get(dev->dev, "mdp0-mem");
+   struct icc_path *path1 = of_icc_get(dev->dev, "mdp1-mem");
+
+   if (IS_ERR(path0))
+   return PTR_ERR(path0);
+
+   dpu_mdss->path[0] = path0;
+   dpu_mdss->num_paths = 1;
+
+   if (!IS_ERR(path1)) {
+   dpu_mdss->path[1] = path1;
+   dpu_mdss->num_paths++;
+   }
+
+   return 0;
+}
+
 static irqreturn_t dpu_mdss_irq(int irq, void *arg)
 {
struct dpu_mdss *dpu_mdss = arg;
@@ -127,7 +153,11 @@ static int dpu_mdss_enable(struct msm_mdss *mdss)
 {
struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss);
struct dss_module_power *mp = _mdss->mp;
-   int ret;
+   int ret, i;
+   u64 avg_bw = dpu_mdss->num_paths ? MAX_BW / dpu_mdss->num_paths : 0;
+
+   for (i = 0; i < dpu_mdss->num_paths; i++)
+   icc_set_bw(dpu_mdss->path[i], avg_bw, kBps_to_icc(MAX_BW));
 
ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, true);
if (ret)
@@ -140,12 +170,15 @@ static int dpu_mdss_disable(struct msm_mdss *mdss)
 {
struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss);
struct dss_module_power *mp = _mdss->mp;
-   int ret;
+   int ret, i;
 
ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, false);
if (ret)
DPU_ERROR("clock disable failed, ret:%d\n", ret);
 
+   for (i = 0; i < dpu_mdss->num_paths; i++)
+   icc_set_bw(dpu_mdss->path[i], 0, 0);
+
return ret;
 }
 
@@ -155,6 +188,7 @@ static void dpu_mdss_destroy(struct drm_device *dev)
struct msm_drm_private *priv = dev->dev_private;
struct dpu_mdss *dpu_mdss = to_dpu_mdss(priv->mdss);
struct dss_module_power *mp = _mdss->mp;
+   int i;
 
pm_runtime_disable(dev->dev);
_dpu_mdss_irq_domain_fini(dpu_mdss);
@@ -162,6 +196,9 @@ static void dpu_mdss_destroy(struct drm_device *dev)
msm_dss_put_clk(mp->clk_config, mp->num_clk);
devm_kfree(>dev, mp->clk_config);
 
+   for (i = 0; i < dpu_mdss->num_paths; i++)
+   icc_put(dpu_mdss->path[i]);
+
if (dpu_mdss->mmio)
devm_iounmap(>dev, dpu_mdss->mmio);
dpu_mdss->mmio = NULL;
@@ -200,6 +237,10 @@ int dpu_mdss_init(struct drm_device *dev)
}
dpu_mdss->mmio_len = resource_size(res);
 
+   ret = dpu_mdss_parse_data_bus_icc_path(dev, dpu_mdss);
+   if (ret)
+   return ret;
+
mp = _mdss->mp;
ret = msm_dss_parse_clock(pdev, mp);
if (ret) {
@@ -221,14 +262,14 @@ int dpu_mdss_init(struct drm_device *dev)
goto irq_error;
}
 
+   priv->mdss = _mdss->base;
+
pm_runtime_enable(dev->dev);
 
pm_runtime_get_sync(dev->dev);
dpu_mdss->hwversion = readl_relaxed(dpu_mdss->mmio);
pm_runtime_put_sync(dev->dev);
 
-   priv->mdss = _mdss->base;
-

[v6 1/3] drm/msm/dpu: clean up references of DPU custom bus scaling

2019-01-22 Thread Jayant Shekhar

Since the upstream interconnect bus framework has landed
upstream, the existing references of custom bus scaling
needs to be cleaned up.

Changes in v2:
- Fixed build error due to partial clean up

Changes in v3:
- Condense multiple lines into a single line (Sean Paul)

Changes in v4:
- None

Changes in v5:
- None

Changes in v6:
-None

Signed-off-by: Sravanthi Kollukuduru 
Signed-off-by: Jayant Shekhar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c| 174 +--
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h|   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c |  13 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.c |  47 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.h |  68 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h|  22 +--
 6 files changed, 89 insertions(+), 239 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
index 22e84b3..c75536e 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
@@ -84,7 +84,6 @@ static void _dpu_core_perf_calc_crtc(struct dpu_kms *kms,
struct dpu_core_perf_params *perf)
 {
struct dpu_crtc_state *dpu_cstate;
-   int i;
 
if (!kms || !kms->catalog || !crtc || !state || !perf) {
DPU_ERROR("invalid parameters\n");
@@ -95,35 +94,24 @@ static void _dpu_core_perf_calc_crtc(struct dpu_kms *kms,
memset(perf, 0, sizeof(struct dpu_core_perf_params));
 
if (!dpu_cstate->bw_control) {
-   for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) {
-   perf->bw_ctl[i] = kms->catalog->perf.max_bw_high *
+   perf->bw_ctl = kms->catalog->perf.max_bw_high *
1000ULL;
-   perf->max_per_pipe_ib[i] = perf->bw_ctl[i];
-   }
+   perf->max_per_pipe_ib = perf->bw_ctl;
perf->core_clk_rate = kms->perf.max_core_clk_rate;
} else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_MINIMUM) {
-   for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) {
-   perf->bw_ctl[i] = 0;
-   perf->max_per_pipe_ib[i] = 0;
-   }
+   perf->bw_ctl = 0;
+   perf->max_per_pipe_ib = 0;
perf->core_clk_rate = 0;
} else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_FIXED) {
-   for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) {
-   perf->bw_ctl[i] = kms->perf.fix_core_ab_vote;
-   perf->max_per_pipe_ib[i] = kms->perf.fix_core_ib_vote;
-   }
+   perf->bw_ctl = kms->perf.fix_core_ab_vote;
+   perf->max_per_pipe_ib = kms->perf.fix_core_ib_vote;
perf->core_clk_rate = kms->perf.fix_core_clk_rate;
}
 
DPU_DEBUG(
-   "crtc=%d clk_rate=%llu core_ib=%llu core_ab=%llu llcc_ib=%llu 
llcc_ab=%llu mem_ib=%llu mem_ab=%llu\n",
+   "crtc=%d clk_rate=%llu core_ib=%llu core_ab=%llu\n",
crtc->base.id, perf->core_clk_rate,
-   perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_MNOC],
-   perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_MNOC],
-   perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_LLCC],
-   perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_LLCC],
-   perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_EBI],
-   perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_EBI]);
+   perf->max_per_pipe_ib, perf->bw_ctl);
 }
 
 int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
@@ -136,7 +124,6 @@ int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
struct dpu_crtc_state *dpu_cstate;
struct drm_crtc *tmp_crtc;
struct dpu_kms *kms;
-   int i;
 
if (!crtc || !state) {
DPU_ERROR("invalid crtc\n");
@@ -158,31 +145,25 @@ int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
/* obtain new values */
_dpu_core_perf_calc_crtc(kms, crtc, state, _cstate->new_perf);
 
-   for (i = DPU_POWER_HANDLE_DBUS_ID_MNOC;
-   i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) {
-   bw_sum_of_intfs = dpu_cstate->new_perf.bw_ctl[i];
-   curr_client_type = dpu_crtc_get_client_type(crtc);
+   bw_sum_of_intfs = dpu_cstate->new_perf.bw_ctl;
+   curr_client_type = dpu_crtc_get_client_type(crtc);
 
-   drm_for_each_crtc(tmp_crtc, crtc->dev) {
-   if (_dpu_core_perf_crtc_is_power_on(tmp_crtc) &&
-   (dpu_crtc_get_client_type(tmp_crtc) ==
-   curr_client_type) &&
-   (tmp_crtc != crtc)) {
-   struct dpu_crtc_state *tmp_cstate =
-

[v6 3/3] dt-bindings: msm/disp: Introduce interconnect bindings for MDSS on SDM845

2019-01-22 Thread Jayant Shekhar

Add interconnect properties such as interconnect provider specifier
, the edge source and destination ports which are required by the
interconnect API to configure interconnect path for MDSS.

Changes in v2:
- none

Changes in v3:
- Remove common property definitions (Rob Herring)

Changes in v4:
- Use port macros and change port string names (Georgi Djakov)

Changes in v5:
- None

Changes in v6:
-None

Signed-off-by: Sravanthi Kollukuduru 
Signed-off-by: Jayant Shekhar 
---
 Documentation/devicetree/bindings/display/msm/dpu.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/msm/dpu.txt 
b/Documentation/devicetree/bindings/display/msm/dpu.txt
index ad2e883..a61dd40 100644
--- a/Documentation/devicetree/bindings/display/msm/dpu.txt
+++ b/Documentation/devicetree/bindings/display/msm/dpu.txt
@@ -28,6 +28,11 @@ Required properties:
 - #address-cells: number of address cells for the MDSS children. Should be 1.
 - #size-cells: Should be 1.
 - ranges: parent bus address space is the same as the child bus address space.
+- interconnects : interconnect path specifier for MDSS according to
+  Documentation/devicetree/bindings/interconnect/interconnect.txt. Should be
+  2 paths corresponding to 2 AXI ports.
+- interconnect-names : MDSS will have 2 port names to differentiate between the
+  2 interconnect paths defined with interconnect specifier.
 
 Optional properties:
 - assigned-clocks: list of clock specifiers for clocks needing rate assignment
@@ -86,6 +91,11 @@ Example:
interrupt-controller;
#interrupt-cells = <1>;
 
+   interconnects = <_hlos MASTER_MDP0 _hlos SLAVE_EBI1>,
+   <_hlos MASTER_MDP1 _hlos SLAVE_EBI1>;
+
+   interconnect-names = "mdp0-mem", "mdp1-mem";
+
iommus = <_iommu 0>;
 
#address-cells = <2>;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[v6 0/3] Use interconnect API in MDSS on SDM845

2019-01-22 Thread Jayant Shekhar

The interconnect API provides an interface for consumer drivers to express
their bandwidth needs in the SoC. This data is aggregated and the on-chip
interconnect hardware is configured to the appropriate power/performance
profile.

MDSS is one of the interconnect consumers which uses the interconnect APIs
to get the path between endpoints and set its bandwidth requirements
for the given interconnected path.

Subsequently, there is a clean up patch to remove all the references
of the DPU custom bus scaling.

There is corresponding DT patch with the source and destination ports
defined for display driver which will be sent separately.

Changes in v2:
- Remove error log and unnecessary check (Jordan Crouse)
- Fixed build error due to partial clean up

Changes in v3:
- Remove common property definitions (Rob Herring)
- Code clean up involving variable name change, removal
  of extra paranthesis and variables (Matthias Kaehlcke)
- Condense multiple lines into a single line (Sean Paul)

Changes in v4:
   - Add comments, spacings, tabs, proper port name and icc macro
   - Use port macros and change port string names (Georgi Djakov)

Changes in v5:
   - Updated commit text and parenthesis alignment (Georgi Djakov)

Changes in v6:
   - Change icc_set to icc_set_bw (Doug Anderson)

Jayant Shekhar (3):
  drm/msm/dpu: clean up references of DPU custom bus scaling
  drm/msm/dpu: Integrate interconnect API in MDSS
  dt-bindings: msm/disp: Introduce interconnect bindings for MDSS on
SDM845

 .../devicetree/bindings/display/msm/dpu.txt|  10 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c  | 174 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h  |   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c   |  13 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c   |  49 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.c   |  47 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.h   |  68 
 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h  |  22 +--
 8 files changed, 144 insertions(+), 243 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: rcutorture: meaning of "End of test: RCU_HOTPLUG"

2019-01-22 Thread Paul E. McKenney

On Tue, Jan 22, 2019 at 04:42:19PM +0800, Su Yue wrote:
> Thanks for your quick reply! Paul
> 
> On 1/22/19 12:01 PM, Paul E. McKenney wrote:
> >On Tue, Jan 22, 2019 at 11:40:53AM +0800, Su Yue wrote:
> >>Hi, guys
> >>   While running rcutorture tests with "onoff_interval", some tests
> >>failed and results show like:
> >>
> >>=
> >>[  316.354501] srcud-torture:--- End of test: RCU_HOTPLUG:
> >>nreaders=1 nfakewriters=4 stat_interval=60 verbose=2
> >>test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fq\
> >>s_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0
> >>test_boost_interval=7 test_boost_duration=4 shutdown_secs=0
> >>stall_cpu=0 stall_cpu_holdoff=10 stall_cpu_irqsoff=0 n_ba\
> >>rrier_cbs=0 onoff_interval=3 onoff_holdoff=0
> >>
> >>
> >>I am wondering that meaning of "RCU_HOTPLUG". Is it expected because
> >>cpu hotplug is enabled in the test? Or just represents another type of
> >>failure?
> >
> >This says that at least one CPU hotplug operation failed, that is,
> >the CPU didn't actually come online or go offline as requested.  If you
> >are introducing CPU hotplug to an architecture, this usually indicates
> >that you have bugs in your CPU-hotplug code.  Or it nmight be that
> 
> It should hit the case since there is no RCU CPU stall warnings.
> 
> >RCU grace periods failed to progress -- though this would normally
> >also result in RCU CPU stall warnings.
> >
> >There should be lines containing "ver:" in your console output.  What
> >does one of the later one of these say?
> >
> 
> The line says:
> ==
> [  318.850175] busted_srcud-torture: rtc:   (null) ver:
> 27040 tfle: 0 rta: 27040 rtaf: 0 rtf: 27027 rtmbe: 0 rtbe: 0 rtbke:
> 0 rtbre: 0 rtbf: 0 rtb: 0 \
> nt: 9497 onoff: 2639/2639:2640/5310 40,373:10,355 162868:67542
> (HZ=1000) barrier: 0/0:0

Yes, you have many more offline attempts than successes, which is
why RCU_HOTPLUG was printed.

> =
> 
> And here are useful errors:
> =
> kern  :info  : [  135.379693] KVM setup async PF for cpu 1
> kern  :info  : [  135.381412] kvm-stealtime: cpu 1, msr 23fd16180
> kern  :alert : [  135.386897] busted_srcud-torture:torture_onoff

Just so your know, busted_srcud can sometimes fail by design.  Hence
the "busted" in the name.  But failure didn't happen this time.

> task: onlined 1
> kern  :alert : [  135.408241] busted_srcud-torture:torture_onoff
> task: offlining 1
> kern  :info  : [  135.423310] Unregister pv shared memory for cpu 1
> kern  :info  : [  135.427940] smpboot: CPU 1 is now offline
> kern  :alert : [  135.430106] busted_srcud-torture:torture_onoff
> task: offlined 1
> kern  :alert : [  135.436404] busted_srcud-torture:torture_onoff
> task: offlining 0
> kern  :alert : [  135.446173] busted_srcud-torture:torture_onoff
> task: offline 0 failed: errno -16
> kern  :alert : [  135.453076] busted_srcud-torture:torture_onoff
> task: offlining 0
> kern  :alert : [  135.457461] busted_srcud-torture:torture_onoff
> task: offline 0 failed: errno -16
> 
> 
> =
> There are only two CPUs on the VM. Torture try to offline the last one
> but -EBUSY occured.
> 
> I spent time to understand kernel/torture.c.
> There is torture_onoff():
> 
> 225while (!torture_must_stop()) {
> 226cpu = (torture_random() >> 4) % (maxcpu + 1);
> 227if (!torture_offline(cpu,
> 228 _offline_attempts,
> _offline_successes,
> 229 _offline, _offline,
> _offline))
> 230torture_online(cpu,
> 231   _online_attempts,
> _online_successes,
> 232   _online, _online,
> _online);
> 233schedule_timeout_interruptible(onoff_interval);
> 234}
> 235
> 
> torture_offline() and torture_offline() don't pre judge if the current
> cpu is only one usable.

That does appear to be the case, and that would be a problem with
the CONFIG_BOOTPARAM_HOTPLUG_CPU0 listed below.

Good catch!

> Our test machines are configured with CONFIG_BOOTPARAM_HOTPLUG_CPU0. If
> there are only one oneline and hotplugable cpux, then
> n_offline_successes != n_offline_attempts which caused "End of test:
> RCU_HOTPLUG".
> 
> Does I misunderstand something above? Feel free to correct me.

Does the following patch help?

Thanx, Paul



diff --git a/kernel/torture.c b/kernel/torture.c
index a03ff722352b..2b6700ca2a43 100644
--- a/kernel/torture.c
+++

[PATCH V2 4/6] misc/pvpanic: add pvpanic acpi driver

2019-01-22 Thread Peng Hao

Make pvpanic acpi driver as seperate file and modify code 
in order to adapt the framework.

Signed-off-by: Peng Hao 
---
 drivers/misc/pvpanic/Kconfig|  9 +
 drivers/misc/pvpanic/Makefile   |  1 +
 drivers/misc/pvpanic/pvpanic-acpi.c | 77 +
 3 files changed, 87 insertions(+)
 create mode 100644 drivers/misc/pvpanic/pvpanic-acpi.c

diff --git a/drivers/misc/pvpanic/Kconfig b/drivers/misc/pvpanic/Kconfig
index 3e612c6..d274130 100644
--- a/drivers/misc/pvpanic/Kconfig
+++ b/drivers/misc/pvpanic/Kconfig
@@ -5,3 +5,12 @@ config PVPANIC
  This driver provides support for the pvpanic device.  pvpanic is
  a paravirtualized device provided by QEMU; it lets a virtual machine
  (guest) communicate panic events to the host.
+
+if PVPANIC
+
+config PVPANIC_ACPI
+   tristate "pvpanic acpi driver"
+   depends on ACPI
+   default PVPANIC
+
+endif
diff --git a/drivers/misc/pvpanic/Makefile b/drivers/misc/pvpanic/Makefile
index 6394224..c5b73ca 100644
--- a/drivers/misc/pvpanic/Makefile
+++ b/drivers/misc/pvpanic/Makefile
@@ -3,3 +3,4 @@
 # Copyright (c) 2018 ZTE Ltd.
 
 obj-$(CONFIG_PVPANIC)+= pvpanic.o
+obj-$(CONFIG_PVPANIC_ACPI)  += pvpanic-acpi.o
diff --git a/drivers/misc/pvpanic/pvpanic-acpi.c 
b/drivers/misc/pvpanic/pvpanic-acpi.c
new file mode 100644
index 000..a6153fa
--- /dev/null
+++ b/drivers/misc/pvpanic/pvpanic-acpi.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ *  pvpanic acpi driver.
+ *
+ *  Copyright (C) 2019 ZTE Ltd.
+ *  Author: Peng Hao
+ */
+#include 
+#include 
+#include 
+#include 
+#include "pvpanic.h"
+
+static int pvpanic_add(struct acpi_device *device);
+static int pvpanic_remove(struct acpi_device *device);
+
+static const struct acpi_device_id pvpanic_device_ids[] = {
+   { "QEMU0001", 0 },
+   { "", 0 }
+};
+MODULE_DEVICE_TABLE(acpi, pvpanic_device_ids);
+
+static struct acpi_driver pvpanic_driver = {
+   .name = "pvpanic",
+   .class ="QEMU",
+   .ids =  pvpanic_device_ids,
+   .ops =  {
+   .add =  pvpanic_add,
+   .remove =   pvpanic_remove,
+   },
+   .owner =THIS_MODULE,
+};
+
+static acpi_status
+pvpanic_walk_resources(struct acpi_resource *res, void *context)
+{
+   struct resource r;
+   int ret = 0;
+   struct device *dev = context;
+
+   memset(, 0, sizeof(r));
+   if (acpi_dev_resource_io(res, ) || acpi_dev_resource_memory(res, ))
+   ret = pvpanic_add_device(dev, );
+
+   if (!ret)
+   return AE_OK;
+
+   return AE_ERROR;
+}
+static int pvpanic_add(struct acpi_device *device)
+{
+   int ret;
+   acpi_status status;
+
+   ret = acpi_bus_get_status(device);
+   if (ret < 0)
+   return ret;
+
+   if (!device->status.enabled || !device->status.functional)
+   return -ENODEV;
+
+   status = acpi_walk_resources(device->handle, METHOD_NAME__CRS,
+pvpanic_walk_resources, >dev);
+
+   if (ACPI_FAILURE(status))
+   return -ENODEV;
+
+   return 0;
+}
+
+static int pvpanic_remove(struct acpi_device *device)
+{
+   pvpanic_remove_device();
+   return 0;
+}
+
+module_acpi_driver(pvpanic_driver);
-- 
1.8.3.1

[PATCH V2 1/6] misc/pvpanic: preparing for pvpanic driver framework

2019-01-22 Thread Peng Hao

Preparing for pvpanic driver framework. Create a pvpanic driver
directory and move current driver file to new directory.

Signed-off-by: Peng Hao 
---
 drivers/misc/Kconfig | 9 +
 drivers/misc/Makefile| 2 +-
 drivers/misc/pvpanic/Kconfig | 7 +++
 drivers/misc/pvpanic/Makefile| 5 +
 drivers/misc/{ => pvpanic}/pvpanic.c | 0
 5 files changed, 14 insertions(+), 9 deletions(-)
 create mode 100644 drivers/misc/pvpanic/Kconfig
 create mode 100644 drivers/misc/pvpanic/Makefile
 rename drivers/misc/{ => pvpanic}/pvpanic.c (100%)

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index f417b06..aa3a805 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -513,14 +513,7 @@ config MISC_RTSX
tristate
default MISC_RTSX_PCI || MISC_RTSX_USB
 
-config PVPANIC
-   tristate "pvpanic device support"
-   depends on HAS_IOMEM && (ACPI || OF)
-   help
- This driver provides support for the pvpanic device.  pvpanic is
- a paravirtualized device provided by QEMU; it lets a virtual machine
- (guest) communicate panic events to the host.
-
+source "drivers/misc/pvpanic/Kconfig"
 source "drivers/misc/c2port/Kconfig"
 source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index e39ccbb..cfe20b3 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -58,4 +58,4 @@ obj-$(CONFIG_ASPEED_LPC_SNOOP)+= aspeed-lpc-snoop.o
 obj-$(CONFIG_PCI_ENDPOINT_TEST)+= pci_endpoint_test.o
 obj-$(CONFIG_OCXL) += ocxl/
 obj-y  += cardreader/
-obj-$(CONFIG_PVPANIC)  += pvpanic.o
+obj-$(CONFIG_PVPANIC)  += pvpanic/
diff --git a/drivers/misc/pvpanic/Kconfig b/drivers/misc/pvpanic/Kconfig
new file mode 100644
index 000..3e612c6
--- /dev/null
+++ b/drivers/misc/pvpanic/Kconfig
@@ -0,0 +1,7 @@
+config PVPANIC
+   tristate "pvpanic device support"
+   depends on HAS_IOMEM && (ACPI || OF)
+   help
+ This driver provides support for the pvpanic device.  pvpanic is
+ a paravirtualized device provided by QEMU; it lets a virtual machine
+ (guest) communicate panic events to the host.
diff --git a/drivers/misc/pvpanic/Makefile b/drivers/misc/pvpanic/Makefile
new file mode 100644
index 000..6394224
--- /dev/null
+++ b/drivers/misc/pvpanic/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (c) 2018 ZTE Ltd.
+
+obj-$(CONFIG_PVPANIC)+= pvpanic.o
diff --git a/drivers/misc/pvpanic.c b/drivers/misc/pvpanic/pvpanic.c
similarity index 100%
rename from drivers/misc/pvpanic.c
rename to drivers/misc/pvpanic/pvpanic.c
-- 
1.8.3.1

[PATCH V2 5/6] misc/pvpanic: add pvpanic mmio driver

2019-01-22 Thread Peng Hao

Make pvpanic mmioi driver as seperate file and modify code
in order to adapt the framework.

Signed-off-by: Peng Hao 
---
 drivers/misc/pvpanic/Kconfig  |  4 +++
 drivers/misc/pvpanic/Makefile |  1 +
 drivers/misc/pvpanic/pvpanic-of.c | 53 +++
 3 files changed, 58 insertions(+)
 create mode 100644 drivers/misc/pvpanic/pvpanic-of.c

diff --git a/drivers/misc/pvpanic/Kconfig b/drivers/misc/pvpanic/Kconfig
index d274130..47f8709 100644
--- a/drivers/misc/pvpanic/Kconfig
+++ b/drivers/misc/pvpanic/Kconfig
@@ -13,4 +13,8 @@ config PVPANIC_ACPI
depends on ACPI
default PVPANIC
 
+config PVPANIC_OF
+   tristate "pvpanic mmio driver"
+   depends on OF
+
 endif
diff --git a/drivers/misc/pvpanic/Makefile b/drivers/misc/pvpanic/Makefile
index c5b73ca..63ef0db 100644
--- a/drivers/misc/pvpanic/Makefile
+++ b/drivers/misc/pvpanic/Makefile
@@ -4,3 +4,4 @@
 
 obj-$(CONFIG_PVPANIC)+= pvpanic.o
 obj-$(CONFIG_PVPANIC_ACPI)  += pvpanic-acpi.o
+obj-$(CONFIG_PVPANIC_OF)+= pvpanic-of.o
diff --git a/drivers/misc/pvpanic/pvpanic-of.c 
b/drivers/misc/pvpanic/pvpanic-of.c
new file mode 100644
index 000..73ca5f3
--- /dev/null
+++ b/drivers/misc/pvpanic/pvpanic-of.c
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ *  pvpanic of driver.
+ *
+ *  Copyright (C) 2019 ZTE Ltd.
+ *  Author: Peng Hao 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "pvpanic.h"
+
+static int pvpanic_mmio_probe(struct platform_device *pdev)
+{
+   struct resource *res;
+   int ret;
+
+   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   if (!res)
+   return -EINVAL;
+
+   ret = pvpanic_add_device(>dev, res);
+   if (ret)
+   return -ENODEV;
+
+   return 0;
+}
+
+static int pvpanic_mmio_remove(struct platform_device *pdev)
+{
+   pvpanic_remove_device();
+   return 0;
+}
+
+static const struct of_device_id pvpanic_mmio_match[] = {
+   { .compatible = "qemu,pvpanic-mmio", },
+   {}
+};
+
+static struct platform_driver pvpanic_mmio_driver = {
+   .driver = {
+   .name = "pvpanic-mmio",
+   .of_match_table = pvpanic_mmio_match,
+   },
+   .probe = pvpanic_mmio_probe,
+   .remove = pvpanic_mmio_remove,
+};
+
+module_platform_driver(pvpanic_mmio_driver);
-- 
1.8.3.1

[PATCH V2 0/6] add pvpanic driver framework

2019-01-22 Thread Peng Hao

QEMU community requires additional PCI devices to simulate PVPANIC 
devices so that some architectures can not occupy precious less than 4G 
of memory space.
Previously, I added PCI driver directly to the original version of the driver, 
which made the whole driver file look a bit cluttered. So Andy Shevchenko 
suggests:
"I would recommend to split it in a way how it's done for ChipIdea USB driver, 
for example. (drivers/usb/chipidea if I'm not mistaken)".


Peng Hao (6):
  misc/pvpanic: preparing for pvpanic driver framework
  misc/pvpanic: Add pvpanic driver framework
  misc/pvpanic: add API for pvpanic driver framework
  misc/pvpanic: add pvpanic acpi driver
  misc/pvpanic: add pvpanic mmio driver
  misc/pvpanic: add pvpanic pci driver

 drivers/misc/Kconfig|   9 +-
 drivers/misc/Makefile   |   2 +-
 drivers/misc/pvpanic.c  | 192 
 drivers/misc/pvpanic/Kconfig|  25 +
 drivers/misc/pvpanic/Makefile   |   8 ++
 drivers/misc/pvpanic/pvpanic-acpi.c |  77 +++
 drivers/misc/pvpanic/pvpanic-of.c   |  53 ++
 drivers/misc/pvpanic/pvpanic-pci.c  |  56 +++
 drivers/misc/pvpanic/pvpanic.c  | 131 
 drivers/misc/pvpanic/pvpanic.h  |  14 +++
 10 files changed, 366 insertions(+), 201 deletions(-)
 delete mode 100644 drivers/misc/pvpanic.c
 create mode 100644 drivers/misc/pvpanic/Kconfig
 create mode 100644 drivers/misc/pvpanic/Makefile
 create mode 100644 drivers/misc/pvpanic/pvpanic-acpi.c
 create mode 100644 drivers/misc/pvpanic/pvpanic-of.c
 create mode 100644 drivers/misc/pvpanic/pvpanic-pci.c
 create mode 100644 drivers/misc/pvpanic/pvpanic.c
 create mode 100644 drivers/misc/pvpanic/pvpanic.h

-- 
1.8.3.1

[PATCH V2 6/6] misc/pvpanic: add new pvpanic pci driver

2019-01-22 Thread Peng Hao

Add new pvpanic pci driver to pvpanic driver framework.

Signed-off-by: Peng Hao 
---
 drivers/misc/pvpanic/Kconfig   |  5 
 drivers/misc/pvpanic/Makefile  |  1 +
 drivers/misc/pvpanic/pvpanic-pci.c | 56 ++
 3 files changed, 62 insertions(+)
 create mode 100644 drivers/misc/pvpanic/pvpanic-pci.c

diff --git a/drivers/misc/pvpanic/Kconfig b/drivers/misc/pvpanic/Kconfig
index 47f8709..46b6e05 100644
--- a/drivers/misc/pvpanic/Kconfig
+++ b/drivers/misc/pvpanic/Kconfig
@@ -17,4 +17,9 @@ config PVPANIC_OF
tristate "pvpanic mmio driver"
depends on OF
 
+config PVPANIC_PCI
+   tristate "pvpanic pci driver"
+   depends on PCI
+   default PVPANIC
+
 endif
diff --git a/drivers/misc/pvpanic/Makefile b/drivers/misc/pvpanic/Makefile
index 63ef0db..7c71f85 100644
--- a/drivers/misc/pvpanic/Makefile
+++ b/drivers/misc/pvpanic/Makefile
@@ -5,3 +5,4 @@
 obj-$(CONFIG_PVPANIC)+= pvpanic.o
 obj-$(CONFIG_PVPANIC_ACPI)  += pvpanic-acpi.o
 obj-$(CONFIG_PVPANIC_OF)+= pvpanic-of.o
+obj-$(CONFIG_PVPANIC_PCI)   += pvpanic-pci.o
diff --git a/drivers/misc/pvpanic/pvpanic-pci.c 
b/drivers/misc/pvpanic/pvpanic-pci.c
new file mode 100644
index 000..b4f453b
--- /dev/null
+++ b/drivers/misc/pvpanic/pvpanic-pci.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ *  pvpanic acpi driver.
+ *
+ *  Copyright (C) 2019 ZTE Ltd.
+ *  Author: Peng Hao 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include "pvpanic.h"
+
+#define PCI_VENDOR_ID_REDHAT 0x1b36
+#define PCI_DEVICE_ID_REDHAT_PVPANIC 0x0101
+
+static const struct pci_device_id pvpanic_pci_id_tbl[]  = {
+   { PCI_DEVICE(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_PVPANIC),},
+   {}
+};
+
+static int pvpanic_pci_probe(struct pci_dev *pdev,
+const struct pci_device_id *ent)
+{
+   int ret;
+   struct resource res;
+
+   ret = pcim_enable_device(pdev);
+   if (ret < 0)
+   return ret;
+
+   memset(, 0, sizeof(res));
+   res.start = pci_resource_start(pdev, 0);
+   res.end = pci_resource_end(pdev, 0);
+   res.flags = IORESOURCE_MEM;
+   ret = pvpanic_add_device(>dev, );
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
+static void pvpanic_pci_remove(struct pci_dev *pdev)
+{
+   pvpanic_remove_device();
+}
+
+static struct pci_driver pvpanic_pci_driver = {
+   .name = "pvpanic-pci",
+   .id_table = pvpanic_pci_id_tbl,
+   .probe =pvpanic_pci_probe,
+   .remove =   pvpanic_pci_remove,
+};
+
+module_pci_driver(pvpanic_pci_driver);
-- 
1.8.3.1

[PATCH V2 2/6] misc/pvpanic: Add pvpanic driver framework

2019-01-22 Thread Peng Hao

Add pvpanic driver framework. Split the original pvpanic acpi/of
driver as the two seperate files and modify code for adaptation framework
in follow-up patches.

Signed-off-by: Peng Hao 
---
 drivers/misc/pvpanic/pvpanic.c | 171 ++---
 1 file changed, 39 insertions(+), 132 deletions(-)

diff --git a/drivers/misc/pvpanic/pvpanic.c b/drivers/misc/pvpanic/pvpanic.c
index 595ac06..6380540 100644
--- a/drivers/misc/pvpanic/pvpanic.c
+++ b/drivers/misc/pvpanic/pvpanic.c
@@ -8,15 +8,20 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
-#include 
+#include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
 
-static void __iomem *base;
+static struct {
+   struct platform_device *pdev;
+   void __iomem *base;
+   bool is_ioport;
+} pvpanic_data = {
+   .pdev = NULL,
+   .is_ioport = false,
+};
 
 #define PVPANIC_PANICKED(1 << 0)
 
@@ -27,7 +32,7 @@
 static void
 pvpanic_send_event(unsigned int event)
 {
-   iowrite8(event, base);
+   iowrite8(event, pvpanic_data.base);
 }
 
 static int
@@ -43,150 +48,52 @@
.priority = 1, /* let this called before broken drm_fb_helper */
 };
 
-#ifdef CONFIG_ACPI
-static int pvpanic_add(struct acpi_device *device);
-static int pvpanic_remove(struct acpi_device *device);
-
-static const struct acpi_device_id pvpanic_device_ids[] = {
-   { "QEMU0001", 0 },
-   { "", 0 }
-};
-MODULE_DEVICE_TABLE(acpi, pvpanic_device_ids);
-
-static struct acpi_driver pvpanic_driver = {
-   .name = "pvpanic",
-   .class ="QEMU",
-   .ids =  pvpanic_device_ids,
-   .ops =  {
-   .add =  pvpanic_add,
-   .remove =   pvpanic_remove,
-   },
-   .owner =THIS_MODULE,
-};
-
-static acpi_status
-pvpanic_walk_resources(struct acpi_resource *res, void *context)
+static int pvpanic_platform_probe(struct platform_device *pdev)
 {
-   struct resource r;
-
-   if (acpi_dev_resource_io(res, )) {
-   base = ioport_map(r.start, resource_size());
-   return AE_OK;
-   } else if (acpi_dev_resource_memory(res, )) {
-   base = ioremap(r.start, resource_size());
-   return AE_OK;
+   struct device *dev = >dev;
+   struct resource *res;
+   void __iomem *base;
+
+   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   if (res) {
+   base = devm_ioremap_resource(dev, res);
+   if (IS_ERR(base))
+   return -ENODEV;
+   } else {
+   res = platform_get_resource(pdev, IORESOURCE_IO, 0);
+   if (!res)
+   return -ENODEV;
+
+   base = ioport_map(res->start, resource_size(res));
+   if (!base)
+   return -ENODEV;
+   pvpanic_data.is_ioport = true;
}
 
-   return AE_ERROR;
-}
-
-static int pvpanic_add(struct acpi_device *device)
-{
-   int ret;
-
-   ret = acpi_bus_get_status(device);
-   if (ret < 0)
-   return ret;
-
-   if (!device->status.enabled || !device->status.functional)
-   return -ENODEV;
-
-   acpi_walk_resources(device->handle, METHOD_NAME__CRS,
-   pvpanic_walk_resources, NULL);
-
-   if (!base)
-   return -ENODEV;
-
+   pvpanic_data.base = base;
atomic_notifier_chain_register(_notifier_list,
   _panic_nb);
 
return 0;
 }
 
-static int pvpanic_remove(struct acpi_device *device)
+static int pvpanic_platform_remove(struct platform_device *pdev)
 {
-
atomic_notifier_chain_unregister(_notifier_list,
 _panic_nb);
-   iounmap(base);
-
-   return 0;
-}
-
-static int pvpanic_register_acpi_driver(void)
-{
-   return acpi_bus_register_driver(_driver);
-}
-
-static void pvpanic_unregister_acpi_driver(void)
-{
-   acpi_bus_unregister_driver(_driver);
-}
-#else
-static int pvpanic_register_acpi_driver(void)
-{
-   return -ENODEV;
-}
 
-static void pvpanic_unregister_acpi_driver(void) {}
-#endif
-
-static int pvpanic_mmio_probe(struct platform_device *pdev)
-{
-   struct resource *mem;
-
-   mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-   if (!mem)
-   return -EINVAL;
-
-   base = devm_ioremap_resource(>dev, mem);
-   if (IS_ERR(base))
-   return PTR_ERR(base);
-
-   atomic_notifier_chain_register(_notifier_list,
-  _panic_nb);
-
-   return 0;
-}
-
-static int pvpanic_mmio_remove(struct platform_device *pdev)
-{
-
-   atomic_notifier_chain_unregister(_notifier_list,
-_panic_nb);
+   if (pvpanic_data.is_ioport)
+   iounmap(pvpanic_data.base);
 
return 0;
 }
 
-static const

[PATCH V2 3/6] misc/pvpanic: add API for pvpanic driver framework

2019-01-22 Thread Peng Hao

Add pvpanic_add/remove_device API. Follow-up patches will use them to
add/remove specific drivers into framework.

Signed-off-by: Peng Hao 
---
 drivers/misc/pvpanic/pvpanic.c | 32 
 drivers/misc/pvpanic/pvpanic.h | 14 ++
 2 files changed, 46 insertions(+)
 create mode 100644 drivers/misc/pvpanic/pvpanic.h

diff --git a/drivers/misc/pvpanic/pvpanic.c b/drivers/misc/pvpanic/pvpanic.c
index 227ab4e..f842ee4 100644
--- a/drivers/misc/pvpanic/pvpanic.c
+++ b/drivers/misc/pvpanic/pvpanic.c
@@ -48,6 +48,38 @@
.priority = 1, /* let this called before broken drm_fb_helper */
 };
 
+int pvpanic_add_device(struct device *dev, struct resource *res)
+{
+   struct platform_device *pdev;
+   int ret;
+
+   pdev = platform_device_alloc("pvpanic", -1);
+   if (!pdev)
+   return -ENOMEM;
+
+   pdev->dev.parent = dev;
+
+   ret = platform_device_add_resources(pdev, res, 1);
+   if (ret)
+   goto err;
+
+   ret = platform_device_add(pdev);
+   if (ret)
+   goto err;
+   pvpanic_data.pdev = pdev;
+
+   return 0;
+err:
+   platform_device_put(pdev);
+   return -1;
+}
+
+void pvpanic_remove_device(void)
+{
+   platform_device_unregister(pvpanic_data.pdev);
+   pvpanic_data.pdev = NULL;
+}
+
 static int pvpanic_platform_probe(struct platform_device *pdev)
 {
struct device *dev = >dev;
diff --git a/drivers/misc/pvpanic/pvpanic.h b/drivers/misc/pvpanic/pvpanic.h
new file mode 100644
index 000..a72ca59
--- /dev/null
+++ b/drivers/misc/pvpanic/pvpanic.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0+
+/* pvpanic driver framework header file
+ *
+ * Copyright (C) 2019 ZTE Ltd.
+ * Author: Peng Hao 
+ */
+
+#ifndef __DRIVERS_MISC_PVPANIC_H
+#define __DRIVERS_MISC_PVPANIC_H
+
+extern int pvpanic_add_device(struct device *dev, struct resource *res);
+extern void pvpanic_remove_device(void);
+
+#endif
-- 
1.8.3.1

LTP case read_all_proc fails on qemux86-64 since 5.0-rc1

2019-01-22 Thread He Zhe



LTP case read_all_proc(read_all -d /proc -q -r 10) often, but not every time, 
fails with the following call traces, since 600335205b8d "ide: convert to 
blk-mq"(5.0-rc1) till now(5.0-rc3).

qemu-system-x86_64 -drive file=rootfs.ext4,if=virtio,format=raw -object 
rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0 
-nographic -m 16192 -smp cpus=12 -cpu core2duo -enable-kvm -serial mon:stdio 
-serial null -kernel bzImage -append 'root=/dev/vda rw highres=off 
console=ttyS0 mem=16192M'

tst_test.c:1085: INFO: Timeout per run is 0h 05m 00s
[   47.080156] Warning: /proc/ide/hd?/settings interface is obsolete, and will 
be removed soon!
[   47.085330] [ cut here ]
[   47.085810] kernel BUG at block/blk-mq.c:767!
[   47.086498] invalid opcode:  [#1] PREEMPT SMP PTI
[   47.087022] CPU: 5 PID: 146 Comm: kworker/5:1H Not tainted 5.0.0-rc3 #1
[   47.087858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
[   47.088992] Workqueue: kblockd blk_mq_run_work_fn
[   47.089469] RIP: 0010:blk_mq_add_to_requeue_list+0xc1/0xd0
[   47.090035] Code: 48 8d 53 48 49 8b 8c 24 b8 04 00 00 48 89 51 08 48 89 4b 
48 49 8d 8c 24 b8 04 00 00 48 89 4b 50 49 89 94 24 b8 04 00 008
[   47.091930] RSP: 0018:9e1ea4b43e40 EFLAGS: 00010002
[   47.092458] RAX: 9e1ea13c0048 RBX: 9e1ea13c RCX: 0006
[   47.093181] RDX:  RSI: 0001 RDI: 9e1ea13c
[   47.093906] RBP: 9e1ea4b43e68 R08: eb5bcf630680 R09: 
[   47.094626] R10: 0001 R11: 0012 R12: 9e1ea1033a40
[   47.095347] R13: 9e1ea13a8d00 R14: 9e1ea13a9000 R15: 0046
[   47.096071] FS:  () GS:9e1ea4b4() 
knlGS:
[   47.096898] CS:  0010 DS:  ES:  CR0: 80050033
[   47.097477] CR2: 003fda41fda0 CR3: 0003d8e6a000 CR4: 06e0
[   47.098203] DR0:  DR1:  DR2: 
[   47.098929] DR3:  DR6: fffe0ff0 DR7: 0400
[   47.099650] Call Trace:
[   47.099910]  
[   47.100128]  blk_mq_requeue_request+0x58/0x60
[   47.100576]  ide_requeue_and_plug+0x20/0x50
[   47.101014]  ide_intr+0x21a/0x230
[   47.101362]  ? idecd_open+0xc0/0xc0
[   47.101735]  __handle_irq_event_percpu+0x43/0x1e0
[   47.102214]  handle_irq_event_percpu+0x32/0x80
[   47.102668]  handle_irq_event+0x39/0x60
[   47.103074]  handle_edge_irq+0xe8/0x1c0
[   47.103470]  handle_irq+0x20/0x30
[   47.103819]  do_IRQ+0x46/0xe0
[   47.104128]  common_interrupt+0xf/0xf
[   47.104505]  
[   47.104731] RIP: 0010:ide_output_data+0xbc/0x100
[   47.105201] Code: 74 22 8d 41 ff 85 c9 74 24 49 8d 54 40 02 41 0f b7 00 66 
41 89 01 49 83 c0 02 49 39 d0 75 ef 5b 41 5c 5d c3 4c 89 c6 445
[   47.107092] RSP: 0018:bd508059bb18 EFLAGS: 00010246 ORIG_RAX: 
ffdd
[   47.107862] RAX: 9e1ea13a8800 RBX: 9e1ea13a9000 RCX: 
[   47.108581] RDX: 0170 RSI: 9e1ea13c012c RDI: 
[   47.109293] RBP: bd508059bb28 R08: 9e1ea13c0120 R09: 0170
[   47.110016] R10: 000d R11: 000c R12: 9e1ea13a8800
[   47.110731] R13: 000c R14: 9e1ea13c R15: 7530
[   47.111446]  ide_transfer_pc+0x216/0x310
[   47.111848]  ? __const_udelay+0x3d/0x40
[   47.112236]  ? ide_execute_command+0x85/0xb0
[   47.112668]  ? ide_pc_intr+0x3f0/0x3f0
[   47.113051]  ? ide_check_atapi_device+0x110/0x110
[   47.113524]  ide_issue_pc+0x178/0x240
[   47.113901]  ide_cd_do_request+0x15c/0x350
[   47.114314]  ide_queue_rq+0x180/0x6b0
[   47.114686]  ? blk_mq_get_driver_tag+0xa1/0x110
[   47.115153]  blk_mq_dispatch_rq_list+0x90/0x550
[   47.115606]  ? __queue_delayed_work+0x63/0x90
[   47.116054]  ? deadline_fifo_request+0x41/0x90
[   47.116506]  blk_mq_do_dispatch_sched+0x80/0x100
[   47.116976]  blk_mq_sched_dispatch_requests+0xfc/0x170
[   47.117491]  __blk_mq_run_hw_queue+0x6f/0xd0
[   47.117941]  blk_mq_run_work_fn+0x1b/0x20
[   47.118342]  process_one_work+0x14c/0x450
[   47.118747]  worker_thread+0x4a/0x440
[   47.119125]  kthread+0x105/0x140
[   47.119456]  ? process_one_work+0x450/0x450
[   47.119880]  ? kthread_park+0x90/0x90
[   47.120251]  ret_from_fork+0x35/0x40
[   47.120619] Modules linked in:
[   47.120952] ---[ end trace 4562f716e88fdefe ]---
[   47.121423] RIP: 0010:blk_mq_add_to_requeue_list+0xc1/0xd0
[   47.121981] Code: 48 8d 53 48 49 8b 8c 24 b8 04 00 00 48 89 51 08 48 89 4b 
48 49 8d 8c 24 b8 04 00 00 48 89 4b 50 49 89 94 24 b8 04 00 008
[   47.123851] RSP: 0018:9e1ea4b43e40 EFLAGS: 00010002
[   47.124393] RAX: 9e1ea13c0048 RBX: 9e1ea13c RCX: 0006
[   47.125108] RDX:  RSI: 0001 RDI: 9e1ea13c
[   47.125819] RBP: 9e1ea4b43e68 R08: eb5bcf630680 R09: 
[   47.126539] R10:

Re: [virtio-dev] [PATCH] virtio: support VIRTIO_F_ORDER_PLATFORM

2019-01-22 Thread Jason Wang




On 2019/1/23 上午1:03, Tiwei Bie wrote:

This patch introduces the support for VIRTIO_F_ORDER_PLATFORM.
When this feature is negotiated, driver will use the barriers
suitable for hardware devices.

Signed-off-by: Tiwei Bie 
---
  drivers/virtio/virtio_ring.c   | 8 
  include/uapi/linux/virtio_config.h | 6 ++
  2 files changed, 14 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cd7e755484e3..27d3f057493e 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1609,6 +1609,9 @@ static struct virtqueue *vring_create_virtqueue_packed(
!context;
vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
  
+	if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))

+   vq->weak_barriers = false;
+
vq->packed.ring_dma_addr = ring_dma_addr;
vq->packed.driver_event_dma_addr = driver_event_dma_addr;
vq->packed.device_event_dma_addr = device_event_dma_addr;
@@ -2079,6 +2082,9 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
!context;
vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
  
+	if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))

+   vq->weak_barriers = false;
+
vq->split.queue_dma_addr = 0;
vq->split.queue_size_in_bytes = 0;
  
@@ -2213,6 +2219,8 @@ void vring_transport_features(struct virtio_device *vdev)

break;
case VIRTIO_F_RING_PACKED:
break;
+   case VIRTIO_F_ORDER_PLATFORM:
+   break;
default:
/* We don't understand this bit. */
__virtio_clear_bit(vdev, i);
diff --git a/include/uapi/linux/virtio_config.h 
b/include/uapi/linux/virtio_config.h
index 1196e1c1d4f6..ff8e7dc9d4dd 100644
--- a/include/uapi/linux/virtio_config.h
+++ b/include/uapi/linux/virtio_config.h
@@ -78,6 +78,12 @@
  /* This feature indicates support for the packed virtqueue layout. */
  #define VIRTIO_F_RING_PACKED  34
  
+/*

+ * This feature indicates that memory accesses by the driver and the
+ * device are ordered in a way described by the platform.
+ */
+#define VIRTIO_F_ORDER_PLATFORM36
+
  /*
   * Does the device support Single Root I/O Virtualization?
   */



I wonder whether or not this is sufficient. Is dma barrier implies a 
mmio barrier? Looks not.


See ia64/include/asm/barrier.h:

 * Note: "mb()" and its variants cannot be used as a fence to order
 * accesses to memory mapped I/O registers.  For that, mf.a needs to
 * be used.  However, we don't want to always use mf.a because (a)
 * it's (presumably) much slower than mf and (b) mf.a is supported for
 * sequential memory pages only.
 */
#define mb()    ia64_mf()
#define rmb()   mb()
#define wmb()   mb()

#define dma_rmb()   mb()
=>efine dma_wmb()   mb()

Thanks

Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain

2019-01-22 Thread Michael S. Tsirkin

On Tue, Jan 22, 2019 at 11:59:31AM -0800, Stefano Stabellini wrote:
> On Mon, 21 Jan 2019, Peng Fan wrote:
> > on i.MX8QM, M4_1 is communicating with DomU using rpmsg with a fixed
> > address as the dma mem buffer which is predefined.
> > 
> > Without this patch, the flow is:
> > vring_map_one_sg -> vring_use_dma_api
> >  -> dma_map_page
> >-> __swiotlb_map_page
> > ->swiotlb_map_page
> > ->__dma_map_area(phys_to_virt(dma_to_phys(dev, 
> > dev_addr)), size, dir);
> > However we are using per device dma area for rpmsg, phys_to_virt
> > could not return a correct virtual address for virtual address in
> > vmalloc area. Then kernel panic.
> > 
> > With this patch, vring_use_dma_api will return false, and
> > vring_map_one_sg will return sg_phys(sg) which is the correct phys
> > address in the predefined memory region.
> > vring_map_one_sg -> vring_use_dma_api
> >  -> sg_phys(sg)
> > 
> > Signed-off-by: Peng Fan 
> > ---
> >  drivers/virtio/virtio_ring.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index cd7e755484e3..8993d7cb3592 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -248,6 +248,8 @@ static inline bool virtqueue_use_indirect(struct 
> > virtqueue *_vq,
> >  
> >  static bool vring_use_dma_api(struct virtio_device *vdev)
> >  {
> > +   struct device *dma_dev = vdev->dev.parent;
> > +
> > if (!virtio_has_iommu_quirk(vdev))
> > return true;
> >  
> > @@ -260,7 +262,7 @@ static bool vring_use_dma_api(struct virtio_device 
> > *vdev)
> >  * the DMA API if we're a Xen guest, which at least allows
> >  * all of the sensible Xen configurations to work correctly.
> >  */
> > -   if (xen_domain())
> > +   if (xen_domain() && !dma_dev->dma_mem)
> > return true;
> >  
> > return false;
> 
> I can see you spotted a real issue, but this is not the right fix. We
> just need something a bit more flexible than xen_domain(): there are
> many kinds of Xen domains on different architectures, we basically want
> to enable this (return true from vring_use_dma_api) only when the xen
> swiotlb is meant to be used. Does the appended patch fix the issue you
> have?
> 
> ---
> 
> xen: introduce xen_vring_use_dma
> 
> From: Stefano Stabellini 
> 
> Export xen_swiotlb on arm and arm64.
> 
> Use xen_swiotlb to determine when vring should use dma APIs to map the
> ring: when xen_swiotlb is enabled the dma API is required. When it is
> disabled, it is not required.
> 
> Reported-by: Peng Fan 
> Signed-off-by: Stefano Stabellini 
> 
> diff --git a/arch/arm/include/asm/xen/swiotlb-xen.h 
> b/arch/arm/include/asm/xen/swiotlb-xen.h
> new file mode 100644
> index 000..455ade5
> --- /dev/null
> +++ b/arch/arm/include/asm/xen/swiotlb-xen.h
> @@ -0,0 +1 @@
> +#include 
> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> index cb44aa2..8592863 100644
> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
> @@ -21,6 +21,8 @@
>  #include 
>  #include 
>  
> +int xen_swiotlb __read_mostly;
> +
>  unsigned long xen_get_swiotlb_free_pages(unsigned int order)
>  {
>   struct memblock_region *reg;
> @@ -189,6 +191,7 @@ int __init xen_mm_init(void)
>   struct gnttab_cache_flush cflush;
>   if (!xen_initial_domain())
>   return 0;
> + xen_swiotlb = 1;
>   xen_swiotlb_init(1, false);
>   xen_dma_ops = _swiotlb_dma_ops;
>  
> diff --git a/arch/arm64/include/asm/xen/swiotlb-xen.h 
> b/arch/arm64/include/asm/xen/swiotlb-xen.h
> new file mode 100644
> index 000..455ade5
> --- /dev/null
> +++ b/arch/arm64/include/asm/xen/swiotlb-xen.h
> @@ -0,0 +1 @@
> +#include 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index cd7e755..bf8badc 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -260,7 +260,7 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>* the DMA API if we're a Xen guest, which at least allows
>* all of the sensible Xen configurations to work correctly.
>*/
> - if (xen_domain())
> + if (xen_vring_use_dma())
>   return true;
>  
>   return false;
> diff --git a/include/xen/arm/swiotlb-xen.h b/include/xen/arm/swiotlb-xen.h
> new file mode 100644
> index 000..2aac7c4
> --- /dev/null
> +++ b/include/xen/arm/swiotlb-xen.h
> @@ -0,0 +1,10 @@
> +#ifndef _ASM_ARM_XEN_SWIOTLB_XEN_H
> +#define _ASM_ARM_XEN_SWIOTLB_XEN_H
> +
> +#ifdef CONFIG_SWIOTLB_XEN
> +extern int xen_swiotlb;
> +#else
> +#define xen_swiotlb (0)
> +#endif
> +
> +#endif
> diff --git a/include/xen/xen.h b/include/xen/xen.h
> index 0e21567..74a536d 100644
> --- a/include/xen/xen.h
> +++ b/include/xen/xen.h
> @@ -46,4 +46,10 @@ enum xen_domain_type {
>  bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
>

[GIT PULL] Thermal management updates for v5.0-rc4

2019-01-22 Thread Zhang Rui

Hi, Linus,

Please pull from
  git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux.git for-rc

to receive the latest Thermal management updates for v5.0-rc4 with
top-most commit 129699bb8c7572106b5bbb2407c2daee4727ccad:

  drivers: thermal: int340x_thermal: Fix sysfs race condition (2019-01-
18 15:23:04 +0800)

on top of commit bfeffd155283772bbe78c6a05dec7c0128ee500c:

  Linux 5.0-rc1 (2019-01-06 17:08:20 -0800)

Specifics:

- Fix a race condition that sysfs could be accessed before necessary
initialization in int340x thermal driver. (Aaron Hill)

- Fix a NULL vs IS_ERR() check in int340x thermal driver. (Dan
Carpenter)

thanks,
rui


Aaron Hill (1):
  drivers: thermal: int340x_thermal: Fix sysfs race condition

Dan Carpenter (1):
  thermal: int340x_thermal: Fix a NULL vs IS_ERR() check

 .../int340x_thermal/processor_thermal_device.c | 30 
--
 1 file changed, 16 insertions(+), 14 deletions(-)

Re: [PATCH] staging: ks7010: remove unnecessary parentheses

2019-01-22 Thread Joe Perches

On Tue, 2019-01-22 at 21:18 -0500, Matt McCoy wrote:
> Remove unnecessary parentheses reported by checkpatch.
[]
> diff --git a/drivers/staging/ks7010/ks_hostif.c 
> b/drivers/staging/ks7010/ks_hostif.c
[]
> @@ -171,7 +171,7 @@ int get_current_ap(struct ks_wlan_private *priv, struct 
> link_ap_info *ap_info)
>  "- rate_set_size=%d\n",
>  ap->bssid[0], ap->bssid[1], ap->bssid[2],
>  ap->bssid[3], ap->bssid[4], ap->bssid[5],
> -&(ap->ssid.body[0]),
> +>ssid.body[0],
>  ap->rate_set.body[0], ap->rate_set.body[1],
>  ap->rate_set.body[2], ap->rate_set.body[3],
>  ap->rate_set.body[4], ap->rate_set.body[5],

This bit:

[]
netdev_dbg(priv->net_dev, "Link AP\n"
   "- bssid=%02X:%02X:%02X:%02X:%02X:%02X\n"
[]
   ap->bssid[0], ap->bssid[1], ap->bssid[2],

should instead use the vsprintf %pM extension

"- bssid: %pM\n"
[]
ap->bssid,

Re: [PATCH] dt-bindings: sdhci-omap: Add properties for using external dma

2019-01-22 Thread Chunyan Zhang

On Tue, 22 Jan 2019 at 18:17, Faiz Abbas  wrote:
>
> Hi Chunyan,
>
> +Rob Herring
>
> On 22/01/19 2:17 PM, Chunyan Zhang wrote:
> > sdhci-omap can support both external dma controller via dmaengine
> > framework as well as ADMA which standard SD host controller
> > provides.
> >
> > Signed-off-by: Chunyan Zhang 
> > Signed-off-by: Faiz Abbas 
> > ---
>
> Thanks for fixing this. However, this change should be part of the

Right, I actually used "--in-reply-to" with the parameter of
Message-ID of Rob's mail, but it seems not work as I expected that it
was expected a reply to Rob's mail for review.

> series (with a change log and a version number). I will send this as a
> part of my series after I get some Acks on the driver patches.

Ok, thanks.

Chunyan

>
> Thanks,
> Faiz

Re: [PATCH RFC 06/24] userfaultfd: wp: support write protection for userfault vma range

2019-01-22 Thread Jerome Glisse

On Wed, Jan 23, 2019 at 10:17:45AM +0800, Peter Xu wrote:
> On Tue, Jan 22, 2019 at 12:02:24PM -0500, Jerome Glisse wrote:
> > On Tue, Jan 22, 2019 at 05:39:35PM +0800, Peter Xu wrote:
> > > On Mon, Jan 21, 2019 at 09:05:35AM -0500, Jerome Glisse wrote:
> > > 
> > > [...]
> > > 
> > > > > + change_protection(dst_vma, start, start + len, newprot,
> > > > > + !enable_wp, 0);
> > > > 
> > > > So setting dirty_accountable bring us to that code in mprotect.c:
> > > > 
> > > > if (dirty_accountable && pte_dirty(ptent) &&
> > > > (pte_soft_dirty(ptent) ||
> > > >  !(vma->vm_flags & VM_SOFTDIRTY))) {
> > > > ptent = pte_mkwrite(ptent);
> > > > }
> > > > 
> > > > My understanding is that you want to set write flag when enable_wp
> > > > is false and you want to set the write flag unconditionaly, right ?
> > > 
> > > Right.
> > > 
> > > > 
> > > > If so then you should really move the change_protection() flags
> > > > patch before this patch and add a flag for setting pte write flags.
> > > > 
> > > > Otherwise the above is broken at it will only set the write flag
> > > > for pte that were dirty and i am guessing so far you always were
> > > > lucky because pte were all dirty (change_protection will preserve
> > > > dirtyness) when you write protected them.
> > > > 
> > > > So i believe the above is broken or at very least unclear if what
> > > > you really want is to only set write flag to pte that have the
> > > > dirty flag set.
> > > 
> > > You are right, if we build the tree until this patch it won't work for
> > > all the cases.  It'll only work if the page was at least writable
> > > before and also it's dirty (as you explained).  Sorry to be unclear
> > > about this, maybe I should at least mention that in the commit message
> > > but I totally forgot it.
> > > 
> > > All these problems are solved in later on patches, please feel free to
> > > have a look at:
> > > 
> > >   mm: merge parameters for change_protection()
> > >   userfaultfd: wp: apply _PAGE_UFFD_WP bit
> > >   userfaultfd: wp: handle COW properly for uffd-wp
> > > 
> > > Note that even in the follow up patches IMHO we can't directly change
> > > the write permission since the page can be shared by other processes
> > > (e.g., the zero page or COW pages).  But the general idea is the same
> > > as you explained.
> > > 
> > > I tried to avoid squashing these stuff altogether as explained
> > > previously.  Also, this patch can be seen as a standalone patch to
> > > introduce the new interface which seems to make sense too, and it is
> > > indeed still working in many cases so I see the latter patches as
> > > enhancement of this one.  Please let me know if you still want me to
> > > have all these stuff squashed, or if you'd like me to squash some of
> > > them.
> > 
> > Yeah i have look at those after looking at this one. You should just
> > re-order the patch this one first and then one that add new flag,
> > then ones that add the new userfaultfd feature. Otherwise you are
> > adding a userfaultfd feature that is broken midway ie it is added
> > broken and then you fix it. Some one bisecting thing might get hurt
> > by that. It is better to add and change everything you need and then
> > add the new feature so that the new feature will work as intended.
> > 
> > So no squashing just change the order ie add the userfaultfd code
> > last.
> 
> Yes this makes sense, I'll do that in v2.  Thanks for the suggestion!

Note before doing a v2 i would really like to see some proof of why
you need new page table flag see my reply to:
userfaultfd: wp: add WP pagetable tracking to x86

As i believe you can identify COW or KSM from UFD write protect with-
out a pte flag.

Cheers,
Jérôme

linux-next: manual merge of the slave-dma tree with Linus' tree

2019-01-22 Thread Stephen Rothwell

Hi Vinod,

Today's linux-next merge of the slave-dma tree got a conflict in:

  drivers/dma/imx-sdma.c

between commit:

  750afb08ca71 ("cross-tree: phase out dma_zalloc_coherent()")

from Linus' tree and commit:

  ceaf52265148 ("dmaengine: imx-sdma: pass ->dev to dma_alloc_coherent() API")

from the slave-dma tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/dma/imx-sdma.c
index 86708fb9bda1,af14a8d6efa8..
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@@ -1182,8 -1189,8 +1189,8 @@@ static int sdma_request_channel0(struc
  {
int ret = -EBUSY;
  
-   sdma->bd0 = dma_alloc_coherent(NULL, PAGE_SIZE, >bd0_phys,
 -  sdma->bd0 = dma_zalloc_coherent(sdma->dev, PAGE_SIZE, >bd0_phys,
 -  GFP_NOWAIT);
++  sdma->bd0 = dma_alloc_coherent(sdma->dev, PAGE_SIZE, >bd0_phys,
 + GFP_NOWAIT);
if (!sdma->bd0) {
ret = -ENOMEM;
goto out;
@@@ -1205,8 -1212,8 +1212,8 @@@ static int sdma_alloc_bd(struct sdma_de
u32 bd_size = desc->num_bd * sizeof(struct sdma_buffer_descriptor);
int ret = 0;
  
-   desc->bd = dma_alloc_coherent(NULL, bd_size, >bd_phys,
- GFP_NOWAIT);
 -  desc->bd = dma_zalloc_coherent(desc->sdmac->sdma->dev, bd_size,
 - >bd_phys, GFP_NOWAIT);
++  desc->bd = dma_alloc_coherent(desc->sdmac->sdma->dev, bd_size,
++>bd_phys, GFP_NOWAIT);
if (!desc->bd) {
ret = -ENOMEM;
goto out;


pgpaiItbQAyfy.pgp
Description: OpenPGP digital signature

Re: [PATCH v2 0/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2

2019-01-22 Thread Martin K. Petersen



Ching,

> This patch series are against to mkp's 5.1/scsi-queue.

Applied to 5.1/scsi-queue. Thank you.

PS. Your file permissions are odd. I always have to change your diffs
from 755 to 644 before applying.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH RFC 03/24] mm: allow VM_FAULT_RETRY for multiple times

2019-01-22 Thread Jerome Glisse

On Wed, Jan 23, 2019 at 10:12:41AM +0800, Peter Xu wrote:
> On Tue, Jan 22, 2019 at 11:53:10AM -0500, Jerome Glisse wrote:
> > On Tue, Jan 22, 2019 at 04:22:38PM +0800, Peter Xu wrote:
> > > On Mon, Jan 21, 2019 at 10:55:36AM -0500, Jerome Glisse wrote:
> > > > On Mon, Jan 21, 2019 at 03:57:01PM +0800, Peter Xu wrote:
> > > > > The idea comes from a discussion between Linus and Andrea [1].
> > > > > 
> > > > > Before this patch we only allow a page fault to retry once.  We 
> > > > > achieved
> > > > > this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
> > > > > handle_mm_fault() the second time.  This was majorly used to avoid
> > > > > unexpected starvation of the system by looping over forever to handle
> > > > > the page fault on a single page.  However that should hardly happen, 
> > > > > and
> > > > > after all for each code path to return a VM_FAULT_RETRY we'll first 
> > > > > wait
> > > > > for a condition (during which time we should possibly yield the cpu) 
> > > > > to
> > > > > happen before VM_FAULT_RETRY is really returned.
> > > > > 
> > > > > This patch removes the restriction by keeping the 
> > > > > FAULT_FLAG_ALLOW_RETRY
> > > > > flag when we receive VM_FAULT_RETRY.  It means that the page fault
> > > > > handler now can retry the page fault for multiple times if necessary
> > > > > without the need to generate another page fault event. Meanwhile we
> > > > > still keep the FAULT_FLAG_TRIED flag so page fault handler can still
> > > > > identify whether a page fault is the first attempt or not.
> > > > 
> > > > So there is nothing protecting starvation after this patch ? AFAICT.
> > > > Do we sufficient proof that we never have a scenario where one process
> > > > might starve fault another ?
> > > > 
> > > > For instance some page locking could starve one process.
> > > 
> > > Hi, Jerome,
> > > 
> > > Do you mean lock_page()?
> > > 
> > > AFAIU lock_page() will only yield the process itself until the lock is
> > > released, so IMHO it's not really starving the process but a natural
> > > behavior.  After all the process may not continue without handling the
> > > page fault correctly.
> > > 
> > > Or when you say "starvation" do you mean that we might return
> > > VM_FAULT_RETRY from handle_mm_fault() continuously so we'll looping
> > > over and over inside the page fault handler?
> > 
> > That one ie every time we retry someone else is holding the lock and
> > thus lock_page_or_retry() will continuously retry. Some process just
> > get unlucky ;)
> > 
> > With existing code because we remove the retry flag then on the second
> > try we end up waiting for the page lock while holding the mmap_sem so
> > we know that we are in line for the page lock and we will get it once
> > it is our turn.
> 
> Ah I see. :)  It's indeed a valid questioning.
> 
> Firstly note that even after this patch we can still identify whether
> we're at the first attempt or not by checking against FAULT_FLAG_TRIED
> (it will be applied to the fault flag in all the retries but not in
> the first atttempt). So IMHO this change might suite if we want to
> keep the old behavior [1]:
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 9f5e323e883e..44942c78bb92 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1351,7 +1351,7 @@ EXPORT_SYMBOL_GPL(__lock_page_killable);
>  int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
>  unsigned int flags)
>  {
> -   if (flags & FAULT_FLAG_ALLOW_RETRY) {
> +   if (!flags & FAULT_FLAG_TRIED) {
> /*
>  * CAUTION! In this case, mmap_sem is not released
>  * even though return 0.

I need to check how FAULT_FLAG_TRIED have been use so far, but yes
it looks like this would keep the existing behavior intact.

> 
> But at the same time I'm stepping back trying to see the whole
> picture... My understanding is that this is really a policy that we
> can decide, and a trade off between "being polite or not on the
> mmap_sem", that when taking the page lock in slow path we either:
> 
>   (1) release mmap_sem before waiting, polite enough but uncertain to
>   finally have the lock, or,
> 
>   (2) keep mmap_sem before waiting, not polite enough but certain to
>   take the lock.
> 
> We did (2) before on the reties because in existing code we only allow
> to retry once, so we can't fail on the 2nd attempt.  That seems to be
> a good reason to being "unpolite" - we took the mmap_sem without
> considering others because we've been "polite" once.  I'm not that
> experienced in mm development but AFAIU solution 2 is only reducing
> our chance of starvation but adding that chance of starvation to other
> processes that want the mmap_sem instead.  So IMHO the starvation
> issue always existed even before this patch, and it looks natural and
> sane to me so far...  And if with that in mind, I can't say that above
> change at [1] would be better, and maybe, it'll be even more fair

Re: [PATCH v5 2/2] kexec, KEYS: Make use of platform keyring for signature verify

2019-01-22 Thread Dave Young

On 01/21/19 at 05:59pm, Kairui Song wrote:
> This patch let kexec_file_load makes use of .platform keyring as fall
> back if it failed to verify a PE signed image against secondary or
> builtin key ring, make it possible to verify kernel image signed with
> preboot keys as well.
> 
> This commit adds a VERIFY_USE_PLATFORM_KEYRING similar to previous
> VERIFY_USE_SECONDARY_KEYRING indicating that verify_pkcs7_signature
> should verify the signature using platform keyring. Also, decrease
> the error message log level when verification failed with -ENOKEY,
> so that if called tried multiple time with different keyring it
> won't generate extra noises.
> 
> Signed-off-by: Kairui Song 
> ---
>  arch/x86/kernel/kexec-bzimage64.c | 13 ++---
>  certs/system_keyring.c| 13 -
>  include/linux/verification.h  |  1 +
>  3 files changed, 23 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/kexec-bzimage64.c 
> b/arch/x86/kernel/kexec-bzimage64.c
> index 7d97e432cbbc..2c007abd3d40 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -534,9 +534,16 @@ static int bzImage64_cleanup(void *loader_data)
>  #ifdef CONFIG_KEXEC_BZIMAGE_VERIFY_SIG
>  static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_len)
>  {
> - return verify_pefile_signature(kernel, kernel_len,
> -VERIFY_USE_SECONDARY_KEYRING,
> -VERIFYING_KEXEC_PE_SIGNATURE);
> + int ret;
> + ret = verify_pefile_signature(kernel, kernel_len,
> +   VERIFY_USE_SECONDARY_KEYRING,
> +   VERIFYING_KEXEC_PE_SIGNATURE);
> + if (ret == -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) {
> + ret = verify_pefile_signature(kernel, kernel_len,
> +   VERIFY_USE_PLATFORM_KEYRING,
> +   VERIFYING_KEXEC_PE_SIGNATURE);
> + }
> + return ret;
>  }
>  #endif
>  
> diff --git a/certs/system_keyring.c b/certs/system_keyring.c
> index 4690ef9cda8a..7085c286f4bd 100644
> --- a/certs/system_keyring.c
> +++ b/certs/system_keyring.c
> @@ -240,11 +240,22 @@ int verify_pkcs7_signature(const void *data, size_t len,
>  #else
>   trusted_keys = builtin_trusted_keys;
>  #endif
> + } else if (trusted_keys == VERIFY_USE_PLATFORM_KEYRING) {
> +#ifdef CONFIG_INTEGRITY_PLATFORM_KEYRING
> + trusted_keys = platform_trusted_keys;
> +#else
> + trusted_keys = NULL;
> +#endif
> + if (!trusted_keys) {
> + ret = -ENOKEY;
> + pr_devel("PKCS#7 platform keyring is not available\n");
> + goto error;
> + }
>   }
>   ret = pkcs7_validate_trust(pkcs7, trusted_keys);
>   if (ret < 0) {
>   if (ret == -ENOKEY)
> - pr_err("PKCS#7 signature not signed with a trusted 
> key\n");
> + pr_devel("PKCS#7 signature not signed with a trusted 
> key\n");
>   goto error;
>   }
>  
> diff --git a/include/linux/verification.h b/include/linux/verification.h
> index cfa4730d607a..018fb5f13d44 100644
> --- a/include/linux/verification.h
> +++ b/include/linux/verification.h
> @@ -17,6 +17,7 @@
>   * should be used.
>   */
>  #define VERIFY_USE_SECONDARY_KEYRING ((struct key *)1UL)
> +#define VERIFY_USE_PLATFORM_KEYRING  ((struct key *)2UL)
>  
>  /*
>   * The use to which an asymmetric key is being put.
> -- 
> 2.20.1
> 

For kexec_file part

Acked-by: Dave Young 

Thanks
Dave

[PATCH] staging: ks7010: remove unnecessary parentheses

2019-01-22 Thread Matt McCoy

Remove unnecessary parentheses reported by checkpatch.

Signed-off-by: Matt McCoy 
---
 drivers/staging/ks7010/ks_hostif.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/ks7010/ks_hostif.c 
b/drivers/staging/ks7010/ks_hostif.c
index 065bce1..d938b09 100644
--- a/drivers/staging/ks7010/ks_hostif.c
+++ b/drivers/staging/ks7010/ks_hostif.c
@@ -35,7 +35,7 @@ static inline u8 get_byte(struct ks_wlan_private *priv)
 {
u8 data;
 
-   data = *(priv->rxp)++;
+   data = *priv->rxp++;
/* length check in advance ! */
--(priv->rx_size);
return data;
@@ -171,7 +171,7 @@ int get_current_ap(struct ks_wlan_private *priv, struct 
link_ap_info *ap_info)
   "- rate_set_size=%d\n",
   ap->bssid[0], ap->bssid[1], ap->bssid[2],
   ap->bssid[3], ap->bssid[4], ap->bssid[5],
-  &(ap->ssid.body[0]),
+  >ssid.body[0],
   ap->rate_set.body[0], ap->rate_set.body[1],
   ap->rate_set.body[2], ap->rate_set.body[3],
   ap->rate_set.body[4], ap->rate_set.body[5],
@@ -732,7 +732,7 @@ void hostif_scan_indication(struct ks_wlan_private *priv)
netdev_dbg(priv->net_dev, " scan_ind_count=%d :: 
aplist.size=%d\n",
priv->scan_ind_count, priv->aplist.size);
get_ap_information(priv, (struct ap_info *)(priv->rxp),
-  &(priv->aplist.ap[priv->scan_ind_count - 
1]));
+  >aplist.ap[priv->scan_ind_count - 1]);
priv->aplist.size = priv->scan_ind_count;
} else {
netdev_dbg(priv->net_dev, " count over :: scan_ind_count=%d\n",
-- 
2.7.4

Re: [PATCH RFC 06/24] userfaultfd: wp: support write protection for userfault vma range

2019-01-22 Thread Peter Xu

On Tue, Jan 22, 2019 at 12:02:24PM -0500, Jerome Glisse wrote:
> On Tue, Jan 22, 2019 at 05:39:35PM +0800, Peter Xu wrote:
> > On Mon, Jan 21, 2019 at 09:05:35AM -0500, Jerome Glisse wrote:
> > 
> > [...]
> > 
> > > > +   change_protection(dst_vma, start, start + len, newprot,
> > > > +   !enable_wp, 0);
> > > 
> > > So setting dirty_accountable bring us to that code in mprotect.c:
> > > 
> > > if (dirty_accountable && pte_dirty(ptent) &&
> > > (pte_soft_dirty(ptent) ||
> > >  !(vma->vm_flags & VM_SOFTDIRTY))) {
> > > ptent = pte_mkwrite(ptent);
> > > }
> > > 
> > > My understanding is that you want to set write flag when enable_wp
> > > is false and you want to set the write flag unconditionaly, right ?
> > 
> > Right.
> > 
> > > 
> > > If so then you should really move the change_protection() flags
> > > patch before this patch and add a flag for setting pte write flags.
> > > 
> > > Otherwise the above is broken at it will only set the write flag
> > > for pte that were dirty and i am guessing so far you always were
> > > lucky because pte were all dirty (change_protection will preserve
> > > dirtyness) when you write protected them.
> > > 
> > > So i believe the above is broken or at very least unclear if what
> > > you really want is to only set write flag to pte that have the
> > > dirty flag set.
> > 
> > You are right, if we build the tree until this patch it won't work for
> > all the cases.  It'll only work if the page was at least writable
> > before and also it's dirty (as you explained).  Sorry to be unclear
> > about this, maybe I should at least mention that in the commit message
> > but I totally forgot it.
> > 
> > All these problems are solved in later on patches, please feel free to
> > have a look at:
> > 
> >   mm: merge parameters for change_protection()
> >   userfaultfd: wp: apply _PAGE_UFFD_WP bit
> >   userfaultfd: wp: handle COW properly for uffd-wp
> > 
> > Note that even in the follow up patches IMHO we can't directly change
> > the write permission since the page can be shared by other processes
> > (e.g., the zero page or COW pages).  But the general idea is the same
> > as you explained.
> > 
> > I tried to avoid squashing these stuff altogether as explained
> > previously.  Also, this patch can be seen as a standalone patch to
> > introduce the new interface which seems to make sense too, and it is
> > indeed still working in many cases so I see the latter patches as
> > enhancement of this one.  Please let me know if you still want me to
> > have all these stuff squashed, or if you'd like me to squash some of
> > them.
> 
> Yeah i have look at those after looking at this one. You should just
> re-order the patch this one first and then one that add new flag,
> then ones that add the new userfaultfd feature. Otherwise you are
> adding a userfaultfd feature that is broken midway ie it is added
> broken and then you fix it. Some one bisecting thing might get hurt
> by that. It is better to add and change everything you need and then
> add the new feature so that the new feature will work as intended.
> 
> So no squashing just change the order ie add the userfaultfd code
> last.

Yes this makes sense, I'll do that in v2.  Thanks for the suggestion!

-- 
Peter Xu

Re: [PATCH] kprobes: no need to check return value of debugfs_create functions

2019-01-22 Thread Masami Hiramatsu

On Tue, 22 Jan 2019 16:21:46 +0100
Greg Kroah-Hartman  wrote:

> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.
> 
> Cc: "Naveen N. Rao" 
> Cc: Anil S Keshavamurthy 
> Cc: "David S. Miller" 
> Cc: Masami Hiramatsu 
> Signed-off-by: Greg Kroah-Hartman 
> ---
>  kernel/kprobes.c | 25 ++---
>  1 file changed, 6 insertions(+), 19 deletions(-)
> 
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index f4ddfdd2d07e..7287e7de2350 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -2566,33 +2566,20 @@ static const struct file_operations fops_kp = {
>  
>  static int __init debugfs_kprobe_init(void)
>  {
> - struct dentry *dir, *file;
> + struct dentry *dir;
>   unsigned int value = 1;
>  
>   dir = debugfs_create_dir("kprobes", NULL);
> - if (!dir)
> - return -ENOMEM;

Here, I think IS_ERR(dir) is OK for debugfs_create_file(),
but dir == NULL has different meaning. I think we'd better
keep this check. (I see, -ENOMEM will be no good...)

Thank you,

>  
> - file = debugfs_create_file("list", 0400, dir, NULL,
> - _kprobes_operations);
> - if (!file)
> - goto error;
> + debugfs_create_file("list", 0400, dir, NULL,
> + _kprobes_operations);
>  
> - file = debugfs_create_file("enabled", 0600, dir,
> - , _kp);
> - if (!file)
> - goto error;
> + debugfs_create_file("enabled", 0600, dir, , _kp);
>  
> - file = debugfs_create_file("blacklist", 0400, dir, NULL,
> - _kprobe_blacklist_ops);
> - if (!file)
> - goto error;
> + debugfs_create_file("blacklist", 0400, dir, NULL,
> + _kprobe_blacklist_ops);
>  
>   return 0;
> -
> -error:
> - debugfs_remove(dir);
> - return -ENOMEM;
>  }
>  
>  late_initcall(debugfs_kprobe_init);
> -- 
> 2.20.1
> 


-- 
Masami Hiramatsu

Re: [PATCH RFC 03/24] mm: allow VM_FAULT_RETRY for multiple times

2019-01-22 Thread Peter Xu

On Tue, Jan 22, 2019 at 11:53:10AM -0500, Jerome Glisse wrote:
> On Tue, Jan 22, 2019 at 04:22:38PM +0800, Peter Xu wrote:
> > On Mon, Jan 21, 2019 at 10:55:36AM -0500, Jerome Glisse wrote:
> > > On Mon, Jan 21, 2019 at 03:57:01PM +0800, Peter Xu wrote:
> > > > The idea comes from a discussion between Linus and Andrea [1].
> > > > 
> > > > Before this patch we only allow a page fault to retry once.  We achieved
> > > > this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
> > > > handle_mm_fault() the second time.  This was majorly used to avoid
> > > > unexpected starvation of the system by looping over forever to handle
> > > > the page fault on a single page.  However that should hardly happen, and
> > > > after all for each code path to return a VM_FAULT_RETRY we'll first wait
> > > > for a condition (during which time we should possibly yield the cpu) to
> > > > happen before VM_FAULT_RETRY is really returned.
> > > > 
> > > > This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
> > > > flag when we receive VM_FAULT_RETRY.  It means that the page fault
> > > > handler now can retry the page fault for multiple times if necessary
> > > > without the need to generate another page fault event. Meanwhile we
> > > > still keep the FAULT_FLAG_TRIED flag so page fault handler can still
> > > > identify whether a page fault is the first attempt or not.
> > > 
> > > So there is nothing protecting starvation after this patch ? AFAICT.
> > > Do we sufficient proof that we never have a scenario where one process
> > > might starve fault another ?
> > > 
> > > For instance some page locking could starve one process.
> > 
> > Hi, Jerome,
> > 
> > Do you mean lock_page()?
> > 
> > AFAIU lock_page() will only yield the process itself until the lock is
> > released, so IMHO it's not really starving the process but a natural
> > behavior.  After all the process may not continue without handling the
> > page fault correctly.
> > 
> > Or when you say "starvation" do you mean that we might return
> > VM_FAULT_RETRY from handle_mm_fault() continuously so we'll looping
> > over and over inside the page fault handler?
> 
> That one ie every time we retry someone else is holding the lock and
> thus lock_page_or_retry() will continuously retry. Some process just
> get unlucky ;)
> 
> With existing code because we remove the retry flag then on the second
> try we end up waiting for the page lock while holding the mmap_sem so
> we know that we are in line for the page lock and we will get it once
> it is our turn.

Ah I see. :)  It's indeed a valid questioning.

Firstly note that even after this patch we can still identify whether
we're at the first attempt or not by checking against FAULT_FLAG_TRIED
(it will be applied to the fault flag in all the retries but not in
the first atttempt). So IMHO this change might suite if we want to
keep the old behavior [1]:

diff --git a/mm/filemap.c b/mm/filemap.c
index 9f5e323e883e..44942c78bb92 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1351,7 +1351,7 @@ EXPORT_SYMBOL_GPL(__lock_page_killable);
 int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 unsigned int flags)
 {
-   if (flags & FAULT_FLAG_ALLOW_RETRY) {
+   if (!flags & FAULT_FLAG_TRIED) {
/*
 * CAUTION! In this case, mmap_sem is not released
 * even though return 0.

But at the same time I'm stepping back trying to see the whole
picture... My understanding is that this is really a policy that we
can decide, and a trade off between "being polite or not on the
mmap_sem", that when taking the page lock in slow path we either:

  (1) release mmap_sem before waiting, polite enough but uncertain to
  finally have the lock, or,

  (2) keep mmap_sem before waiting, not polite enough but certain to
  take the lock.

We did (2) before on the reties because in existing code we only allow
to retry once, so we can't fail on the 2nd attempt.  That seems to be
a good reason to being "unpolite" - we took the mmap_sem without
considering others because we've been "polite" once.  I'm not that
experienced in mm development but AFAIU solution 2 is only reducing
our chance of starvation but adding that chance of starvation to other
processes that want the mmap_sem instead.  So IMHO the starvation
issue always existed even before this patch, and it looks natural and
sane to me so far...  And if with that in mind, I can't say that above
change at [1] would be better, and maybe, it'll be even more fair that
we should always release the mmap_sem first in this case (assuming
that we'll after all have that lock though we might pay more times of
retries)?

Or, is there a way to constantly starve the process that handles the
page fault that I've totally missed?

Thanks,

-- 
Peter Xu

Re: [PATCH v3 1/1] arm64: dts: sdm845: wireup the thermal trip points to cpufreq

2019-01-22 Thread Matthias Kaehlcke

Hi Amit,

On Mon, Jan 21, 2019 at 11:38:34PM +0530, Amit Kucheria wrote:
> Since all cpus in the big and little clusters, respectively, are in the
> same frequency domain, use all of them for mitigation in the
> cooling-map. We end up with two cooling devices - one each for the big
> and little clusters.
> 
> We throttle lightly at the first trip point, just removing the boost
> frequency. At the next trip point we allow ourselves to be throttled to
> any extent.
> 
> Signed-off-by: Amit Kucheria 
> ---
>  arch/arm64/boot/dts/qcom/sdm845.dtsi | 225 +--
>  1 file changed, 209 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> index c27cbd3bcb0a..878f661d16eb 100644
> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> @@ -13,6 +13,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  / {
>   interrupt-parent = <>;
> @@ -99,6 +100,7 @@
>   compatible = "qcom,kryo385";
>   reg = <0x0 0x0>;
>   enable-method = "psci";
> + #cooling-cells = <2>;
>   next-level-cache = <_0>;
>   L2_0: l2-cache {
>   compatible = "cache";
> @@ -114,6 +116,7 @@
>   compatible = "qcom,kryo385";
>   reg = <0x0 0x100>;
>   enable-method = "psci";
> + #cooling-cells = <2>;
>   next-level-cache = <_100>;
>   L2_100: l2-cache {
>   compatible = "cache";
> @@ -126,6 +129,7 @@
>   compatible = "qcom,kryo385";
>   reg = <0x0 0x200>;
>   enable-method = "psci";
> + #cooling-cells = <2>;
>   next-level-cache = <_200>;
>   L2_200: l2-cache {
>   compatible = "cache";
> @@ -138,6 +142,7 @@
>   compatible = "qcom,kryo385";
>   reg = <0x0 0x300>;
>   enable-method = "psci";
> + #cooling-cells = <2>;
>   next-level-cache = <_300>;
>   L2_300: l2-cache {
>   compatible = "cache";
> @@ -150,6 +155,7 @@
>   compatible = "qcom,kryo385";
>   reg = <0x0 0x400>;
>   enable-method = "psci";
> + #cooling-cells = <2>;
>   next-level-cache = <_400>;
>   L2_400: l2-cache {
>   compatible = "cache";
> @@ -162,6 +168,7 @@
>   compatible = "qcom,kryo385";
>   reg = <0x0 0x500>;
>   enable-method = "psci";
> + #cooling-cells = <2>;
>   next-level-cache = <_500>;
>   L2_500: l2-cache {
>   compatible = "cache";
> @@ -174,6 +181,7 @@
>   compatible = "qcom,kryo385";
>   reg = <0x0 0x600>;
>   enable-method = "psci";
> + #cooling-cells = <2>;
>   next-level-cache = <_600>;
>   L2_600: l2-cache {
>   compatible = "cache";
> @@ -186,6 +194,7 @@
>   compatible = "qcom,kryo385";
>   reg = <0x0 0x700>;
>   enable-method = "psci";
> + #cooling-cells = <2>;
>   next-level-cache = <_700>;
>   L2_700: l2-cache {
>   compatible = "cache";
> @@ -1691,18 +1700,41 @@
>   thermal-sensors = < 1>;
>  
>   trips {
> - cpu_alert0: trip0 {
> + cpu0_alert1: trip-point@0 {
>   temperature = <75000>;

In my observations a 'switch on/threshold' temperature of 75 degrees
leads to aggressive throttling with IPA when the temperature is above
this threshold:

[  716.760804] cpu_cooling_ratelimit: 31 callbacks suppressed
[  716.760836] cpu cpu4: Cooling state set to 10. New max freq = 192
[  716.773390] power_allocator_ratelimit: 15 callbacks suppressed
[  716.773405] thermal thermal_zone5: Controlling power: control_temp=95000 
last_temp=73500, curr_temp=75200 total_requested_power=39025 
total_granted_power=18654
[  749.609336] cpu_cooling_ratelimit: 45 callbacks suppressed
[  749.609371] cpu cpu4: Cooling state set to 11. New max freq = 1843200
[  749.624300] power_allocator_ratelimit: 24 callbacks suppressed
[  749.624323] thermal thermal_zone5: Controlling power: control_temp=95000 
last_temp=70800, curr_temp=77200

Re: [PATCH] workqueue: Try to catch flush_work() without INIT_WORK().

2019-01-22 Thread Daniel Jordan

On Wed, Jan 23, 2019 at 09:44:12AM +0900, Tetsuo Handa wrote:
> Daniel Jordan wrote:
> > On Sat, Jan 19, 2019 at 11:41:22AM +0900, Tetsuo Handa wrote:
> > > On 2019/01/19 4:48, Daniel Jordan wrote:
> > > > On Sat, Jan 19, 2019 at 02:04:58AM +0900, Tetsuo Handa wrote:
> > > > __queue_work has a sanity check already for work, but using list_empty. 
> > > >  Seems
> > > > slightly better to be consistent?
> > > > 
> > > 
> > > list_empty() won't work, for "struct work_struct" is embedded into a 
> > > struct
> > > which is allocated by kzalloc().
> > 
> > Please check list_empty's definition again, it compares the address of the 
> > node
> > to its next pointer, so it should work for a zeroed node.  I'll reiterate 
> > that
> > it seems slightly better to be consistent in "is work_struct initialized?"
> > checks, but it's not a big deal and I'm fine either way.
> 
> You are talking about
> 
>   if (WARN_ON(!list_empty(>entry))) {
>   spin_unlock(>pool->lock);
>   return;
>   }
> 
> part in __queue_work(), aren't you? But since flush_work() is used for 
> waiting for
> a work to complete, that work can be either queued state (list_empty() == 
> false) or
> not queued state (list_empty() == true). Thus, I don't think that 
> flush_work() can
> use list_empty() for checking whether that work was initialized.

Oh, you're right, sorry for the noise!

> [PATCH v2] workqueue: Try to catch flush_work() without INIT_WORK().
> 
> syzbot found a flush_work() caller who forgot to call INIT_WORK()
> because that work_struct was allocated by kzalloc() [1]. But the message
> 
>   INFO: trying to register non-static key.
>   the code is fine but needs lockdep annotation.
>   turning off the locking correctness validator.
> 
> by lock_map_acquire() is failing to tell that INIT_WORK() is missing.
> 
> Since flush_work() without INIT_WORK() is a bug, and INIT_WORK() should
> set ->func field to non-zero, let's warn if ->func field is zero.
> 
> [1] 
> https://syzkaller.appspot.com/bug?id=a5954455fcfa51c29ca2ab55b203076337e1c770
> 
> Signed-off-by: Tetsuo Handa 
> ---
>  kernel/workqueue.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 392be4b..a503ad9 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2908,6 +2908,9 @@ static bool __flush_work(struct work_struct *work, bool 
> from_cancel)
>   if (WARN_ON(!wq_online))
>   return false;
>  
> + if (WARN_ON(!work->func))
> + return false;
> +
>   if (!from_cancel) {
>   lock_map_acquire(>lockdep_map);
>   lock_map_release(>lockdep_map);

Thanks for updating the changelog.  FWIW, you can add

Reviewed-by: Daniel Jordan

Re: possible deadlock in __do_page_fault

2019-01-22 Thread Tetsuo Handa

Joel Fernandes wrote:
> > Why do we need to call fallocate() synchronously with ashmem_mutex held?
> > Why can't we call fallocate() asynchronously from WQ_MEM_RECLAIM workqueue
> > context so that we can call fallocate() with ashmem_mutex not held?
> > 
> > I don't know how ashmem works, but as far as I can guess, offloading is
> > possible as long as other operations which depend on the completion of
> > fallocate() operation (e.g. read()/mmap(), querying/changing pinned status)
> > wait for completion of asynchronous fallocate() operation (like a draft
> > patch shown below is doing).
> 
> This adds a bit of complexity, I am worried if it will introduce more
> bugs especially because ashmem is going away in the long term, in favor of
> memfd - and if its worth adding more complexity / maintenance burden to it.

I don't care migrating to memfd. I care when bugs are fixed.

> 
> I am wondering if we can do this synchronously, without using a workqueue.
> All you would need is a temporary list of areas to punch. In
> ashmem_shrink_scan, you would create this list under mutex and then once you
> release the mutex, you can go through this list and do the fallocate followed
> by the wake up of waiters on the wait queue, right? If you can do it this
> way, then it would be better IMO.

Are you sure that none of locks held before doing GFP_KERNEL allocation
interferes lock dependency used by fallocate() ? If yes, we can do without a
workqueue context (like a draft patch shown below). Since I don't understand
what locks are potentially involved, I offloaded to a clean workqueue context.

Anyway, I need your checks regarding whether this approach is waiting for
completion at all locations which need to wait for completion.

---
 drivers/staging/android/ashmem.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c
index 90a8a9f1ac7d..6a267563cb66 100644
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -75,6 +75,9 @@ struct ashmem_range {
 /* LRU list of unpinned pages, protected by ashmem_mutex */
 static LIST_HEAD(ashmem_lru_list);
 
+static atomic_t ashmem_shrink_inflight = ATOMIC_INIT(0);
+static DECLARE_WAIT_QUEUE_HEAD(ashmem_shrink_wait);
+
 /*
  * long lru_count - The count of pages on our LRU list.
  *
@@ -292,6 +295,7 @@ static ssize_t ashmem_read_iter(struct kiocb *iocb, struct 
iov_iter *iter)
int ret = 0;
 
mutex_lock(_mutex);
+   wait_event(ashmem_shrink_wait, !atomic_read(_shrink_inflight));
 
/* If size is not set, or set to 0, always return EOF. */
if (asma->size == 0)
@@ -359,6 +363,7 @@ static int ashmem_mmap(struct file *file, struct 
vm_area_struct *vma)
int ret = 0;
 
mutex_lock(_mutex);
+   wait_event(ashmem_shrink_wait, !atomic_read(_shrink_inflight));
 
/* user needs to SET_SIZE before mapping */
if (!asma->size) {
@@ -438,7 +443,6 @@ static int ashmem_mmap(struct file *file, struct 
vm_area_struct *vma)
 static unsigned long
 ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
-   struct ashmem_range *range, *next;
unsigned long freed = 0;
 
/* We might recurse into filesystem code, so bail out if necessary */
@@ -448,17 +452,27 @@ ashmem_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
if (!mutex_trylock(_mutex))
return -1;
 
-   list_for_each_entry_safe(range, next, _lru_list, lru) {
+   while (!list_empty(_lru_list)) {
+   struct ashmem_range *range =
+   list_first_entry(_lru_list, typeof(*range), lru);
loff_t start = range->pgstart * PAGE_SIZE;
loff_t end = (range->pgend + 1) * PAGE_SIZE;
+   struct file *f = range->asma->file;
 
-   range->asma->file->f_op->fallocate(range->asma->file,
-   FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-   start, end - start);
+   get_file(f);
+   atomic_inc(_shrink_inflight);
range->purged = ASHMEM_WAS_PURGED;
lru_del(range);
 
freed += range_size(range);
+   mutex_unlock(_mutex);
+   f->f_op->fallocate(f,
+  FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+  start, end - start);
+   fput(f);
+   if (atomic_dec_and_test(_shrink_inflight))
+   wake_up_all(_shrink_wait);
+   mutex_lock(_mutex);
if (--sc->nr_to_scan <= 0)
break;
}
@@ -713,6 +727,7 @@ static int ashmem_pin_unpin(struct ashmem_area *asma, 
unsigned long cmd,
return -EFAULT;
 
mutex_lock(_mutex);
+   wait_event(ashmem_shrink_wait, !atomic_read(_shrink_inflight));

Re: [alsa-devel] [PATCH] ASoC: soc-core: Fix null pointer dereference in soc_find_component

2019-01-22 Thread Pierre-Louis Bossart



On 1/22/19 7:36 PM, Curtis Malainey wrote:

Curtis Malainey | Software Engineer | cujomalai...@google.com | 650-898-3849


On Wed, Jan 23, 2019 at 4:11 AM Pierre-Louis Bossart
 wrote:



The issue was that we were seeing a memory corruption bug on an AMD
chromebooks with that function already (not observed on Intel). I was
testing some SOF integrations and was seeing this in the kernel logs.
I had Dylan verify my logic before I sent the patch because it took so
long to identify the bug and it was traced to the patch that introduce
soc_init_platform.

[   10.922112] cz-da7219-max98357a AMD7219:00: ASoC: CPU DAI
designware-i2s.1.auto not registered
[   10.922122] cz-da7219-max98357a AMD7219:00:
devm_snd_soc_register_card(acpd7219m98357) failed: -517
[   11.001411] cz-da7219-max98357a AMD7219:00: ASoC: Both platform
name/of_node are set for amd-max98357-play
[   11.001423] cz-da7219-max98357a AMD7219:00: ASoC: failed to init
link amd-max98357-play
[   11.001431] cz-da7219-max98357a AMD7219:00:
devm_snd_soc_register_card(acpd7219m98357) failed: -22
[   11.001577] cz-da7219-max98357a: probe of AMD7219:00 failed with error -22

of_node was never getting set but the pointer was becoming populated
(outside of the probe call) which traced to soc_init_platform function
which was not reallocating memory on a EPROBE_DEFER even though it was
getting freed by devm. I am not very familiar with devm but my local
maintainers say that it should be freeing the memory even on a
PROBE_DEFER.
The patch should mirror the memory behaviour in
snd_soc_init_multicodec which also reallocates its memory on every
probe. I'm not sure how the patch is causing you to defer, is your
component list corrupt?

Sorry for the duplicate spam, forgot to send via plain text mode,
re-sending for the mailing list so it gets accepted.

There is no defer issue with the intel stuff, but we call this routine
multiple times

snd_soc_register_card

--soc_init_dai_link

snd_soc_init_platform

-- soc_soc_bind_card

snd_soc_instantiate_card

-- soc_check_tplg_fes

 snd_soc_init_platform << ALLOC1

soc_init_dai_link

--snd_soc_init_platform << ALLOC2


Ah that explains it, in my testing I didn't have the patch that
brought in the call from within tplg_fes

Initially dai_link->legacy_platform is 0, so gets set after the first
first devm_kzalloc (ALLOC1) and after that we always allocate new memory
(ALLOC2). The end result is that whatever we set in soc_check_tplg_fes
is lost with the new/unnecessary alloc.

I would guess your solution is also a work-around, if devm_ effectively
freed the memory then the pointer would become NULL. Or may that's the
issue is that no one actually resets it.



Yes, its a work around to fix the memory issue. If you set the
platform in the machine driver the code will ignore it and not reset
it. That being said that is not a full proof workaround and a better
solution is definitely needed. We could go and clean up the pointers
in soc_instantiate_card based on the flag being set. That way we only
relocate on a NULL pointer like we used to but still don't affect
statically allocated memory. I will draft a patch, test it on the AMD
device, reply to this thread later with it, Pierre can you test it as
well?

I am curious why soc_check_tplg_fes is calling snd_soc_init_platform.
It should have already been called earlier, in soc_init_dai_link at
the beginning of snd_soc_register_card so the memory should already be
initialized. Unless I am missing somewhere where links are getting
added between the calls.


This is actually a second order problem, the main issue i have is that 
the very first call to init_dai_link fails with the new DEFER_PROBE 
handling.


I don't quite understand what Linaro/AMD folks are doing but I trust 
their changes are legitimate. To move forward, maybe it's not worth 
spending too much time on a grand unification of string theory, there 
are simpler solutions: the Intel machine drivers already do get the 
platform driver name as an platform_data argument, so we could modify 
the dailinks platform names before even registering the card. I tested 
with the attached proof-of-concept patch, it adds 2 lines of code per 
machine driver if we use a common helper (after the transition to the 
"modern" dailink representation that's needed anyways) so maybe it's 
better in the end? the override we care about is really the automatic 
handling of all the hard-coded front-ends, the platform-name override 
isn't really a battle i want to pick or spend time on.




>From 5680c64b09964b134e20bf96142d1ce5dcf0f77f Mon Sep 17 00:00:00 2001
From: Pierre-Louis Bossart 
Date: Tue, 22 Jan 2019 18:53:43 -0600
Subject: [PATCH] ASoC: add helper to change platform name for all dailinks

To reuse the same machine drivers with Atom/SST, Skylake and SOF, we
need to change the default platform_name (or platforms->name in the
"modern" representation).

So far, this override was done with an automatic

Re: [LSF/MM TOPIC] Page flags, can we free up space ?

2019-01-22 Thread Mike Kravetz

On 1/22/19 12:17 PM, Jerome Glisse wrote:
> So lattely i have been looking at page flags and we are using 6 flags
> for memory reclaim and compaction:
> 
> PG_referenced
> PG_lru
> PG_active
> PG_workingset
> PG_reclaim
> PG_unevictable
> 
> On top of which you can add the page anonymous flag (anonymous or
> share memory)
> PG_anon // does not exist, lower bit of page->mapping
> 
> And also the movable flag (which alias with KSM)
> PG_movable // does not exist, lower bit of page->mapping
> 
> 
> So i would like to explore if there is a way to express the same amount
> of information with less bits. My methodology is to exhaustively list
> all the possible states (valid combination of above flags) and then to
> see how we change from one state to another (what event trigger the change
> like mlock(), page being referenced, ...) and under which rules (ie do we
> hold the page lock, zone lock, ...).
> 
> My hope is that there might be someway to use less bits to express the
> same thing. I am doing this because for my work on generic page write
> protection (ie KSM for file back page) which i talk about last year and
> want to talk about again ;) I will need to unalias the movable bit from
> KSM bit.
> 
> 
> Right now this is more a temptative ie i do not know if i will succeed,
> in any case i can report on failure or success and discuss my finding to
> get people opinions on the matter.
> 
> 
> I think everyone interested in mm will be interested in this topic :)

Explicitly adding Matthew on Cc as I am pretty sure he has been working
in this area.

-- 
Mike Kravetz

[PATCH] acpi_pm: Reduce PMTMR counter read contention

2019-01-22 Thread Zhenzhong Duan

On a large system with many CPUs, using PMTMR as the clock source can
have a significant impact on the overall system performance because
of the following reasons:
 1) There is a single PMTMR counter shared by all the CPUs.
 2) PMTMR counter reading is a very slow operation.

Using PMTMR as the default clock source may happen when, for example,
the TSC clock calibration exceeds the allowable tolerance and HPET
disabled by nohpet on kernel command line. Sometimes the performance
slowdown can be so severe that the system may crash because of a NMI
watchdog soft lockup, logs:

[   20.181521] clocksource: acpi_pm: mask: 0xff max_cycles: 0xff,
max_idle_ns: 2085701024 ns
[   44.273786] BUG: soft lockup - CPU#48 stuck for 23s! [swapper/48:0]
[   44.279992] BUG: soft lockup - CPU#49 stuck for 23s! [migration/49:307]
[   44.285169] BUG: soft lockup - CPU#50 stuck for 23s! [migration/50:313]

Commit f99fd22e4d4b ("x86/hpet: Reduce HPET counter read contention")
fixed a similar issue for HPET, this patch adapts that design to PMTMR.

Signed-off-by: Zhenzhong Duan 
Tested-by: Kin Cho 
Cc: Daniel Lezcano 
Cc: Thomas Gleixner 
Cc: Waiman Long 
Cc: Srinivas Eeda 
---
 drivers/clocksource/acpi_pm.c | 101 +-
 1 file changed, 100 insertions(+), 1 deletion(-)

diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 1961e35..8b522eb 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -32,12 +32,111 @@
  */
 u32 pmtmr_ioport __read_mostly;
 
-static inline u32 read_pmtmr(void)
+static inline u32 pmtmr_readl(void)
 {
/* mask the output to 24 bits */
return inl(pmtmr_ioport) & ACPI_PM_MASK;
 }
 
+#if defined(CONFIG_SMP) && defined(CONFIG_64BIT)
+/*
+ * Reading the PMTMR counter is a very slow operation. If a large number of
+ * CPUs are trying to access the PMTMR counter simultaneously, it can cause
+ * massive delay and slow down system performance dramatically. This may
+ * happen when PMTMR is the default clock source instead of TSC. For a
+ * really large system with hundreds of CPUs, the slowdown may be so
+ * severe that it may actually crash the system because of a NMI watchdog
+ * soft lockup, for example.
+ *
+ * If multiple CPUs are trying to access the PMTMR counter at the same time,
+ * we don't actually need to read the counter multiple times. Instead, the
+ * other CPUs can use the counter value read by the first CPU in the group.
+ *
+ * This special feature is only enabled on x86-64 systems. It is unlikely
+ * that 32-bit x86 systems will have enough CPUs to require this feature
+ * with its associated locking overhead. And we also need 64-bit atomic
+ * read.
+ *
+ * The lock and the pmtmr value are stored together and can be read in a
+ * single atomic 64-bit read. It is explicitly assumed that arch_spinlock_t
+ * is 32 bits in size.
+ */
+union pmtmr_lock {
+   struct {
+   arch_spinlock_t lock;
+   u32 value;
+   };
+   u64 lockval;
+};
+
+static union pmtmr_lock pmtmr __cacheline_aligned = {
+   { .lock = __ARCH_SPIN_LOCK_UNLOCKED, },
+};
+
+static u32 read_pmtmr(void)
+{
+   unsigned long flags;
+   union pmtmr_lock old, new;
+
+   BUILD_BUG_ON(sizeof(union pmtmr_lock) != 8);
+
+   /*
+* Read PMTMR directly if in NMI.
+*/
+   if (in_nmi())
+   return (u64)pmtmr_readl();
+
+   /*
+* Read the current state of the lock and PMTMR value atomically.
+*/
+   old.lockval = READ_ONCE(pmtmr.lockval);
+
+   if (arch_spin_is_locked())
+   goto contended;
+
+   local_irq_save(flags);
+   if (arch_spin_trylock()) {
+   new.value = pmtmr_readl();
+   /*
+* Use WRITE_ONCE() to prevent store tearing.
+*/
+   WRITE_ONCE(pmtmr.value, new.value);
+   arch_spin_unlock();
+   local_irq_restore(flags);
+   return (u64)new.value;
+   }
+   local_irq_restore(flags);
+
+contended:
+   /*
+* Contended case
+* --
+* Wait until the PMTMR value change or the lock is free to indicate
+* its value is up-to-date.
+*
+* It is possible that old.value has already contained the latest
+* PMTMR value while the lock holder was in the process of releasing
+* the lock. Checking for lock state change will enable us to return
+* the value immediately instead of waiting for the next PMTMR reader
+* to come along.
+*/
+   do {
+   cpu_relax();
+   new.lockval = READ_ONCE(pmtmr.lockval);
+   } while ((new.value == old.value) && arch_spin_is_locked());
+
+   return (u64)new.value;
+}
+#else
+/*
+ * For UP or 32-bit.
+ */
+static inline u32 read_pmtmr(void)
+{
+   return pmtmr_readl();
+}
+#endif
+
 u32 acpi_pm_read_verified(void)
 {
u32 v1 =

[PATCH -next] perf: xgene: Remove set but not used variable 'config'

2019-01-22 Thread YueHaibing

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/perf/xgene_pmu.c: In function 'xgene_perf_stop':
drivers/perf/xgene_pmu.c:1055:6: warning:
 variable 'config' set but not used [-Wunused-but-set-variable]

It never used since introduction.

Signed-off-by: YueHaibing 
---
 drivers/perf/xgene_pmu.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index d4ec048..27574e8 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -1052,7 +1052,6 @@ static void xgene_perf_start(struct perf_event *event, 
int flags)
 static void xgene_perf_stop(struct perf_event *event, int flags)
 {
struct hw_perf_event *hw = >hw;
-   u64 config;
 
if (hw->state & PERF_HES_UPTODATE)
return;
@@ -1064,7 +1063,6 @@ static void xgene_perf_stop(struct perf_event *event, int 
flags)
if (hw->state & PERF_HES_UPTODATE)
return;
 
-   config = hw->config;
xgene_perf_read(event);
hw->state |= PERF_HES_UPTODATE;
 }

Re: [PATCH net-next 1/1] net: stmmac: implement the SIOCGHWTSTAMP ioctl

2019-01-22 Thread David Miller

From: Artem Panfilov 
Date: Sun, 20 Jan 2019 19:05:15 +0300

> This patch adds support for the SIOCGHWTSTAMP ioctl which enables user
> processes to read the current hwtstamp_config settings
> non-destructively.
> 
> Signed-off-by: Artem Panfilov 

Applied, thanks.

Re: [alsa-devel] [PATCH] ASoC: soc-core: Fix null pointer dereference in soc_find_component

2019-01-22 Thread Curtis Malainey

Curtis Malainey | Software Engineer | cujomalai...@google.com | 650-898-3849


On Wed, Jan 23, 2019 at 4:11 AM Pierre-Louis Bossart
 wrote:
>
>
> > The issue was that we were seeing a memory corruption bug on an AMD
> > chromebooks with that function already (not observed on Intel). I was
> > testing some SOF integrations and was seeing this in the kernel logs.
> > I had Dylan verify my logic before I sent the patch because it took so
> > long to identify the bug and it was traced to the patch that introduce
> > soc_init_platform.
> >
> > [   10.922112] cz-da7219-max98357a AMD7219:00: ASoC: CPU DAI
> > designware-i2s.1.auto not registered
> > [   10.922122] cz-da7219-max98357a AMD7219:00:
> > devm_snd_soc_register_card(acpd7219m98357) failed: -517
> > [   11.001411] cz-da7219-max98357a AMD7219:00: ASoC: Both platform
> > name/of_node are set for amd-max98357-play
> > [   11.001423] cz-da7219-max98357a AMD7219:00: ASoC: failed to init
> > link amd-max98357-play
> > [   11.001431] cz-da7219-max98357a AMD7219:00:
> > devm_snd_soc_register_card(acpd7219m98357) failed: -22
> > [   11.001577] cz-da7219-max98357a: probe of AMD7219:00 failed with error 
> > -22
> >
> > of_node was never getting set but the pointer was becoming populated
> > (outside of the probe call) which traced to soc_init_platform function
> > which was not reallocating memory on a EPROBE_DEFER even though it was
> > getting freed by devm. I am not very familiar with devm but my local
> > maintainers say that it should be freeing the memory even on a
> > PROBE_DEFER.
> > The patch should mirror the memory behaviour in
> > snd_soc_init_multicodec which also reallocates its memory on every
> > probe. I'm not sure how the patch is causing you to defer, is your
> > component list corrupt?
> >
> > Sorry for the duplicate spam, forgot to send via plain text mode,
> > re-sending for the mailing list so it gets accepted.
>
> There is no defer issue with the intel stuff, but we call this routine
> multiple times
>
> snd_soc_register_card
>
> --soc_init_dai_link
>
> snd_soc_init_platform
>
> -- soc_soc_bind_card
>
> snd_soc_instantiate_card
>
> -- soc_check_tplg_fes
>
>  snd_soc_init_platform << ALLOC1
>
> soc_init_dai_link
>
> --snd_soc_init_platform << ALLOC2
>
Ah that explains it, in my testing I didn't have the patch that
brought in the call from within tplg_fes
>
> Initially dai_link->legacy_platform is 0, so gets set after the first
> first devm_kzalloc (ALLOC1) and after that we always allocate new memory
> (ALLOC2). The end result is that whatever we set in soc_check_tplg_fes
> is lost with the new/unnecessary alloc.
>
> I would guess your solution is also a work-around, if devm_ effectively
> freed the memory then the pointer would become NULL. Or may that's the
> issue is that no one actually resets it.
>
>
Yes, its a work around to fix the memory issue. If you set the
platform in the machine driver the code will ignore it and not reset
it. That being said that is not a full proof workaround and a better
solution is definitely needed. We could go and clean up the pointers
in soc_instantiate_card based on the flag being set. That way we only
relocate on a NULL pointer like we used to but still don't affect
statically allocated memory. I will draft a patch, test it on the AMD
device, reply to this thread later with it, Pierre can you test it as
well?

I am curious why soc_check_tplg_fes is calling snd_soc_init_platform.
It should have already been called earlier, in soc_init_dai_link at
the beginning of snd_soc_register_card so the memory should already be
initialized. Unless I am missing somewhere where links are getting
added between the calls.

Re: [PATCH 0/5] mips: cleanup debugfs usage

2019-01-22 Thread Paul Burton

Hello,

Greg Kroah-Hartman wrote:
> When calling debugfs code, there is no need to ever check the return
> value of the call, as no logic should ever change if a call works
> properly or not.  Fix up a bunch of x86-specific code to not care about
> the results of debugfs.
> 
> Greg Kroah-Hartman (5):
> mips: cavium: no need to check return value of debugfs_create
> functions
> mips: ralink: no need to check return value of debugfs_create
> functions
> mips: mm: no need to check return value of debugfs_create functions
> mips: math-emu: no need to check return value of debugfs_create
> functions
> mips: kernel: no need to check return value of debugfs_create
> functions
> 
> arch/mips/cavium-octeon/oct_ilm.c | 31 ---
> arch/mips/kernel/mips-r2-to-r6-emul.c | 21 --
> arch/mips/kernel/segment.c| 15 +++--
> arch/mips/kernel/setup.c  |  7 +-
> arch/mips/kernel/spinlock_test.c  | 21 --
> arch/mips/kernel/unaligned.c  | 16 --
> arch/mips/math-emu/me-debugfs.c   | 23 
> arch/mips/mm/sc-debugfs.c | 15 +++--
> arch/mips/ralink/bootrom.c|  8 +--
> 9 files changed, 28 insertions(+), 129 deletions(-)

Series applied to mips-next.

Thanks,
Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.bur...@mips.com to report it. ]

Re: [PATCH v1 00/11] MIPS: ath79: move towards proper OF support

2019-01-22 Thread Paul Burton

Hello,

Oleksij Rempel wrote:
> This patches are take from OpenWRT, rebased and tested with kernel
> v5.0-rt1 on DPTechnics DPT-Module (Atheros AR9331) by me.
> 
> Since one dt-bindings header is touched, I added DT maintainers to the
> TO/CC.
> 
> Felix Fietkau (6):
> MIPS: ath79: add helpers for setting clocks and expose the ref clock
> MIPS: ath79: move legacy "wdt" and "uart" clock aliases out of soc
> init
> MIPS: ath79: pass PLL base to clock init functions
> MIPS: ath79: make specifying the reference clock in DT optional
> MIPS: ath79: support setting up clock via DT on all SoC types
> MIPS: ath79: export switch MDIO reference clock
> 
> John Crispin (5):
> MIPS: ath79: drop legacy IRQ code
> MIPS: ath79: drop machfiles
> MIPS: ath79: drop legacy pci code
> MIPS: ath79: drop platform device registration code
> MIPS: ath79: drop !OF clock code
> 
> arch/mips/Kconfig|   1 -
> arch/mips/ath79/Kconfig  |  73 -
> arch/mips/ath79/Makefile |  23 +-
> arch/mips/ath79/clock.c  | 342 ++-
> arch/mips/ath79/common.h |   5 -
> arch/mips/ath79/dev-common.c | 159 ---
> arch/mips/ath79/dev-common.h |  18 --
> arch/mips/ath79/dev-gpio-buttons.c   |  56 
> arch/mips/ath79/dev-gpio-buttons.h   |  23 --
> arch/mips/ath79/dev-leds-gpio.c  |  54 
> arch/mips/ath79/dev-leds-gpio.h  |  21 --
> arch/mips/ath79/dev-spi.c|  38 ---
> arch/mips/ath79/dev-spi.h|  22 --
> arch/mips/ath79/dev-usb.c| 242 
> arch/mips/ath79/dev-usb.h|  17 --
> arch/mips/ath79/dev-wmac.c   | 155 --
> arch/mips/ath79/dev-wmac.h   |  17 --
> arch/mips/ath79/irq.c| 169 ---
> arch/mips/ath79/mach-ap121.c |  92 --
> arch/mips/ath79/mach-ap136.c | 156 ---
> arch/mips/ath79/mach-ap81.c  | 100 ---
> arch/mips/ath79/mach-db120.c | 136 -
> arch/mips/ath79/mach-pb44.c  | 128 -
> arch/mips/ath79/mach-ubnt-xm.c   | 126 -
> arch/mips/ath79/machtypes.h  |  28 --
> arch/mips/ath79/pci.c| 273 --
> arch/mips/ath79/pci.h|  35 ---
> arch/mips/ath79/setup.c  |  78 +-
> arch/mips/include/asm/mach-ath79/ath79.h |   4 -
> arch/mips/pci/Makefile   |   1 +
> arch/mips/pci/fixup-ath79.c  |  21 ++
> include/dt-bindings/clock/ath79-clk.h|   4 +-
> 32 files changed, 185 insertions(+), 2432 deletions(-)
> delete mode 100644 arch/mips/ath79/dev-common.c
> delete mode 100644 arch/mips/ath79/dev-common.h
> delete mode 100644 arch/mips/ath79/dev-gpio-buttons.c
> delete mode 100644 arch/mips/ath79/dev-gpio-buttons.h
> delete mode 100644 arch/mips/ath79/dev-leds-gpio.c
> delete mode 100644 arch/mips/ath79/dev-leds-gpio.h
> delete mode 100644 arch/mips/ath79/dev-spi.c
> delete mode 100644 arch/mips/ath79/dev-spi.h
> delete mode 100644 arch/mips/ath79/dev-usb.c
> delete mode 100644 arch/mips/ath79/dev-usb.h
> delete mode 100644 arch/mips/ath79/dev-wmac.c
> delete mode 100644 arch/mips/ath79/dev-wmac.h
> delete mode 100644 arch/mips/ath79/irq.c
> delete mode 100644 arch/mips/ath79/mach-ap121.c
> delete mode 100644 arch/mips/ath79/mach-ap136.c
> delete mode 100644 arch/mips/ath79/mach-ap81.c
> delete mode 100644 arch/mips/ath79/mach-db120.c
> delete mode 100644 arch/mips/ath79/mach-pb44.c
> delete mode 100644 arch/mips/ath79/mach-ubnt-xm.c
> delete mode 100644 arch/mips/ath79/machtypes.h
> delete mode 100644 arch/mips/ath79/pci.c
> delete mode 100644 arch/mips/ath79/pci.h
> create mode 100644 arch/mips/pci/fixup-ath79.c

Series applied to mips-next.

Thanks,
Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.bur...@mips.com to report it. ]

Re: [PATCH net-next v2 0/4] bridge: implement Multicast Router Discovery (RFC4286)

2019-01-22 Thread David Miller

From: Linus Lüssing 
Date: Mon, 21 Jan 2019 07:26:24 +0100

> This patchset adds initial Multicast Router Discovery support to
> the Linux bridge (RFC4286). With MRD it is possible to detect multicast
> routers and mark bridge ports and forward multicast packets to such routers
> accordingly.
> 
> So far, multicast routers are detected via IGMP/MLD queries and PIM
> messages in the Linux bridge. As there is only one active, selected
> querier at a time RFC4541 ("Considerations for Internet Group Management
> Protocol (IGMP) and Multicast Listener Discovery (MLD) Snooping
> Switches") section 2.1.1.a) recommends snooping Multicast Router
> Advertisements as provided by MRD (RFC4286).
> 
> 
> The first two patches are refactoring some existing code which is reused
> for parsing the Multicast Router Advertisements later in the fourth
> patch. The third patch lets the bridge join the all-snoopers multicast
> address to be able to reliably receive the Multicast Router
> Advertisements.
 ...

Series applied, thanks!

Re: [PATCH v3 10/10] arm64: dts: qcom: sdm845: Add Q6V5 MSS node

2019-01-22 Thread Bjorn Andersson

On Tue 22 Jan 16:28 PST 2019, Doug Anderson wrote:

> Hi,
> 
> On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson
>  wrote:
> >
> > From: Sibi Sankar 
> >
> > This patch adds Q6V5 MSS remoteproc node for SDM845 SoCs.
> >
> > Signed-off-by: Sibi Sankar 
> > Reviewed-by: Douglas Anderson 
> > Signed-off-by: Bjorn Andersson 
> > ---
> >
> > Changes since v2:
> > - Picked up Sibi's patch
> > - Fixed reg to work with address/size-cells as 2
> >
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 58 
> >  1 file changed, 58 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> > b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > index 5cc2615461da..78df5f1bce2d 100644
> > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > @@ -1617,6 +1617,64 @@
> > clock-names = "xo";
> > };
> >
> > +   mss_pil: remoteproc@408 {
> > +   compatible = "qcom,sdm845-mss-pil";
> > +   reg = <0 0x0408 0 0x408>, <0 0x0418 0 0x48>;
> > +   reg-names = "qdsp6", "rmb";
> > +
> > +   interrupts-extended =
> > +   < GIC_SPI 266 IRQ_TYPE_EDGE_RISING>,
> > +   <_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
> > +   <_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
> > +   <_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
> > +   <_smp2p_in 3 IRQ_TYPE_EDGE_RISING>,
> > +   <_smp2p_in 7 IRQ_TYPE_EDGE_RISING>;
> > +   interrupt-names = "wdog", "fatal", "ready",
> > + "handover", "stop-ack",
> > + "shutdown-ack";
> > +
> > +   clocks = < GCC_MSS_CFG_AHB_CLK>,
> > +< GCC_MSS_Q6_MEMNOC_AXI_CLK>,
> > +< GCC_BOOT_ROM_AHB_CLK>,
> > +< GCC_MSS_GPLL0_DIV_CLK_SRC>,
> > +< GCC_MSS_SNOC_AXI_CLK>,
> > +< GCC_MSS_MFAB_AXIS_CLK>,
> > +< GCC_PRNG_AHB_CLK>,
> > +< RPMH_CXO_CLK>;
> > +   clock-names = "iface", "bus", "mem", "gpll0_mss",
> > + "snoc_axi", "mnoc_axi", "prng", "xo";
> > +
> > +   qcom,smem-states = <_smp2p_out 0>;
> > +   qcom,smem-state-names = "stop";
> > +
> > +   resets = <_reset AOSS_CC_MSS_RESTART>,
> > +<_reset PDC_MODEM_SYNC_RESET>;
> > +   reset-names = "mss_restart", "pdc_reset";
> > +
> > +   qcom,halt-regs = <_mutex_regs 0x23000 0x25000 
> > 0x24000>;
> > +
> > +   power-domains = <_qmp AOSS_QMP_LS_MODEM>,
> > +   < SDM845_CX>,
> > +   < SDM845_MX>,
> > +   < SDM845_MSS>;
> > +   power-domain-names = "load_state", "cx", "mx", 
> > "mss";
> > +
> > +   mba {
> > +   memory-region = <_region>;
> > +   };
> > +
> > +   mpss {
> > +   memory-region = <_region>;
> > +   };
> > +
> > +   glink-edge {
> > +   interrupts =  > IRQ_TYPE_EDGE_RISING>;
> > +   label = "modem";
> > +   qcom,remote-pid = <1>;
> > +   mboxes = <_shared 12>;
> > +   };
> > +   };
> > +
> > sdhc_2: sdhci@8804000 {
> 
> Can you please sort by unit address now that you have a device tree
> that has more stuff?
> 

Of course, sorry for missing that.

Regards,
Bjorn

Re: [PATCH V2] livepatch: fix non-static warnings

2019-01-22 Thread Nicholas Mc Guire

On Tue, Jan 22, 2019 at 11:30:30AM -0500, Joe Lawrence wrote:
> On 12/18/18 10:18 AM, Joe Lawrence wrote:
> >On 12/18/2018 03:49 AM, Miroslav Benes wrote:
> >>On Mon, 17 Dec 2018, Joe Lawrence wrote:
> >>
> >>>I'm just being picky about its documentation and how we should note its
> >>>usage in the v3 patch.  Think that s/__noclone/used/g of the v2 commit
> >>>message would be sufficient?
> >>
> >>We could rephrase it. After all it is not only about symbol names in the
> >>symbol table. The traceable/patchable code has to be present...
> >>
> >>"Sparse reported warnings about non-static symbols. For the variables
> >>a simple static attribute is fine - for the functions referenced by
> >>livepatch via klp_func the symbol-names must be unmodified in the
> >>symbol table and the patchable code has to be emitted.
> >>
> >>Attach __used attribute to the shared statically declared functions."
> >>
> >>?
> >
> >That works for me.
> >
> Hi Nicholas,
> 
> Did you still want to post a v3 for this fix?  I think there were only a few
> v3 suggestions (link tag, tag order, __used attribute, and commit msg
> phrasing.)
>
yup - will go cleanup and repost.

thx!
hofrat

Re: [GIT PULL] clk fixes for v5.0-rc3

2019-01-22 Thread pr-tracker-bot

The pull request you sent on Tue, 22 Jan 2019 14:37:29 -0800:

> https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git 
> tags/clk-fixes-for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/0b0d4be6b4880c7199d41afe4d9a3f20f47fd9bb

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [PATCH v3 03/10] arm64: dts: sdm845: Introduce ADSP and CDSP PAS nodes

2019-01-22 Thread Bjorn Andersson

On Tue 22 Jan 16:40 PST 2019, Doug Anderson wrote:

> Hi,
> 
> On Tue, Jan 22, 2019 at 4:26 PM Bjorn Andersson
>  wrote:
> > > > +   clocks = <_board>;
> > > > +   clock-names = "xo";
> > >
> > > I've found that nearly all the places that refer to xo_board are wrong
> > > and should actually point to '< RPMH_CXO_CLK>'.  Maybe yours
> > > should too?
> > >
> >
> > Yes, xo_board is a fake clock representing the 19.2MHz clock feeding the
> > cxo (or cxo2) pad of the SoC. So you're definitely right in that this
> > should be referencing the actual 19.2MHz clock.
> >
> > We've kept referring to this as xo_board, as we don't handle probe
> > deferral when gcc will probe earlier than rpmcc in the boot and for
> > other non-clock drivers the fear of actually hitting 0 on the refcounter
> > for this (you don't want to disable the cxo while running the system).
> 
> Note that, as defined in the device tree, "xo_board" is actually 38.4.
> IIUC that is not actually a fake/bogus clock but represents the actual
> crystal on the board.  There's a divide by 2 in the CPU though so most
> peripherals consider "xo" as 19.2.
> 

There's the 38.4MHz XO connected to the PMIC, but the signal going into
the CXO_IN pad of the SoC is supposed to come from LNBBCLK1 and be
19.2MHz.

> ...OK, confirmed.  The actual RF_XO_CLK pin on the board is truly
> connected to 38.4.
> 

And the three RF clocks from the PMIC are all ticking at 38.4MHz.


The "xo" I need here is the LNBBCLK1 (RPMH_CXO_CLK in clk-rpmh), for the
purpose of preventing the root clock to be turned off if apps goes to
suspend while the modem is booting, before it has had a chance to tell
RPM(h) that it needs it to be on.

Regards,
Bjorn

[PATCH] i2c: imx: fix inconsistent IS_ERR and PTR_ERR

2019-01-22 Thread Gustavo A. R. Silva

Fix inconsistent IS_ERR and PTR_ERR in i2c_imx_dma_request.

The proper pointer to be passed as argument is dma->chan_tx.

This bug was detected with the help of Coccinelle.

Fixes: 5b3a23a3cc94 ("i2c: imx: notify about real errors on dma 
i2c_imx_dma_request")
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/i2c/busses/i2c-imx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index 09b124547669..42fed40198a0 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -287,7 +287,7 @@ static int i2c_imx_dma_request(struct imx_i2c_struct 
*i2c_imx,
 
dma->chan_tx = dma_request_chan(dev, "tx");
if (IS_ERR(dma->chan_tx)) {
-   ret = PTR_ERR(dma->chan_rx);
+   ret = PTR_ERR(dma->chan_tx);
if (ret != -ENODEV && ret != -EPROBE_DEFER)
dev_err(dev, "can't request DMA tx channel (%d)\n", 
ret);
goto fail_al;
-- 
2.20.1

Re: [PATCH reset-next 2/2] reset: brcmstb: Fix 32-bit build with 64-bit resource_size_t

2019-01-22 Thread Randy Dunlap

On 1/22/19 4:33 PM, Florian Fainelli wrote:
> On 32-bit architectures defining resource_size_t as 64-bit (because of
> PAE), we can run into a linker failure because of the modulo and the
> division against resource_size(), replace the two problematic operations
> with an alignment check on the register resource (instead of modulo),
> and the division with DIV_ROUND_CLOSEST_ULL().
> 
> Reported-by: Randy Dunlap 
> Fixes: c196cdc7659d ("reset: Add Broadcom STB SW_INIT reset controller 
> driver")
> Signed-off-by: Florian Fainelli 
> ---
>  drivers/reset/reset-brcmstb.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/reset/reset-brcmstb.c b/drivers/reset/reset-brcmstb.c
> index 01ab1f71518b..c4cab8b5052d 100644
> --- a/drivers/reset/reset-brcmstb.c
> +++ b/drivers/reset/reset-brcmstb.c
> @@ -91,7 +91,8 @@ static int brcmstb_reset_probe(struct platform_device *pdev)
>   return -ENOMEM;
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> - if (resource_size(res) % SW_INIT_BANK_SIZE) {
> + if (!IS_ALIGNED(res->start, SW_INIT_BANK_SIZE) ||
> + !IS_AGLINED(resource_size(res), SW_INIT_BANK_SIZE)) {
>   dev_err(kdev, "incorrect register range\n");
>   return -EINVAL;
>   }
> @@ -103,7 +104,8 @@ static int brcmstb_reset_probe(struct platform_device 
> *pdev)
>   dev_set_drvdata(kdev, priv);
>  
>   priv->rcdev.owner = THIS_MODULE;
> - priv->rcdev.nr_resets = (resource_size(res) / SW_INIT_BANK_SIZE) * 32;
> + priv->rcdev.nr_resets = DIV_ROUND_CLOSEST_ULL(resource_size(res),
> +   SW_INIT_BANK_SIZE) * 32;
>   priv->rcdev.ops = _reset_ops;
>   priv->rcdev.of_node = kdev->of_node;
>   /* Use defaults: 1 cell and simple xlate function */
> 

Hi Florian,

This gives me:

  CC  drivers/reset/reset-brcmstb.o
../drivers/reset/reset-brcmstb.c: In function ‘brcmstb_reset_probe’:
../drivers/reset/reset-brcmstb.c:95:6: error: implicit declaration of function 
‘IS_AGLINED’ [-Werror=implicit-function-declaration]
  !IS_AGLINED(resource_size(res), SW_INIT_BANK_SIZE)) {
  ^


but if the typo is fixed, it is fine :) then you can added:

Acked-by: Randy Dunlap 

Thanks.

-- 
~Randy

Re: [PATCH 01/15] habanalabs: add skeleton driver

2019-01-22 Thread Joe Perches

On Wed, 2019-01-23 at 02:00 +0200, Oded Gabbay wrote:
> This patch adds the habanalabs skeleton driver. The driver does nothing at
> this stage except very basic operations. It contains the minimal code to
> insmod and rmmod the driver and to create a /dev/hlX file per PCI device.

trivial notes:

> 
> diff --git a/drivers/misc/habanalabs/Makefile 
> b/drivers/misc/habanalabs/Makefile
[]
> \ No newline at end of file

You should fixes these.  There are a least a couple of them.

> diff --git a/drivers/misc/habanalabs/device.c 
> b/drivers/misc/habanalabs/device.c
[]
> @@ -0,0 +1,331 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Copyright 2016-2018 HabanaLabs, Ltd.
> + * All Rights Reserved.
> + */

Add #define pr_fmt(fmt) "habanalabs: " fmt

> +
> +#include "habanalabs.h"

or add it in this file


> +static int device_setup_cdev(struct hl_device *hdev, struct class *hclass,
> + int minor, const struct file_operations *fops)
> +{
> + int err, devno = MKDEV(hdev->major, minor);
> + struct cdev *hdev_cdev = >cdev;
> + char name[8];
> +
> + sprintf(name, "hl%d", hdev->id);

Might overflow name one day

> +
> + cdev_init(hdev_cdev, fops);
> + hdev_cdev->owner = THIS_MODULE;
> + err = cdev_add(hdev_cdev, devno, 1);
> + if (err) {
> + pr_err("habanalabs: Failed to add char device %s", name);

So #define pr_fmt can auto prefix these and this would be

pr_err("Failed to add char device %s\n", name);

missing terminating '\n' btw

> + goto err_cdev_add;
> + }
> +
> + hdev->dev = device_create(hclass, NULL, devno, NULL, "%s", name);
> + if (IS_ERR(hdev->dev)) {
> + pr_err("habanalabs: Failed to create device %s\n", name);

And this would be:
pr_err("Failed to create device %s\n", name);


etc...

> +static int device_early_init(struct hl_device *hdev)
> +{
> + switch (hdev->asic_type) {
> + case ASIC_GOYA:
> + sprintf(hdev->asic_name, "GOYA");

strcpy or perhaps better still as strlcpy

> +int hl_device_init(struct hl_device *hdev, struct class *hclass)
> +{
[]
> + dev_notice(hdev->dev,
> + "Successfully added device to habanalabs driver\n");

This is mostly aligned to open parenthesis, but perhaps
it could check with scripts/checkpatch.pl --strict and
see if you agree with anything it bleats.

> +int hl_poll_timeout_memory(struct hl_device *hdev, u64 addr,
> + u32 timeout_us, u32 *val)
> +{
> + /*
> +  * pReturnVal is defined as volatile because it points to HOST memory,
> +  * which is being written to by the device. Therefore, we can't use
> +  * locks to synchronize it and it is not a memory-mapped register space
> +  */
> + volatile u32 *pReturnVal = (volatile u32 *) addr;

It'd be nice to avoid hungarian and camelcase

> + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us);
> +
> + might_sleep();
> +
> + for (;;) {
> + *val = *pReturnVal;
> + if (*val)
> + break;
> + if (ktime_compare(ktime_get(), timeout) > 0) {
> + *val = *pReturnVal;
> + break;
> + }
> + usleep_range((100 >> 2) + 1, 100);
> + }
> +
> + return (*val ? 0 : -ETIMEDOUT);

Unnecessary parentheses

> diff --git a/drivers/misc/habanalabs/habanalabs_drv.c 
> b/drivers/misc/habanalabs/habanalabs_drv.c
[]
> +static struct pci_device_id ids[] = {
> + { PCI_DEVICE(PCI_VENDOR_ID_HABANALABS, PCI_IDS_GOYA), },
> + { 0, }
> +};

static const?

> diff --git a/drivers/misc/habanalabs/include/habanalabs_device_if.h 
> b/drivers/misc/habanalabs/include/habanalabs_device_if.h
[]
> +struct hl_bd {
> + __u64   ptr;
> + __u32   len;
> + union {
> + struct {
> + __u32   repeat:16;
> + __u32   res1:8;
> + __u32   repeat_valid:1;
> + __u32   res2:7;
> + };
> + __u32   ctl;
> + };
> +};

Maybe use the appropriate bit-endian __le instead of __u
with whatever cpu_to_le / le_to_cpu bits are necessary.

Re: [PATCH] KVM: VMX: Fix vm entry failure caused by invalid vmexit controls

2019-01-22 Thread Changbin Du

On Tue, Jan 22, 2019 at 09:00:45AM -0800, Sean Christopherson wrote:
> On Tue, Jan 22, 2019 at 11:29:52PM +0800, Changbin Du wrote:
> > The commit c73da3f ("KVM: VMX: Properly handle dynamic VM Entry/Exit
> > controls") has a typo that cause invalid vmexit controls. The
> > VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL is against _vmentry_control.
> > 
> > KVM: entry failed, hardware error 0x7
> > EAX= EBX= ECX= EDX=000206c2
> > ESI= EDI= EBP= ESP=
> > EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
> > ES =   9300
> > CS =f000   9b00
> > SS =   9300
> > DS =   9300
> > FS =   9300
> > GS =   9300
> > LDT=   8200
> > TR =   8b00
> > GDT=  
> > IDT=  
> > CR0=6010 CR2= CR3= CR4=
> > DR0= DR1= DR2=
> > DR3= DR6=0ff0 DR7=0400
> > EFER=
> > 
> > Fixes: c73da3f ("KVM: VMX: Properly handle dynamic VM Entry/Exit controls")
> > Signed-off-by: Changbin Du 
> 
> Patch already submitted[1].
> 
> Paolo/Radim, the VM-Exit fix needs to be queued asap.  The fix for the
> objtool warning[2] should also go into v5.0.
> 
echo. This bug breaks kvm on some old machines!

> [1] https://patchwork.kernel.org/patch/10763351/
> [2] https://patchwork.kernel.org/patch/10765309/
> 
> 
> 
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index f6915f10e584..0762fcab8fc9 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -2344,7 +2344,7 @@ static __init int setup_vmcs_config(struct 
> > vmcs_config *vmcs_conf,
> > case 37: /* AAT100 */
> > case 44: /* BC86,AAY89,BD102 */
> > case 46: /* BA97 */
> > -   _vmexit_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> > +   _vmentry_control &= 
> > ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> > _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> > pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
> > "does not work properly. Using 
> > workaround\n");
> > -- 
> > 2.19.1
> > 

-- 
Cheers,
Changbin Du

Re: [PATCH v3 02/10] arm64: dts: qcom: sdm845: Define rmtfs memory

2019-01-22 Thread Bjorn Andersson

On Tue 22 Jan 15:26 PST 2019, Doug Anderson wrote:

> Hi,
> 
> On Mon, Jan 21, 2019 at 9:51 PM Bjorn Andersson
>  wrote:
> >
> > Define the rmtfs memory node, as described in version 10 of the memory
> > map.
> >
> > Signed-off-by: Bjorn Andersson 
> > ---
> >
> > Changes since v2:
> > - New patch
> >
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 9 +
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> > b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > index cdcac3704c13..64f57cc5c61a 100644
> > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > @@ -72,6 +72,15 @@
> > #size-cells = <2>;
> > ranges;
> >
> > +   rmtfs@85d0 {
> > +   compatible = "qcom,rmtfs-mem";
> > +   reg = <0 0x85d0 0 0x20>;
> > +   no-map;
> > +
> > +   qcom,client-id = <1>;
> > +   qcom,vmid = <15>;
> > +   };
> 
> Ah, I saw this after I posted my comments to patch #1.  I guess this
> is the same as this node we have in our cheza board file downstream
> (need to get that posted upstream soon):
> 
> rmtfs@88f0 {
>   compatible = "qcom,rmtfs-mem";
>   reg = <0x0 0x88f0 0x0 0x80>;
>   no-map;
> 
>   qcom,client-id = <1>;
> };
> 
> That brings up a few things:
> 
> 1. You should add a node label here.  This allows us to act on the
> node more easily from board files, like setting it to disabled or
> changing it.
> 

I'll make sure to label it.

> 2. In https://crrev.com/c/1119572, the argument was made that the size
> of this carveout is board-specific.  That makes it hard to put it in
> sdm845.dts.
> 

I don't think I've seen a modern platform where this isn't 2MB, so I
think it's safe to add it to the platform. But a label sounds good, if
someone out there has a custom modem firmware with some odd changes in
this area.

I'll label it, to make it possible to move, resize or reclaim in boards.

Regards,
Bjorn

Re: [PATCH 1/2] thermal/int340x_thermal: Add additional UUIDs

2019-01-22 Thread Joel Stanley

Hello Rui,

On Tue, 4 Dec 2018 at 02:12, Zhang Rui  wrote:
> On 三, 2018-10-10 at 01:30 -0700, Matthew Garrett wrote:
> > Platforms support more DPTF policies than the driver currently
> > exposes.
> > Add them. This effectively reverts
> > 31908f45a583e8f21db37f402b6e8d5739945afd which removed several UUIDs
> > without explaining why.
> >
> I'm going to apply this patch series, just with two minor changes,
> 1. 31908f45a583e8f21db37f402b6e8d5739945afd does not follow the git
> commit description style 'commit <12+ chars of sha1> ("")'
> 2. the UUIDs were removed previously because these policies were not
> used.

Which tree did this series get applied to?

Cheers,

Joel

>
> thanks,
> rui
>
> > Signed-off-by: Matthew Garrett 
> > Cc: Zhang Rui 
> > Cc: Nisha Aram 
> > ---
> >  drivers/thermal/int340x_thermal/int3400_thermal.c | 14
> > ++
> >  1 file changed, 14 insertions(+)
> >
> > diff --git a/drivers/thermal/int340x_thermal/int3400_thermal.c
> > b/drivers/thermal/int340x_thermal/int3400_thermal.c
> > index e26b01c05e82..51c9097eaf7a 100644
> > --- a/drivers/thermal/int340x_thermal/int3400_thermal.c
> > +++ b/drivers/thermal/int340x_thermal/int3400_thermal.c
> > @@ -22,6 +22,13 @@ enum int3400_thermal_uuid {
> >   INT3400_THERMAL_PASSIVE_1,
> >   INT3400_THERMAL_ACTIVE,
> >   INT3400_THERMAL_CRITICAL,
> > + INT3400_THERMAL_ADAPTIVE_PERFORMANCE,
> > + INT3400_THERMAL_EMERGENCY_CALL_MODE,
> > + INT3400_THERMAL_PASSIVE_2,
> > + INT3400_THERMAL_POWER_BOSS,
> > + INT3400_THERMAL_VIRTUAL_SENSOR,
> > + INT3400_THERMAL_COOLING_MODE,
> > + INT3400_THERMAL_HARDWARE_DUTY_CYCLING,
> >   INT3400_THERMAL_MAXIMUM_UUID,
> >  };
> >
> > @@ -29,6 +36,13 @@ static char
> > *int3400_thermal_uuids[INT3400_THERMAL_MAXIMUM_UUID] = {
> >   "42A441D6-AE6A-462b-A84B-4A8CE79027D3",
> >   "3A95C389-E4B8-4629-A526-C52C88626BAE",
> >   "97C68AE7-15FA-499c-B8C9-5DA81D606E0A",
> > + "63BE270F-1C11-48FD-A6F7-3AF253FF3E2D",
> > + "5349962F-71E6-431D-9AE8-0A635B710AEE",
> > + "9E04115A-AE87-4D1C-9500-0F3E340BFE75",
> > + "F5A35014-C209-46A4-993A-EB56DE7530A1",
> > + "6ED722A7-9240-48A5-B479-31EEF723D7CF",
> > + "16CAF1B7-DD38-40ED-B1C1-1B8A1913D531",
> > + "BE84BABF-C4D4-403D-B495-3128FD44dAC1",
> >  };
> >
> >  struct int3400_thermal_priv {

Re: [PATCH] workqueue: Try to catch flush_work() without INIT_WORK().

2019-01-22 Thread Tetsuo Handa

Daniel Jordan wrote:
> On Sat, Jan 19, 2019 at 11:41:22AM +0900, Tetsuo Handa wrote:
> > On 2019/01/19 4:48, Daniel Jordan wrote:
> > > On Sat, Jan 19, 2019 at 02:04:58AM +0900, Tetsuo Handa wrote:
> > > __queue_work has a sanity check already for work, but using list_empty.  
> > > Seems
> > > slightly better to be consistent?
> > > 
> > 
> > list_empty() won't work, for "struct work_struct" is embedded into a struct
> > which is allocated by kzalloc().
> 
> Please check list_empty's definition again, it compares the address of the 
> node
> to its next pointer, so it should work for a zeroed node.  I'll reiterate that
> it seems slightly better to be consistent in "is work_struct initialized?"
> checks, but it's not a big deal and I'm fine either way.

You are talking about

if (WARN_ON(!list_empty(>entry))) {
spin_unlock(>pool->lock);
return;
}

part in __queue_work(), aren't you? But since flush_work() is used for waiting 
for
a work to complete, that work can be either queued state (list_empty() == 
false) or
not queued state (list_empty() == true). Thus, I don't think that flush_work() 
can
use list_empty() for checking whether that work was initialized.

[PATCH v2] workqueue: Try to catch flush_work() without INIT_WORK().

syzbot found a flush_work() caller who forgot to call INIT_WORK()
because that work_struct was allocated by kzalloc() [1]. But the message

  INFO: trying to register non-static key.
  the code is fine but needs lockdep annotation.
  turning off the locking correctness validator.

by lock_map_acquire() is failing to tell that INIT_WORK() is missing.

Since flush_work() without INIT_WORK() is a bug, and INIT_WORK() should
set ->func field to non-zero, let's warn if ->func field is zero.

[1] 
https://syzkaller.appspot.com/bug?id=a5954455fcfa51c29ca2ab55b203076337e1c770

Signed-off-by: Tetsuo Handa 
---
 kernel/workqueue.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 392be4b..a503ad9 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2908,6 +2908,9 @@ static bool __flush_work(struct work_struct *work, bool 
from_cancel)
if (WARN_ON(!wq_online))
return false;

+   if (WARN_ON(!work->func))
+   return false;
+
if (!from_cancel) {
lock_map_acquire(>lockdep_map);
lock_map_release(>lockdep_map);

Re: [PATCH v3 01/10] arm64: dts: qcom: sdm845: Update PIL region memory map

2019-01-22 Thread Bjorn Andersson

On Tue 22 Jan 15:16 PST 2019, Doug Anderson wrote:

> Hi,
> 
> On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson
>  wrote:
> >
> > Update existing and add all missing PIL regions to the reserved memory
> > map, as described in version 10.
> >
> > Signed-off-by: Bjorn Andersson 
> > ---
> >
> > Changes since v2:
> > - New patch
> >
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 61 ++--
> >  1 file changed, 58 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> > b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > index 0ec827394e92..cdcac3704c13 100644
> > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > @@ -89,12 +89,47 @@
> > };
> >
> > memory@8620 {
> > -   reg = <0 0x8620 0 0x2d0>;
> > +   reg = <0 0x8620 0 0x10>;
> > no-map;
> > };
> >
> > -   wlan_msa_mem: memory@9670 {
> > -   reg = <0 0x9670 0 0x10>;
> > +   memory@8630 {
> > +   reg = <0 0x8630 0 0x480>;
> > +   no-map;
> > +   };
> 
> I know it's not a problem upstream (yet), but downstream this collides
> with a memory region in the cheza board.  We have:
> 
> rmtfs@88f0 {
>   compatible = "qcom,rmtfs-mem";
>   reg = <0x0 0x88f0 0x0 0x80>;
>   no-map;
> 
>   qcom,client-id = <1>;
> };
> 
> ...and the above region overlays it since it goes till 0x8ab0
> 

Digging through the table again I see that there's another level here,
so it seems only the first 44MB of these 78MB are reserved for non-APSS
things. So this should actually be 0x2c0 long.

I will update this and we'll have one conflict less.

> 
> > +
> > +   memory@8ab0 {
> > +   reg = <0 0x8ab0 0 0x140>;
> > +   no-map;
> > +   };
> > +
> > +   memory@8bf0 {
> > +   reg = <0 0x8bf0 0 0x50>;
> > +   no-map;
> > +   };
> > +
> > +   ipa_fw_mem: memory@8c40 {
> > +   reg = <0 0x8c40 0 0x1>;
> > +   no-map;
> > +   };
> > +
> > +   ipa_gsi_mem: memory@8c41 {
> > +   reg = <0 0x8c41 0 0x5000>;
> > +   no-map;
> > +   };
> > +
> > +   memory@8c415000 {
> > +   reg = <0 0x8c415000 0 0x2000>;
> > +   no-map;
> > +   };
> > +
> > +   adsp_mem: memory@8c50 {
> > +   reg = <0 0x8c50 0 0x1a0>;
> > +   no-map;
> > +   };
> > +
> > +   wlan_msa_mem: memory@8df0 {
> 
> Your patch moves 'wlan_msa_mem' from 0x9670 to 0x8df0.  Is
> that OK?  I haven't been involved in all of the previous discussions
> but if everything is all OK w/ the device tree just moving this chunk
> around (without any other coordination w/ firmware) it seems really
> weird that we even need to specify it in the device tree.  ...but
> maybe I shouldn't open this can of worms.  You can pretend I didn't
> say anything.
> 

0x9670 seems to be reserved for the sensor core, so either WiFi
wasn't actually tested before, or more likely its firmware is position
independent.

Most (all?) firmware is position independent, but the security
configuration prevents us from relocating it. One such example is that
the ADSP in the newer firmware versions are not allowed to execute from
the old memory region.

Regards,
Bjorn

Re: [PATCH v3 03/10] arm64: dts: sdm845: Introduce ADSP and CDSP PAS nodes

2019-01-22 Thread Doug Anderson

Hi,

On Tue, Jan 22, 2019 at 4:26 PM Bjorn Andersson
 wrote:
> > > +   clocks = <_board>;
> > > +   clock-names = "xo";
> >
> > I've found that nearly all the places that refer to xo_board are wrong
> > and should actually point to '< RPMH_CXO_CLK>'.  Maybe yours
> > should too?
> >
>
> Yes, xo_board is a fake clock representing the 19.2MHz clock feeding the
> cxo (or cxo2) pad of the SoC. So you're definitely right in that this
> should be referencing the actual 19.2MHz clock.
>
> We've kept referring to this as xo_board, as we don't handle probe
> deferral when gcc will probe earlier than rpmcc in the boot and for
> other non-clock drivers the fear of actually hitting 0 on the refcounter
> for this (you don't want to disable the cxo while running the system).

Note that, as defined in the device tree, "xo_board" is actually 38.4.
IIUC that is not actually a fake/bogus clock but represents the actual
crystal on the board.  There's a divide by 2 in the CPU though so most
peripherals consider "xo" as 19.2.

...OK, confirmed.  The actual RF_XO_CLK pin on the board is truly
connected to 38.4.

-Doug

[PATCH reset-next 1/2] reset: brcmstb: Make it tristate

2019-01-22 Thread Florian Fainelli

The driver can be built as a module just fine, so let's make it
selectable as such.

Reported-by: Paul Gortmaker 
Signed-off-by: Florian Fainelli 
---
 drivers/reset/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/reset/Kconfig b/drivers/reset/Kconfig
index 1ca03c57e049..d9a02b7f90cf 100644
--- a/drivers/reset/Kconfig
+++ b/drivers/reset/Kconfig
@@ -41,7 +41,8 @@ config RESET_BERLIN
  This enables the reset controller driver for Marvell Berlin SoCs.
 
 config RESET_BRCMSTB
-   bool "Broadcom STB reset controller" if COMPILE_TEST
+   tristate "Broadcom STB reset controller"
+   depends on ARCH_BRCMSTB || COMPILE_TEST
default ARCH_BRCMSTB
help
  This enables the reset controller driver for Broadcom STB SoCs using
-- 
2.17.1

[PATCH reset-next 2/2] reset: brcmstb: Fix 32-bit build with 64-bit resource_size_t

2019-01-22 Thread Florian Fainelli

On 32-bit architectures defining resource_size_t as 64-bit (because of
PAE), we can run into a linker failure because of the modulo and the
division against resource_size(), replace the two problematic operations
with an alignment check on the register resource (instead of modulo),
and the division with DIV_ROUND_CLOSEST_ULL().

Reported-by: Randy Dunlap 
Fixes: c196cdc7659d ("reset: Add Broadcom STB SW_INIT reset controller driver")
Signed-off-by: Florian Fainelli 
---
 drivers/reset/reset-brcmstb.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/reset/reset-brcmstb.c b/drivers/reset/reset-brcmstb.c
index 01ab1f71518b..c4cab8b5052d 100644
--- a/drivers/reset/reset-brcmstb.c
+++ b/drivers/reset/reset-brcmstb.c
@@ -91,7 +91,8 @@ static int brcmstb_reset_probe(struct platform_device *pdev)
return -ENOMEM;
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-   if (resource_size(res) % SW_INIT_BANK_SIZE) {
+   if (!IS_ALIGNED(res->start, SW_INIT_BANK_SIZE) ||
+   !IS_AGLINED(resource_size(res), SW_INIT_BANK_SIZE)) {
dev_err(kdev, "incorrect register range\n");
return -EINVAL;
}
@@ -103,7 +104,8 @@ static int brcmstb_reset_probe(struct platform_device *pdev)
dev_set_drvdata(kdev, priv);
 
priv->rcdev.owner = THIS_MODULE;
-   priv->rcdev.nr_resets = (resource_size(res) / SW_INIT_BANK_SIZE) * 32;
+   priv->rcdev.nr_resets = DIV_ROUND_CLOSEST_ULL(resource_size(res),
+ SW_INIT_BANK_SIZE) * 32;
priv->rcdev.ops = _reset_ops;
priv->rcdev.of_node = kdev->of_node;
/* Use defaults: 1 cell and simple xlate function */
-- 
2.17.1

[PATCH reset-next 0/2] reset: brcmstb: Misc fixes

2019-01-22 Thread Florian Fainelli

Hi Philipp,

These two patches fix some recent issues brought up by Paul and Randy,
feel free to squash into c196cdc7659d ("reset: Add Broadcom STB SW_INIT
reset controller driver") since this is only in reset/next and
linux-next so far.

Thank you!

Florian Fainelli (2):
  reset: brcmstb: Make it tristate
  reset: brcmstb: Fix 32-bit build with 64-bit resource_size_t

 drivers/reset/Kconfig | 3 ++-
 drivers/reset/reset-brcmstb.c | 6 --
 2 files changed, 6 insertions(+), 3 deletions(-)

-- 
2.17.1

Re: [PATCH] block: aoe: no need to check return value of debugfs_create functions

2019-01-22 Thread Ed Cashin

On Tue, Jan 22, 2019 at 6:29 PM Omar Sandoval  wrote:
...
> Now entry is uninitialized here when we assign it to d->debugfs.

Thanks for noticing that.

Re: [PATCH v3 01/10] arm64: dts: qcom: sdm845: Update PIL region memory map

2019-01-22 Thread Bjorn Andersson

On Tue 22 Jan 15:10 PST 2019, Doug Anderson wrote:

> Hi,
> 
> On Tue, Jan 22, 2019 at 11:24 AM Bjorn Andersson
>  wrote:
> >
> > On Tue 22 Jan 10:58 PST 2019, Stephen Boyd wrote:
> >
> > > Quoting Bjorn Andersson (2019-01-21 21:51:03)
> > > > @@ -103,10 +138,30 @@
> > > > no-map;
> > > > };
> > > >
> > > > +   venus_mem: memory@9580 {
> > > > +   reg = <0 0x9580 0 0x50>;
> > > > +   no-map;
> > > > +   };
> > > > +
> > > > +   cdsp_mem: memory@95d0 {
> > > > +   reg = <0 0x95d0 0 0x80>;
> > > > +   no-map;
> > > > +   };
> > > > +
> > > > mba_region: memory@9650 {
> > > > reg = <0 0x9650 0 0x20>;
> > > > no-map;
> > > > };
> > > > +
> > > > +   slpi_mem: memory@9670 {
> > > > +   reg = <0 0x9670 0 0x140>;
> > > > +   no-map;
> > > > +   };
> > > > +
> > > > +   spss_mem: memory@97b0 {
> > > > +   reg = <0 0x97b0 0 0x10>;
> > > > +   no-map;
> > > > +   };
> > > > };
> > > >
> > >
> > > What's the plan if certain configurations don't use all these carveouts?
> > > Can we mark the reservation nodes as status = "disabled", or the reverse
> > > and mark them as status = "ok" in all boards, and then reclaim the
> > > memory for peripherals we don't care to use?
> > >
> >
> > The code path that picks these up does look for "status", so I suggest
> > that we leave them all enabled in the platform dtsi and then let the
> > device's reclaim them as needed.
> 
> Does that mean we should add labels for all of the sub-nodes so that
> boards can easily mark them "disabled"?
> 

That sounds reasonable, I'll dig up some labels for the unlabeled nodes
as well.

Thanks,
Bjorn

Re: [PATCH v3 10/10] arm64: dts: qcom: sdm845: Add Q6V5 MSS node

2019-01-22 Thread Doug Anderson

Hi,

On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson
 wrote:
>
> From: Sibi Sankar 
>
> This patch adds Q6V5 MSS remoteproc node for SDM845 SoCs.
>
> Signed-off-by: Sibi Sankar 
> Reviewed-by: Douglas Anderson 
> Signed-off-by: Bjorn Andersson 
> ---
>
> Changes since v2:
> - Picked up Sibi's patch
> - Fixed reg to work with address/size-cells as 2
>
>  arch/arm64/boot/dts/qcom/sdm845.dtsi | 58 
>  1 file changed, 58 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> index 5cc2615461da..78df5f1bce2d 100644
> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> @@ -1617,6 +1617,64 @@
> clock-names = "xo";
> };
>
> +   mss_pil: remoteproc@408 {
> +   compatible = "qcom,sdm845-mss-pil";
> +   reg = <0 0x0408 0 0x408>, <0 0x0418 0 0x48>;
> +   reg-names = "qdsp6", "rmb";
> +
> +   interrupts-extended =
> +   < GIC_SPI 266 IRQ_TYPE_EDGE_RISING>,
> +   <_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
> +   <_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
> +   <_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
> +   <_smp2p_in 3 IRQ_TYPE_EDGE_RISING>,
> +   <_smp2p_in 7 IRQ_TYPE_EDGE_RISING>;
> +   interrupt-names = "wdog", "fatal", "ready",
> + "handover", "stop-ack",
> + "shutdown-ack";
> +
> +   clocks = < GCC_MSS_CFG_AHB_CLK>,
> +< GCC_MSS_Q6_MEMNOC_AXI_CLK>,
> +< GCC_BOOT_ROM_AHB_CLK>,
> +< GCC_MSS_GPLL0_DIV_CLK_SRC>,
> +< GCC_MSS_SNOC_AXI_CLK>,
> +< GCC_MSS_MFAB_AXIS_CLK>,
> +< GCC_PRNG_AHB_CLK>,
> +< RPMH_CXO_CLK>;
> +   clock-names = "iface", "bus", "mem", "gpll0_mss",
> + "snoc_axi", "mnoc_axi", "prng", "xo";
> +
> +   qcom,smem-states = <_smp2p_out 0>;
> +   qcom,smem-state-names = "stop";
> +
> +   resets = <_reset AOSS_CC_MSS_RESTART>,
> +<_reset PDC_MODEM_SYNC_RESET>;
> +   reset-names = "mss_restart", "pdc_reset";
> +
> +   qcom,halt-regs = <_mutex_regs 0x23000 0x25000 
> 0x24000>;
> +
> +   power-domains = <_qmp AOSS_QMP_LS_MODEM>,
> +   < SDM845_CX>,
> +   < SDM845_MX>,
> +   < SDM845_MSS>;
> +   power-domain-names = "load_state", "cx", "mx", "mss";
> +
> +   mba {
> +   memory-region = <_region>;
> +   };
> +
> +   mpss {
> +   memory-region = <_region>;
> +   };
> +
> +   glink-edge {
> +   interrupts =  IRQ_TYPE_EDGE_RISING>;
> +   label = "modem";
> +   qcom,remote-pid = <1>;
> +   mboxes = <_shared 12>;
> +   };
> +   };
> +
> sdhc_2: sdhci@8804000 {

Can you please sort by unit address now that you have a device tree
that has more stuff?

-Doug

[PATCH] drm/vmwgfx: Replace PTR_RET with PTR_ERR_OR_ZERO

2019-01-22 Thread Gustavo A. R. Silva

PTR_RET is deprecated and will be removed soon.

Use PTR_ERR_OR_ZERO instead.

Notice that these are the last instances of PTR_RET in the
whole codebase.

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
index f2d13a72c05d..137cb1a4d6b0 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
@@ -2521,7 +2521,7 @@ static int vmw_cmd_dx_clear_rendertarget_view(struct 
vmw_private *dev_priv,
SVGA3dCmdDXClearRenderTargetView body;
} *cmd = container_of(header, typeof(*cmd), header);
 
-   return PTR_RET(vmw_view_id_val_add(sw_context, vmw_view_rt,
+   return PTR_ERR_OR_ZERO(vmw_view_id_val_add(sw_context, vmw_view_rt,
   cmd->body.renderTargetViewId));
 }
 
@@ -2542,7 +2542,7 @@ static int vmw_cmd_dx_clear_depthstencil_view(struct 
vmw_private *dev_priv,
SVGA3dCmdDXClearDepthStencilView body;
} *cmd = container_of(header, typeof(*cmd), header);
 
-   return PTR_RET(vmw_view_id_val_add(sw_context, vmw_view_ds,
+   return PTR_ERR_OR_ZERO(vmw_view_id_val_add(sw_context, vmw_view_ds,
   cmd->body.depthStencilViewId));
 }
 
@@ -2916,7 +2916,7 @@ static int vmw_cmd_dx_genmips(struct vmw_private 
*dev_priv,
SVGA3dCmdDXGenMips body;
} *cmd = container_of(header, typeof(*cmd), header);
 
-   return PTR_RET(vmw_view_id_val_add(sw_context, vmw_view_sr,
+   return PTR_ERR_OR_ZERO(vmw_view_id_val_add(sw_context, vmw_view_sr,
   cmd->body.shaderResourceViewId));
 }
 
-- 
2.20.1

Re: [PATCH 5/8 v4] dma: k3dma: Add support for dma-channel-mask

2019-01-22 Thread John Stultz

On Thu, Jan 17, 2019 at 9:14 AM Manivannan Sadhasivam
 wrote:
>
> /* Skip the channels which are masked */
> if ((d->dma_channel_mask) & BIT(pch))
> continue;

Per the discussion w/ Vinod and Rob, I think I'll leave this bit be,
so we use the channels in the bitmask.

> PS: Use BIT() macro where applicable.

But this suggestions I've integrated for v5.

Thanks so much for the review!
-john

Re: [PATCH 7/8] infiniband: usnic: no need to check return value of debugfs_create functions

2019-01-22 Thread Parvi Kaustubhi (pkaustub)

usnic driver was tested with this change.

Acked-by: Parvi Kaustubhi 

> On Jan 22, 2019, at 7:17 AM, Greg Kroah-Hartman  
> wrote:
> 
> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.
> 
> Cc: Christian Benvenuti 
> Cc: Nelson Escobar 
> Cc: Parvi Kaustubhi 
> Cc: Doug Ledford 
> Cc: Jason Gunthorpe 
> Cc: linux-r...@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman 
> ---
> drivers/infiniband/hw/usnic/usnic_debugfs.c | 26 -
> 1 file changed, 26 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/usnic/usnic_debugfs.c 
> b/drivers/infiniband/hw/usnic/usnic_debugfs.c
> index a3115709fb03..e5a3f02fb078 100644
> --- a/drivers/infiniband/hw/usnic/usnic_debugfs.c
> +++ b/drivers/infiniband/hw/usnic/usnic_debugfs.c
> @@ -113,42 +113,21 @@ static const struct file_operations flowinfo_ops = {
> void usnic_debugfs_init(void)
> {
>   debugfs_root = debugfs_create_dir(DRV_NAME, NULL);
> - if (IS_ERR(debugfs_root)) {
> - usnic_err("Failed to create debugfs root dir, check if debugfs 
> is enabled in kernel configuration\n");
> - goto out_clear_root;
> - }
> 
>   flows_dentry = debugfs_create_dir("flows", debugfs_root);
> - if (IS_ERR_OR_NULL(flows_dentry)) {
> - usnic_err("Failed to create debugfs flow dir with err %ld\n",
> - PTR_ERR(flows_dentry));
> - goto out_free_root;
> - }
> 
>   debugfs_create_file("build-info", S_IRUGO, debugfs_root,
>   NULL, _debugfs_buildinfo_ops);
> - return;
> -
> -out_free_root:
> - debugfs_remove_recursive(debugfs_root);
> -out_clear_root:
> - debugfs_root = NULL;
> }
> 
> void usnic_debugfs_exit(void)
> {
> - if (!debugfs_root)
> - return;
> -
>   debugfs_remove_recursive(debugfs_root);
>   debugfs_root = NULL;
> }
> 
> void usnic_debugfs_flow_add(struct usnic_ib_qp_grp_flow *qp_flow)
> {
> - if (IS_ERR_OR_NULL(flows_dentry))
> - return;
> -
>   scnprintf(qp_flow->dentry_name, sizeof(qp_flow->dentry_name),
>   "%u", qp_flow->flow->flow_id);
>   qp_flow->dbgfs_dentry = debugfs_create_file(qp_flow->dentry_name,
> @@ -156,11 +135,6 @@ void usnic_debugfs_flow_add(struct usnic_ib_qp_grp_flow 
> *qp_flow)
>   flows_dentry,
>   qp_flow,
>   _ops);
> - if (IS_ERR_OR_NULL(qp_flow->dbgfs_dentry)) {
> - usnic_err("Failed to create dbg fs entry for flow %u with error 
> %ld\n",
> - qp_flow->flow->flow_id,
> - PTR_ERR(qp_flow->dbgfs_dentry));
> - }
> }
> 
> void usnic_debugfs_flow_remove(struct usnic_ib_qp_grp_flow *qp_flow)
> -- 
> 2.20.1
>

Re: [PATCH v3 03/10] arm64: dts: sdm845: Introduce ADSP and CDSP PAS nodes

2019-01-22 Thread Bjorn Andersson

On Tue 22 Jan 15:46 PST 2019, Doug Anderson wrote:

> Hi,
> 
> On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson
>  wrote:
> >
> > Add the ADSP and CDSP nodes for PAS-based remoteproc, supporting booting
> > these cores on e.g. the MTP, and enable the same for the MTP.
> >
> > Signed-off-by: Bjorn Andersson 
> > ---
> >
> > Changes since v2:
> > - New patch
> >
> >  arch/arm64/boot/dts/qcom/sdm845-mtp.dts |  8 
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi| 58 +
> >  2 files changed, 66 insertions(+)
> 
> It's a bit of a nit of mine that if it's not totally obvious what
> acronyms mean that they should be spelled out in places that use them.
> 
> In this case I believe ADSP is the Audio DSP.  Is CDSP the Camera DSP?  ...or 
> ?
> 

C as in Compute. I'll spell these out as I respin the series.

> 
> > +   adsp_pas: remoteproc-adsp {
> > +   compatible = "qcom,sdm845-adsp-pas";
> > +
> > +   interrupts-extended = < GIC_SPI 162 
> > IRQ_TYPE_EDGE_RISING>,
> > + <_smp2p_in 0 
> > IRQ_TYPE_EDGE_RISING>,
> > + <_smp2p_in 1 
> > IRQ_TYPE_EDGE_RISING>,
> > + <_smp2p_in 2 
> > IRQ_TYPE_EDGE_RISING>,
> > + <_smp2p_in 3 
> > IRQ_TYPE_EDGE_RISING>;
> > +   interrupt-names = "wdog", "fatal", "ready",
> > + "handover", "stop-ack";
> > +
> > +   clocks = <_board>;
> > +   clock-names = "xo";
> 
> I've found that nearly all the places that refer to xo_board are wrong
> and should actually point to '< RPMH_CXO_CLK>'.  Maybe yours
> should too?
> 

Yes, xo_board is a fake clock representing the 19.2MHz clock feeding the
cxo (or cxo2) pad of the SoC. So you're definitely right in that this
should be referencing the actual 19.2MHz clock.

We've kept referring to this as xo_board, as we don't handle probe
deferral when gcc will probe earlier than rpmcc in the boot and for
other non-clock drivers the fear of actually hitting 0 on the refcounter
for this (you don't want to disable the cxo while running the system).

I'll give it a spin with appropriate reference and see what happens, I
think this should either be changed or documented in the commit message.

Thanks,
Bjorn

Re: [PATCH v8 11/11] media: imx.rst: Update doc to reflect fixes to interlaced capture

2019-01-22 Thread Steve Longerbeam





On 1/22/19 11:51 AM, Tim Harvey wrote:

On Mon, Jan 21, 2019 at 12:24 PM Tim Harvey  wrote:

On Tue, Jan 15, 2019 at 3:54 PM Steve Longerbeam  wrote:

Hi Tim,

On 1/15/19 1:58 PM, Tim Harvey wrote:

On Wed, Jan 9, 2019 at 10:30 AM Steve Longerbeam  wrote:

Also add an example pipeline for unconverted capture with interweave
on SabreAuto.

Cleanup some language in various places in the process.

Signed-off-by: Steve Longerbeam 
Reviewed-by: Philipp Zabel 
---
Changes since v4:
- Make clear that it is IDMAC channel that does pixel reordering and
interweave, not the CSI. Caught by Philipp Zabel.
Changes since v3:
- none.
Changes since v2:
- expand on idmac interweave behavior in CSI subdev.
- switch second SabreAuto pipeline example to PAL to give
both NTSC and PAL examples.
- Cleanup some language in various places.
---
   Documentation/media/v4l-drivers/imx.rst | 103 +++-
   1 file changed, 66 insertions(+), 37 deletions(-)




   Capture Pipelines
   -
@@ -516,10 +522,33 @@ On the SabreAuto, an on-board ADV7180 SD decoder is 
connected to the
   parallel bus input on the internal video mux to IPU1 CSI0.

   The following example configures a pipeline to capture from the ADV7180
-video decoder, assuming NTSC 720x480 input signals, with Motion
-Compensated de-interlacing. Pad field types assume the adv7180 outputs
-"interlaced". $outputfmt can be any format supported by the ipu1_ic_prpvf
-entity at its output pad:
+video decoder, assuming NTSC 720x480 input signals, using simple
+interweave (unconverted and without motion compensation). The adv7180
+must output sequential or alternating fields (field type 'seq-bt' for
+NTSC, or 'alternate'):
+
+.. code-block:: none
+
+   # Setup links
+   media-ctl -l "'adv7180 3-0021':0 -> 'ipu1_csi0_mux':1[1]"
+   media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]"
+   media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]"
+   # Configure pads
+   media-ctl -V "'adv7180 3-0021':0 [fmt:UYVY2X8/720x480 field:seq-bt]"
+   media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x480]"
+   media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]"
+   # Configure "ipu1_csi0 capture" interface (assumed at /dev/video4)
+   v4l2-ctl -d4 --set-fmt-video=field=interlaced_bt
+
+Streaming can then begin on /dev/video4. The v4l2-ctl tool can also be
+used to select any supported YUV pixelformat on /dev/video4.
+

Hi Steve,

I'm testing 4.20 with this patchset on top.

I'm on a GW5104 which has an IMX6Q with the adv7180 on ipu1_csi0 like
the SabeAuto example above I can't get the simple interveave example
to work:

media-ctl -r # reset all links
# Setup links (ADV7180 IPU1_CSI0)
media-ctl -l '"adv7180 2-0020":0 -> "ipu1_csi0_mux":1[1]'
media-ctl -l '"ipu1_csi0_mux":2 -> "ipu1_csi0":0[1]'
media-ctl -l '"ipu1_csi0":2 -> "ipu1_csi0 capture":0[1]' # /dev/video4
# Configure pads
media-ctl -V "'adv7180 2-0020':0 [fmt:UYVY2X8/720x480 field:seq-bt]"
media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x480]"
media-ctl -V "'ipu1_csi0':0 [fmt:AYUV32/720x480]"

This is the reason. The adv7180 is only allowing to configure alternate
field mode, and thus it reports the field height on the mbus, not the
full frame height. Imx deals with alternate field mode by capturing a
full frame, so the CSI entity sets the output pad height to double the
height.

So the CSI input pad needs to be configured with the field height:

media-ctl -V "'ipu1_csi0':0 [fmt:AYUV32/720x240]"

It should work for you after doing that. And better yet, don't bother
configuring the input pad, because media-ctl will propagate formats from
source to sink pads for you, so it's better to rely on the propagation,
and set the CSI output pad format instead (full frame height at output pad):

media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]"


Steve,

Thanks - that makes sense.

I also noticed that if I setup one of the vdic pipelines first then
went back after a 'media-ctl -r' and setup the example that failed it
no longer failed. I'm thinking that this is because 'media-ctl -r'
make reset all the links but does not reset all the V4L2 formats on
pads?


Final note: the imx.rst doc is technically correct even though it is
showing full frame heights being configured at the pads, because it is
expecting the adv7180 has accepted 'seq-bt'. But even the example given
in that doc works for alternate field mode, because the pad heights are
forced to the correct field height for alternate mode.


hmmm... I don't quite follow this statement. It sounds like the
example would only be correct if you were setting 'field:alternate'
but the example sets 'field:seq-bt' instead.

I wonder if you should add some verbiage explaining the difference in
format (resolution specifically) between the input and output pads
and/or change the example to set the output pad format so people don't
run into what I did trying to follow the example.


Steve,

I'm able to link a sensor->mux->csi->vdic->ic_prp->ic_prpenc but not a

Re: [PATCH] fail_function: no need to check return value of debugfs_create functions

2019-01-22 Thread Masami Hiramatsu

On Tue, 22 Jan 2019 16:21:44 +0100
Greg Kroah-Hartman  wrote:

> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.

Ah, OK. It simplifies the code. But I have a question below,

> 
> Cc: Masami Hiramatsu 
> Cc: Kees Cook 
> Cc: Josef Bacik 
> Cc: Thomas Gleixner 
> Cc: "Naveen N. Rao" 
> Cc: zhong jiang 
> Signed-off-by: Greg Kroah-Hartman 
> ---
>  kernel/fail_function.c | 23 +--
>  1 file changed, 5 insertions(+), 18 deletions(-)
> 
> diff --git a/kernel/fail_function.c b/kernel/fail_function.c
> index 17f75b545f66..afc779be5ebb 100644
> --- a/kernel/fail_function.c
> +++ b/kernel/fail_function.c
> @@ -152,20 +152,13 @@ static int fei_retval_get(void *data, u64 *val)
>  DEFINE_DEBUGFS_ATTRIBUTE(fei_retval_ops, fei_retval_get, fei_retval_set,
>"%llx\n");
>  
> -static int fei_debugfs_add_attr(struct fei_attr *attr)
> +static void fei_debugfs_add_attr(struct fei_attr *attr)
>  {
>   struct dentry *dir;
>  
>   dir = debugfs_create_dir(attr->kp.symbol_name, fei_debugfs_dir);
> - if (!dir)
> - return -ENOMEM;
> -
> - if (!debugfs_create_file("retval", 0600, dir, attr, _retval_ops)) {
> - debugfs_remove_recursive(dir);
> - return -ENOMEM;
> - }
>  
> - return 0;

Don't we need to check dir here? If above debugfs_create_dir() returns NULL,
it seems we will create "retval" under root directory of debugfs.

Thank you,

> + debugfs_create_file("retval", 0600, dir, attr, _retval_ops);
>  }
>  
>  static void fei_debugfs_remove_attr(struct fei_attr *attr)
> @@ -306,7 +299,7 @@ static ssize_t fei_write(struct file *file, const char 
> __user *buffer,
>  
>   ret = register_kprobe(>kp);
>   if (!ret)
> - ret = fei_debugfs_add_attr(attr);
> + fei_debugfs_add_attr(attr);
>   if (ret < 0)
>   fei_attr_remove(attr);
>   else {
> @@ -337,19 +330,13 @@ static int __init fei_debugfs_init(void)
>   return PTR_ERR(dir);
>  
>   /* injectable attribute is just a symlink of error_inject/list */
> - if (!debugfs_create_symlink("injectable", dir,
> - "../error_injection/list"))
> - goto error;
> + debugfs_create_symlink("injectable", dir, "../error_injection/list");
>  
> - if (!debugfs_create_file("inject", 0600, dir, NULL, _ops))
> - goto error;
> + debugfs_create_file("inject", 0600, dir, NULL, _ops);
>  
>   fei_debugfs_dir = dir;
>  
>   return 0;
> -error:
> - debugfs_remove_recursive(dir);
> - return -ENOMEM;
>  }
>  
>  late_initcall(fei_debugfs_init);
> -- 
> 2.20.1
> 


-- 
Masami Hiramatsu

Re: [PATCH v8 11/11] media: imx.rst: Update doc to reflect fixes to interlaced capture

2019-01-22 Thread Steve Longerbeam





On 1/21/19 12:24 PM, Tim Harvey wrote:

On Tue, Jan 15, 2019 at 3:54 PM Steve Longerbeam  wrote:

Hi Tim,

On 1/15/19 1:58 PM, Tim Harvey wrote:

On Wed, Jan 9, 2019 at 10:30 AM Steve Longerbeam  wrote:

Also add an example pipeline for unconverted capture with interweave
on SabreAuto.

Cleanup some language in various places in the process.

Signed-off-by: Steve Longerbeam 
Reviewed-by: Philipp Zabel 
---
Changes since v4:
- Make clear that it is IDMAC channel that does pixel reordering and
interweave, not the CSI. Caught by Philipp Zabel.
Changes since v3:
- none.
Changes since v2:
- expand on idmac interweave behavior in CSI subdev.
- switch second SabreAuto pipeline example to PAL to give
both NTSC and PAL examples.
- Cleanup some language in various places.
---
   Documentation/media/v4l-drivers/imx.rst | 103 +++-
   1 file changed, 66 insertions(+), 37 deletions(-)




   Capture Pipelines
   -
@@ -516,10 +522,33 @@ On the SabreAuto, an on-board ADV7180 SD decoder is 
connected to the
   parallel bus input on the internal video mux to IPU1 CSI0.

   The following example configures a pipeline to capture from the ADV7180
-video decoder, assuming NTSC 720x480 input signals, with Motion
-Compensated de-interlacing. Pad field types assume the adv7180 outputs
-"interlaced". $outputfmt can be any format supported by the ipu1_ic_prpvf
-entity at its output pad:
+video decoder, assuming NTSC 720x480 input signals, using simple
+interweave (unconverted and without motion compensation). The adv7180
+must output sequential or alternating fields (field type 'seq-bt' for
+NTSC, or 'alternate'):
+
+.. code-block:: none
+
+   # Setup links
+   media-ctl -l "'adv7180 3-0021':0 -> 'ipu1_csi0_mux':1[1]"
+   media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]"
+   media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]"
+   # Configure pads
+   media-ctl -V "'adv7180 3-0021':0 [fmt:UYVY2X8/720x480 field:seq-bt]"
+   media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x480]"
+   media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]"
+   # Configure "ipu1_csi0 capture" interface (assumed at /dev/video4)
+   v4l2-ctl -d4 --set-fmt-video=field=interlaced_bt
+
+Streaming can then begin on /dev/video4. The v4l2-ctl tool can also be
+used to select any supported YUV pixelformat on /dev/video4.
+

Hi Steve,

I'm testing 4.20 with this patchset on top.

I'm on a GW5104 which has an IMX6Q with the adv7180 on ipu1_csi0 like
the SabeAuto example above I can't get the simple interveave example
to work:

media-ctl -r # reset all links
# Setup links (ADV7180 IPU1_CSI0)
media-ctl -l '"adv7180 2-0020":0 -> "ipu1_csi0_mux":1[1]'
media-ctl -l '"ipu1_csi0_mux":2 -> "ipu1_csi0":0[1]'
media-ctl -l '"ipu1_csi0":2 -> "ipu1_csi0 capture":0[1]' # /dev/video4
# Configure pads
media-ctl -V "'adv7180 2-0020':0 [fmt:UYVY2X8/720x480 field:seq-bt]"
media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x480]"
media-ctl -V "'ipu1_csi0':0 [fmt:AYUV32/720x480]"

This is the reason. The adv7180 is only allowing to configure alternate
field mode, and thus it reports the field height on the mbus, not the
full frame height. Imx deals with alternate field mode by capturing a
full frame, so the CSI entity sets the output pad height to double the
height.

So the CSI input pad needs to be configured with the field height:

media-ctl -V "'ipu1_csi0':0 [fmt:AYUV32/720x240]"

It should work for you after doing that. And better yet, don't bother
configuring the input pad, because media-ctl will propagate formats from
source to sink pads for you, so it's better to rely on the propagation,
and set the CSI output pad format instead (full frame height at output pad):

media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]"


Steve,

Thanks - that makes sense.

I also noticed that if I setup one of the vdic pipelines first then
went back after a 'media-ctl -r' and setup the example that failed it
no longer failed. I'm thinking that this is because 'media-ctl -r'
make reset all the links but does not reset all the V4L2 formats on
pads?


Final note: the imx.rst doc is technically correct even though it is
showing full frame heights being configured at the pads, because it is
expecting the adv7180 has accepted 'seq-bt'. But even the example given
in that doc works for alternate field mode, because the pad heights are
forced to the correct field height for alternate mode.


hmmm... I don't quite follow this statement. It sounds like the
example would only be correct if you were setting 'field:alternate'
but the example sets 'field:seq-bt' instead.


The example is consistent for a sensor that sends seq-bt. Here is the 
example config from the imx.rst doc again, a (ntsc) height of 480 lines 
is correct for a seq-bt source:


   # Setup links
   media-ctl -l "'adv7180 3-0021':0 -> 'ipu1_csi0_mux':1[1]"
   media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]"
   media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]"
   # Configure pads
   media-ctl

Dear Friend,

2019-01-22 Thread Mr. Abdul Samad

Dear Friend,

With due respect to your person and much sincerity of purpose, I make this
contact with you as I believe that you can be of great assistance to me. My
name is Mr. Abdul Samad, from Ouagadougou Republic of BURKINA FASO, West
Africa . Presently i work in the African development Bank as telex manager.
I have been searching for your contact since you left our country some
years ago.

I do not know whether this is your correct email address or not because I
only used your name initials to search for your contact .In case you are
not the person I am supposed to contact, please see this as a confidential
message and do not reveal it to another person but if you are not the
intended receiver, do let me know whether you can be of assistance
regarding my proposal below because it is top secret.

I am about to retire from active Bank service to start a new life but I am
sceptical to reveal this particular secret to a stranger. You must assure
me that everything will be handled confidentially because we are not going
to suffer again in life.

It has been 10 years now that most of the greedy African Politicians used
our bank to launder money overseas through the help of their Political
advisers. Most of the funds which they transferred out of the shores of
Africa were gold and oil money that was supposed to have been used to
develop the continent. Their Political advisers always inflated the amounts
before transfer to foreign accounts so I also used the opportunity to
divert part of the funds hence I am aware that there is no official trace
of how much was transferred as all the accounts used for such transfers
were being closed after transfer.

I acted as the Bank Officer to most of the politicians and when I
discovered that they were using me to succeed in their greedy act; I also
cleaned some of their banking records from the Bank files and no one cared
to ask me because the money was too much for them to control. They
laundered over $5b Dollars during the process .As I am sending this message
to you, I was able to divert thirty five million united state dollars
($35m) to an escrow account belonging to no one in the bank. The bank is
anxious now to know who the beneficiary to the funds is because they have
made a lot of profits with the funds.

It is more than Eight years now and most of the politicians are no longer
using our bank to transfer funds overseas. The ($35) Million Dollars has
been laying waste but I don't want to retire from the bank without
transferring the funds to a foreign account to enable me share the proceeds
with the receiver. The money will be shared 60% for me and 40% for you..

There is no one coming to ask you about the funds because I secured
everything. I only want you to assist me by providing a bank account where
the funds can be transferred. You are not to face any difficulties or legal
implications as I am going to handle the transfer personally. If you are
capable of receiving the funds, do let me know immediately to enable me
give you a detailed information on what to do.

For me, I have not stolen the money from anyone because the other people
that took the whole money did not face any problems. This is my chance also
to grab my own but you must keep the details of the funds secret to avoid
any leakages as no one in the bank knows about the funds.

Please get back to me if you are interested and capable to handle this
project

I shall intimate you on what to do when I hear from your confirmation and
acceptance. If you are capable of being my trusted associate, do declare
your consent to me .E-mail:  as6391...@gmail.com

Waiting for your urgent response.
Yours Faithfully,

[PATCH 2/2] f2fs: sync filesystem after roll-forward recovery

2019-01-22 Thread Jaegeuk Kim

Some works after roll-forward recovery can get an error which will release
all the data structures. Let's flush them in order to make it clean.

One possible corruption came from:

[   90.400500] list_del corruption. prev->next should be ffed1f566208, but 
was (null)
[   90.675349] Call trace:
[   90.677869]  __list_del_entry_valid+0x94/0xb4
[   90.682351]  remove_dirty_inode+0xac/0x114
[   90.686563]  __f2fs_write_data_pages+0x6a8/0x6c8
[   90.691302]  f2fs_write_data_pages+0x40/0x4c
[   90.695695]  do_writepages+0x80/0xf0
[   90.699372]  __writeback_single_inode+0xdc/0x4ac
[   90.704113]  writeback_sb_inodes+0x280/0x440
[   90.708501]  wb_writeback+0x1b8/0x3d0
[   90.712267]  wb_workfn+0x1a8/0x4d4
[   90.715765]  process_one_work+0x1c0/0x3d4
[   90.719883]  worker_thread+0x224/0x344
[   90.723739]  kthread+0x120/0x130
[   90.727055]  ret_from_fork+0x10/0x18

Reported-by: Sahitya Tummala 
Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/checkpoint.c |  5 +++--
 fs/f2fs/node.c   |  4 +++-
 fs/f2fs/super.c  | 42 +++---
 3 files changed, 37 insertions(+), 14 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index f955cd3e0677..f0ce2f06 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -306,8 +306,9 @@ static int f2fs_write_meta_pages(struct address_space 
*mapping,
goto skip_write;
 
/* collect a number of dirty meta pages and write together */
-   if (wbc->for_kupdate ||
-   get_pages(sbi, F2FS_DIRTY_META) < nr_pages_to_skip(sbi, META))
+   if (wbc->sync_mode != WB_SYNC_ALL &&
+   get_pages(sbi, F2FS_DIRTY_META) <
+   nr_pages_to_skip(sbi, META))
goto skip_write;
 
/* if locked failed, cp will flush dirty pages instead */
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 4f450e573312..f6ff84e29749 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1920,7 +1920,9 @@ static int f2fs_write_node_pages(struct address_space 
*mapping,
f2fs_balance_fs_bg(sbi);
 
/* collect a number of dirty node pages and write together */
-   if (get_pages(sbi, F2FS_DIRTY_NODES) < nr_pages_to_skip(sbi, NODE))
+   if (wbc->sync_mode != WB_SYNC_ALL &&
+   get_pages(sbi, F2FS_DIRTY_NODES) <
+   nr_pages_to_skip(sbi, NODE))
goto skip_write;
 
if (wbc->sync_mode == WB_SYNC_ALL)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 7998ff5418f2..2af0db2b738e 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1456,9 +1456,16 @@ static int f2fs_enable_quotas(struct super_block *sb);
 
 static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi)
 {
+   unsigned int s_flags = sbi->sb->s_flags;
struct cp_control cpc;
-   int err;
+   int err = 0;
+   int ret;
 
+   if (s_flags & SB_RDONLY) {
+   f2fs_msg(sbi->sb, KERN_ERR,
+   "checkpoint=disable on readonly fs");
+   return -EINVAL;
+   }
sbi->sb->s_flags |= SB_ACTIVE;
 
f2fs_update_time(sbi, DISABLE_TIME);
@@ -1466,18 +1473,24 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info 
*sbi)
while (!f2fs_time_over(sbi, DISABLE_TIME)) {
mutex_lock(>gc_mutex);
err = f2fs_gc(sbi, true, false, NULL_SEGNO);
-   if (err == -ENODATA)
+   if (err == -ENODATA) {
+   err = 0;
break;
+   }
if (err && err != -EAGAIN)
-   return err;
+   break;
}
 
-   err = sync_filesystem(sbi->sb);
-   if (err)
-   return err;
+   ret = sync_filesystem(sbi->sb);
+   if (ret || err) {
+   err = ret ? ret: err;
+   goto restore_flag;
+   }
 
-   if (f2fs_disable_cp_again(sbi))
-   return -EAGAIN;
+   if (f2fs_disable_cp_again(sbi)) {
+   err = -EAGAIN;
+   goto restore_flag;
+   }
 
mutex_lock(>gc_mutex);
cpc.reason = CP_PAUSE;
@@ -1486,7 +1499,9 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info 
*sbi)
 
sbi->unusable_block_count = 0;
mutex_unlock(>gc_mutex);
-   return 0;
+restore_flag:
+   sbi->sb->s_flags = s_flags; /* Restore MS_RDONLY status */
+   return err;
 }
 
 static void f2fs_enable_checkpoint(struct f2fs_sb_info *sbi)
@@ -3356,7 +3371,7 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
if (test_opt(sbi, DISABLE_CHECKPOINT)) {
err = f2fs_disable_checkpoint(sbi);
if (err)
-   goto free_meta;
+   goto sync_free_meta;
} else if (is_set_ckpt_flags(sbi, CP_DISABLED_FLAG)) {
f2fs_enable_checkpoint(sbi);
}
@@ -3369,7 +3384,7 @@

[PATCH 10/15] habanalabs: add device reset support

2019-01-22 Thread Oded Gabbay

This patch adds support for doing various on-the-fly reset of Goya.

The driver supports two types of resets:
1. soft-reset
2. hard-reset

Soft-reset is done when the device detects a timeout of a command
submission that was given to the device. The soft-reset process only resets
the engines that are relevant for the submission of compute jobs, i.e. the
DMA channels, the TPCs and the MME. The purpose is to bring the device as
fast as possible to a working state.

Hard-reset is done in several cases:
1. After soft-reset is done but the device is not responding
2. When fatal errors occur inside the device, e.g. ECC error
3. When the driver is removed

Hard-reset performs a reset of the entire chip except for the PCI
controller and the PLLs. It is a much longer process then soft-reset but it
helps to recover the device without the need to reboot the Host.

After hard-reset, the driver will restore the max power attribute and in
case of manual power management, the frequencies that were set.

This patch also adds two entries to the sysfs, which allows the root user
to initiate a soft or hard reset.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/command_buffer.c  |  11 +-
 drivers/misc/habanalabs/device.c  | 308 +-
 drivers/misc/habanalabs/goya/goya.c   | 201 ++
 drivers/misc/habanalabs/goya/goya_hwmgr.c |  18 +-
 drivers/misc/habanalabs/habanalabs.h  |  35 +++
 drivers/misc/habanalabs/habanalabs_drv.c  |   9 +-
 drivers/misc/habanalabs/hwmon.c   |   4 +-
 drivers/misc/habanalabs/irq.c |  31 +++
 drivers/misc/habanalabs/sysfs.c   | 120 -
 9 files changed, 712 insertions(+), 25 deletions(-)

diff --git a/drivers/misc/habanalabs/command_buffer.c 
b/drivers/misc/habanalabs/command_buffer.c
index 535ed6cc5bda..700c6da01188 100644
--- a/drivers/misc/habanalabs/command_buffer.c
+++ b/drivers/misc/habanalabs/command_buffer.c
@@ -81,9 +81,10 @@ int hl_cb_create(struct hl_device *hdev, struct hl_cb_mgr 
*mgr,
bool alloc_new_cb = true;
int rc;
 
-   if (hdev->disabled) {
+   if ((hdev->disabled) || ((atomic_read(>in_reset)) &&
+   (ctx_id != HL_KERNEL_ASID_ID))) {
dev_warn_ratelimited(hdev->dev,
-   "Device is disabled !!! Can't create new CBs\n");
+   "Device is disabled or in reset !!! Can't create new 
CBs\n");
rc = -EBUSY;
goto out_err;
}
@@ -187,6 +188,12 @@ int hl_cb_ioctl(struct hl_fpriv *hpriv, void *data)
u64 handle;
int rc;
 
+   if (hdev->hard_reset_pending) {
+   dev_crit_ratelimited(hdev->dev,
+   "Device HARD reset pending !!! Please close FD\n");
+   return -ENODEV;
+   }
+
switch (args->in.op) {
case HL_CB_OP_CREATE:
rc = hl_cb_create(hdev, >cb_mgr, args->in.cb_size,
diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index ff7b610f18c4..00fde57ce823 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -188,6 +188,7 @@ static int device_early_init(struct hl_device *hdev)
 
mutex_init(>device_open);
mutex_init(>send_cpu_message_lock);
+   atomic_set(>in_reset, 0);
atomic_set(>fd_open_cnt, 0);
 
return 0;
@@ -238,6 +239,27 @@ static void set_freq_to_low_job(struct work_struct *work)
usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC));
 }
 
+static void hl_device_heartbeat(struct work_struct *work)
+{
+   struct hl_device *hdev = container_of(work, struct hl_device,
+   work_heartbeat.work);
+
+   if ((hdev->disabled) || (atomic_read(>in_reset)))
+   goto reschedule;
+
+   if (!hdev->asic_funcs->send_heartbeat(hdev))
+   goto reschedule;
+
+   dev_err(hdev->dev, "Device heartbeat failed !!!\n");
+   hl_device_reset(hdev, true, false);
+
+   return;
+
+reschedule:
+   schedule_delayed_work(>work_heartbeat,
+   usecs_to_jiffies(HL_HEARTBEAT_PER_USEC));
+}
+
 /**
  * device_late_init - do late stuff initialization for the habanalabs device
  *
@@ -273,6 +295,12 @@ static int device_late_init(struct hl_device *hdev)
schedule_delayed_work(>work_freq,
usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC));
 
+   if (hdev->heartbeat) {
+   INIT_DELAYED_WORK(>work_heartbeat, hl_device_heartbeat);
+   schedule_delayed_work(>work_heartbeat,
+   usecs_to_jiffies(HL_HEARTBEAT_PER_USEC));
+   }
+
hdev->late_init_done = true;
 
return 0;
@@ -290,6 +318,8 @@ static void device_late_fini(struct hl_device *hdev)
return;
 
cancel_delayed_work_sync(>work_freq);
+   if (hdev->heartbeat)
+

[PATCH 07/15] habanalabs: add h/w queues module

2019-01-22 Thread Oded Gabbay

This patch adds the H/W queues module and the code to initialize Goya's
various compute and DMA engines and their queues.

Goya has 5 DMA channels, 8 TPC engines and a single MME engine. For each
channel/engine, there is a H/W queue logic which is used to pass commands
from the user to the H/W. That logic is called QMAN.

There are two types of QMANs: external and internal. The DMA QMANs are
considered external while the TPC and MME QMANs are considered internal.
For each external queue there is a completion queue, which is located on
the Host memory.

The differences between external and internal QMANs are:

1. The location of the queue's memory. External QMANs are located on the
   Host memory while internal QMANs are located on the on-chip memory.

2. The external QMAN write an entry to a completion queue and sends an
   MSI-X interrupt upon completion of a command buffer that was given to
   it. The internal QMAN doesn't do that.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/Makefile  |2 +-
 drivers/misc/habanalabs/device.c  |   74 +-
 drivers/misc/habanalabs/goya/goya.c   | 1518 +++--
 drivers/misc/habanalabs/goya/goyaP.h  |6 +
 drivers/misc/habanalabs/habanalabs.h  |  176 +-
 drivers/misc/habanalabs/habanalabs_drv.c  |6 +
 drivers/misc/habanalabs/hw_queue.c|  404 +
 .../habanalabs/include/goya/goya_packets.h|  234 +++
 .../habanalabs/include/habanalabs_device_if.h |  272 +++
 drivers/misc/habanalabs/irq.c |  150 ++
 10 files changed, 2721 insertions(+), 121 deletions(-)
 create mode 100644 drivers/misc/habanalabs/hw_queue.c
 create mode 100644 drivers/misc/habanalabs/include/goya/goya_packets.h
 create mode 100644 drivers/misc/habanalabs/irq.c

diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile
index 2530c9b78ca4..c07f3ccb57dc 100644
--- a/drivers/misc/habanalabs/Makefile
+++ b/drivers/misc/habanalabs/Makefile
@@ -5,7 +5,7 @@
 obj-m  := habanalabs.o
 
 habanalabs-y := habanalabs_drv.o device.o context.o asid.o habanalabs_ioctl.o \
-   command_buffer.o
+   command_buffer.o hw_queue.o irq.o
 
 include $(src)/goya/Makefile
 habanalabs-y += $(HL_GOYA_FILES)
diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index 9fc7218a973c..98220628a467 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -170,13 +170,22 @@ static int device_early_init(struct hl_device *hdev)
if (rc)
goto early_fini;
 
+   hdev->cq_wq = alloc_workqueue("hl-free-jobs", WQ_UNBOUND, 0);
+   if (hdev->cq_wq == NULL) {
+   dev_err(hdev->dev, "Failed to allocate CQ workqueue\n");
+   goto asid_fini;
+   }
+
hl_cb_mgr_init(>kernel_cb_mgr);
 
mutex_init(>device_open);
+   mutex_init(>send_cpu_message_lock);
atomic_set(>fd_open_cnt, 0);
 
return 0;
 
+asid_fini:
+   hl_asid_fini(hdev);
 early_fini:
if (hdev->asic_funcs->early_fini)
hdev->asic_funcs->early_fini(hdev);
@@ -192,9 +201,12 @@ static int device_early_init(struct hl_device *hdev)
  */
 static void device_early_fini(struct hl_device *hdev)
 {
+   mutex_destroy(>send_cpu_message_lock);
 
hl_cb_mgr_fini(hdev, >kernel_cb_mgr);
 
+   destroy_workqueue(hdev->cq_wq);
+
hl_asid_fini(hdev);
 
if (hdev->asic_funcs->early_fini)
@@ -273,7 +285,7 @@ int hl_device_resume(struct hl_device *hdev)
  */
 int hl_device_init(struct hl_device *hdev, struct class *hclass)
 {
-   int rc;
+   int i, rc, cq_ready_cnt;
 
/* Create device */
rc = device_setup_cdev(hdev, hclass, hdev->id, _ops);
@@ -294,11 +306,48 @@ int hl_device_init(struct hl_device *hdev, struct class 
*hclass)
if (rc)
goto early_fini;
 
+   /*
+* Initialize the H/W queues. Must be done before hw_init, because
+* there the addresses of the kernel queue are being written to the
+* registers of the device
+*/
+   rc = hl_hw_queues_create(hdev);
+   if (rc) {
+   dev_err(hdev->dev, "failed to initialize kernel queues\n");
+   goto sw_fini;
+   }
+
+   /*
+* Initialize the completion queues. Must be done before hw_init,
+* because there the addresses of the completion queues are being
+* passed as arguments to request_irq
+*/
+   hdev->completion_queue =
+   kcalloc(hdev->asic_prop.completion_queues_count,
+   sizeof(*hdev->completion_queue), GFP_KERNEL);
+
+   if (!hdev->completion_queue) {
+   dev_err(hdev->dev, "failed to allocate completion queues\n");
+   rc = -ENOMEM;
+   goto hw_queues_destroy;
+   }
+
+   for (i = 0, cq_ready_cnt = 0;
+   i <

[PATCH 04/15] habanalabs: add context and ASID modules

2019-01-22 Thread Oded Gabbay

This patch adds two modules - ASID and context.

Each user process the opens a device's file must have at least one context
before it is able to "work" with the device. Each context has its own
device address-space and contains information about its runtime state (its
active command submissions).

To have address-space separation between contexts, each context is assigned
a unique ASID, which stands for "address-space id". Goya supports up to
1024 ASIDs.

Currently, the driver doesn't support multiple contexts. Therefore, the
user doesn't need to actively create a context. A "primary context" is
created automatically when the user opens the device's file.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/Makefile |   2 +-
 drivers/misc/habanalabs/asid.c   |  58 +
 drivers/misc/habanalabs/context.c| 155 +++
 drivers/misc/habanalabs/device.c |  47 +++
 drivers/misc/habanalabs/habanalabs.h |  70 ++
 drivers/misc/habanalabs/habanalabs_drv.c |  46 ++-
 6 files changed, 375 insertions(+), 3 deletions(-)
 create mode 100644 drivers/misc/habanalabs/asid.c
 create mode 100644 drivers/misc/habanalabs/context.c

diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile
index 6f1ead69bd77..3ffbadc2ca01 100644
--- a/drivers/misc/habanalabs/Makefile
+++ b/drivers/misc/habanalabs/Makefile
@@ -4,7 +4,7 @@
 
 obj-m  := habanalabs.o
 
-habanalabs-y := habanalabs_drv.o device.o
+habanalabs-y := habanalabs_drv.o device.o context.o asid.o
 
 include $(src)/goya/Makefile
 habanalabs-y += $(HL_GOYA_FILES)
diff --git a/drivers/misc/habanalabs/asid.c b/drivers/misc/habanalabs/asid.c
new file mode 100644
index ..0ce84c8f5a47
--- /dev/null
+++ b/drivers/misc/habanalabs/asid.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright 2016-2018 HabanaLabs, Ltd.
+ * All Rights Reserved.
+ */
+
+#include "habanalabs.h"
+
+#include 
+#include 
+
+int hl_asid_init(struct hl_device *hdev)
+{
+   hdev->asid_bitmap = kcalloc(BITS_TO_LONGS(hdev->asic_prop.max_asid),
+   sizeof(*hdev->asid_bitmap), GFP_KERNEL);
+   if (!hdev->asid_bitmap)
+   return -ENOMEM;
+
+   mutex_init(>asid_mutex);
+
+   /* ASID 0 is reserved for KMD */
+   set_bit(0, hdev->asid_bitmap);
+
+   return 0;
+}
+
+void hl_asid_fini(struct hl_device *hdev)
+{
+   mutex_destroy(>asid_mutex);
+   kfree(hdev->asid_bitmap);
+}
+
+unsigned long hl_asid_alloc(struct hl_device *hdev)
+{
+   unsigned long found;
+
+   mutex_lock(>asid_mutex);
+
+   found = find_first_zero_bit(hdev->asid_bitmap,
+   hdev->asic_prop.max_asid);
+   if (found == hdev->asic_prop.max_asid)
+   found = 0;
+   else
+   set_bit(found, hdev->asid_bitmap);
+
+   mutex_unlock(>asid_mutex);
+
+   return found;
+}
+
+void hl_asid_free(struct hl_device *hdev, unsigned long asid)
+{
+   if (WARN((asid == 0 || asid >= hdev->asic_prop.max_asid),
+   "Invalid ASID %lu", asid))
+   return;
+   clear_bit(asid, hdev->asid_bitmap);
+}
diff --git a/drivers/misc/habanalabs/context.c 
b/drivers/misc/habanalabs/context.c
new file mode 100644
index ..cdcad077e5cf
--- /dev/null
+++ b/drivers/misc/habanalabs/context.c
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright 2016-2018 HabanaLabs, Ltd.
+ * All Rights Reserved.
+ */
+
+#include "habanalabs.h"
+
+#include 
+#include 
+
+static void hl_ctx_fini(struct hl_ctx *ctx)
+{
+   struct hl_device *hdev = ctx->hdev;
+
+   if (ctx->asid != HL_KERNEL_ASID_ID)
+   hl_asid_free(hdev, ctx->asid);
+}
+
+void hl_ctx_do_release(struct kref *ref)
+{
+   struct hl_ctx *ctx;
+
+   ctx = container_of(ref, struct hl_ctx, refcount);
+
+   dev_dbg(ctx->hdev->dev, "Now really releasing context %d\n", ctx->asid);
+
+   hl_ctx_fini(ctx);
+
+   if (ctx->hpriv)
+   hl_hpriv_put(ctx->hpriv);
+
+   kfree(ctx);
+}
+
+int hl_ctx_create(struct hl_device *hdev, struct hl_fpriv *hpriv)
+{
+   struct hl_ctx_mgr *mgr = >ctx_mgr;
+   struct hl_ctx *ctx;
+   int rc;
+
+   ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+   if (!ctx) {
+   rc = -ENOMEM;
+   goto out_err;
+   }
+
+   rc = hl_ctx_init(hdev, ctx, false);
+   if (rc)
+   goto free_ctx;
+
+   hl_hpriv_get(hpriv);
+   ctx->hpriv = hpriv;
+
+   /* TODO: remove for multiple contexts */
+   hpriv->ctx = ctx;
+   hdev->user_ctx = ctx;
+
+   mutex_lock(>ctx_lock);
+   rc = idr_alloc(>ctx_handles, ctx, 1, 0, GFP_KERNEL);
+   mutex_unlock(>ctx_lock);
+
+   if (rc < 0) {
+   dev_err(hdev->dev, "Failed to allocate IDR for a new CTX\n");
+   hl_ctx_free(hdev, ctx);
+   goto

[PATCH 1/2] f2fs: run discard jobs when put_super

2019-01-22 Thread Jaegeuk Kim

When we umount f2fs, we need to avoid long delay due to discard commands, which
is actually taking tens of seconds, if storage is very slow on UNMAP. So, this
patch introduces timeout-based work on it.

By default, let me give 5 seconds for discard.

Signed-off-by: Jaegeuk Kim 
---
 Documentation/ABI/testing/sysfs-fs-f2fs |  7 +++
 fs/f2fs/f2fs.h  |  5 -
 fs/f2fs/segment.c   | 11 ++-
 fs/f2fs/super.c | 17 -
 fs/f2fs/sysfs.c |  3 +++
 5 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs 
b/Documentation/ABI/testing/sysfs-fs-f2fs
index a7ce33199457..91822ce25831 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -86,6 +86,13 @@ Description:
The unit size is one block, now only support configuring in 
range
of [1, 512].
 
+What:  /sys/fs/f2fs//umount_discard_timeout
+Date:  January 2019
+Contact:   "Jaegeuk Kim" 
+Description:
+   Set timeout to issue discard commands during umount.
+   Default: 5 secs
+
 What:  /sys/fs/f2fs//max_victim_search
 Date:  January 2014
 Contact:   "Jaegeuk Kim" 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 0f564883e078..6b6ec5600089 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -191,6 +191,7 @@ enum {
 #define DEF_CP_INTERVAL60  /* 60 secs */
 #define DEF_IDLE_INTERVAL  5   /* 5 secs */
 #define DEF_DISABLE_INTERVAL   5   /* 5 secs */
+#define DEF_UMOUNT_DISCARD_TIMEOUT 5   /* 5 secs */
 
 struct cp_control {
int reason;
@@ -310,6 +311,7 @@ struct discard_policy {
bool sync;  /* submit discard with REQ_SYNC flag */
bool ordered;   /* issue discard by lba order */
unsigned int granularity;   /* discard granularity */
+   int timeout;/* discard timeout for put_super */
 };
 
 struct discard_cmd_control {
@@ -1110,6 +1112,7 @@ enum {
DISCARD_TIME,
GC_TIME,
DISABLE_TIME,
+   UMOUNT_DISCARD_TIMEOUT,
MAX_TIME,
 };
 
@@ -3006,7 +3009,7 @@ void f2fs_invalidate_blocks(struct f2fs_sb_info *sbi, 
block_t addr);
 bool f2fs_is_checkpointed_data(struct f2fs_sb_info *sbi, block_t blkaddr);
 void f2fs_drop_discard_cmd(struct f2fs_sb_info *sbi);
 void f2fs_stop_discard_thread(struct f2fs_sb_info *sbi);
-bool f2fs_wait_discard_bios(struct f2fs_sb_info *sbi);
+bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi);
 void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi,
struct cp_control *cpc);
 void f2fs_dirty_to_prefree(struct f2fs_sb_info *sbi);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 9b79056d705d..97e0faf09ebf 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1037,6 +1037,7 @@ static void __init_discard_policy(struct f2fs_sb_info 
*sbi,
 
dpolicy->max_requests = DEF_MAX_DISCARD_REQUEST;
dpolicy->io_aware_gran = MAX_PLIST_NUM;
+   dpolicy->timeout = MAX_TIME;
 
if (discard_type == DPOLICY_BG) {
dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME;
@@ -1424,7 +1425,14 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
int i, issued = 0;
bool io_interrupted = false;
 
+   if (dpolicy->timeout != MAX_TIME)
+   f2fs_update_time(sbi, dpolicy->timeout);
+
for (i = MAX_PLIST_NUM - 1; i >= 0; i--) {
+   if (dpolicy->timeout != MAX_TIME &&
+   f2fs_time_over(sbi, dpolicy->timeout))
+   break;
+
if (i + 1 < dpolicy->granularity)
break;
 
@@ -1611,7 +1619,7 @@ void f2fs_stop_discard_thread(struct f2fs_sb_info *sbi)
 }
 
 /* This comes from f2fs_put_super */
-bool f2fs_wait_discard_bios(struct f2fs_sb_info *sbi)
+bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi)
 {
struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
struct discard_policy dpolicy;
@@ -1619,6 +1627,7 @@ bool f2fs_wait_discard_bios(struct f2fs_sb_info *sbi)
 
__init_discard_policy(sbi, , DPOLICY_UMOUNT,
dcc->discard_granularity);
+   dpolicy.timeout = UMOUNT_DISCARD_TIMEOUT;
__issue_discard_cmd(sbi, );
dropped = __drop_discard_cmd(sbi);
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index ea514acede36..7998ff5418f2 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1029,6 +1029,9 @@ static void f2fs_put_super(struct super_block *sb)
int i;
bool dropped;
 
+   /* be sure to wait for any on-going discard commands */
+   dropped = f2fs_issue_discard_timeout(sbi);
+
f2fs_quota_off_umount(sb);
 
/* prevent remaining shrinker jobs

Re: [PATCH v3 07/10] remoteproc: q6v5-mss: Vote for rpmh power domains

2019-01-22 Thread Doug Anderson

Hi,

On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson
 wrote:
> @@ -1333,7 +1431,7 @@ static int q6v5_probe(struct platform_device *pdev)
> ret = qcom_q6v5_init(>q6v5, pdev, rproc, 
> MPSS_CRASH_REASON_SMEM,
>  qcom_msa_handover);
> if (ret)
> -   goto free_rproc;
> +   goto detach_proxy_pds;
>
> qproc->mpss_perm = BIT(QCOM_SCM_VMID_HLOS);
> qproc->mba_perm = BIT(QCOM_SCM_VMID_HLOS);
> @@ -1344,10 +1442,12 @@ static int q6v5_probe(struct platform_device *pdev)
>
> ret = rproc_add(rproc);
> if (ret)
> -   goto free_rproc;
> +   goto detach_proxy_pds;

I can't comment on the patch overall since I haven't spent any time in
remoteproc, but as previously pointed out by Sibi, now that you've
landed commit 027045a6e2b7 ("remoteproc: qcom: Add shutdown-ack irq"),
you need to adjust the "goto"s in your patch.  Specifically
(whitespace damaged) atop your whole series:

@@ -1488,7 +1488,7 @@ static int q6v5_probe(struct platform_device *pdev)
qproc->sysmon = qcom_add_sysmon_subdev(rproc, "modem", 0x12);
if (IS_ERR(qproc->sysmon)) {
ret = PTR_ERR(qproc->sysmon);
-   goto free_rproc;
+   goto detach_proxy_pds;
}

-Doug

[PATCH 14/15] habanalabs: add debugfs support

2019-01-22 Thread Oded Gabbay

This patch adds debugfs support to the driver. It allows the user-space to
display information that is contained in the internal structures of the
driver, such as:
- active command submissions
- active user virtual memory mappings
- number of allocated command buffers

It also enables the user to perform reads and writes through Goya's PCI
bars.

Signed-off-by: Oded Gabbay 
---
 .../ABI/testing/debugfs-driver-habanalabs |  127 ++
 drivers/misc/habanalabs/Makefile  |2 +
 drivers/misc/habanalabs/command_buffer.c  |4 +
 drivers/misc/habanalabs/command_submission.c  |   12 +
 drivers/misc/habanalabs/debugfs.c | 1069 +
 drivers/misc/habanalabs/device.c  |6 +
 drivers/misc/habanalabs/goya/goya.c   |  108 ++
 drivers/misc/habanalabs/goya/goyaP.h  |5 +
 drivers/misc/habanalabs/habanalabs.h  |  191 +++
 drivers/misc/habanalabs/habanalabs_drv.c  |   16 +-
 drivers/misc/habanalabs/memory.c  |8 +
 11 files changed, 1546 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/debugfs-driver-habanalabs
 create mode 100644 drivers/misc/habanalabs/debugfs.c

diff --git a/Documentation/ABI/testing/debugfs-driver-habanalabs 
b/Documentation/ABI/testing/debugfs-driver-habanalabs
new file mode 100644
index ..2b606c84938c
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-driver-habanalabs
@@ -0,0 +1,127 @@
+What:   /sys/kernel/debug/habanalabs/hl/addr
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Sets the device address to be used for read or write through
+PCI bar. The acceptable value is a string that starts with "0x"
+
+What:   /sys/kernel/debug/habanalabs/hl/command_buffers
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Displays a list with information about the currently allocated
+command buffers
+
+What:   /sys/kernel/debug/habanalabs/hl/command_submission
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Displays a list with information about the currently active
+command submissions
+
+What:   /sys/kernel/debug/habanalabs/hl/command_submission_jobs
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Displays a list with detailed information about each JOB (CB) 
of
+each active command submission
+
+What:   /sys/kernel/debug/habanalabs/hl/data32
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Allows the root user to read or write directly through the
+device's PCI bar. Writing to this file generates a write
+transaction while reading from the file generates a read
+transcation. This custom interface is needed (instead of using
+the generic Linux user-space PCI mapping) because the DDR bar
+is very small compared to the DDR memory and only the driver 
can
+move the bar before and after the transaction
+
+What:   /sys/kernel/debug/habanalabs/hl/device
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Enables the root user to set the device to specific state.
+Valid values are "disable", "enable", "suspend", "resume".
+User can read this property to see the valid values
+
+What:   /sys/kernel/debug/habanalabs/hl/i2c_addr
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Sets I2C device address for I2C transaction that is generated
+by the device's CPU
+
+What:   /sys/kernel/debug/habanalabs/hl/i2c_bus
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Sets I2C bus address for I2C transaction that is generated by
+the device's CPU
+
+What:   /sys/kernel/debug/habanalabs/hl/i2c_data
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Triggers an I2C transaction that is generated by the device's
+CPU. Writing to this file generates a write transaction while
+reading from the file generates a read transcation
+
+What:   /sys/kernel/debug/habanalabs/hl/i2c_reg
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Sets I2C register id for I2C transaction that is generated by
+the device's CPU
+
+What:   /sys/kernel/debug/habanalabs/hl/led0
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Sets the state of the first S/W led on the device
+
+What:

[PATCH 15/15] Update MAINTAINERS and CREDITS with habanalabs info

2019-01-22 Thread Oded Gabbay

The habanalabs driver was written from scratch from the very first days
of Habana and is maintained by Oded Gabbay.

Signed-off-by: Oded Gabbay 
---
 CREDITS | 2 +-
 MAINTAINERS | 9 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/CREDITS b/CREDITS
index e818eb6a3e71..03f3d67126fc 100644
--- a/CREDITS
+++ b/CREDITS
@@ -1222,7 +1222,7 @@ S: Brazil
 
 N: Oded Gabbay
 E: oded.gab...@gmail.com
-D: AMD KFD maintainer
+D: HabanaLabs and AMD KFD maintainer
 S: 12 Shraga Raphaeli
 S: Petah-Tikva, 4906418
 S: Israel
diff --git a/MAINTAINERS b/MAINTAINERS
index 51029a425dbe..93e047336cab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6641,6 +6641,15 @@ F:   drivers/clocksource/h8300_*.c
 F: drivers/clk/h8300/
 F: drivers/irqchip/irq-renesas-h8*.c
 
+HABANALABS PCI DRIVER
+M: Oded Gabbay 
+T: git https://github.com/HabanaAI/linux.git
+S: Supported
+F: drivers/misc/habanalabs/
+F: include/uapi/misc/habanalabs.h
+F: Documentation/ABI/testing/sysfs-driver-habanalabs
+F: Documentation/ABI/testing/debugfs-driver-habanalabs
+
 HACKRF MEDIA DRIVER
 M: Antti Palosaari 
 L: linux-me...@vger.kernel.org
-- 
2.17.1

[PATCH 12/15] habanalabs: add virtual memory and MMU modules

2019-01-22 Thread Oded Gabbay

From: Omer Shpigelman 

This patch adds the Virtual Memory and MMU modules.

Goya has an internal MMU which provides process isolation on the internal
DDR. The internal MMU also performs translations for transactions that go
from Goya to the Host.

The driver is responsible for allocating and freeing memory on the DDR
upon user request. It also provides an interface to map and unmap DDR and
Host memory to the device address space.

Signed-off-by: Omer Shpigelman 
Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/Makefile  |2 +-
 drivers/misc/habanalabs/context.c |   19 +-
 drivers/misc/habanalabs/device.c  |   20 +-
 drivers/misc/habanalabs/goya/goya.c   |  391 +
 drivers/misc/habanalabs/habanalabs.h  |  195 +++
 drivers/misc/habanalabs/habanalabs_drv.c  |2 +-
 drivers/misc/habanalabs/habanalabs_ioctl.c|3 +-
 drivers/misc/habanalabs/include/goya/goya.h   |6 +-
 .../include/hw_ip/mmu/mmu_general.h   |   45 +
 .../habanalabs/include/hw_ip/mmu/mmu_v1_0.h   |   15 +
 drivers/misc/habanalabs/memory.c  | 1506 +
 drivers/misc/habanalabs/mmu.c |  604 +++
 include/uapi/misc/habanalabs.h|  122 +-
 13 files changed, 2922 insertions(+), 8 deletions(-)
 create mode 100644 drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h
 create mode 100644 drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_0.h
 create mode 100644 drivers/misc/habanalabs/mmu.c

diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile
index d2fd0e18b1eb..fd46f8b48bab 100644
--- a/drivers/misc/habanalabs/Makefile
+++ b/drivers/misc/habanalabs/Makefile
@@ -6,7 +6,7 @@ obj-m   := habanalabs.o
 
 habanalabs-y := habanalabs_drv.o device.o context.o asid.o habanalabs_ioctl.o \
command_buffer.o hw_queue.o irq.o sysfs.o hwmon.o memory.o \
-   command_submission.o
+   command_submission.o mmu.o
 
 include $(src)/goya/Makefile
 habanalabs-y += $(HL_GOYA_FILES)
diff --git a/drivers/misc/habanalabs/context.c 
b/drivers/misc/habanalabs/context.c
index 2da672113e7a..dc0800a0ac9c 100644
--- a/drivers/misc/habanalabs/context.c
+++ b/drivers/misc/habanalabs/context.c
@@ -26,8 +26,10 @@ static void hl_ctx_fini(struct hl_ctx *ctx)
for (i = 0 ; i < HL_MAX_PENDING_CS ; i++)
dma_fence_put(ctx->cs_pending[i]);
 
-   if (ctx->asid != HL_KERNEL_ASID_ID)
+   if (ctx->asid != HL_KERNEL_ASID_ID) {
+   hl_vm_ctx_fini(ctx);
hl_asid_free(hdev, ctx->asid);
+   }
 }
 
 void hl_ctx_do_release(struct kref *ref)
@@ -97,6 +99,8 @@ void hl_ctx_free(struct hl_device *hdev, struct hl_ctx *ctx)
 
 int hl_ctx_init(struct hl_device *hdev, struct hl_ctx *ctx, bool is_kernel_ctx)
 {
+   int rc = 0;
+
ctx->hdev = hdev;
 
kref_init(>refcount);
@@ -114,9 +118,22 @@ int hl_ctx_init(struct hl_device *hdev, struct hl_ctx 
*ctx, bool is_kernel_ctx)
dev_err(hdev->dev, "No free ASID, failed to create 
context\n");
return -ENOMEM;
}
+
+   rc = hl_vm_ctx_init(ctx);
+   if (rc) {
+   dev_err(hdev->dev, "Failed to init mem ctx module\n");
+   rc = -ENOMEM;
+   goto mem_ctx_err;
+   }
}
 
return 0;
+
+mem_ctx_err:
+   if (ctx->asid != HL_KERNEL_ASID_ID)
+   hl_asid_free(hdev, ctx->asid);
+
+   return rc;
 }
 
 void hl_ctx_get(struct hl_device *hdev, struct hl_ctx *ctx)
diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index a47e00fe5ccf..1f7340551386 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -585,8 +585,10 @@ int hl_device_reset(struct hl_device *hdev, bool 
hard_reset,
/* Reset the H/W. It will be in idle state after this returns */
hdev->asic_funcs->hw_fini(hdev, hard_reset);
 
-   if (hard_reset)
+   if (hard_reset) {
+   hl_vm_fini(hdev);
hl_eq_reset(hdev, >event_queue);
+   }
 
/* Re-initialize PI,CI to 0 in all queues (hw queue, cq) */
hl_hw_queue_reset(hdev, hard_reset);
@@ -647,6 +649,13 @@ int hl_device_reset(struct hl_device *hdev, bool 
hard_reset,
goto out_err;
}
 
+   rc = hl_vm_init(hdev);
+   if (rc) {
+   dev_err(hdev->dev,
+   "Failed to init memory module after hard 
reset\n");
+   goto out_err;
+   }
+
hl_set_max_power(hdev, hdev->max_power);
 
hdev->hard_reset_pending = false;
@@ -828,6 +837,13 @@ int hl_device_init(struct hl_device *hdev, struct class 
*hclass)
hdev->asic_name,
hdev->asic_prop.dram_size / 1024 / 1024 / 1024);
 
+   rc =

[PATCH 08/15] habanalabs: add event queue and interrupts

2019-01-22 Thread Oded Gabbay

This patch adds support for receiving events from Goya's control CPU and
for receiving MSI-X interrupts from Goya's DMA engines and CPU.

Goya's PCI controller supports up to 8 MSI-X interrupts, which only 6 of
them are currently used. The first 5 interrupts are dedicated for Goya's
DMA engine queues. The 6th interrupt is dedicated for Goya's control CPU.

The DMA queue will signal its MSI-X entry upon each completion of a command
buffer that was placed on its primary queue. The driver will then mark that
CB as completed and free the related resources. It will also update the
command submission object which that CB belongs to.

There is a dedicated event queue (EQ) between the driver and Goya's control
CPU. The EQ is located on the Host memory. The control CPU writes a new
entry to the EQ for various reasons, such as ECC error, MMU page fault, Hot
temperature. After writing the new entry to the EQ, the control CPU will
trigger its dedicated MSI-X entry to signal the driver that there is a new
entry in the EQ. The driver will then read the entry and act accordingly.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/device.c|  35 +-
 drivers/misc/habanalabs/goya/goya.c | 522 +++-
 drivers/misc/habanalabs/goya/goyaP.h|   1 +
 drivers/misc/habanalabs/habanalabs.h|  37 ++
 drivers/misc/habanalabs/include/goya/goya.h |   1 -
 drivers/misc/habanalabs/irq.c   | 144 ++
 6 files changed, 729 insertions(+), 11 deletions(-)

diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index 98220628a467..9199e070e79e 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -173,9 +173,17 @@ static int device_early_init(struct hl_device *hdev)
hdev->cq_wq = alloc_workqueue("hl-free-jobs", WQ_UNBOUND, 0);
if (hdev->cq_wq == NULL) {
dev_err(hdev->dev, "Failed to allocate CQ workqueue\n");
+   rc = -ENOMEM;
goto asid_fini;
}
 
+   hdev->eq_wq = alloc_workqueue("hl-events", WQ_UNBOUND, 0);
+   if (hdev->eq_wq == NULL) {
+   dev_err(hdev->dev, "Failed to allocate EQ workqueue\n");
+   rc = -ENOMEM;
+   goto free_cq_wq;
+   }
+
hl_cb_mgr_init(>kernel_cb_mgr);
 
mutex_init(>device_open);
@@ -184,6 +192,8 @@ static int device_early_init(struct hl_device *hdev)
 
return 0;
 
+free_cq_wq:
+   destroy_workqueue(hdev->cq_wq);
 asid_fini:
hl_asid_fini(hdev);
 early_fini:
@@ -205,6 +215,7 @@ static void device_early_fini(struct hl_device *hdev)
 
hl_cb_mgr_fini(hdev, >kernel_cb_mgr);
 
+   destroy_workqueue(hdev->eq_wq);
destroy_workqueue(hdev->cq_wq);
 
hl_asid_fini(hdev);
@@ -343,11 +354,22 @@ int hl_device_init(struct hl_device *hdev, struct class 
*hclass)
}
}
 
+   /*
+* Initialize the event queue. Must be done before hw_init,
+* because there the address of the event queue is being
+* passed as argument to request_irq
+*/
+   rc = hl_eq_init(hdev, >event_queue);
+   if (rc) {
+   dev_err(hdev->dev, "failed to initialize event queue\n");
+   goto cq_fini;
+   }
+
/* Allocate the kernel context */
hdev->kernel_ctx = kzalloc(sizeof(*hdev->kernel_ctx), GFP_KERNEL);
if (!hdev->kernel_ctx) {
rc = -ENOMEM;
-   goto cq_fini;
+   goto eq_fini;
}
 
hdev->user_ctx = NULL;
@@ -392,6 +414,8 @@ int hl_device_init(struct hl_device *hdev, struct class 
*hclass)
"kernel ctx is still alive on initialization 
failure\n");
 free_ctx:
kfree(hdev->kernel_ctx);
+eq_fini:
+   hl_eq_fini(hdev, >event_queue);
 cq_fini:
for (i = 0 ; i < cq_ready_cnt ; i++)
hl_cq_fini(hdev, >completion_queue[i]);
@@ -433,6 +457,13 @@ void hl_device_fini(struct hl_device *hdev)
/* Mark device as disabled */
hdev->disabled = true;
 
+   /*
+* Halt the engines and disable interrupts so we won't get any more
+* completions from H/W and we won't have any accesses from the
+* H/W to the host machine
+*/
+   hdev->asic_funcs->halt_engines(hdev, true);
+
hl_cb_pool_fini(hdev);
 
/* Release kernel context */
@@ -442,6 +473,8 @@ void hl_device_fini(struct hl_device *hdev)
/* Reset the H/W. It will be in idle state after this returns */
hdev->asic_funcs->hw_fini(hdev, true);
 
+   hl_eq_fini(hdev, >event_queue);
+
for (i = 0 ; i < hdev->asic_prop.completion_queues_count ; i++)
hl_cq_fini(hdev, >completion_queue[i]);
kfree(hdev->completion_queue);
diff --git a/drivers/misc/habanalabs/goya/goya.c 
b/drivers/misc/habanalabs/goya/goya.c
index 08d5227eaf1d..6c04277ae0fa 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++

[PATCH 09/15] habanalabs: add sysfs and hwmon support

2019-01-22 Thread Oded Gabbay

This patch add the sysfs and hwmon entries that are exposed by the driver.

Goya has several sensors, from various categories such as temperature,
voltage, current, etc. The driver exposes those sensors in the standard
hwmon mechanism.

In addition, the driver exposes a couple of interfaces in sysfs, both for
configuration and for providing status of the device or driver.

The configuration attributes is for Power Management:
- Automatic or manual
- Frequency value when moving to high frequency mode
- Maximum power the device is allowed to consume

The rest of the attributes are read-only and provide the following
information:
- Versions of the various firmwares running on the device
- Contents of the device's EEPROM
- The device type (currently only Goya is supported)
- PCI address of the device (to allow user-space to connect between
  /dev/hlX to PCI address)
- Status of the device (operational, malfunction, in_reset)
- How many processes are open on the device's file

Signed-off-by: Oded Gabbay 
---
 .../ABI/testing/sysfs-driver-habanalabs   | 190 ++
 drivers/misc/habanalabs/Makefile  |   2 +-
 drivers/misc/habanalabs/device.c  | 146 +
 drivers/misc/habanalabs/goya/Makefile |   2 +-
 drivers/misc/habanalabs/goya/goya.c   | 230 +++
 drivers/misc/habanalabs/goya/goyaP.h  |  21 +
 drivers/misc/habanalabs/goya/goya_hwmgr.c | 306 +
 drivers/misc/habanalabs/habanalabs.h  |  97 +++
 drivers/misc/habanalabs/habanalabs_drv.c  |   7 +
 drivers/misc/habanalabs/hwmon.c   | 449 +
 drivers/misc/habanalabs/sysfs.c   | 588 ++
 11 files changed, 2036 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-habanalabs
 create mode 100644 drivers/misc/habanalabs/goya/goya_hwmgr.c
 create mode 100644 drivers/misc/habanalabs/hwmon.c
 create mode 100644 drivers/misc/habanalabs/sysfs.c

diff --git a/Documentation/ABI/testing/sysfs-driver-habanalabs 
b/Documentation/ABI/testing/sysfs-driver-habanalabs
new file mode 100644
index ..19edd4da87c1
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-habanalabs
@@ -0,0 +1,190 @@
+What:   /sys/class/habanalabs/hl/armcp_kernel_ver
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Version of the Linux kernel running on the device's CPU
+
+What:   /sys/class/habanalabs/hl/armcp_ver
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Version of the application running on the device's CPU
+
+What:   /sys/class/habanalabs/hl/cpld_ver
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Version of the Device's CPLD F/W
+
+What:   /sys/class/habanalabs/hl/device_type
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Displays the code name of the device according to its type.
+The supported values are: "GOYA"
+
+What:   /sys/class/habanalabs/hl/eeprom
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:A binary file attribute that contains the contents of the
+on-board EEPROM
+
+What:   /sys/class/habanalabs/hl/fuse_ver
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Displays the device's version from the eFuse
+
+What:   /sys/class/habanalabs/hl/hard_reset
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Interface to trigger a hard-reset operation for the device.
+Hard-reset will reset ALL internal components of the device
+except for the PCI interface and the internal PLLs
+
+What:   /sys/class/habanalabs/hl/hard_reset_cnt
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Displays how many times the device have undergone a hard-reset
+operation
+
+What:   /sys/class/habanalabs/hl/high_pll
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Allows the user to set the maximum clock frequency for MME, TPC
+and IC when the power management profile is set to "automatic".
+
+What:   /sys/class/habanalabs/hl/ic_clk
+Date:   Jan 2019
+KernelVersion:  5.1
+Contact:oded.gab...@gmail.com
+Description:Allows the user to set the maximum clock frequency of the
+Interconnect fabric. Writes to this parameter affect the device
+only when the power management profile is set to "manual" mode.
+The device IC clock might be set to lower value then the
+maximum. The user should read the ic_clk_curr to see

[PATCH 11/15] habanalabs: add command submission module

2019-01-22 Thread Oded Gabbay

This patch adds the main flow for the user to submit work to the device.

Each work is described by a command submission object (CS). The CS contains
3 arrays of command buffers: One for execution, and two for context-switch
(store and restore).

For each CB, the user specifies on which queue to put that CB. In case of
an internal queue, the entry doesn't contain a pointer to the CB but the
address in the on-chip memory that the CB resides at.

The driver parses some of the CBs to enforce security restrictions.

The user receives a sequence number that represents the CS object. The user
can then query the driver regarding the status of the CS, using that
sequence number.

In case the CS doesn't finish before the timeout expires, the driver will
perform a soft-reset of the device.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/Makefile |3 +-
 drivers/misc/habanalabs/command_submission.c |  787 +
 drivers/misc/habanalabs/context.c|   52 +-
 drivers/misc/habanalabs/device.c |   16 +
 drivers/misc/habanalabs/goya/goya.c  | 1082 ++
 drivers/misc/habanalabs/habanalabs.h |  274 +
 drivers/misc/habanalabs/habanalabs_drv.c |   23 +
 drivers/misc/habanalabs/habanalabs_ioctl.c   |4 +-
 drivers/misc/habanalabs/hw_queue.c   |  250 
 drivers/misc/habanalabs/memory.c |  200 
 include/uapi/misc/habanalabs.h   |  158 ++-
 11 files changed, 2842 insertions(+), 7 deletions(-)
 create mode 100644 drivers/misc/habanalabs/command_submission.c
 create mode 100644 drivers/misc/habanalabs/memory.c

diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile
index b5607233d216..d2fd0e18b1eb 100644
--- a/drivers/misc/habanalabs/Makefile
+++ b/drivers/misc/habanalabs/Makefile
@@ -5,7 +5,8 @@
 obj-m  := habanalabs.o
 
 habanalabs-y := habanalabs_drv.o device.o context.o asid.o habanalabs_ioctl.o \
-   command_buffer.o hw_queue.o irq.o sysfs.o hwmon.o
+   command_buffer.o hw_queue.o irq.o sysfs.o hwmon.o memory.o \
+   command_submission.o
 
 include $(src)/goya/Makefile
 habanalabs-y += $(HL_GOYA_FILES)
diff --git a/drivers/misc/habanalabs/command_submission.c 
b/drivers/misc/habanalabs/command_submission.c
new file mode 100644
index ..0116c2262f17
--- /dev/null
+++ b/drivers/misc/habanalabs/command_submission.c
@@ -0,0 +1,787 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright 2016-2018 HabanaLabs, Ltd.
+ * All Rights Reserved.
+ */
+
+#include 
+#include "habanalabs.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void job_wq_completion(struct work_struct *work);
+static long _hl_cs_wait_ioctl(struct hl_device *hdev,
+   struct hl_ctx *ctx, u64 timeout_us, u64 seq);
+static void cs_do_release(struct kref *ref);
+
+static const char *hl_fence_get_driver_name(struct dma_fence *fence)
+{
+   return "HabanaLabs";
+}
+
+static const char *hl_fence_get_timeline_name(struct dma_fence *fence)
+{
+   struct hl_dma_fence *hl_fence =
+   container_of(fence, struct hl_dma_fence, base_fence);
+
+   return dev_name(hl_fence->hdev->dev);
+}
+
+static bool hl_fence_enable_signaling(struct dma_fence *fence)
+{
+   return true;
+}
+
+static void hl_fence_release(struct dma_fence *fence)
+{
+   struct hl_dma_fence *hl_fence =
+   container_of(fence, struct hl_dma_fence, base_fence);
+
+   kfree_rcu(hl_fence, base_fence.rcu);
+}
+
+static const struct dma_fence_ops hl_fence_ops = {
+   .get_driver_name = hl_fence_get_driver_name,
+   .get_timeline_name = hl_fence_get_timeline_name,
+   .enable_signaling = hl_fence_enable_signaling,
+   .wait = dma_fence_default_wait,
+   .release = hl_fence_release
+};
+
+static void cs_get(struct hl_cs *cs)
+{
+   kref_get(>refcount);
+}
+
+static int cs_get_unless_zero(struct hl_cs *cs)
+{
+   return kref_get_unless_zero(>refcount);
+}
+
+static void cs_put(struct hl_cs *cs)
+{
+   kref_put(>refcount, cs_do_release);
+}
+
+/**
+ * cs_parser - parse the user command submission
+ *
+ * @hpriv  : pointer to the private data of the fd
+ * @job: pointer to the job that holds the command submission info
+ *
+ * The function parses the command submission of the user. It calls the
+ * ASIC specific parser, which returns a list of memory blocks to send
+ * to the device as different command buffers
+ *
+ */
+static int cs_parser(struct hl_fpriv *hpriv, struct hl_cs_job *job)
+{
+   struct hl_device *hdev = hpriv->hdev;
+   struct hl_cs_parser parser;
+   int rc;
+
+   parser.ctx_id = job->cs->ctx->asid;
+   parser.cs_sequence = job->cs->sequence;
+   parser.job_id = job->id;
+
+   parser.hw_queue_id = job->hw_queue_id;
+   parser.job_userptr_list = >userptr_list;
+   parser.patched_cb = NULL;
+   parser.user_cb = job->user_cb;
+

[PATCH 13/15] habanalabs: implement INFO IOCTL

2019-01-22 Thread Oded Gabbay

This patch implements the INFO IOCTL. That IOCTL is used by the user to
query information that is relevant/needed by the user in order to submit
deep learning jobs to Goya.

The information is divided into several categories, such as H/W IP, Events
that happened, DDR usage and more.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/goya/goya.c|   6 +
 drivers/misc/habanalabs/habanalabs.h   |   2 +
 drivers/misc/habanalabs/habanalabs_ioctl.c | 132 +
 include/uapi/misc/habanalabs.h |  76 +++-
 4 files changed, 215 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/goya/goya.c 
b/drivers/misc/habanalabs/goya/goya.c
index 94ee4cb00a49..c21c6046f09b 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -6120,6 +6120,11 @@ static void goya_hw_queues_unlock(struct hl_device *hdev)
spin_unlock(>hw_queues_lock);
 }
 
+static u32 goya_get_pci_id(struct hl_device *hdev)
+{
+   return hdev->pdev->device;
+}
+
 int goya_get_eeprom_data(struct hl_device *hdev, void *data, size_t max_size)
 {
struct goya_device *goya = hdev->asic_specific;
@@ -6217,6 +6222,7 @@ static const struct hl_asic_funcs goya_funcs = {
.soft_reset_late_init = goya_soft_reset_late_init,
.hw_queues_lock = goya_hw_queues_lock,
.hw_queues_unlock = goya_hw_queues_unlock,
+   .get_pci_id = goya_get_pci_id,
.get_eeprom_data = goya_get_eeprom_data,
.send_cpu_message = goya_send_cpu_message
 };
diff --git a/drivers/misc/habanalabs/habanalabs.h 
b/drivers/misc/habanalabs/habanalabs.h
index 1abc139d4293..6c0fe76936be 100644
--- a/drivers/misc/habanalabs/habanalabs.h
+++ b/drivers/misc/habanalabs/habanalabs.h
@@ -462,6 +462,7 @@ enum hl_pll_frequency {
  * @soft_reset_late_init: perform certain actions needed after soft reset.
  * @hw_queues_lock: acquire H/W queues lock.
  * @hw_queues_unlock: release H/W queues lock.
+ * @get_pci_id: retrieve PCI ID.
  * @get_eeprom_data: retrieve EEPROM data from F/W.
  * @send_cpu_message: send buffer to ArmCP.
  */
@@ -530,6 +531,7 @@ struct hl_asic_funcs {
int (*soft_reset_late_init)(struct hl_device *hdev);
void (*hw_queues_lock)(struct hl_device *hdev);
void (*hw_queues_unlock)(struct hl_device *hdev);
+   u32 (*get_pci_id)(struct hl_device *hdev);
int (*get_eeprom_data)(struct hl_device *hdev, void *data,
size_t max_size);
int (*send_cpu_message)(struct hl_device *hdev, u32 *msg,
diff --git a/drivers/misc/habanalabs/habanalabs_ioctl.c 
b/drivers/misc/habanalabs/habanalabs_ioctl.c
index 6dcad810b821..067cf640ad50 100644
--- a/drivers/misc/habanalabs/habanalabs_ioctl.c
+++ b/drivers/misc/habanalabs/habanalabs_ioctl.c
@@ -12,10 +12,142 @@
 #include 
 #include 
 
+static int hw_ip_info(struct hl_device *hdev, struct hl_info_args *args)
+{
+   struct hl_info_hw_ip_info hw_ip = {0};
+   u32 size = args->return_size;
+   void __user *out = (void __user *) (uintptr_t) args->return_pointer;
+   struct asic_fixed_properties *prop = >asic_prop;
+   u64 sram_kmd_size, dram_kmd_size;
+
+   if ((!size) || (!out))
+   return -EINVAL;
+
+   sram_kmd_size = (prop->sram_user_base_address -
+   prop->sram_base_address);
+   dram_kmd_size = (prop->dram_user_base_address -
+   prop->dram_base_address);
+
+   hw_ip.device_id = hdev->asic_funcs->get_pci_id(hdev);
+   hw_ip.sram_base_address = prop->sram_user_base_address;
+   hw_ip.dram_base_address = prop->dram_user_base_address;
+   hw_ip.tpc_enabled_mask = prop->tpc_enabled_mask;
+   hw_ip.sram_size = prop->sram_size - sram_kmd_size;
+   hw_ip.dram_size = prop->dram_size - dram_kmd_size;
+   if (hw_ip.dram_size > 0)
+   hw_ip.dram_enabled = 1;
+   hw_ip.num_of_events = prop->num_of_events;
+   memcpy(hw_ip.armcp_version,
+   prop->armcp_info.armcp_version, VERSION_MAX_LEN);
+   hw_ip.armcp_cpld_version = prop->armcp_info.cpld_version;
+   hw_ip.psoc_pci_pll_nr = prop->psoc_pci_pll_nr;
+   hw_ip.psoc_pci_pll_nf = prop->psoc_pci_pll_nf;
+   hw_ip.psoc_pci_pll_od = prop->psoc_pci_pll_od;
+   hw_ip.psoc_pci_pll_div_factor = prop->psoc_pci_pll_div_factor;
+
+   return copy_to_user(out, _ip,
+   min((size_t)size, sizeof(hw_ip))) ? -EFAULT : 0;
+}
+
+static int hw_events_info(struct hl_device *hdev, struct hl_info_args *args)
+{
+   u32 size, max_size = args->return_size;
+   void __user *out = (void __user *) (uintptr_t) args->return_pointer;
+   void *arr;
+
+   if ((!max_size) || (!out))
+   return -EINVAL;
+
+   arr = hdev->asic_funcs->get_events_stat(hdev, );
+
+   return copy_to_user(out, arr, min(max_size, size)) ? -EFAULT : 0;
+}
+
+static int dram_usage_info(struct hl_device *hdev, struct hl_info_args

[PATCH 01/15] habanalabs: add skeleton driver

2019-01-22 Thread Oded Gabbay

This patch adds the habanalabs skeleton driver. The driver does nothing at
this stage except very basic operations. It contains the minimal code to
insmod and rmmod the driver and to create a /dev/hlX file per PCI device.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/Kconfig  |   1 +
 drivers/misc/Makefile |   1 +
 drivers/misc/habanalabs/Kconfig   |  22 ++
 drivers/misc/habanalabs/Makefile  |   7 +
 drivers/misc/habanalabs/device.c  | 331 
 drivers/misc/habanalabs/habanalabs.h  | 149 +++
 drivers/misc/habanalabs/habanalabs_drv.c  | 366 ++
 .../habanalabs/include/habanalabs_device_if.h | 125 ++
 8 files changed, 1002 insertions(+)
 create mode 100644 drivers/misc/habanalabs/Kconfig
 create mode 100644 drivers/misc/habanalabs/Makefile
 create mode 100644 drivers/misc/habanalabs/device.c
 create mode 100644 drivers/misc/habanalabs/habanalabs.h
 create mode 100644 drivers/misc/habanalabs/habanalabs_drv.c
 create mode 100644 drivers/misc/habanalabs/include/habanalabs_device_if.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index f417b06e11c5..fecab53c4f21 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -535,4 +535,5 @@ source "drivers/misc/echo/Kconfig"
 source "drivers/misc/cxl/Kconfig"
 source "drivers/misc/ocxl/Kconfig"
 source "drivers/misc/cardreader/Kconfig"
+source "drivers/misc/habanalabs/Kconfig"
 endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index e39ccbbc1b3a..ae77dfd790a4 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -59,3 +59,4 @@ obj-$(CONFIG_PCI_ENDPOINT_TEST)   += pci_endpoint_test.o
 obj-$(CONFIG_OCXL) += ocxl/
 obj-y  += cardreader/
 obj-$(CONFIG_PVPANIC)  += pvpanic.o
+obj-$(CONFIG_HABANA_AI)+= habanalabs/
diff --git a/drivers/misc/habanalabs/Kconfig b/drivers/misc/habanalabs/Kconfig
new file mode 100644
index ..b7f38a14caf5
--- /dev/null
+++ b/drivers/misc/habanalabs/Kconfig
@@ -0,0 +1,22 @@
+#
+# HabanaLabs AI accelerators driver
+#
+
+config HABANA_AI
+   tristate "HabanaAI accelerators (habanalabs)"
+   depends on PCI
+   select FRAME_VECTOR
+   help
+ Enables PCIe card driver for Habana's AI Processors (AIP) that are
+ designed to accelerate Deep Learning inference and training workloads.
+
+ The driver manages the PCIe devices and provides IOCTL interface for
+ the user to submit workloads to the devices.
+
+ The user-space interface is described in
+ include/uapi/misc/habanalabs.h
+
+ If unsure, say N.
+
+ To compile this driver as a module, choose M here: the
+ module will be called habanalabs.
diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile
new file mode 100644
index ..b41433a09e02
--- /dev/null
+++ b/drivers/misc/habanalabs/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for HabanaLabs AI accelerators driver
+#
+
+obj-m  := habanalabs.o
+
+habanalabs-y := habanalabs_drv.o device.o
\ No newline at end of file
diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
new file mode 100644
index ..376b55eb73d4
--- /dev/null
+++ b/drivers/misc/habanalabs/device.c
@@ -0,0 +1,331 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright 2016-2018 HabanaLabs, Ltd.
+ * All Rights Reserved.
+ */
+
+#include "habanalabs.h"
+
+#include 
+#include 
+#include 
+
+static void hpriv_release(struct kref *ref)
+{
+   struct hl_fpriv *hpriv;
+   struct hl_device *hdev;
+
+   hpriv = container_of(ref, struct hl_fpriv, refcount);
+
+   hdev = hpriv->hdev;
+
+   put_pid(hpriv->taskpid);
+
+   kfree(hpriv);
+}
+
+void hl_hpriv_get(struct hl_fpriv *hpriv)
+{
+   kref_get(>refcount);
+}
+
+void hl_hpriv_put(struct hl_fpriv *hpriv)
+{
+   kref_put(>refcount, hpriv_release);
+}
+
+/**
+ * hl_device_release - release function for habanalabs device
+ *
+ * @inode: pointer to inode structure
+ * @filp: pointer to file structure
+ *
+ * Called when process closes an habanalabs device
+ */
+static int hl_device_release(struct inode *inode, struct file *filp)
+{
+   struct hl_fpriv *hpriv = filp->private_data;
+
+   filp->private_data = NULL;
+
+   hl_hpriv_put(hpriv);
+
+   return 0;
+}
+
+static const struct file_operations hl_ops = {
+   .owner = THIS_MODULE,
+   .open = hl_device_open,
+   .release = hl_device_release
+};
+
+/**
+ * device_setup_cdev - setup cdev and device for habanalabs device
+ *
+ * @hdev: pointer to habanalabs device structure
+ * @hclass: pointer to the class object of the device
+ * @minor: minor number of the specific device
+ * @fpos : file operations to install for this device
+ *
+ * Create a cdev and a Linux device for habanalabs's device. Need to be
+ * called at the end of the

[PATCH 03/15] habanalabs: add basic Goya support

2019-01-22 Thread Oded Gabbay

This patch adds a basic support for the Goya device. The code initializes
the device's PCI controller and PCI bars. It also initializes various S/W
structures and adds some basic helper functions.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/Makefile|   5 +-
 drivers/misc/habanalabs/device.c|  71 +++
 drivers/misc/habanalabs/goya/Makefile   |   3 +
 drivers/misc/habanalabs/goya/goya.c | 633 
 drivers/misc/habanalabs/goya/goyaP.h| 125 
 drivers/misc/habanalabs/habanalabs.h| 131 
 drivers/misc/habanalabs/habanalabs_drv.c|   3 +
 drivers/misc/habanalabs/include/goya/goya.h | 115 
 8 files changed, 1085 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/habanalabs/goya/Makefile
 create mode 100644 drivers/misc/habanalabs/goya/goya.c
 create mode 100644 drivers/misc/habanalabs/goya/goyaP.h
 create mode 100644 drivers/misc/habanalabs/include/goya/goya.h

diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile
index b41433a09e02..6f1ead69bd77 100644
--- a/drivers/misc/habanalabs/Makefile
+++ b/drivers/misc/habanalabs/Makefile
@@ -4,4 +4,7 @@
 
 obj-m  := habanalabs.o
 
-habanalabs-y := habanalabs_drv.o device.o
\ No newline at end of file
+habanalabs-y := habanalabs_drv.o device.o
+
+include $(src)/goya/Makefile
+habanalabs-y += $(HL_GOYA_FILES)
diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index 376b55eb73d4..a4276ef559b3 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -116,8 +116,11 @@ static int device_setup_cdev(struct hl_device *hdev, 
struct class *hclass,
  */
 static int device_early_init(struct hl_device *hdev)
 {
+   int rc;
+
switch (hdev->asic_type) {
case ASIC_GOYA:
+   goya_set_asic_funcs(hdev);
sprintf(hdev->asic_name, "GOYA");
break;
default:
@@ -126,6 +129,10 @@ static int device_early_init(struct hl_device *hdev)
return -EINVAL;
}
 
+   rc = hdev->asic_funcs->early_init(hdev);
+   if (rc)
+   return rc;
+
return 0;
 }
 
@@ -137,6 +144,10 @@ static int device_early_init(struct hl_device *hdev)
  */
 static void device_early_fini(struct hl_device *hdev)
 {
+
+   if (hdev->asic_funcs->early_fini)
+   hdev->asic_funcs->early_fini(hdev);
+
 }
 
 /**
@@ -150,8 +161,15 @@ static void device_early_fini(struct hl_device *hdev)
  */
 int hl_device_suspend(struct hl_device *hdev)
 {
+   int rc;
+
pci_save_state(hdev->pdev);
 
+   rc = hdev->asic_funcs->suspend(hdev);
+   if (rc)
+   dev_err(hdev->dev,
+   "Failed to disable PCI access of device CPU\n");
+
/* Shut down the device */
pci_disable_device(hdev->pdev);
pci_set_power_state(hdev->pdev, PCI_D3hot);
@@ -181,6 +199,13 @@ int hl_device_resume(struct hl_device *hdev)
return rc;
}
 
+   rc = hdev->asic_funcs->resume(hdev);
+   if (rc) {
+   dev_err(hdev->dev,
+   "Failed to enable PCI access from device CPU\n");
+   return rc;
+   }
+
return 0;
 }
 
@@ -208,11 +233,21 @@ int hl_device_init(struct hl_device *hdev, struct class 
*hclass)
if (rc)
goto release_device;
 
+   /*
+* Start calling ASIC initialization. First S/W then H/W and finally
+* late init
+*/
+   rc = hdev->asic_funcs->sw_init(hdev);
+   if (rc)
+   goto early_fini;
+
dev_notice(hdev->dev,
"Successfully added device to habanalabs driver\n");
 
return 0;
 
+early_fini:
+   device_early_fini(hdev);
 release_device:
device_destroy(hclass, hdev->dev->devt);
cdev_del(>cdev);
@@ -243,6 +278,9 @@ void hl_device_fini(struct hl_device *hdev)
/* Mark device as disabled */
hdev->disabled = true;
 
+   /* Call ASIC S/W finalize function */
+   hdev->asic_funcs->sw_fini(hdev);
+
device_early_fini(hdev);
 
/* Hide device from user */
@@ -329,3 +367,36 @@ int hl_poll_timeout_device_memory(struct hl_device *hdev, 
void __iomem *addr,
 
return (*val ? 0 : -ETIMEDOUT);
 }
+
+/*
+ * MMIO register access helper functions.
+ */
+
+/**
+ * hl_rreg - Read an MMIO register
+ *
+ * @hdev: pointer to habanalabs device structure
+ * @reg: MMIO register offset (in bytes)
+ *
+ * Returns the value of the MMIO register we are asked to read
+ *
+ */
+inline u32 hl_rreg(struct hl_device *hdev, u32 reg)
+{
+   return readl(hdev->rmmio + reg);
+}
+
+/**
+ * hl_wreg - Write to an MMIO register
+ *
+ * @hdev: pointer to habanalabs device structure
+ * @reg: MMIO register offset (in bytes)
+ * @val: 32-bit value
+ *
+ * Writes the 32-bit value into the MMIO register
+ *
+ */
+inline void hl_wreg(struct hl_device *hdev, u32 reg, u32 val)
+{
+

[PATCH 00/15] Habana Labs kernel driver

2019-01-22 Thread Oded Gabbay

Hello,

For those who don't know me, my name is Oded Gabbay (Kernel Maintainer
for AMD's amdkfd driver, worked at RedHat's Desktop group) and I work at
Habana Labs since its inception two and a half years ago. 

Habana is a leading startup in the emerging AI processor space and we have
already started production of our first Goya inference processor PCIe card
and delivered it to customers. The Goya processor silicon has been tested
since June of 2018 and is production-qualified by now. The Gaudi training
processor solution is slated to sample in the second quarter of 2019.

This patch-set contains the kernel driver for Habana's AI Processors 
(AIP) that are designed to accelerate Deep Learning inference and training
workloads. The current version supports only the Goya processor and
support for Gaudi will be upstreamed after the ASIC will be available to
customers.

The Goya processor has been designed from the ground up for deep learning
inference workloads. It comprises a cluster of eight fully programmable
Tensor Processing Cores (TPC). The TPC core is a VLIW SIMD vector
processor with ISA and hardware that was tailored to serve deep learning
workloads efficiently. 

In addition, Goya contains software-managed, on-die memory along with five
separate DMA channels, a PCIe Gen4 x16 system interface and 4/8/16GB of
DDR4 memory.

Goya has 3 PCI bars (64-bit), which are not exposed to user-space. They
map the on-chip memory and configuration space (bar 0-1), MSI-X table 
(bar 2-3) and DDR4 memory (bar 4-5).

Each TPC engine and DMA channel has a H/W queue attached to it, called
QMAN. The S/W provides command buffers to the H/W queues (through the
kernel driver) and the H/W consumes the command buffers. To prevent
malicious users from stealing data from other users through the Host or
Device memory, Goya has an internal MMU and a security protection scheme.
In addition, The kernel driver parses the command buffer and rejects it if
it contains disallowed commands.

The QMANs are triggered by a write to a PI (producer index) register. The
QMAN H/W logic maintains a CI (consumer index) register. When PI==CI, the
queue is empty. When PI+1==CI, the queue is full (note the queue is
cyclic). Each entry in the H/W queue is 16-bytes, and contains
a pointer and length of a variable-size command buffer, which the user
fills with specific commands that the H/W logic can read and execute.

For each DMA QMAN, there is a completion queue that the QMAN writes to
when it finishes the execution of the command buffer. The QMAN also
sends an MSI-X interrupt after writing the completion entry.

Inference workloads running on Goya are associated with an address space
through the ASID (address-space ID) property. Goya supports up to 1024
ASIDs. The ASID value is updated by the kernel driver in the relevant
registers before scheduling a workload.

During its initialization, the driver registers itself to the PCI
subsystem. For each Habana PCI device found, a char device node (/dev/hlX)
is created.

The driver currently exposes a total of five IOCTLs. One IOCTL allows
the application to submit workloads to the device, and another to wait on
completion of submitted workloads. The other three IOCTLs are used for
memory management, command buffer creation and information/status
retrieval.

In addition, the driver exposes several sensors through the hwmon
subsystem and provides various system-level information in sysfs for
system administrators.

The first step for an application process is to open the correct hlX
device it wants to work with. Calls to open create a new "context" for
that application in the driver's internal structures and a unique ASID
is assigned to that context. The context object lives until the process
releases the file descriptor AND its command submissions have finished
executing on the device.

Next step is for the application to request information about the
device, such as amount of DDR4 memory. The application then can go on to
create command buffers for its command submissions and allocate and map
device or host memory (host memory can only be mapped) to the internal
device's MMU subsystem.

At this point the application can load various deep learning
topologies to the device DDR memory. After that, it can start to submit
inference workloads using those topologies. For each workload, the
the application receives a sequence number that represents the workload.
The application can then query the driver regarding the status of the
workload using that sequence number.

In case a workload didn't finish execution after 5 seconds (configurable
using a kernel module parameter) from the time it was scheduled to run, a
TDR (timeout detection & recovery) event occurs in the driver. The driver
will then mark that workload as "timed out", perform a minimal reset of
the device (DMA and compute units only) and abort all other workloads of
that context that were already submitted to the H/W queues.

I would appricate any

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 1233 matches

Mail list logo