Re: [PATCH kernel] vfio-pci/nvlink2: Fix ancient gcc warnings
Hi Geert, The below patch comes about from the build regressions and improvements list you've sent out, but something doesn't add up that we'd be testing with an old compiler where initialization with { 0 } generates a "missing braces around initialization" warning. Is this really the case or are we missing something here? There's no harm that I can see with Alexey's fix, but are these really just false positives from a compiler bug that we should selectively ignore if the "fix" is less clean? Thanks, Alex On Wed, 23 Jan 2019 15:07:11 +1100 Alexey Kardashevskiy wrote: > Using the {0} construct as a generic initializer is perfectly fine in C, > however due to a bug in old gcc there is a warning: > > + /kisskb/src/drivers/vfio/pci/vfio_pci_nvlink2.c: warning: (near > initialization for 'cap.header') [-Wmissing-braces]: => 181:9 > > Since for whatever reason we still want to compile the modern kernel > with such an old gcc without warnings, this changes the capabilities > initialization. > > The gcc bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119 > > Signed-off-by: Alexey Kardashevskiy > --- > drivers/vfio/pci/vfio_pci_nvlink2.c | 30 ++--- > 1 file changed, 15 insertions(+), 15 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_nvlink2.c > b/drivers/vfio/pci/vfio_pci_nvlink2.c > index 054a2cf..91d945b 100644 > --- a/drivers/vfio/pci/vfio_pci_nvlink2.c > +++ b/drivers/vfio/pci/vfio_pci_nvlink2.c > @@ -178,11 +178,11 @@ static int vfio_pci_nvgpu_add_capability(struct > vfio_pci_device *vdev, > struct vfio_pci_region *region, struct vfio_info_cap *caps) > { > struct vfio_pci_nvgpu_data *data = region->data; > - struct vfio_region_info_cap_nvlink2_ssatgt cap = { 0 }; > - > - cap.header.id = VFIO_REGION_INFO_CAP_NVLINK2_SSATGT; > - cap.header.version = 1; > - cap.tgt = data->gpu_tgt; > + struct vfio_region_info_cap_nvlink2_ssatgt cap = { > + .header.id = VFIO_REGION_INFO_CAP_NVLINK2_SSATGT, > + .header.version = 1, > + .tgt = data->gpu_tgt > + }; > > return vfio_info_add_capability(caps, , sizeof(cap)); > } > @@ -365,18 +365,18 @@ static int vfio_pci_npu2_add_capability(struct > vfio_pci_device *vdev, > struct vfio_pci_region *region, struct vfio_info_cap *caps) > { > struct vfio_pci_npu2_data *data = region->data; > - struct vfio_region_info_cap_nvlink2_ssatgt captgt = { 0 }; > - struct vfio_region_info_cap_nvlink2_lnkspd capspd = { 0 }; > + struct vfio_region_info_cap_nvlink2_ssatgt captgt = { > + .header.id = VFIO_REGION_INFO_CAP_NVLINK2_SSATGT, > + .header.version = 1, > + .tgt = data->gpu_tgt > + }; > + struct vfio_region_info_cap_nvlink2_lnkspd capspd = { > + .header.id = VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD, > + .header.version = 1, > + .link_speed = data->link_speed > + }; > int ret; > > - captgt.header.id = VFIO_REGION_INFO_CAP_NVLINK2_SSATGT; > - captgt.header.version = 1; > - captgt.tgt = data->gpu_tgt; > - > - capspd.header.id = VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD; > - capspd.header.version = 1; > - capspd.link_speed = data->link_speed; > - > ret = vfio_info_add_capability(caps, , sizeof(captgt)); > if (ret) > return ret;
Re: [PATCH] ath: move spin_lock_bh to spin_lock in tasklet
姜智伟 writes: > Will do, thanks! Also don't send HTML mail :) Maillists drop those automatically. -- Kalle Valo
Re: linux-next: Fixes tag needs some work in the cpufreq-arm tree
Hi Viresh, On Fri, 18 Jan 2019 11:08:02 +0530 Viresh Kumar wrote: > > I missed looking into that. You must be running some sort of sanity > checks on the branch itself, can I know what exactly are you doing so > that I can try the same. I have attached my current script. I run this on the range of new commits for each tree each day. Suggestions welcome! :-) -- Cheers, Stephen Rothwell check_fixes Description: application/shellscript pgpSG4Uhn1gl4.pgp Description: OpenPGP digital signature
Re: [PATCH] virtio: support VIRTIO_F_ORDER_PLATFORM
On Wed, Jan 23, 2019 at 01:03:46AM +0800, Tiwei Bie wrote: > This patch introduces the support for VIRTIO_F_ORDER_PLATFORM. > When this feature is negotiated, driver will use the barriers > suitable for hardware devices. > > Signed-off-by: Tiwei Bie Could you pls add a bit more explanation in the commit log? E.g. which configurations are broken without this patch? How severe is the problem? I'm trying to decide whether this belongs in 5.0 or 5.1. > --- > drivers/virtio/virtio_ring.c | 8 > include/uapi/linux/virtio_config.h | 6 ++ > 2 files changed, 14 insertions(+) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755484e3..27d3f057493e 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -1609,6 +1609,9 @@ static struct virtqueue *vring_create_virtqueue_packed( > !context; > vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX); > > + if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM)) > + vq->weak_barriers = false; > + > vq->packed.ring_dma_addr = ring_dma_addr; > vq->packed.driver_event_dma_addr = driver_event_dma_addr; > vq->packed.device_event_dma_addr = device_event_dma_addr; > @@ -2079,6 +2082,9 @@ struct virtqueue *__vring_new_virtqueue(unsigned int > index, > !context; > vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX); > > + if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM)) > + vq->weak_barriers = false; > + > vq->split.queue_dma_addr = 0; > vq->split.queue_size_in_bytes = 0; > > @@ -2213,6 +2219,8 @@ void vring_transport_features(struct virtio_device > *vdev) > break; > case VIRTIO_F_RING_PACKED: > break; > + case VIRTIO_F_ORDER_PLATFORM: > + break; > default: > /* We don't understand this bit. */ > __virtio_clear_bit(vdev, i); > diff --git a/include/uapi/linux/virtio_config.h > b/include/uapi/linux/virtio_config.h > index 1196e1c1d4f6..ff8e7dc9d4dd 100644 > --- a/include/uapi/linux/virtio_config.h > +++ b/include/uapi/linux/virtio_config.h > @@ -78,6 +78,12 @@ > /* This feature indicates support for the packed virtqueue layout. */ > #define VIRTIO_F_RING_PACKED 34 > > +/* > + * This feature indicates that memory accesses by the driver and the > + * device are ordered in a way described by the platform. > + */ > +#define VIRTIO_F_ORDER_PLATFORM 36 > + > /* > * Does the device support Single Root I/O Virtualization? > */ > -- > 2.17.1
Re: question about head_64.S
On 1/22/19 9:08 PM, Kirill A. Shutemov wrote: > On Tue, Jan 22, 2019 at 03:31:25PM +0800, Cao jin wrote: >> Hi, Kirll, >> >>> 2. >>> Why gdt64 has following definition?: >>> >>> gdt64: >>> .word gdt_end - gdt >>> .long 0 >>> .word 0 >>> .quad 0 >>> >>> obviously, gdt64 stores the GDTR content under x86_64, which is 10 bytes >>> long, so why not just: >>> >>> gdt64: >>> .word gdt_end - gdt >>> .quad 0 >>> >>> With above modification, it can boot. >>> >> >> Seems you introduced gdt64 code in commit beebaccd50, could you help >> with this question? > > Looks like you are right. I've got confused at some point. > > Could you prepare a patch? Sure. > >> And it also remind me of another question about adjust_got which is also >> introduced by you. Because I failed to construct a test environment with >> ld version less than 2.24 until now, so I wanna do a quick ask here: >> does it make sense to adjust GOT from the 4th entry of it? Because as I >> know, the first 3 entries are special one, which (I guess) will be not used. > > No. > > These 3 entries are reserved for a special symbols (like entry 0 for > _DYNAMIC). It means linker should not use these entries for normal > symbols, but it doesn't mean that they don't need to be adjusted during > the load. > Thanks for your info! BTW, could I know how you construct the test environment? I tried centos6, the GCC version is too old to compile; then tried fedora28 with binutils-2.20.51.0.2-5.48.el6.x86_64.rpm from centos6, ld reported errors; and then tried compiling binutils source with tag 2.23, stopped at configure phase:( -- Sincerely, Cao jin
Re: [virtio-dev] [PATCH] virtio: support VIRTIO_F_ORDER_PLATFORM
On Wed, Jan 23, 2019 at 11:08:04AM +0800, Jason Wang wrote: > > On 2019/1/23 上午1:03, Tiwei Bie wrote: > > This patch introduces the support for VIRTIO_F_ORDER_PLATFORM. > > When this feature is negotiated, driver will use the barriers > > suitable for hardware devices. > > > > Signed-off-by: Tiwei Bie > > --- > > drivers/virtio/virtio_ring.c | 8 > > include/uapi/linux/virtio_config.h | 6 ++ > > 2 files changed, 14 insertions(+) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index cd7e755484e3..27d3f057493e 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -1609,6 +1609,9 @@ static struct virtqueue > > *vring_create_virtqueue_packed( > > !context; > > vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX); > > + if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM)) > > + vq->weak_barriers = false; > > + > > vq->packed.ring_dma_addr = ring_dma_addr; > > vq->packed.driver_event_dma_addr = driver_event_dma_addr; > > vq->packed.device_event_dma_addr = device_event_dma_addr; > > @@ -2079,6 +2082,9 @@ struct virtqueue *__vring_new_virtqueue(unsigned int > > index, > > !context; > > vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX); > > + if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM)) > > + vq->weak_barriers = false; > > + > > vq->split.queue_dma_addr = 0; > > vq->split.queue_size_in_bytes = 0; > > @@ -2213,6 +2219,8 @@ void vring_transport_features(struct virtio_device > > *vdev) > > break; > > case VIRTIO_F_RING_PACKED: > > break; > > + case VIRTIO_F_ORDER_PLATFORM: > > + break; > > default: > > /* We don't understand this bit. */ > > __virtio_clear_bit(vdev, i); > > diff --git a/include/uapi/linux/virtio_config.h > > b/include/uapi/linux/virtio_config.h > > index 1196e1c1d4f6..ff8e7dc9d4dd 100644 > > --- a/include/uapi/linux/virtio_config.h > > +++ b/include/uapi/linux/virtio_config.h > > @@ -78,6 +78,12 @@ > > /* This feature indicates support for the packed virtqueue layout. */ > > #define VIRTIO_F_RING_PACKED 34 > > +/* > > + * This feature indicates that memory accesses by the driver and the > > + * device are ordered in a way described by the platform. > > + */ > > +#define VIRTIO_F_ORDER_PLATFORM36 > > + > > /* > >* Does the device support Single Root I/O Virtualization? > >*/ > > > I wonder whether or not this is sufficient. Is dma barrier implies a mmio > barrier? Looks not. IIUC we don't need an mmio barrier because we are using a serializing API: Documentation/memory-barriers.txt says: Note that, when using writel(), a prior wmb() is not needed to guarantee that the cache coherent memory writes have completed before writing to the MMIO region. > See ia64/include/asm/barrier.h: > > * Note: "mb()" and its variants cannot be used as a fence to order > * accesses to memory mapped I/O registers. For that, mf.a needs to > * be used. However, we don't want to always use mf.a because (a) > * it's (presumably) much slower than mf and (b) mf.a is supported for > * sequential memory pages only. > */ > #define mb() ia64_mf() > #define rmb() mb() > #define wmb() mb() > > #define dma_rmb() mb() > =>efine dma_wmb() mb() > > Thanks Frankly no idea about ia64. Sorry. Are any less esoteric platforms affected? -- MST
[PATCH v2 3/4] locking/qspinlock_stat: Separate out the PV specific stat counts
Some of the statistics counts are for PV qspinlocks only and are not applicable if PARAVIRT_SPINLOCKS aren't configured. So make those counts dependent on the PARAVIRT_SPINLOCKS config option now. Signed-off-by: Waiman Long --- kernel/locking/qspinlock_stat.h | 129 +--- 1 file changed, 81 insertions(+), 48 deletions(-) diff --git a/kernel/locking/qspinlock_stat.h b/kernel/locking/qspinlock_stat.h index 31728f6..7a0a848 100644 --- a/kernel/locking/qspinlock_stat.h +++ b/kernel/locking/qspinlock_stat.h @@ -49,6 +49,7 @@ * There may be slight difference between pv_kick_wake and pv_kick_unlock. */ enum qlock_stats { +#ifdef CONFIG_PARAVIRT_SPINLOCKS qstat_pv_hash_hops, qstat_pv_kick_unlock, qstat_pv_kick_wake, @@ -60,6 +61,7 @@ enum qlock_stats { qstat_pv_wait_early, qstat_pv_wait_head, qstat_pv_wait_node, +#endif qstat_lock_pending, qstat_lock_slowpath, qstat_lock_use_node2, @@ -80,6 +82,7 @@ enum qlock_stats { #include static const char * const qstat_names[qstat_num + 1] = { +#ifdef CONFIG_PARAVIRT_SPINLOCKS [qstat_pv_hash_hops] = "pv_hash_hops", [qstat_pv_kick_unlock] = "pv_kick_unlock", [qstat_pv_kick_wake] = "pv_kick_wake", @@ -91,6 +94,7 @@ enum qlock_stats { [qstat_pv_wait_early] = "pv_wait_early", [qstat_pv_wait_head] = "pv_wait_head", [qstat_pv_wait_node] = "pv_wait_node", +#endif [qstat_lock_pending] = "lock_pending", [qstat_lock_slowpath] = "lock_slowpath", [qstat_lock_use_node2] = "lock_use_node2", @@ -104,6 +108,20 @@ enum qlock_stats { * Per-cpu counters */ static DEFINE_PER_CPU(unsigned long, qstats[qstat_num]); + +/* + * Increment the PV qspinlock statistical counters + */ +static inline void qstat_inc(enum qlock_stats stat, bool cond) +{ + if (cond) + this_cpu_inc(qstats[stat]); +} + +#ifdef CONFIG_PARAVIRT_SPINLOCKS +/* + * PV specific per-cpu counters + */ static DEFINE_PER_CPU(u64, pv_kick_time); /* @@ -178,6 +196,69 @@ static ssize_t qstat_read(struct file *file, char __user *user_buf, } /* + * PV hash hop count + */ +static inline void qstat_hop(int hopcnt) +{ + this_cpu_add(qstats[qstat_pv_hash_hops], hopcnt); +} + +/* + * Replacement function for pv_kick() + */ +static inline void __pv_kick(int cpu) +{ + u64 start = sched_clock(); + + per_cpu(pv_kick_time, cpu) = start; + pv_kick(cpu); + this_cpu_add(qstats[qstat_pv_latency_kick], sched_clock() - start); +} + +/* + * Replacement function for pv_wait() + */ +static inline void __pv_wait(u8 *ptr, u8 val) +{ + u64 *pkick_time = this_cpu_ptr(_kick_time); + + *pkick_time = 0; + pv_wait(ptr, val); + if (*pkick_time) { + this_cpu_add(qstats[qstat_pv_latency_wake], +sched_clock() - *pkick_time); + qstat_inc(qstat_pv_kick_wake, true); + } +} + +#define pv_kick(c) __pv_kick(c) +#define pv_wait(p, v) __pv_wait(p, v) + +#else /* CONFIG_PARAVIRT_SPINLOCKS */ +static ssize_t qstat_read(struct file *file, char __user *user_buf, + size_t count, loff_t *ppos) +{ + char buf[64]; + int cpu, counter, len; + u64 stat = 0; + + /* +* Get the counter ID stored in file->f_inode->i_private +*/ + counter = (long)file_inode(file)->i_private; + + if (counter >= qstat_num) + return -EBADF; + + for_each_possible_cpu(cpu) + stat += per_cpu(qstats[counter], cpu); + len = snprintf(buf, sizeof(buf) - 1, "%llu\n", stat); + + return simple_read_from_buffer(user_buf, count, ppos, buf, len); +} +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ + +/* * Function to handle write request * * When counter = reset_cnts, reset all the counter values. @@ -250,54 +331,6 @@ static int __init init_qspinlock_stat(void) } fs_initcall(init_qspinlock_stat); -/* - * Increment the PV qspinlock statistical counters - */ -static inline void qstat_inc(enum qlock_stats stat, bool cond) -{ - if (cond) - this_cpu_inc(qstats[stat]); -} - -/* - * PV hash hop count - */ -static inline void qstat_hop(int hopcnt) -{ - this_cpu_add(qstats[qstat_pv_hash_hops], hopcnt); -} - -/* - * Replacement function for pv_kick() - */ -static inline void __pv_kick(int cpu) -{ - u64 start = sched_clock(); - - per_cpu(pv_kick_time, cpu) = start; - pv_kick(cpu); - this_cpu_add(qstats[qstat_pv_latency_kick], sched_clock() - start); -} - -/* - * Replacement function for pv_wait() - */ -static inline void __pv_wait(u8 *ptr, u8 val) -{ - u64 *pkick_time = this_cpu_ptr(_kick_time); - - *pkick_time = 0; - pv_wait(ptr, val); - if (*pkick_time) { - this_cpu_add(qstats[qstat_pv_latency_wake], -
[PATCH v2 0/4] locking/qspinlock: Handle > 4 nesting levels
v2: - Use the simple trylock loop as suggested by PeterZ. The current allows up to 4 levels of nested slowpath spinlock calls. That should be enough for the process, soft irq, hard irq, and nmi. With the unfortunate event of nested NMIs happening with slowpath spinlock call in each of the previous level, we are going to run out of useable MCS node for queuing. In this case, we fall back to a simple TAS lock and spin on the lock cacheline until the lock is free. This is not most elegant solution but is simple enough. Patch 1 implements the TAS loop when all the existing MCS nodes are occupied. Patches 2-4 enhances the locking statistics code to track the new code as well as enabling it on other architectures such as ARM64. By setting MAX_NODES to 1, we can have some usage of the new code path during the booting process as demonstrated by the stat counter values shown below on an 1-socket 22-core 44-thread x86-64 system after booting up the new kernel. lock_no_node=20 lock_pending=29660 lock_slowpath=172714 Waiman Long (4): locking/qspinlock: Handle > 4 slowpath nesting levels locking/qspinlock_stat: Track the no MCS node available case locking/qspinlock_stat: Separate out the PV specific stat counts locking/qspinlock_stat: Allow QUEUED_LOCK_STAT for all archs arch/Kconfig| 7 ++ arch/x86/Kconfig| 8 --- kernel/locking/qspinlock.c | 18 - kernel/locking/qspinlock_stat.h | 150 +--- 4 files changed, 120 insertions(+), 63 deletions(-) -- 1.8.3.1
[PATCH v2 4/4] locking/qspinlock_stat: Allow QUEUED_LOCK_STAT for all archs
The QUEUED_LOCK_STAT option to report queued spinlocks statistics was previously allowed only on x86 architecture. Now queued spinlocks are used in multiple architectures, we now allow QUEUED_LOCK_STAT to be enabled for all those architectures that use queued spinlocks. This option is listed as part of the general architecture-dependent options. Signed-off-by: Waiman Long --- arch/Kconfig | 7 +++ arch/x86/Kconfig | 8 2 files changed, 7 insertions(+), 8 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index 4cfb6de..c82e32f 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -885,6 +885,13 @@ config HAVE_ARCH_PREL32_RELOCATIONS architectures, and don't require runtime relocation on relocatable kernels. +config QUEUED_LOCK_STAT + bool "Queued spinlock statistics" + depends on QUEUED_SPINLOCKS && DEBUG_FS + ---help--- + Enable the collection of statistical data on the slowpath + behavior of queued spinlocks and report them on debugfs. + source "kernel/gcov/Kconfig" source "scripts/gcc-plugins/Kconfig" diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 4b4a7f3..872e681 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -784,14 +784,6 @@ config PARAVIRT_SPINLOCKS If you are unsure how to answer this question, answer Y. -config QUEUED_LOCK_STAT - bool "Paravirt queued spinlock statistics" - depends on PARAVIRT_SPINLOCKS && DEBUG_FS - ---help--- - Enable the collection of statistical data on the slowpath - behavior of paravirtualized queued spinlocks and report - them on debugfs. - source "arch/x86/xen/Kconfig" config KVM_GUEST -- 1.8.3.1
[PATCH v2 2/4] locking/qspinlock_stat: Track the no MCS node available case
Track the number of slowpath locking operations that are being done without any MCS node available as well renaming lock_index[123] to make them more descriptive. Using these stat counters is one way to find out if a code path is being exercised. Signed-off-by: Waiman Long --- kernel/locking/qspinlock.c | 3 ++- kernel/locking/qspinlock_stat.h | 21 +++-- 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 0875053..21ee51b 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -422,6 +422,7 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * simple enough. */ if (unlikely(idx >= MAX_NODES)) { + qstat_inc(qstat_lock_no_node, true); while (!queued_spin_trylock(lock)) cpu_relax(); goto release; @@ -432,7 +433,7 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) /* * Keep counts of non-zero index values: */ - qstat_inc(qstat_lock_idx1 + idx - 1, idx); + qstat_inc(qstat_lock_use_node2 + idx - 1, idx); /* * Ensure that we increment the head node->count before initialising diff --git a/kernel/locking/qspinlock_stat.h b/kernel/locking/qspinlock_stat.h index 42d3d8d..31728f6 100644 --- a/kernel/locking/qspinlock_stat.h +++ b/kernel/locking/qspinlock_stat.h @@ -30,6 +30,13 @@ * pv_wait_node - # of vCPU wait's at a non-head queue node * lock_pending - # of locking operations via pending code * lock_slowpath - # of locking operations via MCS lock queue + * lock_use_node2- # of locking operations that use 2nd percpu node + * lock_use_node3- # of locking operations that use 3rd percpu node + * lock_use_node4- # of locking operations that use 4th percpu node + * lock_no_node - # of locking operations without using percpu node + * + * Subtraccting lock_use_node[234] from lock_slowpath will give you + * lock_use_node1. * * Writing to the "reset_counters" file will reset all the above counter * values. @@ -55,9 +62,10 @@ enum qlock_stats { qstat_pv_wait_node, qstat_lock_pending, qstat_lock_slowpath, - qstat_lock_idx1, - qstat_lock_idx2, - qstat_lock_idx3, + qstat_lock_use_node2, + qstat_lock_use_node3, + qstat_lock_use_node4, + qstat_lock_no_node, qstat_num, /* Total number of statistical counters */ qstat_reset_cnts = qstat_num, }; @@ -85,9 +93,10 @@ enum qlock_stats { [qstat_pv_wait_node] = "pv_wait_node", [qstat_lock_pending] = "lock_pending", [qstat_lock_slowpath] = "lock_slowpath", - [qstat_lock_idx1] = "lock_index1", - [qstat_lock_idx2] = "lock_index2", - [qstat_lock_idx3] = "lock_index3", + [qstat_lock_use_node2] = "lock_use_node2", + [qstat_lock_use_node3] = "lock_use_node3", + [qstat_lock_use_node4] = "lock_use_node4", + [qstat_lock_no_node] = "lock_no_node", [qstat_reset_cnts] = "reset_counters", }; -- 1.8.3.1
[PATCH v2 1/4] locking/qspinlock: Handle > 4 slowpath nesting levels
Four queue nodes per cpu are allocated to enable up to 4 nesting levels using the per-cpu nodes. Nested NMIs are possible in some architectures. Still it is very unlikely that we will ever hit more than 4 nested levels with contention in the slowpath. When that rare condition happens, however, it is likely that the system will hang or crash shortly after that. It is not good and we need to handle this exception case. This is done by spinning directly on the lock using repeated trylock. This alternative code path should only be used when there is nested NMIs. Assuming that the locks used by those NMI handlers will not be heavily contended, a simple TAS locking should work out. Suggested-by: Peter Zijlstra Signed-off-by: Waiman Long --- kernel/locking/qspinlock.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 8a8c3c2..0875053 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -412,6 +412,21 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) idx = node->count++; tail = encode_tail(smp_processor_id(), idx); + /* +* 4 nodes are allocated based on the assumption that there will +* not be nested NMIs taking spinlocks. That may not be true in +* some architectures even though the chance of needing more than +* 4 nodes will still be extremely unlikely. When that happens, +* we fall back to spinning on the lock directly without using +* any MCS node. This is not the most elegant solution, but is +* simple enough. +*/ + if (unlikely(idx >= MAX_NODES)) { + while (!queued_spin_trylock(lock)) + cpu_relax(); + goto release; + } + node = grab_mcs_node(node, idx); /* -- 1.8.3.1
Re: [PATCH] capabilities:: annotate implicit fall through
On Mon, 14 Jan 2019, Mathieu Malaterre wrote: > There is a plan to build the kernel with -Wimplicit-fallthrough and > this place in the code produced a warning (W=1). > > In this particular case change put the fall through comment on a single > line so as to match the regular expression expected by GCC. > > This commit remove the following warning: > > kernel/capability.c:95:3: warning: this statement may fall through > [-Wimplicit-fallthrough=] > > Signed-off-by: Mathieu Malaterre Applied to git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general -- James Morris
[v6 2/3] drm/msm/dpu: Integrate interconnect API in MDSS
The interconnect framework is designed to provide a standard kernel interface to control the settings of the interconnects on a SoC. The interconnect API uses a consumer/provider-based model, where the providers are the interconnect buses and the consumers could be various drivers. MDSS is one of the interconnect consumers which uses the interconnect APIs to get the path between endpoints and set its bandwidth requirement for the given interconnected path. Changes in v2: - Remove error log and unnecessary check (Jordan Crouse) Changes in v3: - Code clean involving variable name change, removal of extra paranthesis and variables (Matthias Kaehlcke) Changes in v4: - Add comments, spacings, tabs, proper port name and icc macro (Georgi Djakov) Changes in v5: - Commit text and parenthesis alignment (Georgi Djakov) Changes in v6: - Change to new icc_set API's (Doug Anderson) Signed-off-by: Sravanthi Kollukuduru1 Signed-off-by: Jayant Shekhar --- drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c | 49 +--- 1 file changed, 45 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c index 38576f8..38daf8a 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c @@ -4,11 +4,15 @@ */ #include "dpu_kms.h" +#include #define to_dpu_mdss(x) container_of(x, struct dpu_mdss, base) #define HW_INTR_STATUS 0x0010 +/* Max BW defined in KBps */ +#define MAX_BW 680 + struct dpu_mdss { struct msm_mdss base; void __iomem *mmio; @@ -16,8 +20,30 @@ struct dpu_mdss { u32 hwversion; struct dss_module_power mp; struct dpu_irq_controller irq_controller; + struct icc_path *path[2]; + u32 num_paths; }; +static int dpu_mdss_parse_data_bus_icc_path(struct drm_device *dev, + struct dpu_mdss *dpu_mdss) +{ + struct icc_path *path0 = of_icc_get(dev->dev, "mdp0-mem"); + struct icc_path *path1 = of_icc_get(dev->dev, "mdp1-mem"); + + if (IS_ERR(path0)) + return PTR_ERR(path0); + + dpu_mdss->path[0] = path0; + dpu_mdss->num_paths = 1; + + if (!IS_ERR(path1)) { + dpu_mdss->path[1] = path1; + dpu_mdss->num_paths++; + } + + return 0; +} + static irqreturn_t dpu_mdss_irq(int irq, void *arg) { struct dpu_mdss *dpu_mdss = arg; @@ -127,7 +153,11 @@ static int dpu_mdss_enable(struct msm_mdss *mdss) { struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss); struct dss_module_power *mp = _mdss->mp; - int ret; + int ret, i; + u64 avg_bw = dpu_mdss->num_paths ? MAX_BW / dpu_mdss->num_paths : 0; + + for (i = 0; i < dpu_mdss->num_paths; i++) + icc_set_bw(dpu_mdss->path[i], avg_bw, kBps_to_icc(MAX_BW)); ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, true); if (ret) @@ -140,12 +170,15 @@ static int dpu_mdss_disable(struct msm_mdss *mdss) { struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss); struct dss_module_power *mp = _mdss->mp; - int ret; + int ret, i; ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, false); if (ret) DPU_ERROR("clock disable failed, ret:%d\n", ret); + for (i = 0; i < dpu_mdss->num_paths; i++) + icc_set_bw(dpu_mdss->path[i], 0, 0); + return ret; } @@ -155,6 +188,7 @@ static void dpu_mdss_destroy(struct drm_device *dev) struct msm_drm_private *priv = dev->dev_private; struct dpu_mdss *dpu_mdss = to_dpu_mdss(priv->mdss); struct dss_module_power *mp = _mdss->mp; + int i; pm_runtime_disable(dev->dev); _dpu_mdss_irq_domain_fini(dpu_mdss); @@ -162,6 +196,9 @@ static void dpu_mdss_destroy(struct drm_device *dev) msm_dss_put_clk(mp->clk_config, mp->num_clk); devm_kfree(>dev, mp->clk_config); + for (i = 0; i < dpu_mdss->num_paths; i++) + icc_put(dpu_mdss->path[i]); + if (dpu_mdss->mmio) devm_iounmap(>dev, dpu_mdss->mmio); dpu_mdss->mmio = NULL; @@ -200,6 +237,10 @@ int dpu_mdss_init(struct drm_device *dev) } dpu_mdss->mmio_len = resource_size(res); + ret = dpu_mdss_parse_data_bus_icc_path(dev, dpu_mdss); + if (ret) + return ret; + mp = _mdss->mp; ret = msm_dss_parse_clock(pdev, mp); if (ret) { @@ -221,14 +262,14 @@ int dpu_mdss_init(struct drm_device *dev) goto irq_error; } + priv->mdss = _mdss->base; + pm_runtime_enable(dev->dev); pm_runtime_get_sync(dev->dev); dpu_mdss->hwversion = readl_relaxed(dpu_mdss->mmio); pm_runtime_put_sync(dev->dev); - priv->mdss = _mdss->base; -
Re: [PATCH v2 0/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2
On Tue, 2019-01-22 at 21:41 -0500, Martin K. Petersen wrote: > Ching, > > > This patch series are against to mkp's 5.1/scsi-queue. > > Applied to 5.1/scsi-queue. Thank you. > > PS. Your file permissions are odd. I always have to change your diffs > from 755 to 644 before applying. > Thanks Martin and Dan's help. The file permission problem also confused to me. I used Evolution mail of CentOS 6.x to submit the patches. The mail context format is Plain text, preformatted. I inserted the diff text file to the mail, and diff file listing as below. -rw-r--r--. 1 root root 1663 Jan 16 04:11 p1.txt Don't know why and when it's permission changed from 644 to 755.
[v6 3/3] dt-bindings: msm/disp: Introduce interconnect bindings for MDSS on SDM845
Add interconnect properties such as interconnect provider specifier , the edge source and destination ports which are required by the interconnect API to configure interconnect path for MDSS. Changes in v2: - none Changes in v3: - Remove common property definitions (Rob Herring) Changes in v4: - Use port macros and change port string names (Georgi Djakov) Changes in v5: - None Changes in v6: - None Signed-off-by: Sravanthi Kollukuduru Signed-off-by: Jayant Shekhar --- Documentation/devicetree/bindings/display/msm/dpu.txt | 10 ++ 1 file changed, 10 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/dpu.txt b/Documentation/devicetree/bindings/display/msm/dpu.txt index ad2e883..a61dd40 100644 --- a/Documentation/devicetree/bindings/display/msm/dpu.txt +++ b/Documentation/devicetree/bindings/display/msm/dpu.txt @@ -28,6 +28,11 @@ Required properties: - #address-cells: number of address cells for the MDSS children. Should be 1. - #size-cells: Should be 1. - ranges: parent bus address space is the same as the child bus address space. +- interconnects : interconnect path specifier for MDSS according to + Documentation/devicetree/bindings/interconnect/interconnect.txt. Should be + 2 paths corresponding to 2 AXI ports. +- interconnect-names : MDSS will have 2 port names to differentiate between the + 2 interconnect paths defined with interconnect specifier. Optional properties: - assigned-clocks: list of clock specifiers for clocks needing rate assignment @@ -86,6 +91,11 @@ Example: interrupt-controller; #interrupt-cells = <1>; + interconnects = <_hlos MASTER_MDP0 _hlos SLAVE_EBI1>, + <_hlos MASTER_MDP1 _hlos SLAVE_EBI1>; + + interconnect-names = "mdp0-mem", "mdp1-mem"; + iommus = <_iommu 0>; #address-cells = <2>; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[v6 0/3] Use interconnect API in MDSS on SDM845
The interconnect API provides an interface for consumer drivers to express their bandwidth needs in the SoC. This data is aggregated and the on-chip interconnect hardware is configured to the appropriate power/performance profile. MDSS is one of the interconnect consumers which uses the interconnect APIs to get the path between endpoints and set its bandwidth requirements for the given interconnected path. Subsequently, there is a clean up patch to remove all the references of the DPU custom bus scaling. There is corresponding DT patch with the source and destination ports defined for display driver which will be sent separately. Changes in v2: - Remove error log and unnecessary check (Jordan Crouse) - Fixed build error due to partial clean up Changes in v3: - Remove common property definitions (Rob Herring) - Code clean up involving variable name change, removal of extra paranthesis and variables (Matthias Kaehlcke) - Condense multiple lines into a single line (Sean Paul) Changes in v4: - Add comments, spacings, tabs, proper port name and icc macro - Use port macros and change port string names (Georgi Djakov) Changes in v5: - Updated commit text and parenthesis alignment (Georgi Djakov) Changes in v6: - Change icc_set to icc_set_bw (Doug Anderson) Jayant Shekhar (3): drm/msm/dpu: clean up references of DPU custom bus scaling drm/msm/dpu: Integrate interconnect API in MDSS dt-bindings: msm/disp: Introduce interconnect bindings for MDSS on SDM845 .../devicetree/bindings/display/msm/dpu.txt| 10 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 174 - drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h | 4 +- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 13 +- drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c | 49 +- drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.c | 47 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.h | 68 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h | 22 +-- 8 files changed, 144 insertions(+), 243 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[v6 1/3] drm/msm/dpu: clean up references of DPU custom bus scaling
Since the upstream interconnect bus framework has landed upstream, the existing references of custom bus scaling needs to be cleaned up. Changes in v2: - Fixed build error due to partial clean up Changes in v3: - Condense multiple lines into a single line (Sean Paul) Changes in v4: - None Changes in v5: - None Changes in v6: - None Signed-off-by: Sravanthi Kollukuduru Signed-off-by: Jayant Shekhar --- drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c| 174 +-- drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h| 4 +- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 13 +- drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.c | 47 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.h | 68 - drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h| 22 +-- 6 files changed, 89 insertions(+), 239 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c index 22e84b3..c75536e 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c @@ -84,7 +84,6 @@ static void _dpu_core_perf_calc_crtc(struct dpu_kms *kms, struct dpu_core_perf_params *perf) { struct dpu_crtc_state *dpu_cstate; - int i; if (!kms || !kms->catalog || !crtc || !state || !perf) { DPU_ERROR("invalid parameters\n"); @@ -95,35 +94,24 @@ static void _dpu_core_perf_calc_crtc(struct dpu_kms *kms, memset(perf, 0, sizeof(struct dpu_core_perf_params)); if (!dpu_cstate->bw_control) { - for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) { - perf->bw_ctl[i] = kms->catalog->perf.max_bw_high * + perf->bw_ctl = kms->catalog->perf.max_bw_high * 1000ULL; - perf->max_per_pipe_ib[i] = perf->bw_ctl[i]; - } + perf->max_per_pipe_ib = perf->bw_ctl; perf->core_clk_rate = kms->perf.max_core_clk_rate; } else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_MINIMUM) { - for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) { - perf->bw_ctl[i] = 0; - perf->max_per_pipe_ib[i] = 0; - } + perf->bw_ctl = 0; + perf->max_per_pipe_ib = 0; perf->core_clk_rate = 0; } else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_FIXED) { - for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) { - perf->bw_ctl[i] = kms->perf.fix_core_ab_vote; - perf->max_per_pipe_ib[i] = kms->perf.fix_core_ib_vote; - } + perf->bw_ctl = kms->perf.fix_core_ab_vote; + perf->max_per_pipe_ib = kms->perf.fix_core_ib_vote; perf->core_clk_rate = kms->perf.fix_core_clk_rate; } DPU_DEBUG( - "crtc=%d clk_rate=%llu core_ib=%llu core_ab=%llu llcc_ib=%llu llcc_ab=%llu mem_ib=%llu mem_ab=%llu\n", + "crtc=%d clk_rate=%llu core_ib=%llu core_ab=%llu\n", crtc->base.id, perf->core_clk_rate, - perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_MNOC], - perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_MNOC], - perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_LLCC], - perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_LLCC], - perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_EBI], - perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_EBI]); + perf->max_per_pipe_ib, perf->bw_ctl); } int dpu_core_perf_crtc_check(struct drm_crtc *crtc, @@ -136,7 +124,6 @@ int dpu_core_perf_crtc_check(struct drm_crtc *crtc, struct dpu_crtc_state *dpu_cstate; struct drm_crtc *tmp_crtc; struct dpu_kms *kms; - int i; if (!crtc || !state) { DPU_ERROR("invalid crtc\n"); @@ -158,31 +145,25 @@ int dpu_core_perf_crtc_check(struct drm_crtc *crtc, /* obtain new values */ _dpu_core_perf_calc_crtc(kms, crtc, state, _cstate->new_perf); - for (i = DPU_POWER_HANDLE_DBUS_ID_MNOC; - i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) { - bw_sum_of_intfs = dpu_cstate->new_perf.bw_ctl[i]; - curr_client_type = dpu_crtc_get_client_type(crtc); + bw_sum_of_intfs = dpu_cstate->new_perf.bw_ctl; + curr_client_type = dpu_crtc_get_client_type(crtc); - drm_for_each_crtc(tmp_crtc, crtc->dev) { - if (_dpu_core_perf_crtc_is_power_on(tmp_crtc) && - (dpu_crtc_get_client_type(tmp_crtc) == - curr_client_type) && - (tmp_crtc != crtc)) { - struct dpu_crtc_state *tmp_cstate = -
Re: LTP case read_all_proc fails on qemux86-64 since 5.0-rc1
On Jan 22, 2019, at 8:13 PM, He Zhe wrote: > > > LTP case read_all_proc(read_all -d /proc -q -r 10) often, but not every time, > fails with the following call traces, since 600335205b8d "ide: convert to > blk-mq"(5.0-rc1) till now(5.0-rc3). > > qemu-system-x86_64 -drive file=rootfs.ext4,if=virtio,format=raw -object > rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0 > -nographic -m 16192 -smp cpus=12 -cpu core2duo -enable-kvm -serial mon:stdio > -serial null -kernel bzImage -append 'root=/dev/vda rw highres=off > console=ttyS0 mem=16192M' > > tst_test.c:1085: INFO: Timeout per run is 0h 05m 00s > [ 47.080156] Warning: /proc/ide/hd?/settings interface is obsolete, and > will be removed soon! > [ 47.085330] [ cut here ] > [ 47.085810] kernel BUG at block/blk-mq.c:767! > [ 47.086498] invalid opcode: [#1] PREEMPT SMP PTI > [ 47.087022] CPU: 5 PID: 146 Comm: kworker/5:1H Not tainted 5.0.0-rc3 #1 > [ 47.087858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 > [ 47.088992] Workqueue: kblockd blk_mq_run_work_fn > [ 47.089469] RIP: 0010:blk_mq_add_to_requeue_list+0xc1/0xd0 > [ 47.090035] Code: 48 8d 53 48 49 8b 8c 24 b8 04 00 00 48 89 51 08 48 89 4b > 48 49 8d 8c 24 b8 04 00 00 48 89 4b 50 49 89 94 24 b8 04 00 008 > [ 47.091930] RSP: 0018:9e1ea4b43e40 EFLAGS: 00010002 > [ 47.092458] RAX: 9e1ea13c0048 RBX: 9e1ea13c RCX: > 0006 > [ 47.093181] RDX: RSI: 0001 RDI: > 9e1ea13c > [ 47.093906] RBP: 9e1ea4b43e68 R08: eb5bcf630680 R09: > > [ 47.094626] R10: 0001 R11: 0012 R12: > 9e1ea1033a40 > [ 47.095347] R13: 9e1ea13a8d00 R14: 9e1ea13a9000 R15: > 0046 > [ 47.096071] FS: () GS:9e1ea4b4() > knlGS: > [ 47.096898] CS: 0010 DS: ES: CR0: 80050033 > [ 47.097477] CR2: 003fda41fda0 CR3: 0003d8e6a000 CR4: > 06e0 > [ 47.098203] DR0: DR1: DR2: > > [ 47.098929] DR3: DR6: fffe0ff0 DR7: > 0400 > [ 47.099650] Call Trace: > [ 47.099910] > [ 47.100128] blk_mq_requeue_request+0x58/0x60 > [ 47.100576] ide_requeue_and_plug+0x20/0x50 > [ 47.101014] ide_intr+0x21a/0x230 > [ 47.101362] ? idecd_open+0xc0/0xc0 > [ 47.101735] __handle_irq_event_percpu+0x43/0x1e0 > [ 47.102214] handle_irq_event_percpu+0x32/0x80 > [ 47.102668] handle_irq_event+0x39/0x60 > [ 47.103074] handle_edge_irq+0xe8/0x1c0 > [ 47.103470] handle_irq+0x20/0x30 > [ 47.103819] do_IRQ+0x46/0xe0 > [ 47.104128] common_interrupt+0xf/0xf > [ 47.104505] > [ 47.104731] RIP: 0010:ide_output_data+0xbc/0x100 > [ 47.105201] Code: 74 22 8d 41 ff 85 c9 74 24 49 8d 54 40 02 41 0f b7 00 66 > 41 89 01 49 83 c0 02 49 39 d0 75 ef 5b 41 5c 5d c3 4c 89 c6 445 > [ 47.107092] RSP: 0018:bd508059bb18 EFLAGS: 00010246 ORIG_RAX: > ffdd > [ 47.107862] RAX: 9e1ea13a8800 RBX: 9e1ea13a9000 RCX: > > [ 47.108581] RDX: 0170 RSI: 9e1ea13c012c RDI: > > [ 47.109293] RBP: bd508059bb28 R08: 9e1ea13c0120 R09: > 0170 > [ 47.110016] R10: 000d R11: 000c R12: > 9e1ea13a8800 > [ 47.110731] R13: 000c R14: 9e1ea13c R15: > 7530 > [ 47.111446] ide_transfer_pc+0x216/0x310 > [ 47.111848] ? __const_udelay+0x3d/0x40 > [ 47.112236] ? ide_execute_command+0x85/0xb0 > [ 47.112668] ? ide_pc_intr+0x3f0/0x3f0 > [ 47.113051] ? ide_check_atapi_device+0x110/0x110 > [ 47.113524] ide_issue_pc+0x178/0x240 > [ 47.113901] ide_cd_do_request+0x15c/0x350 > [ 47.114314] ide_queue_rq+0x180/0x6b0 > [ 47.114686] ? blk_mq_get_driver_tag+0xa1/0x110 > [ 47.115153] blk_mq_dispatch_rq_list+0x90/0x550 > [ 47.115606] ? __queue_delayed_work+0x63/0x90 > [ 47.116054] ? deadline_fifo_request+0x41/0x90 > [ 47.116506] blk_mq_do_dispatch_sched+0x80/0x100 > [ 47.116976] blk_mq_sched_dispatch_requests+0xfc/0x170 > [ 47.117491] __blk_mq_run_hw_queue+0x6f/0xd0 > [ 47.117941] blk_mq_run_work_fn+0x1b/0x20 > [ 47.118342] process_one_work+0x14c/0x450 > [ 47.118747] worker_thread+0x4a/0x440 > [ 47.119125] kthread+0x105/0x140 > [ 47.119456] ? process_one_work+0x450/0x450 > [ 47.119880] ? kthread_park+0x90/0x90 > [ 47.120251] ret_from_fork+0x35/0x40 > [ 47.120619] Modules linked in: > [ 47.120952] ---[ end trace 4562f716e88fdefe ]--- > [ 47.121423] RIP: 0010:blk_mq_add_to_requeue_list+0xc1/0xd0 > [ 47.121981] Code: 48 8d 53 48 49 8b 8c 24 b8 04 00 00 48 89 51 08 48 89 4b > 48 49 8d 8c 24 b8 04 00 00 48 89 4b 50 49 89 94 24 b8 04 00 008 > [ 47.123851] RSP: 0018:9e1ea4b43e40 EFLAGS: 00010002 > [
[v6 2/3] drm/msm/dpu: Integrate interconnect API in MDSS
The interconnect framework is designed to provide a standard kernel interface to control the settings of the interconnects on a SoC. The interconnect API uses a consumer/provider-based model, where the providers are the interconnect buses and the consumers could be various drivers. MDSS is one of the interconnect consumers which uses the interconnect APIs to get the path between endpoints and set its bandwidth requirement for the given interconnected path. Changes in v2: - Remove error log and unnecessary check (Jordan Crouse) Changes in v3: - Code clean involving variable name change, removal of extra paranthesis and variables (Matthias Kaehlcke) Changes in v4: - Add comments, spacings, tabs, proper port name and icc macro (Georgi Djakov) Changes in v5: - Commit text and parenthesis alignment (Georgi Djakov) Changes in v6: - Change to new icc_set API's (Doug Anderson) Signed-off-by: Sravanthi Kollukuduru1 Signed-off-by: Jayant Shekhar --- drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c | 49 +--- 1 file changed, 45 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c index 38576f8..38daf8a 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c @@ -4,11 +4,15 @@ */ #include "dpu_kms.h" +#include #define to_dpu_mdss(x) container_of(x, struct dpu_mdss, base) #define HW_INTR_STATUS 0x0010 +/* Max BW defined in KBps */ +#define MAX_BW 680 + struct dpu_mdss { struct msm_mdss base; void __iomem *mmio; @@ -16,8 +20,30 @@ struct dpu_mdss { u32 hwversion; struct dss_module_power mp; struct dpu_irq_controller irq_controller; + struct icc_path *path[2]; + u32 num_paths; }; +static int dpu_mdss_parse_data_bus_icc_path(struct drm_device *dev, + struct dpu_mdss *dpu_mdss) +{ + struct icc_path *path0 = of_icc_get(dev->dev, "mdp0-mem"); + struct icc_path *path1 = of_icc_get(dev->dev, "mdp1-mem"); + + if (IS_ERR(path0)) + return PTR_ERR(path0); + + dpu_mdss->path[0] = path0; + dpu_mdss->num_paths = 1; + + if (!IS_ERR(path1)) { + dpu_mdss->path[1] = path1; + dpu_mdss->num_paths++; + } + + return 0; +} + static irqreturn_t dpu_mdss_irq(int irq, void *arg) { struct dpu_mdss *dpu_mdss = arg; @@ -127,7 +153,11 @@ static int dpu_mdss_enable(struct msm_mdss *mdss) { struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss); struct dss_module_power *mp = _mdss->mp; - int ret; + int ret, i; + u64 avg_bw = dpu_mdss->num_paths ? MAX_BW / dpu_mdss->num_paths : 0; + + for (i = 0; i < dpu_mdss->num_paths; i++) + icc_set_bw(dpu_mdss->path[i], avg_bw, kBps_to_icc(MAX_BW)); ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, true); if (ret) @@ -140,12 +170,15 @@ static int dpu_mdss_disable(struct msm_mdss *mdss) { struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss); struct dss_module_power *mp = _mdss->mp; - int ret; + int ret, i; ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, false); if (ret) DPU_ERROR("clock disable failed, ret:%d\n", ret); + for (i = 0; i < dpu_mdss->num_paths; i++) + icc_set_bw(dpu_mdss->path[i], 0, 0); + return ret; } @@ -155,6 +188,7 @@ static void dpu_mdss_destroy(struct drm_device *dev) struct msm_drm_private *priv = dev->dev_private; struct dpu_mdss *dpu_mdss = to_dpu_mdss(priv->mdss); struct dss_module_power *mp = _mdss->mp; + int i; pm_runtime_disable(dev->dev); _dpu_mdss_irq_domain_fini(dpu_mdss); @@ -162,6 +196,9 @@ static void dpu_mdss_destroy(struct drm_device *dev) msm_dss_put_clk(mp->clk_config, mp->num_clk); devm_kfree(>dev, mp->clk_config); + for (i = 0; i < dpu_mdss->num_paths; i++) + icc_put(dpu_mdss->path[i]); + if (dpu_mdss->mmio) devm_iounmap(>dev, dpu_mdss->mmio); dpu_mdss->mmio = NULL; @@ -200,6 +237,10 @@ int dpu_mdss_init(struct drm_device *dev) } dpu_mdss->mmio_len = resource_size(res); + ret = dpu_mdss_parse_data_bus_icc_path(dev, dpu_mdss); + if (ret) + return ret; + mp = _mdss->mp; ret = msm_dss_parse_clock(pdev, mp); if (ret) { @@ -221,14 +262,14 @@ int dpu_mdss_init(struct drm_device *dev) goto irq_error; } + priv->mdss = _mdss->base; + pm_runtime_enable(dev->dev); pm_runtime_get_sync(dev->dev); dpu_mdss->hwversion = readl_relaxed(dpu_mdss->mmio); pm_runtime_put_sync(dev->dev); - priv->mdss = _mdss->base; -
[v6 1/3] drm/msm/dpu: clean up references of DPU custom bus scaling
Since the upstream interconnect bus framework has landed upstream, the existing references of custom bus scaling needs to be cleaned up. Changes in v2: - Fixed build error due to partial clean up Changes in v3: - Condense multiple lines into a single line (Sean Paul) Changes in v4: - None Changes in v5: - None Changes in v6: -None Signed-off-by: Sravanthi Kollukuduru Signed-off-by: Jayant Shekhar --- drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c| 174 +-- drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h| 4 +- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 13 +- drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.c | 47 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.h | 68 - drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h| 22 +-- 6 files changed, 89 insertions(+), 239 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c index 22e84b3..c75536e 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c @@ -84,7 +84,6 @@ static void _dpu_core_perf_calc_crtc(struct dpu_kms *kms, struct dpu_core_perf_params *perf) { struct dpu_crtc_state *dpu_cstate; - int i; if (!kms || !kms->catalog || !crtc || !state || !perf) { DPU_ERROR("invalid parameters\n"); @@ -95,35 +94,24 @@ static void _dpu_core_perf_calc_crtc(struct dpu_kms *kms, memset(perf, 0, sizeof(struct dpu_core_perf_params)); if (!dpu_cstate->bw_control) { - for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) { - perf->bw_ctl[i] = kms->catalog->perf.max_bw_high * + perf->bw_ctl = kms->catalog->perf.max_bw_high * 1000ULL; - perf->max_per_pipe_ib[i] = perf->bw_ctl[i]; - } + perf->max_per_pipe_ib = perf->bw_ctl; perf->core_clk_rate = kms->perf.max_core_clk_rate; } else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_MINIMUM) { - for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) { - perf->bw_ctl[i] = 0; - perf->max_per_pipe_ib[i] = 0; - } + perf->bw_ctl = 0; + perf->max_per_pipe_ib = 0; perf->core_clk_rate = 0; } else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_FIXED) { - for (i = 0; i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) { - perf->bw_ctl[i] = kms->perf.fix_core_ab_vote; - perf->max_per_pipe_ib[i] = kms->perf.fix_core_ib_vote; - } + perf->bw_ctl = kms->perf.fix_core_ab_vote; + perf->max_per_pipe_ib = kms->perf.fix_core_ib_vote; perf->core_clk_rate = kms->perf.fix_core_clk_rate; } DPU_DEBUG( - "crtc=%d clk_rate=%llu core_ib=%llu core_ab=%llu llcc_ib=%llu llcc_ab=%llu mem_ib=%llu mem_ab=%llu\n", + "crtc=%d clk_rate=%llu core_ib=%llu core_ab=%llu\n", crtc->base.id, perf->core_clk_rate, - perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_MNOC], - perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_MNOC], - perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_LLCC], - perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_LLCC], - perf->max_per_pipe_ib[DPU_POWER_HANDLE_DBUS_ID_EBI], - perf->bw_ctl[DPU_POWER_HANDLE_DBUS_ID_EBI]); + perf->max_per_pipe_ib, perf->bw_ctl); } int dpu_core_perf_crtc_check(struct drm_crtc *crtc, @@ -136,7 +124,6 @@ int dpu_core_perf_crtc_check(struct drm_crtc *crtc, struct dpu_crtc_state *dpu_cstate; struct drm_crtc *tmp_crtc; struct dpu_kms *kms; - int i; if (!crtc || !state) { DPU_ERROR("invalid crtc\n"); @@ -158,31 +145,25 @@ int dpu_core_perf_crtc_check(struct drm_crtc *crtc, /* obtain new values */ _dpu_core_perf_calc_crtc(kms, crtc, state, _cstate->new_perf); - for (i = DPU_POWER_HANDLE_DBUS_ID_MNOC; - i < DPU_POWER_HANDLE_DBUS_ID_MAX; i++) { - bw_sum_of_intfs = dpu_cstate->new_perf.bw_ctl[i]; - curr_client_type = dpu_crtc_get_client_type(crtc); + bw_sum_of_intfs = dpu_cstate->new_perf.bw_ctl; + curr_client_type = dpu_crtc_get_client_type(crtc); - drm_for_each_crtc(tmp_crtc, crtc->dev) { - if (_dpu_core_perf_crtc_is_power_on(tmp_crtc) && - (dpu_crtc_get_client_type(tmp_crtc) == - curr_client_type) && - (tmp_crtc != crtc)) { - struct dpu_crtc_state *tmp_cstate = -
[v6 3/3] dt-bindings: msm/disp: Introduce interconnect bindings for MDSS on SDM845
Add interconnect properties such as interconnect provider specifier , the edge source and destination ports which are required by the interconnect API to configure interconnect path for MDSS. Changes in v2: - none Changes in v3: - Remove common property definitions (Rob Herring) Changes in v4: - Use port macros and change port string names (Georgi Djakov) Changes in v5: - None Changes in v6: -None Signed-off-by: Sravanthi Kollukuduru Signed-off-by: Jayant Shekhar --- Documentation/devicetree/bindings/display/msm/dpu.txt | 10 ++ 1 file changed, 10 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/dpu.txt b/Documentation/devicetree/bindings/display/msm/dpu.txt index ad2e883..a61dd40 100644 --- a/Documentation/devicetree/bindings/display/msm/dpu.txt +++ b/Documentation/devicetree/bindings/display/msm/dpu.txt @@ -28,6 +28,11 @@ Required properties: - #address-cells: number of address cells for the MDSS children. Should be 1. - #size-cells: Should be 1. - ranges: parent bus address space is the same as the child bus address space. +- interconnects : interconnect path specifier for MDSS according to + Documentation/devicetree/bindings/interconnect/interconnect.txt. Should be + 2 paths corresponding to 2 AXI ports. +- interconnect-names : MDSS will have 2 port names to differentiate between the + 2 interconnect paths defined with interconnect specifier. Optional properties: - assigned-clocks: list of clock specifiers for clocks needing rate assignment @@ -86,6 +91,11 @@ Example: interrupt-controller; #interrupt-cells = <1>; + interconnects = <_hlos MASTER_MDP0 _hlos SLAVE_EBI1>, + <_hlos MASTER_MDP1 _hlos SLAVE_EBI1>; + + interconnect-names = "mdp0-mem", "mdp1-mem"; + iommus = <_iommu 0>; #address-cells = <2>; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[v6 0/3] Use interconnect API in MDSS on SDM845
The interconnect API provides an interface for consumer drivers to express their bandwidth needs in the SoC. This data is aggregated and the on-chip interconnect hardware is configured to the appropriate power/performance profile. MDSS is one of the interconnect consumers which uses the interconnect APIs to get the path between endpoints and set its bandwidth requirements for the given interconnected path. Subsequently, there is a clean up patch to remove all the references of the DPU custom bus scaling. There is corresponding DT patch with the source and destination ports defined for display driver which will be sent separately. Changes in v2: - Remove error log and unnecessary check (Jordan Crouse) - Fixed build error due to partial clean up Changes in v3: - Remove common property definitions (Rob Herring) - Code clean up involving variable name change, removal of extra paranthesis and variables (Matthias Kaehlcke) - Condense multiple lines into a single line (Sean Paul) Changes in v4: - Add comments, spacings, tabs, proper port name and icc macro - Use port macros and change port string names (Georgi Djakov) Changes in v5: - Updated commit text and parenthesis alignment (Georgi Djakov) Changes in v6: - Change icc_set to icc_set_bw (Doug Anderson) Jayant Shekhar (3): drm/msm/dpu: clean up references of DPU custom bus scaling drm/msm/dpu: Integrate interconnect API in MDSS dt-bindings: msm/disp: Introduce interconnect bindings for MDSS on SDM845 .../devicetree/bindings/display/msm/dpu.txt| 10 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 174 - drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h | 4 +- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 13 +- drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c | 49 +- drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.c | 47 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_power_handle.h | 68 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h | 22 +-- 8 files changed, 144 insertions(+), 243 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
Re: rcutorture: meaning of "End of test: RCU_HOTPLUG"
On Tue, Jan 22, 2019 at 04:42:19PM +0800, Su Yue wrote: > Thanks for your quick reply! Paul > > On 1/22/19 12:01 PM, Paul E. McKenney wrote: > >On Tue, Jan 22, 2019 at 11:40:53AM +0800, Su Yue wrote: > >>Hi, guys > >> While running rcutorture tests with "onoff_interval", some tests > >>failed and results show like: > >> > >>= > >>[ 316.354501] srcud-torture:--- End of test: RCU_HOTPLUG: > >>nreaders=1 nfakewriters=4 stat_interval=60 verbose=2 > >>test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fq\ > >>s_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 > >>test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 > >>stall_cpu=0 stall_cpu_holdoff=10 stall_cpu_irqsoff=0 n_ba\ > >>rrier_cbs=0 onoff_interval=3 onoff_holdoff=0 > >> > >> > >>I am wondering that meaning of "RCU_HOTPLUG". Is it expected because > >>cpu hotplug is enabled in the test? Or just represents another type of > >>failure? > > > >This says that at least one CPU hotplug operation failed, that is, > >the CPU didn't actually come online or go offline as requested. If you > >are introducing CPU hotplug to an architecture, this usually indicates > >that you have bugs in your CPU-hotplug code. Or it nmight be that > > It should hit the case since there is no RCU CPU stall warnings. > > >RCU grace periods failed to progress -- though this would normally > >also result in RCU CPU stall warnings. > > > >There should be lines containing "ver:" in your console output. What > >does one of the later one of these say? > > > > The line says: > == > [ 318.850175] busted_srcud-torture: rtc: (null) ver: > 27040 tfle: 0 rta: 27040 rtaf: 0 rtf: 27027 rtmbe: 0 rtbe: 0 rtbke: > 0 rtbre: 0 rtbf: 0 rtb: 0 \ > nt: 9497 onoff: 2639/2639:2640/5310 40,373:10,355 162868:67542 > (HZ=1000) barrier: 0/0:0 Yes, you have many more offline attempts than successes, which is why RCU_HOTPLUG was printed. > = > > And here are useful errors: > = > kern :info : [ 135.379693] KVM setup async PF for cpu 1 > kern :info : [ 135.381412] kvm-stealtime: cpu 1, msr 23fd16180 > kern :alert : [ 135.386897] busted_srcud-torture:torture_onoff Just so your know, busted_srcud can sometimes fail by design. Hence the "busted" in the name. But failure didn't happen this time. > task: onlined 1 > kern :alert : [ 135.408241] busted_srcud-torture:torture_onoff > task: offlining 1 > kern :info : [ 135.423310] Unregister pv shared memory for cpu 1 > kern :info : [ 135.427940] smpboot: CPU 1 is now offline > kern :alert : [ 135.430106] busted_srcud-torture:torture_onoff > task: offlined 1 > kern :alert : [ 135.436404] busted_srcud-torture:torture_onoff > task: offlining 0 > kern :alert : [ 135.446173] busted_srcud-torture:torture_onoff > task: offline 0 failed: errno -16 > kern :alert : [ 135.453076] busted_srcud-torture:torture_onoff > task: offlining 0 > kern :alert : [ 135.457461] busted_srcud-torture:torture_onoff > task: offline 0 failed: errno -16 > > > = > There are only two CPUs on the VM. Torture try to offline the last one > but -EBUSY occured. > > I spent time to understand kernel/torture.c. > There is torture_onoff(): > > 225while (!torture_must_stop()) { > 226cpu = (torture_random() >> 4) % (maxcpu + 1); > 227if (!torture_offline(cpu, > 228 _offline_attempts, > _offline_successes, > 229 _offline, _offline, > _offline)) > 230torture_online(cpu, > 231 _online_attempts, > _online_successes, > 232 _online, _online, > _online); > 233schedule_timeout_interruptible(onoff_interval); > 234} > 235 > > torture_offline() and torture_offline() don't pre judge if the current > cpu is only one usable. That does appear to be the case, and that would be a problem with the CONFIG_BOOTPARAM_HOTPLUG_CPU0 listed below. Good catch! > Our test machines are configured with CONFIG_BOOTPARAM_HOTPLUG_CPU0. If > there are only one oneline and hotplugable cpux, then > n_offline_successes != n_offline_attempts which caused "End of test: > RCU_HOTPLUG". > > Does I misunderstand something above? Feel free to correct me. Does the following patch help? Thanx, Paul diff --git a/kernel/torture.c b/kernel/torture.c index a03ff722352b..2b6700ca2a43 100644 --- a/kernel/torture.c +++
[PATCH V2 4/6] misc/pvpanic: add pvpanic acpi driver
Make pvpanic acpi driver as seperate file and modify code in order to adapt the framework. Signed-off-by: Peng Hao --- drivers/misc/pvpanic/Kconfig| 9 + drivers/misc/pvpanic/Makefile | 1 + drivers/misc/pvpanic/pvpanic-acpi.c | 77 + 3 files changed, 87 insertions(+) create mode 100644 drivers/misc/pvpanic/pvpanic-acpi.c diff --git a/drivers/misc/pvpanic/Kconfig b/drivers/misc/pvpanic/Kconfig index 3e612c6..d274130 100644 --- a/drivers/misc/pvpanic/Kconfig +++ b/drivers/misc/pvpanic/Kconfig @@ -5,3 +5,12 @@ config PVPANIC This driver provides support for the pvpanic device. pvpanic is a paravirtualized device provided by QEMU; it lets a virtual machine (guest) communicate panic events to the host. + +if PVPANIC + +config PVPANIC_ACPI + tristate "pvpanic acpi driver" + depends on ACPI + default PVPANIC + +endif diff --git a/drivers/misc/pvpanic/Makefile b/drivers/misc/pvpanic/Makefile index 6394224..c5b73ca 100644 --- a/drivers/misc/pvpanic/Makefile +++ b/drivers/misc/pvpanic/Makefile @@ -3,3 +3,4 @@ # Copyright (c) 2018 ZTE Ltd. obj-$(CONFIG_PVPANIC)+= pvpanic.o +obj-$(CONFIG_PVPANIC_ACPI) += pvpanic-acpi.o diff --git a/drivers/misc/pvpanic/pvpanic-acpi.c b/drivers/misc/pvpanic/pvpanic-acpi.c new file mode 100644 index 000..a6153fa --- /dev/null +++ b/drivers/misc/pvpanic/pvpanic-acpi.c @@ -0,0 +1,77 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * pvpanic acpi driver. + * + * Copyright (C) 2019 ZTE Ltd. + * Author: Peng Hao + */ +#include +#include +#include +#include +#include "pvpanic.h" + +static int pvpanic_add(struct acpi_device *device); +static int pvpanic_remove(struct acpi_device *device); + +static const struct acpi_device_id pvpanic_device_ids[] = { + { "QEMU0001", 0 }, + { "", 0 } +}; +MODULE_DEVICE_TABLE(acpi, pvpanic_device_ids); + +static struct acpi_driver pvpanic_driver = { + .name = "pvpanic", + .class ="QEMU", + .ids = pvpanic_device_ids, + .ops = { + .add = pvpanic_add, + .remove = pvpanic_remove, + }, + .owner =THIS_MODULE, +}; + +static acpi_status +pvpanic_walk_resources(struct acpi_resource *res, void *context) +{ + struct resource r; + int ret = 0; + struct device *dev = context; + + memset(, 0, sizeof(r)); + if (acpi_dev_resource_io(res, ) || acpi_dev_resource_memory(res, )) + ret = pvpanic_add_device(dev, ); + + if (!ret) + return AE_OK; + + return AE_ERROR; +} +static int pvpanic_add(struct acpi_device *device) +{ + int ret; + acpi_status status; + + ret = acpi_bus_get_status(device); + if (ret < 0) + return ret; + + if (!device->status.enabled || !device->status.functional) + return -ENODEV; + + status = acpi_walk_resources(device->handle, METHOD_NAME__CRS, +pvpanic_walk_resources, >dev); + + if (ACPI_FAILURE(status)) + return -ENODEV; + + return 0; +} + +static int pvpanic_remove(struct acpi_device *device) +{ + pvpanic_remove_device(); + return 0; +} + +module_acpi_driver(pvpanic_driver); -- 1.8.3.1
[PATCH V2 1/6] misc/pvpanic: preparing for pvpanic driver framework
Preparing for pvpanic driver framework. Create a pvpanic driver directory and move current driver file to new directory. Signed-off-by: Peng Hao --- drivers/misc/Kconfig | 9 + drivers/misc/Makefile| 2 +- drivers/misc/pvpanic/Kconfig | 7 +++ drivers/misc/pvpanic/Makefile| 5 + drivers/misc/{ => pvpanic}/pvpanic.c | 0 5 files changed, 14 insertions(+), 9 deletions(-) create mode 100644 drivers/misc/pvpanic/Kconfig create mode 100644 drivers/misc/pvpanic/Makefile rename drivers/misc/{ => pvpanic}/pvpanic.c (100%) diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index f417b06..aa3a805 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -513,14 +513,7 @@ config MISC_RTSX tristate default MISC_RTSX_PCI || MISC_RTSX_USB -config PVPANIC - tristate "pvpanic device support" - depends on HAS_IOMEM && (ACPI || OF) - help - This driver provides support for the pvpanic device. pvpanic is - a paravirtualized device provided by QEMU; it lets a virtual machine - (guest) communicate panic events to the host. - +source "drivers/misc/pvpanic/Kconfig" source "drivers/misc/c2port/Kconfig" source "drivers/misc/eeprom/Kconfig" source "drivers/misc/cb710/Kconfig" diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index e39ccbb..cfe20b3 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -58,4 +58,4 @@ obj-$(CONFIG_ASPEED_LPC_SNOOP)+= aspeed-lpc-snoop.o obj-$(CONFIG_PCI_ENDPOINT_TEST)+= pci_endpoint_test.o obj-$(CONFIG_OCXL) += ocxl/ obj-y += cardreader/ -obj-$(CONFIG_PVPANIC) += pvpanic.o +obj-$(CONFIG_PVPANIC) += pvpanic/ diff --git a/drivers/misc/pvpanic/Kconfig b/drivers/misc/pvpanic/Kconfig new file mode 100644 index 000..3e612c6 --- /dev/null +++ b/drivers/misc/pvpanic/Kconfig @@ -0,0 +1,7 @@ +config PVPANIC + tristate "pvpanic device support" + depends on HAS_IOMEM && (ACPI || OF) + help + This driver provides support for the pvpanic device. pvpanic is + a paravirtualized device provided by QEMU; it lets a virtual machine + (guest) communicate panic events to the host. diff --git a/drivers/misc/pvpanic/Makefile b/drivers/misc/pvpanic/Makefile new file mode 100644 index 000..6394224 --- /dev/null +++ b/drivers/misc/pvpanic/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0-or-later +# +# Copyright (c) 2018 ZTE Ltd. + +obj-$(CONFIG_PVPANIC)+= pvpanic.o diff --git a/drivers/misc/pvpanic.c b/drivers/misc/pvpanic/pvpanic.c similarity index 100% rename from drivers/misc/pvpanic.c rename to drivers/misc/pvpanic/pvpanic.c -- 1.8.3.1
[PATCH V2 5/6] misc/pvpanic: add pvpanic mmio driver
Make pvpanic mmioi driver as seperate file and modify code in order to adapt the framework. Signed-off-by: Peng Hao --- drivers/misc/pvpanic/Kconfig | 4 +++ drivers/misc/pvpanic/Makefile | 1 + drivers/misc/pvpanic/pvpanic-of.c | 53 +++ 3 files changed, 58 insertions(+) create mode 100644 drivers/misc/pvpanic/pvpanic-of.c diff --git a/drivers/misc/pvpanic/Kconfig b/drivers/misc/pvpanic/Kconfig index d274130..47f8709 100644 --- a/drivers/misc/pvpanic/Kconfig +++ b/drivers/misc/pvpanic/Kconfig @@ -13,4 +13,8 @@ config PVPANIC_ACPI depends on ACPI default PVPANIC +config PVPANIC_OF + tristate "pvpanic mmio driver" + depends on OF + endif diff --git a/drivers/misc/pvpanic/Makefile b/drivers/misc/pvpanic/Makefile index c5b73ca..63ef0db 100644 --- a/drivers/misc/pvpanic/Makefile +++ b/drivers/misc/pvpanic/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_PVPANIC)+= pvpanic.o obj-$(CONFIG_PVPANIC_ACPI) += pvpanic-acpi.o +obj-$(CONFIG_PVPANIC_OF)+= pvpanic-of.o diff --git a/drivers/misc/pvpanic/pvpanic-of.c b/drivers/misc/pvpanic/pvpanic-of.c new file mode 100644 index 000..73ca5f3 --- /dev/null +++ b/drivers/misc/pvpanic/pvpanic-of.c @@ -0,0 +1,53 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * pvpanic of driver. + * + * Copyright (C) 2019 ZTE Ltd. + * Author: Peng Hao + */ + +#include +#include +#include +#include +#include +#include +#include "pvpanic.h" + +static int pvpanic_mmio_probe(struct platform_device *pdev) +{ + struct resource *res; + int ret; + + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + if (!res) + return -EINVAL; + + ret = pvpanic_add_device(>dev, res); + if (ret) + return -ENODEV; + + return 0; +} + +static int pvpanic_mmio_remove(struct platform_device *pdev) +{ + pvpanic_remove_device(); + return 0; +} + +static const struct of_device_id pvpanic_mmio_match[] = { + { .compatible = "qemu,pvpanic-mmio", }, + {} +}; + +static struct platform_driver pvpanic_mmio_driver = { + .driver = { + .name = "pvpanic-mmio", + .of_match_table = pvpanic_mmio_match, + }, + .probe = pvpanic_mmio_probe, + .remove = pvpanic_mmio_remove, +}; + +module_platform_driver(pvpanic_mmio_driver); -- 1.8.3.1
[PATCH V2 0/6] add pvpanic driver framework
QEMU community requires additional PCI devices to simulate PVPANIC devices so that some architectures can not occupy precious less than 4G of memory space. Previously, I added PCI driver directly to the original version of the driver, which made the whole driver file look a bit cluttered. So Andy Shevchenko suggests: "I would recommend to split it in a way how it's done for ChipIdea USB driver, for example. (drivers/usb/chipidea if I'm not mistaken)". Peng Hao (6): misc/pvpanic: preparing for pvpanic driver framework misc/pvpanic: Add pvpanic driver framework misc/pvpanic: add API for pvpanic driver framework misc/pvpanic: add pvpanic acpi driver misc/pvpanic: add pvpanic mmio driver misc/pvpanic: add pvpanic pci driver drivers/misc/Kconfig| 9 +- drivers/misc/Makefile | 2 +- drivers/misc/pvpanic.c | 192 drivers/misc/pvpanic/Kconfig| 25 + drivers/misc/pvpanic/Makefile | 8 ++ drivers/misc/pvpanic/pvpanic-acpi.c | 77 +++ drivers/misc/pvpanic/pvpanic-of.c | 53 ++ drivers/misc/pvpanic/pvpanic-pci.c | 56 +++ drivers/misc/pvpanic/pvpanic.c | 131 drivers/misc/pvpanic/pvpanic.h | 14 +++ 10 files changed, 366 insertions(+), 201 deletions(-) delete mode 100644 drivers/misc/pvpanic.c create mode 100644 drivers/misc/pvpanic/Kconfig create mode 100644 drivers/misc/pvpanic/Makefile create mode 100644 drivers/misc/pvpanic/pvpanic-acpi.c create mode 100644 drivers/misc/pvpanic/pvpanic-of.c create mode 100644 drivers/misc/pvpanic/pvpanic-pci.c create mode 100644 drivers/misc/pvpanic/pvpanic.c create mode 100644 drivers/misc/pvpanic/pvpanic.h -- 1.8.3.1
[PATCH V2 6/6] misc/pvpanic: add new pvpanic pci driver
Add new pvpanic pci driver to pvpanic driver framework. Signed-off-by: Peng Hao --- drivers/misc/pvpanic/Kconfig | 5 drivers/misc/pvpanic/Makefile | 1 + drivers/misc/pvpanic/pvpanic-pci.c | 56 ++ 3 files changed, 62 insertions(+) create mode 100644 drivers/misc/pvpanic/pvpanic-pci.c diff --git a/drivers/misc/pvpanic/Kconfig b/drivers/misc/pvpanic/Kconfig index 47f8709..46b6e05 100644 --- a/drivers/misc/pvpanic/Kconfig +++ b/drivers/misc/pvpanic/Kconfig @@ -17,4 +17,9 @@ config PVPANIC_OF tristate "pvpanic mmio driver" depends on OF +config PVPANIC_PCI + tristate "pvpanic pci driver" + depends on PCI + default PVPANIC + endif diff --git a/drivers/misc/pvpanic/Makefile b/drivers/misc/pvpanic/Makefile index 63ef0db..7c71f85 100644 --- a/drivers/misc/pvpanic/Makefile +++ b/drivers/misc/pvpanic/Makefile @@ -5,3 +5,4 @@ obj-$(CONFIG_PVPANIC)+= pvpanic.o obj-$(CONFIG_PVPANIC_ACPI) += pvpanic-acpi.o obj-$(CONFIG_PVPANIC_OF)+= pvpanic-of.o +obj-$(CONFIG_PVPANIC_PCI) += pvpanic-pci.o diff --git a/drivers/misc/pvpanic/pvpanic-pci.c b/drivers/misc/pvpanic/pvpanic-pci.c new file mode 100644 index 000..b4f453b --- /dev/null +++ b/drivers/misc/pvpanic/pvpanic-pci.c @@ -0,0 +1,56 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * pvpanic acpi driver. + * + * Copyright (C) 2019 ZTE Ltd. + * Author: Peng Hao + */ + +#include +#include +#include +#include +#include "pvpanic.h" + +#define PCI_VENDOR_ID_REDHAT 0x1b36 +#define PCI_DEVICE_ID_REDHAT_PVPANIC 0x0101 + +static const struct pci_device_id pvpanic_pci_id_tbl[] = { + { PCI_DEVICE(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_PVPANIC),}, + {} +}; + +static int pvpanic_pci_probe(struct pci_dev *pdev, +const struct pci_device_id *ent) +{ + int ret; + struct resource res; + + ret = pcim_enable_device(pdev); + if (ret < 0) + return ret; + + memset(, 0, sizeof(res)); + res.start = pci_resource_start(pdev, 0); + res.end = pci_resource_end(pdev, 0); + res.flags = IORESOURCE_MEM; + ret = pvpanic_add_device(>dev, ); + if (ret) + return ret; + + return 0; +} + +static void pvpanic_pci_remove(struct pci_dev *pdev) +{ + pvpanic_remove_device(); +} + +static struct pci_driver pvpanic_pci_driver = { + .name = "pvpanic-pci", + .id_table = pvpanic_pci_id_tbl, + .probe =pvpanic_pci_probe, + .remove = pvpanic_pci_remove, +}; + +module_pci_driver(pvpanic_pci_driver); -- 1.8.3.1
[PATCH V2 2/6] misc/pvpanic: Add pvpanic driver framework
Add pvpanic driver framework. Split the original pvpanic acpi/of driver as the two seperate files and modify code for adaptation framework in follow-up patches. Signed-off-by: Peng Hao --- drivers/misc/pvpanic/pvpanic.c | 171 ++--- 1 file changed, 39 insertions(+), 132 deletions(-) diff --git a/drivers/misc/pvpanic/pvpanic.c b/drivers/misc/pvpanic/pvpanic.c index 595ac06..6380540 100644 --- a/drivers/misc/pvpanic/pvpanic.c +++ b/drivers/misc/pvpanic/pvpanic.c @@ -8,15 +8,20 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt -#include +#include #include #include -#include -#include #include #include -static void __iomem *base; +static struct { + struct platform_device *pdev; + void __iomem *base; + bool is_ioport; +} pvpanic_data = { + .pdev = NULL, + .is_ioport = false, +}; #define PVPANIC_PANICKED(1 << 0) @@ -27,7 +32,7 @@ static void pvpanic_send_event(unsigned int event) { - iowrite8(event, base); + iowrite8(event, pvpanic_data.base); } static int @@ -43,150 +48,52 @@ .priority = 1, /* let this called before broken drm_fb_helper */ }; -#ifdef CONFIG_ACPI -static int pvpanic_add(struct acpi_device *device); -static int pvpanic_remove(struct acpi_device *device); - -static const struct acpi_device_id pvpanic_device_ids[] = { - { "QEMU0001", 0 }, - { "", 0 } -}; -MODULE_DEVICE_TABLE(acpi, pvpanic_device_ids); - -static struct acpi_driver pvpanic_driver = { - .name = "pvpanic", - .class ="QEMU", - .ids = pvpanic_device_ids, - .ops = { - .add = pvpanic_add, - .remove = pvpanic_remove, - }, - .owner =THIS_MODULE, -}; - -static acpi_status -pvpanic_walk_resources(struct acpi_resource *res, void *context) +static int pvpanic_platform_probe(struct platform_device *pdev) { - struct resource r; - - if (acpi_dev_resource_io(res, )) { - base = ioport_map(r.start, resource_size()); - return AE_OK; - } else if (acpi_dev_resource_memory(res, )) { - base = ioremap(r.start, resource_size()); - return AE_OK; + struct device *dev = >dev; + struct resource *res; + void __iomem *base; + + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + if (res) { + base = devm_ioremap_resource(dev, res); + if (IS_ERR(base)) + return -ENODEV; + } else { + res = platform_get_resource(pdev, IORESOURCE_IO, 0); + if (!res) + return -ENODEV; + + base = ioport_map(res->start, resource_size(res)); + if (!base) + return -ENODEV; + pvpanic_data.is_ioport = true; } - return AE_ERROR; -} - -static int pvpanic_add(struct acpi_device *device) -{ - int ret; - - ret = acpi_bus_get_status(device); - if (ret < 0) - return ret; - - if (!device->status.enabled || !device->status.functional) - return -ENODEV; - - acpi_walk_resources(device->handle, METHOD_NAME__CRS, - pvpanic_walk_resources, NULL); - - if (!base) - return -ENODEV; - + pvpanic_data.base = base; atomic_notifier_chain_register(_notifier_list, _panic_nb); return 0; } -static int pvpanic_remove(struct acpi_device *device) +static int pvpanic_platform_remove(struct platform_device *pdev) { - atomic_notifier_chain_unregister(_notifier_list, _panic_nb); - iounmap(base); - - return 0; -} - -static int pvpanic_register_acpi_driver(void) -{ - return acpi_bus_register_driver(_driver); -} - -static void pvpanic_unregister_acpi_driver(void) -{ - acpi_bus_unregister_driver(_driver); -} -#else -static int pvpanic_register_acpi_driver(void) -{ - return -ENODEV; -} -static void pvpanic_unregister_acpi_driver(void) {} -#endif - -static int pvpanic_mmio_probe(struct platform_device *pdev) -{ - struct resource *mem; - - mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); - if (!mem) - return -EINVAL; - - base = devm_ioremap_resource(>dev, mem); - if (IS_ERR(base)) - return PTR_ERR(base); - - atomic_notifier_chain_register(_notifier_list, - _panic_nb); - - return 0; -} - -static int pvpanic_mmio_remove(struct platform_device *pdev) -{ - - atomic_notifier_chain_unregister(_notifier_list, -_panic_nb); + if (pvpanic_data.is_ioport) + iounmap(pvpanic_data.base); return 0; } -static const
[PATCH V2 3/6] misc/pvpanic: add API for pvpanic driver framework
Add pvpanic_add/remove_device API. Follow-up patches will use them to add/remove specific drivers into framework. Signed-off-by: Peng Hao --- drivers/misc/pvpanic/pvpanic.c | 32 drivers/misc/pvpanic/pvpanic.h | 14 ++ 2 files changed, 46 insertions(+) create mode 100644 drivers/misc/pvpanic/pvpanic.h diff --git a/drivers/misc/pvpanic/pvpanic.c b/drivers/misc/pvpanic/pvpanic.c index 227ab4e..f842ee4 100644 --- a/drivers/misc/pvpanic/pvpanic.c +++ b/drivers/misc/pvpanic/pvpanic.c @@ -48,6 +48,38 @@ .priority = 1, /* let this called before broken drm_fb_helper */ }; +int pvpanic_add_device(struct device *dev, struct resource *res) +{ + struct platform_device *pdev; + int ret; + + pdev = platform_device_alloc("pvpanic", -1); + if (!pdev) + return -ENOMEM; + + pdev->dev.parent = dev; + + ret = platform_device_add_resources(pdev, res, 1); + if (ret) + goto err; + + ret = platform_device_add(pdev); + if (ret) + goto err; + pvpanic_data.pdev = pdev; + + return 0; +err: + platform_device_put(pdev); + return -1; +} + +void pvpanic_remove_device(void) +{ + platform_device_unregister(pvpanic_data.pdev); + pvpanic_data.pdev = NULL; +} + static int pvpanic_platform_probe(struct platform_device *pdev) { struct device *dev = >dev; diff --git a/drivers/misc/pvpanic/pvpanic.h b/drivers/misc/pvpanic/pvpanic.h new file mode 100644 index 000..a72ca59 --- /dev/null +++ b/drivers/misc/pvpanic/pvpanic.h @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* pvpanic driver framework header file + * + * Copyright (C) 2019 ZTE Ltd. + * Author: Peng Hao + */ + +#ifndef __DRIVERS_MISC_PVPANIC_H +#define __DRIVERS_MISC_PVPANIC_H + +extern int pvpanic_add_device(struct device *dev, struct resource *res); +extern void pvpanic_remove_device(void); + +#endif -- 1.8.3.1
LTP case read_all_proc fails on qemux86-64 since 5.0-rc1
LTP case read_all_proc(read_all -d /proc -q -r 10) often, but not every time, fails with the following call traces, since 600335205b8d "ide: convert to blk-mq"(5.0-rc1) till now(5.0-rc3). qemu-system-x86_64 -drive file=rootfs.ext4,if=virtio,format=raw -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0 -nographic -m 16192 -smp cpus=12 -cpu core2duo -enable-kvm -serial mon:stdio -serial null -kernel bzImage -append 'root=/dev/vda rw highres=off console=ttyS0 mem=16192M' tst_test.c:1085: INFO: Timeout per run is 0h 05m 00s [ 47.080156] Warning: /proc/ide/hd?/settings interface is obsolete, and will be removed soon! [ 47.085330] [ cut here ] [ 47.085810] kernel BUG at block/blk-mq.c:767! [ 47.086498] invalid opcode: [#1] PREEMPT SMP PTI [ 47.087022] CPU: 5 PID: 146 Comm: kworker/5:1H Not tainted 5.0.0-rc3 #1 [ 47.087858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 [ 47.088992] Workqueue: kblockd blk_mq_run_work_fn [ 47.089469] RIP: 0010:blk_mq_add_to_requeue_list+0xc1/0xd0 [ 47.090035] Code: 48 8d 53 48 49 8b 8c 24 b8 04 00 00 48 89 51 08 48 89 4b 48 49 8d 8c 24 b8 04 00 00 48 89 4b 50 49 89 94 24 b8 04 00 008 [ 47.091930] RSP: 0018:9e1ea4b43e40 EFLAGS: 00010002 [ 47.092458] RAX: 9e1ea13c0048 RBX: 9e1ea13c RCX: 0006 [ 47.093181] RDX: RSI: 0001 RDI: 9e1ea13c [ 47.093906] RBP: 9e1ea4b43e68 R08: eb5bcf630680 R09: [ 47.094626] R10: 0001 R11: 0012 R12: 9e1ea1033a40 [ 47.095347] R13: 9e1ea13a8d00 R14: 9e1ea13a9000 R15: 0046 [ 47.096071] FS: () GS:9e1ea4b4() knlGS: [ 47.096898] CS: 0010 DS: ES: CR0: 80050033 [ 47.097477] CR2: 003fda41fda0 CR3: 0003d8e6a000 CR4: 06e0 [ 47.098203] DR0: DR1: DR2: [ 47.098929] DR3: DR6: fffe0ff0 DR7: 0400 [ 47.099650] Call Trace: [ 47.099910] [ 47.100128] blk_mq_requeue_request+0x58/0x60 [ 47.100576] ide_requeue_and_plug+0x20/0x50 [ 47.101014] ide_intr+0x21a/0x230 [ 47.101362] ? idecd_open+0xc0/0xc0 [ 47.101735] __handle_irq_event_percpu+0x43/0x1e0 [ 47.102214] handle_irq_event_percpu+0x32/0x80 [ 47.102668] handle_irq_event+0x39/0x60 [ 47.103074] handle_edge_irq+0xe8/0x1c0 [ 47.103470] handle_irq+0x20/0x30 [ 47.103819] do_IRQ+0x46/0xe0 [ 47.104128] common_interrupt+0xf/0xf [ 47.104505] [ 47.104731] RIP: 0010:ide_output_data+0xbc/0x100 [ 47.105201] Code: 74 22 8d 41 ff 85 c9 74 24 49 8d 54 40 02 41 0f b7 00 66 41 89 01 49 83 c0 02 49 39 d0 75 ef 5b 41 5c 5d c3 4c 89 c6 445 [ 47.107092] RSP: 0018:bd508059bb18 EFLAGS: 00010246 ORIG_RAX: ffdd [ 47.107862] RAX: 9e1ea13a8800 RBX: 9e1ea13a9000 RCX: [ 47.108581] RDX: 0170 RSI: 9e1ea13c012c RDI: [ 47.109293] RBP: bd508059bb28 R08: 9e1ea13c0120 R09: 0170 [ 47.110016] R10: 000d R11: 000c R12: 9e1ea13a8800 [ 47.110731] R13: 000c R14: 9e1ea13c R15: 7530 [ 47.111446] ide_transfer_pc+0x216/0x310 [ 47.111848] ? __const_udelay+0x3d/0x40 [ 47.112236] ? ide_execute_command+0x85/0xb0 [ 47.112668] ? ide_pc_intr+0x3f0/0x3f0 [ 47.113051] ? ide_check_atapi_device+0x110/0x110 [ 47.113524] ide_issue_pc+0x178/0x240 [ 47.113901] ide_cd_do_request+0x15c/0x350 [ 47.114314] ide_queue_rq+0x180/0x6b0 [ 47.114686] ? blk_mq_get_driver_tag+0xa1/0x110 [ 47.115153] blk_mq_dispatch_rq_list+0x90/0x550 [ 47.115606] ? __queue_delayed_work+0x63/0x90 [ 47.116054] ? deadline_fifo_request+0x41/0x90 [ 47.116506] blk_mq_do_dispatch_sched+0x80/0x100 [ 47.116976] blk_mq_sched_dispatch_requests+0xfc/0x170 [ 47.117491] __blk_mq_run_hw_queue+0x6f/0xd0 [ 47.117941] blk_mq_run_work_fn+0x1b/0x20 [ 47.118342] process_one_work+0x14c/0x450 [ 47.118747] worker_thread+0x4a/0x440 [ 47.119125] kthread+0x105/0x140 [ 47.119456] ? process_one_work+0x450/0x450 [ 47.119880] ? kthread_park+0x90/0x90 [ 47.120251] ret_from_fork+0x35/0x40 [ 47.120619] Modules linked in: [ 47.120952] ---[ end trace 4562f716e88fdefe ]--- [ 47.121423] RIP: 0010:blk_mq_add_to_requeue_list+0xc1/0xd0 [ 47.121981] Code: 48 8d 53 48 49 8b 8c 24 b8 04 00 00 48 89 51 08 48 89 4b 48 49 8d 8c 24 b8 04 00 00 48 89 4b 50 49 89 94 24 b8 04 00 008 [ 47.123851] RSP: 0018:9e1ea4b43e40 EFLAGS: 00010002 [ 47.124393] RAX: 9e1ea13c0048 RBX: 9e1ea13c RCX: 0006 [ 47.125108] RDX: RSI: 0001 RDI: 9e1ea13c [ 47.125819] RBP: 9e1ea4b43e68 R08: eb5bcf630680 R09: [ 47.126539] R10:
Re: [virtio-dev] [PATCH] virtio: support VIRTIO_F_ORDER_PLATFORM
On 2019/1/23 上午1:03, Tiwei Bie wrote: This patch introduces the support for VIRTIO_F_ORDER_PLATFORM. When this feature is negotiated, driver will use the barriers suitable for hardware devices. Signed-off-by: Tiwei Bie --- drivers/virtio/virtio_ring.c | 8 include/uapi/linux/virtio_config.h | 6 ++ 2 files changed, 14 insertions(+) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index cd7e755484e3..27d3f057493e 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -1609,6 +1609,9 @@ static struct virtqueue *vring_create_virtqueue_packed( !context; vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX); + if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM)) + vq->weak_barriers = false; + vq->packed.ring_dma_addr = ring_dma_addr; vq->packed.driver_event_dma_addr = driver_event_dma_addr; vq->packed.device_event_dma_addr = device_event_dma_addr; @@ -2079,6 +2082,9 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index, !context; vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX); + if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM)) + vq->weak_barriers = false; + vq->split.queue_dma_addr = 0; vq->split.queue_size_in_bytes = 0; @@ -2213,6 +2219,8 @@ void vring_transport_features(struct virtio_device *vdev) break; case VIRTIO_F_RING_PACKED: break; + case VIRTIO_F_ORDER_PLATFORM: + break; default: /* We don't understand this bit. */ __virtio_clear_bit(vdev, i); diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h index 1196e1c1d4f6..ff8e7dc9d4dd 100644 --- a/include/uapi/linux/virtio_config.h +++ b/include/uapi/linux/virtio_config.h @@ -78,6 +78,12 @@ /* This feature indicates support for the packed virtqueue layout. */ #define VIRTIO_F_RING_PACKED 34 +/* + * This feature indicates that memory accesses by the driver and the + * device are ordered in a way described by the platform. + */ +#define VIRTIO_F_ORDER_PLATFORM36 + /* * Does the device support Single Root I/O Virtualization? */ I wonder whether or not this is sufficient. Is dma barrier implies a mmio barrier? Looks not. See ia64/include/asm/barrier.h: * Note: "mb()" and its variants cannot be used as a fence to order * accesses to memory mapped I/O registers. For that, mf.a needs to * be used. However, we don't want to always use mf.a because (a) * it's (presumably) much slower than mf and (b) mf.a is supported for * sequential memory pages only. */ #define mb() ia64_mf() #define rmb() mb() #define wmb() mb() #define dma_rmb() mb() =>efine dma_wmb() mb() Thanks
Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain
On Tue, Jan 22, 2019 at 11:59:31AM -0800, Stefano Stabellini wrote: > On Mon, 21 Jan 2019, Peng Fan wrote: > > on i.MX8QM, M4_1 is communicating with DomU using rpmsg with a fixed > > address as the dma mem buffer which is predefined. > > > > Without this patch, the flow is: > > vring_map_one_sg -> vring_use_dma_api > > -> dma_map_page > >-> __swiotlb_map_page > > ->swiotlb_map_page > > ->__dma_map_area(phys_to_virt(dma_to_phys(dev, > > dev_addr)), size, dir); > > However we are using per device dma area for rpmsg, phys_to_virt > > could not return a correct virtual address for virtual address in > > vmalloc area. Then kernel panic. > > > > With this patch, vring_use_dma_api will return false, and > > vring_map_one_sg will return sg_phys(sg) which is the correct phys > > address in the predefined memory region. > > vring_map_one_sg -> vring_use_dma_api > > -> sg_phys(sg) > > > > Signed-off-by: Peng Fan > > --- > > drivers/virtio/virtio_ring.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index cd7e755484e3..8993d7cb3592 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -248,6 +248,8 @@ static inline bool virtqueue_use_indirect(struct > > virtqueue *_vq, > > > > static bool vring_use_dma_api(struct virtio_device *vdev) > > { > > + struct device *dma_dev = vdev->dev.parent; > > + > > if (!virtio_has_iommu_quirk(vdev)) > > return true; > > > > @@ -260,7 +262,7 @@ static bool vring_use_dma_api(struct virtio_device > > *vdev) > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > */ > > - if (xen_domain()) > > + if (xen_domain() && !dma_dev->dma_mem) > > return true; > > > > return false; > > I can see you spotted a real issue, but this is not the right fix. We > just need something a bit more flexible than xen_domain(): there are > many kinds of Xen domains on different architectures, we basically want > to enable this (return true from vring_use_dma_api) only when the xen > swiotlb is meant to be used. Does the appended patch fix the issue you > have? > > --- > > xen: introduce xen_vring_use_dma > > From: Stefano Stabellini > > Export xen_swiotlb on arm and arm64. > > Use xen_swiotlb to determine when vring should use dma APIs to map the > ring: when xen_swiotlb is enabled the dma API is required. When it is > disabled, it is not required. > > Reported-by: Peng Fan > Signed-off-by: Stefano Stabellini > > diff --git a/arch/arm/include/asm/xen/swiotlb-xen.h > b/arch/arm/include/asm/xen/swiotlb-xen.h > new file mode 100644 > index 000..455ade5 > --- /dev/null > +++ b/arch/arm/include/asm/xen/swiotlb-xen.h > @@ -0,0 +1 @@ > +#include > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c > index cb44aa2..8592863 100644 > --- a/arch/arm/xen/mm.c > +++ b/arch/arm/xen/mm.c > @@ -21,6 +21,8 @@ > #include > #include > > +int xen_swiotlb __read_mostly; > + > unsigned long xen_get_swiotlb_free_pages(unsigned int order) > { > struct memblock_region *reg; > @@ -189,6 +191,7 @@ int __init xen_mm_init(void) > struct gnttab_cache_flush cflush; > if (!xen_initial_domain()) > return 0; > + xen_swiotlb = 1; > xen_swiotlb_init(1, false); > xen_dma_ops = _swiotlb_dma_ops; > > diff --git a/arch/arm64/include/asm/xen/swiotlb-xen.h > b/arch/arm64/include/asm/xen/swiotlb-xen.h > new file mode 100644 > index 000..455ade5 > --- /dev/null > +++ b/arch/arm64/include/asm/xen/swiotlb-xen.h > @@ -0,0 +1 @@ > +#include > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755..bf8badc 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -260,7 +260,7 @@ static bool vring_use_dma_api(struct virtio_device *vdev) >* the DMA API if we're a Xen guest, which at least allows >* all of the sensible Xen configurations to work correctly. >*/ > - if (xen_domain()) > + if (xen_vring_use_dma()) > return true; > > return false; > diff --git a/include/xen/arm/swiotlb-xen.h b/include/xen/arm/swiotlb-xen.h > new file mode 100644 > index 000..2aac7c4 > --- /dev/null > +++ b/include/xen/arm/swiotlb-xen.h > @@ -0,0 +1,10 @@ > +#ifndef _ASM_ARM_XEN_SWIOTLB_XEN_H > +#define _ASM_ARM_XEN_SWIOTLB_XEN_H > + > +#ifdef CONFIG_SWIOTLB_XEN > +extern int xen_swiotlb; > +#else > +#define xen_swiotlb (0) > +#endif > + > +#endif > diff --git a/include/xen/xen.h b/include/xen/xen.h > index 0e21567..74a536d 100644 > --- a/include/xen/xen.h > +++ b/include/xen/xen.h > @@ -46,4 +46,10 @@ enum xen_domain_type { > bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, >
[GIT PULL] Thermal management updates for v5.0-rc4
Hi, Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux.git for-rc to receive the latest Thermal management updates for v5.0-rc4 with top-most commit 129699bb8c7572106b5bbb2407c2daee4727ccad: drivers: thermal: int340x_thermal: Fix sysfs race condition (2019-01- 18 15:23:04 +0800) on top of commit bfeffd155283772bbe78c6a05dec7c0128ee500c: Linux 5.0-rc1 (2019-01-06 17:08:20 -0800) Specifics: - Fix a race condition that sysfs could be accessed before necessary initialization in int340x thermal driver. (Aaron Hill) - Fix a NULL vs IS_ERR() check in int340x thermal driver. (Dan Carpenter) thanks, rui Aaron Hill (1): drivers: thermal: int340x_thermal: Fix sysfs race condition Dan Carpenter (1): thermal: int340x_thermal: Fix a NULL vs IS_ERR() check .../int340x_thermal/processor_thermal_device.c | 30 -- 1 file changed, 16 insertions(+), 14 deletions(-)
Re: [PATCH] staging: ks7010: remove unnecessary parentheses
On Tue, 2019-01-22 at 21:18 -0500, Matt McCoy wrote: > Remove unnecessary parentheses reported by checkpatch. [] > diff --git a/drivers/staging/ks7010/ks_hostif.c > b/drivers/staging/ks7010/ks_hostif.c [] > @@ -171,7 +171,7 @@ int get_current_ap(struct ks_wlan_private *priv, struct > link_ap_info *ap_info) > "- rate_set_size=%d\n", > ap->bssid[0], ap->bssid[1], ap->bssid[2], > ap->bssid[3], ap->bssid[4], ap->bssid[5], > -&(ap->ssid.body[0]), > +>ssid.body[0], > ap->rate_set.body[0], ap->rate_set.body[1], > ap->rate_set.body[2], ap->rate_set.body[3], > ap->rate_set.body[4], ap->rate_set.body[5], This bit: [] netdev_dbg(priv->net_dev, "Link AP\n" "- bssid=%02X:%02X:%02X:%02X:%02X:%02X\n" [] ap->bssid[0], ap->bssid[1], ap->bssid[2], should instead use the vsprintf %pM extension "- bssid: %pM\n" [] ap->bssid,
Re: [PATCH] dt-bindings: sdhci-omap: Add properties for using external dma
On Tue, 22 Jan 2019 at 18:17, Faiz Abbas wrote: > > Hi Chunyan, > > +Rob Herring > > On 22/01/19 2:17 PM, Chunyan Zhang wrote: > > sdhci-omap can support both external dma controller via dmaengine > > framework as well as ADMA which standard SD host controller > > provides. > > > > Signed-off-by: Chunyan Zhang > > Signed-off-by: Faiz Abbas > > --- > > Thanks for fixing this. However, this change should be part of the Right, I actually used "--in-reply-to" with the parameter of Message-ID of Rob's mail, but it seems not work as I expected that it was expected a reply to Rob's mail for review. > series (with a change log and a version number). I will send this as a > part of my series after I get some Acks on the driver patches. Ok, thanks. Chunyan > > Thanks, > Faiz
Re: [PATCH RFC 06/24] userfaultfd: wp: support write protection for userfault vma range
On Wed, Jan 23, 2019 at 10:17:45AM +0800, Peter Xu wrote: > On Tue, Jan 22, 2019 at 12:02:24PM -0500, Jerome Glisse wrote: > > On Tue, Jan 22, 2019 at 05:39:35PM +0800, Peter Xu wrote: > > > On Mon, Jan 21, 2019 at 09:05:35AM -0500, Jerome Glisse wrote: > > > > > > [...] > > > > > > > > + change_protection(dst_vma, start, start + len, newprot, > > > > > + !enable_wp, 0); > > > > > > > > So setting dirty_accountable bring us to that code in mprotect.c: > > > > > > > > if (dirty_accountable && pte_dirty(ptent) && > > > > (pte_soft_dirty(ptent) || > > > > !(vma->vm_flags & VM_SOFTDIRTY))) { > > > > ptent = pte_mkwrite(ptent); > > > > } > > > > > > > > My understanding is that you want to set write flag when enable_wp > > > > is false and you want to set the write flag unconditionaly, right ? > > > > > > Right. > > > > > > > > > > > If so then you should really move the change_protection() flags > > > > patch before this patch and add a flag for setting pte write flags. > > > > > > > > Otherwise the above is broken at it will only set the write flag > > > > for pte that were dirty and i am guessing so far you always were > > > > lucky because pte were all dirty (change_protection will preserve > > > > dirtyness) when you write protected them. > > > > > > > > So i believe the above is broken or at very least unclear if what > > > > you really want is to only set write flag to pte that have the > > > > dirty flag set. > > > > > > You are right, if we build the tree until this patch it won't work for > > > all the cases. It'll only work if the page was at least writable > > > before and also it's dirty (as you explained). Sorry to be unclear > > > about this, maybe I should at least mention that in the commit message > > > but I totally forgot it. > > > > > > All these problems are solved in later on patches, please feel free to > > > have a look at: > > > > > > mm: merge parameters for change_protection() > > > userfaultfd: wp: apply _PAGE_UFFD_WP bit > > > userfaultfd: wp: handle COW properly for uffd-wp > > > > > > Note that even in the follow up patches IMHO we can't directly change > > > the write permission since the page can be shared by other processes > > > (e.g., the zero page or COW pages). But the general idea is the same > > > as you explained. > > > > > > I tried to avoid squashing these stuff altogether as explained > > > previously. Also, this patch can be seen as a standalone patch to > > > introduce the new interface which seems to make sense too, and it is > > > indeed still working in many cases so I see the latter patches as > > > enhancement of this one. Please let me know if you still want me to > > > have all these stuff squashed, or if you'd like me to squash some of > > > them. > > > > Yeah i have look at those after looking at this one. You should just > > re-order the patch this one first and then one that add new flag, > > then ones that add the new userfaultfd feature. Otherwise you are > > adding a userfaultfd feature that is broken midway ie it is added > > broken and then you fix it. Some one bisecting thing might get hurt > > by that. It is better to add and change everything you need and then > > add the new feature so that the new feature will work as intended. > > > > So no squashing just change the order ie add the userfaultfd code > > last. > > Yes this makes sense, I'll do that in v2. Thanks for the suggestion! Note before doing a v2 i would really like to see some proof of why you need new page table flag see my reply to: userfaultfd: wp: add WP pagetable tracking to x86 As i believe you can identify COW or KSM from UFD write protect with- out a pte flag. Cheers, Jérôme
linux-next: manual merge of the slave-dma tree with Linus' tree
Hi Vinod, Today's linux-next merge of the slave-dma tree got a conflict in: drivers/dma/imx-sdma.c between commit: 750afb08ca71 ("cross-tree: phase out dma_zalloc_coherent()") from Linus' tree and commit: ceaf52265148 ("dmaengine: imx-sdma: pass ->dev to dma_alloc_coherent() API") from the slave-dma tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc drivers/dma/imx-sdma.c index 86708fb9bda1,af14a8d6efa8.. --- a/drivers/dma/imx-sdma.c +++ b/drivers/dma/imx-sdma.c @@@ -1182,8 -1189,8 +1189,8 @@@ static int sdma_request_channel0(struc { int ret = -EBUSY; - sdma->bd0 = dma_alloc_coherent(NULL, PAGE_SIZE, >bd0_phys, - sdma->bd0 = dma_zalloc_coherent(sdma->dev, PAGE_SIZE, >bd0_phys, - GFP_NOWAIT); ++ sdma->bd0 = dma_alloc_coherent(sdma->dev, PAGE_SIZE, >bd0_phys, + GFP_NOWAIT); if (!sdma->bd0) { ret = -ENOMEM; goto out; @@@ -1205,8 -1212,8 +1212,8 @@@ static int sdma_alloc_bd(struct sdma_de u32 bd_size = desc->num_bd * sizeof(struct sdma_buffer_descriptor); int ret = 0; - desc->bd = dma_alloc_coherent(NULL, bd_size, >bd_phys, - GFP_NOWAIT); - desc->bd = dma_zalloc_coherent(desc->sdmac->sdma->dev, bd_size, - >bd_phys, GFP_NOWAIT); ++ desc->bd = dma_alloc_coherent(desc->sdmac->sdma->dev, bd_size, ++>bd_phys, GFP_NOWAIT); if (!desc->bd) { ret = -ENOMEM; goto out; pgpaiItbQAyfy.pgp Description: OpenPGP digital signature
Re: [PATCH v2 0/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2
Ching, > This patch series are against to mkp's 5.1/scsi-queue. Applied to 5.1/scsi-queue. Thank you. PS. Your file permissions are odd. I always have to change your diffs from 755 to 644 before applying. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH RFC 03/24] mm: allow VM_FAULT_RETRY for multiple times
On Wed, Jan 23, 2019 at 10:12:41AM +0800, Peter Xu wrote: > On Tue, Jan 22, 2019 at 11:53:10AM -0500, Jerome Glisse wrote: > > On Tue, Jan 22, 2019 at 04:22:38PM +0800, Peter Xu wrote: > > > On Mon, Jan 21, 2019 at 10:55:36AM -0500, Jerome Glisse wrote: > > > > On Mon, Jan 21, 2019 at 03:57:01PM +0800, Peter Xu wrote: > > > > > The idea comes from a discussion between Linus and Andrea [1]. > > > > > > > > > > Before this patch we only allow a page fault to retry once. We > > > > > achieved > > > > > this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing > > > > > handle_mm_fault() the second time. This was majorly used to avoid > > > > > unexpected starvation of the system by looping over forever to handle > > > > > the page fault on a single page. However that should hardly happen, > > > > > and > > > > > after all for each code path to return a VM_FAULT_RETRY we'll first > > > > > wait > > > > > for a condition (during which time we should possibly yield the cpu) > > > > > to > > > > > happen before VM_FAULT_RETRY is really returned. > > > > > > > > > > This patch removes the restriction by keeping the > > > > > FAULT_FLAG_ALLOW_RETRY > > > > > flag when we receive VM_FAULT_RETRY. It means that the page fault > > > > > handler now can retry the page fault for multiple times if necessary > > > > > without the need to generate another page fault event. Meanwhile we > > > > > still keep the FAULT_FLAG_TRIED flag so page fault handler can still > > > > > identify whether a page fault is the first attempt or not. > > > > > > > > So there is nothing protecting starvation after this patch ? AFAICT. > > > > Do we sufficient proof that we never have a scenario where one process > > > > might starve fault another ? > > > > > > > > For instance some page locking could starve one process. > > > > > > Hi, Jerome, > > > > > > Do you mean lock_page()? > > > > > > AFAIU lock_page() will only yield the process itself until the lock is > > > released, so IMHO it's not really starving the process but a natural > > > behavior. After all the process may not continue without handling the > > > page fault correctly. > > > > > > Or when you say "starvation" do you mean that we might return > > > VM_FAULT_RETRY from handle_mm_fault() continuously so we'll looping > > > over and over inside the page fault handler? > > > > That one ie every time we retry someone else is holding the lock and > > thus lock_page_or_retry() will continuously retry. Some process just > > get unlucky ;) > > > > With existing code because we remove the retry flag then on the second > > try we end up waiting for the page lock while holding the mmap_sem so > > we know that we are in line for the page lock and we will get it once > > it is our turn. > > Ah I see. :) It's indeed a valid questioning. > > Firstly note that even after this patch we can still identify whether > we're at the first attempt or not by checking against FAULT_FLAG_TRIED > (it will be applied to the fault flag in all the retries but not in > the first atttempt). So IMHO this change might suite if we want to > keep the old behavior [1]: > > diff --git a/mm/filemap.c b/mm/filemap.c > index 9f5e323e883e..44942c78bb92 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -1351,7 +1351,7 @@ EXPORT_SYMBOL_GPL(__lock_page_killable); > int __lock_page_or_retry(struct page *page, struct mm_struct *mm, > unsigned int flags) > { > - if (flags & FAULT_FLAG_ALLOW_RETRY) { > + if (!flags & FAULT_FLAG_TRIED) { > /* > * CAUTION! In this case, mmap_sem is not released > * even though return 0. I need to check how FAULT_FLAG_TRIED have been use so far, but yes it looks like this would keep the existing behavior intact. > > But at the same time I'm stepping back trying to see the whole > picture... My understanding is that this is really a policy that we > can decide, and a trade off between "being polite or not on the > mmap_sem", that when taking the page lock in slow path we either: > > (1) release mmap_sem before waiting, polite enough but uncertain to > finally have the lock, or, > > (2) keep mmap_sem before waiting, not polite enough but certain to > take the lock. > > We did (2) before on the reties because in existing code we only allow > to retry once, so we can't fail on the 2nd attempt. That seems to be > a good reason to being "unpolite" - we took the mmap_sem without > considering others because we've been "polite" once. I'm not that > experienced in mm development but AFAIU solution 2 is only reducing > our chance of starvation but adding that chance of starvation to other > processes that want the mmap_sem instead. So IMHO the starvation > issue always existed even before this patch, and it looks natural and > sane to me so far... And if with that in mind, I can't say that above > change at [1] would be better, and maybe, it'll be even more fair
Re: [PATCH v5 2/2] kexec, KEYS: Make use of platform keyring for signature verify
On 01/21/19 at 05:59pm, Kairui Song wrote: > This patch let kexec_file_load makes use of .platform keyring as fall > back if it failed to verify a PE signed image against secondary or > builtin key ring, make it possible to verify kernel image signed with > preboot keys as well. > > This commit adds a VERIFY_USE_PLATFORM_KEYRING similar to previous > VERIFY_USE_SECONDARY_KEYRING indicating that verify_pkcs7_signature > should verify the signature using platform keyring. Also, decrease > the error message log level when verification failed with -ENOKEY, > so that if called tried multiple time with different keyring it > won't generate extra noises. > > Signed-off-by: Kairui Song > --- > arch/x86/kernel/kexec-bzimage64.c | 13 ++--- > certs/system_keyring.c| 13 - > include/linux/verification.h | 1 + > 3 files changed, 23 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kernel/kexec-bzimage64.c > b/arch/x86/kernel/kexec-bzimage64.c > index 7d97e432cbbc..2c007abd3d40 100644 > --- a/arch/x86/kernel/kexec-bzimage64.c > +++ b/arch/x86/kernel/kexec-bzimage64.c > @@ -534,9 +534,16 @@ static int bzImage64_cleanup(void *loader_data) > #ifdef CONFIG_KEXEC_BZIMAGE_VERIFY_SIG > static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_len) > { > - return verify_pefile_signature(kernel, kernel_len, > -VERIFY_USE_SECONDARY_KEYRING, > -VERIFYING_KEXEC_PE_SIGNATURE); > + int ret; > + ret = verify_pefile_signature(kernel, kernel_len, > + VERIFY_USE_SECONDARY_KEYRING, > + VERIFYING_KEXEC_PE_SIGNATURE); > + if (ret == -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) { > + ret = verify_pefile_signature(kernel, kernel_len, > + VERIFY_USE_PLATFORM_KEYRING, > + VERIFYING_KEXEC_PE_SIGNATURE); > + } > + return ret; > } > #endif > > diff --git a/certs/system_keyring.c b/certs/system_keyring.c > index 4690ef9cda8a..7085c286f4bd 100644 > --- a/certs/system_keyring.c > +++ b/certs/system_keyring.c > @@ -240,11 +240,22 @@ int verify_pkcs7_signature(const void *data, size_t len, > #else > trusted_keys = builtin_trusted_keys; > #endif > + } else if (trusted_keys == VERIFY_USE_PLATFORM_KEYRING) { > +#ifdef CONFIG_INTEGRITY_PLATFORM_KEYRING > + trusted_keys = platform_trusted_keys; > +#else > + trusted_keys = NULL; > +#endif > + if (!trusted_keys) { > + ret = -ENOKEY; > + pr_devel("PKCS#7 platform keyring is not available\n"); > + goto error; > + } > } > ret = pkcs7_validate_trust(pkcs7, trusted_keys); > if (ret < 0) { > if (ret == -ENOKEY) > - pr_err("PKCS#7 signature not signed with a trusted > key\n"); > + pr_devel("PKCS#7 signature not signed with a trusted > key\n"); > goto error; > } > > diff --git a/include/linux/verification.h b/include/linux/verification.h > index cfa4730d607a..018fb5f13d44 100644 > --- a/include/linux/verification.h > +++ b/include/linux/verification.h > @@ -17,6 +17,7 @@ > * should be used. > */ > #define VERIFY_USE_SECONDARY_KEYRING ((struct key *)1UL) > +#define VERIFY_USE_PLATFORM_KEYRING ((struct key *)2UL) > > /* > * The use to which an asymmetric key is being put. > -- > 2.20.1 > For kexec_file part Acked-by: Dave Young Thanks Dave
[PATCH] staging: ks7010: remove unnecessary parentheses
Remove unnecessary parentheses reported by checkpatch. Signed-off-by: Matt McCoy --- drivers/staging/ks7010/ks_hostif.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/staging/ks7010/ks_hostif.c b/drivers/staging/ks7010/ks_hostif.c index 065bce1..d938b09 100644 --- a/drivers/staging/ks7010/ks_hostif.c +++ b/drivers/staging/ks7010/ks_hostif.c @@ -35,7 +35,7 @@ static inline u8 get_byte(struct ks_wlan_private *priv) { u8 data; - data = *(priv->rxp)++; + data = *priv->rxp++; /* length check in advance ! */ --(priv->rx_size); return data; @@ -171,7 +171,7 @@ int get_current_ap(struct ks_wlan_private *priv, struct link_ap_info *ap_info) "- rate_set_size=%d\n", ap->bssid[0], ap->bssid[1], ap->bssid[2], ap->bssid[3], ap->bssid[4], ap->bssid[5], - &(ap->ssid.body[0]), + >ssid.body[0], ap->rate_set.body[0], ap->rate_set.body[1], ap->rate_set.body[2], ap->rate_set.body[3], ap->rate_set.body[4], ap->rate_set.body[5], @@ -732,7 +732,7 @@ void hostif_scan_indication(struct ks_wlan_private *priv) netdev_dbg(priv->net_dev, " scan_ind_count=%d :: aplist.size=%d\n", priv->scan_ind_count, priv->aplist.size); get_ap_information(priv, (struct ap_info *)(priv->rxp), - &(priv->aplist.ap[priv->scan_ind_count - 1])); + >aplist.ap[priv->scan_ind_count - 1]); priv->aplist.size = priv->scan_ind_count; } else { netdev_dbg(priv->net_dev, " count over :: scan_ind_count=%d\n", -- 2.7.4
Re: [PATCH RFC 06/24] userfaultfd: wp: support write protection for userfault vma range
On Tue, Jan 22, 2019 at 12:02:24PM -0500, Jerome Glisse wrote: > On Tue, Jan 22, 2019 at 05:39:35PM +0800, Peter Xu wrote: > > On Mon, Jan 21, 2019 at 09:05:35AM -0500, Jerome Glisse wrote: > > > > [...] > > > > > > + change_protection(dst_vma, start, start + len, newprot, > > > > + !enable_wp, 0); > > > > > > So setting dirty_accountable bring us to that code in mprotect.c: > > > > > > if (dirty_accountable && pte_dirty(ptent) && > > > (pte_soft_dirty(ptent) || > > > !(vma->vm_flags & VM_SOFTDIRTY))) { > > > ptent = pte_mkwrite(ptent); > > > } > > > > > > My understanding is that you want to set write flag when enable_wp > > > is false and you want to set the write flag unconditionaly, right ? > > > > Right. > > > > > > > > If so then you should really move the change_protection() flags > > > patch before this patch and add a flag for setting pte write flags. > > > > > > Otherwise the above is broken at it will only set the write flag > > > for pte that were dirty and i am guessing so far you always were > > > lucky because pte were all dirty (change_protection will preserve > > > dirtyness) when you write protected them. > > > > > > So i believe the above is broken or at very least unclear if what > > > you really want is to only set write flag to pte that have the > > > dirty flag set. > > > > You are right, if we build the tree until this patch it won't work for > > all the cases. It'll only work if the page was at least writable > > before and also it's dirty (as you explained). Sorry to be unclear > > about this, maybe I should at least mention that in the commit message > > but I totally forgot it. > > > > All these problems are solved in later on patches, please feel free to > > have a look at: > > > > mm: merge parameters for change_protection() > > userfaultfd: wp: apply _PAGE_UFFD_WP bit > > userfaultfd: wp: handle COW properly for uffd-wp > > > > Note that even in the follow up patches IMHO we can't directly change > > the write permission since the page can be shared by other processes > > (e.g., the zero page or COW pages). But the general idea is the same > > as you explained. > > > > I tried to avoid squashing these stuff altogether as explained > > previously. Also, this patch can be seen as a standalone patch to > > introduce the new interface which seems to make sense too, and it is > > indeed still working in many cases so I see the latter patches as > > enhancement of this one. Please let me know if you still want me to > > have all these stuff squashed, or if you'd like me to squash some of > > them. > > Yeah i have look at those after looking at this one. You should just > re-order the patch this one first and then one that add new flag, > then ones that add the new userfaultfd feature. Otherwise you are > adding a userfaultfd feature that is broken midway ie it is added > broken and then you fix it. Some one bisecting thing might get hurt > by that. It is better to add and change everything you need and then > add the new feature so that the new feature will work as intended. > > So no squashing just change the order ie add the userfaultfd code > last. Yes this makes sense, I'll do that in v2. Thanks for the suggestion! -- Peter Xu
Re: [PATCH] kprobes: no need to check return value of debugfs_create functions
On Tue, 22 Jan 2019 16:21:46 +0100 Greg Kroah-Hartman wrote: > When calling debugfs functions, there is no need to ever check the > return value. The function can work or not, but the code logic should > never do something different based on this. > > Cc: "Naveen N. Rao" > Cc: Anil S Keshavamurthy > Cc: "David S. Miller" > Cc: Masami Hiramatsu > Signed-off-by: Greg Kroah-Hartman > --- > kernel/kprobes.c | 25 ++--- > 1 file changed, 6 insertions(+), 19 deletions(-) > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > index f4ddfdd2d07e..7287e7de2350 100644 > --- a/kernel/kprobes.c > +++ b/kernel/kprobes.c > @@ -2566,33 +2566,20 @@ static const struct file_operations fops_kp = { > > static int __init debugfs_kprobe_init(void) > { > - struct dentry *dir, *file; > + struct dentry *dir; > unsigned int value = 1; > > dir = debugfs_create_dir("kprobes", NULL); > - if (!dir) > - return -ENOMEM; Here, I think IS_ERR(dir) is OK for debugfs_create_file(), but dir == NULL has different meaning. I think we'd better keep this check. (I see, -ENOMEM will be no good...) Thank you, > > - file = debugfs_create_file("list", 0400, dir, NULL, > - _kprobes_operations); > - if (!file) > - goto error; > + debugfs_create_file("list", 0400, dir, NULL, > + _kprobes_operations); > > - file = debugfs_create_file("enabled", 0600, dir, > - , _kp); > - if (!file) > - goto error; > + debugfs_create_file("enabled", 0600, dir, , _kp); > > - file = debugfs_create_file("blacklist", 0400, dir, NULL, > - _kprobe_blacklist_ops); > - if (!file) > - goto error; > + debugfs_create_file("blacklist", 0400, dir, NULL, > + _kprobe_blacklist_ops); > > return 0; > - > -error: > - debugfs_remove(dir); > - return -ENOMEM; > } > > late_initcall(debugfs_kprobe_init); > -- > 2.20.1 > -- Masami Hiramatsu
Re: [PATCH RFC 03/24] mm: allow VM_FAULT_RETRY for multiple times
On Tue, Jan 22, 2019 at 11:53:10AM -0500, Jerome Glisse wrote: > On Tue, Jan 22, 2019 at 04:22:38PM +0800, Peter Xu wrote: > > On Mon, Jan 21, 2019 at 10:55:36AM -0500, Jerome Glisse wrote: > > > On Mon, Jan 21, 2019 at 03:57:01PM +0800, Peter Xu wrote: > > > > The idea comes from a discussion between Linus and Andrea [1]. > > > > > > > > Before this patch we only allow a page fault to retry once. We achieved > > > > this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing > > > > handle_mm_fault() the second time. This was majorly used to avoid > > > > unexpected starvation of the system by looping over forever to handle > > > > the page fault on a single page. However that should hardly happen, and > > > > after all for each code path to return a VM_FAULT_RETRY we'll first wait > > > > for a condition (during which time we should possibly yield the cpu) to > > > > happen before VM_FAULT_RETRY is really returned. > > > > > > > > This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY > > > > flag when we receive VM_FAULT_RETRY. It means that the page fault > > > > handler now can retry the page fault for multiple times if necessary > > > > without the need to generate another page fault event. Meanwhile we > > > > still keep the FAULT_FLAG_TRIED flag so page fault handler can still > > > > identify whether a page fault is the first attempt or not. > > > > > > So there is nothing protecting starvation after this patch ? AFAICT. > > > Do we sufficient proof that we never have a scenario where one process > > > might starve fault another ? > > > > > > For instance some page locking could starve one process. > > > > Hi, Jerome, > > > > Do you mean lock_page()? > > > > AFAIU lock_page() will only yield the process itself until the lock is > > released, so IMHO it's not really starving the process but a natural > > behavior. After all the process may not continue without handling the > > page fault correctly. > > > > Or when you say "starvation" do you mean that we might return > > VM_FAULT_RETRY from handle_mm_fault() continuously so we'll looping > > over and over inside the page fault handler? > > That one ie every time we retry someone else is holding the lock and > thus lock_page_or_retry() will continuously retry. Some process just > get unlucky ;) > > With existing code because we remove the retry flag then on the second > try we end up waiting for the page lock while holding the mmap_sem so > we know that we are in line for the page lock and we will get it once > it is our turn. Ah I see. :) It's indeed a valid questioning. Firstly note that even after this patch we can still identify whether we're at the first attempt or not by checking against FAULT_FLAG_TRIED (it will be applied to the fault flag in all the retries but not in the first atttempt). So IMHO this change might suite if we want to keep the old behavior [1]: diff --git a/mm/filemap.c b/mm/filemap.c index 9f5e323e883e..44942c78bb92 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1351,7 +1351,7 @@ EXPORT_SYMBOL_GPL(__lock_page_killable); int __lock_page_or_retry(struct page *page, struct mm_struct *mm, unsigned int flags) { - if (flags & FAULT_FLAG_ALLOW_RETRY) { + if (!flags & FAULT_FLAG_TRIED) { /* * CAUTION! In this case, mmap_sem is not released * even though return 0. But at the same time I'm stepping back trying to see the whole picture... My understanding is that this is really a policy that we can decide, and a trade off between "being polite or not on the mmap_sem", that when taking the page lock in slow path we either: (1) release mmap_sem before waiting, polite enough but uncertain to finally have the lock, or, (2) keep mmap_sem before waiting, not polite enough but certain to take the lock. We did (2) before on the reties because in existing code we only allow to retry once, so we can't fail on the 2nd attempt. That seems to be a good reason to being "unpolite" - we took the mmap_sem without considering others because we've been "polite" once. I'm not that experienced in mm development but AFAIU solution 2 is only reducing our chance of starvation but adding that chance of starvation to other processes that want the mmap_sem instead. So IMHO the starvation issue always existed even before this patch, and it looks natural and sane to me so far... And if with that in mind, I can't say that above change at [1] would be better, and maybe, it'll be even more fair that we should always release the mmap_sem first in this case (assuming that we'll after all have that lock though we might pay more times of retries)? Or, is there a way to constantly starve the process that handles the page fault that I've totally missed? Thanks, -- Peter Xu
Re: [PATCH v3 1/1] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
Hi Amit, On Mon, Jan 21, 2019 at 11:38:34PM +0530, Amit Kucheria wrote: > Since all cpus in the big and little clusters, respectively, are in the > same frequency domain, use all of them for mitigation in the > cooling-map. We end up with two cooling devices - one each for the big > and little clusters. > > We throttle lightly at the first trip point, just removing the boost > frequency. At the next trip point we allow ourselves to be throttled to > any extent. > > Signed-off-by: Amit Kucheria > --- > arch/arm64/boot/dts/qcom/sdm845.dtsi | 225 +-- > 1 file changed, 209 insertions(+), 16 deletions(-) > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi > b/arch/arm64/boot/dts/qcom/sdm845.dtsi > index c27cbd3bcb0a..878f661d16eb 100644 > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include > > / { > interrupt-parent = <>; > @@ -99,6 +100,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x0>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <_0>; > L2_0: l2-cache { > compatible = "cache"; > @@ -114,6 +116,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x100>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <_100>; > L2_100: l2-cache { > compatible = "cache"; > @@ -126,6 +129,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x200>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <_200>; > L2_200: l2-cache { > compatible = "cache"; > @@ -138,6 +142,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x300>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <_300>; > L2_300: l2-cache { > compatible = "cache"; > @@ -150,6 +155,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x400>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <_400>; > L2_400: l2-cache { > compatible = "cache"; > @@ -162,6 +168,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x500>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <_500>; > L2_500: l2-cache { > compatible = "cache"; > @@ -174,6 +181,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x600>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <_600>; > L2_600: l2-cache { > compatible = "cache"; > @@ -186,6 +194,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x700>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <_700>; > L2_700: l2-cache { > compatible = "cache"; > @@ -1691,18 +1700,41 @@ > thermal-sensors = < 1>; > > trips { > - cpu_alert0: trip0 { > + cpu0_alert1: trip-point@0 { > temperature = <75000>; In my observations a 'switch on/threshold' temperature of 75 degrees leads to aggressive throttling with IPA when the temperature is above this threshold: [ 716.760804] cpu_cooling_ratelimit: 31 callbacks suppressed [ 716.760836] cpu cpu4: Cooling state set to 10. New max freq = 192 [ 716.773390] power_allocator_ratelimit: 15 callbacks suppressed [ 716.773405] thermal thermal_zone5: Controlling power: control_temp=95000 last_temp=73500, curr_temp=75200 total_requested_power=39025 total_granted_power=18654 [ 749.609336] cpu_cooling_ratelimit: 45 callbacks suppressed [ 749.609371] cpu cpu4: Cooling state set to 11. New max freq = 1843200 [ 749.624300] power_allocator_ratelimit: 24 callbacks suppressed [ 749.624323] thermal thermal_zone5: Controlling power: control_temp=95000 last_temp=70800, curr_temp=77200
Re: [PATCH] workqueue: Try to catch flush_work() without INIT_WORK().
On Wed, Jan 23, 2019 at 09:44:12AM +0900, Tetsuo Handa wrote: > Daniel Jordan wrote: > > On Sat, Jan 19, 2019 at 11:41:22AM +0900, Tetsuo Handa wrote: > > > On 2019/01/19 4:48, Daniel Jordan wrote: > > > > On Sat, Jan 19, 2019 at 02:04:58AM +0900, Tetsuo Handa wrote: > > > > __queue_work has a sanity check already for work, but using list_empty. > > > > Seems > > > > slightly better to be consistent? > > > > > > > > > > list_empty() won't work, for "struct work_struct" is embedded into a > > > struct > > > which is allocated by kzalloc(). > > > > Please check list_empty's definition again, it compares the address of the > > node > > to its next pointer, so it should work for a zeroed node. I'll reiterate > > that > > it seems slightly better to be consistent in "is work_struct initialized?" > > checks, but it's not a big deal and I'm fine either way. > > You are talking about > > if (WARN_ON(!list_empty(>entry))) { > spin_unlock(>pool->lock); > return; > } > > part in __queue_work(), aren't you? But since flush_work() is used for > waiting for > a work to complete, that work can be either queued state (list_empty() == > false) or > not queued state (list_empty() == true). Thus, I don't think that > flush_work() can > use list_empty() for checking whether that work was initialized. Oh, you're right, sorry for the noise! > [PATCH v2] workqueue: Try to catch flush_work() without INIT_WORK(). > > syzbot found a flush_work() caller who forgot to call INIT_WORK() > because that work_struct was allocated by kzalloc() [1]. But the message > > INFO: trying to register non-static key. > the code is fine but needs lockdep annotation. > turning off the locking correctness validator. > > by lock_map_acquire() is failing to tell that INIT_WORK() is missing. > > Since flush_work() without INIT_WORK() is a bug, and INIT_WORK() should > set ->func field to non-zero, let's warn if ->func field is zero. > > [1] > https://syzkaller.appspot.com/bug?id=a5954455fcfa51c29ca2ab55b203076337e1c770 > > Signed-off-by: Tetsuo Handa > --- > kernel/workqueue.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > index 392be4b..a503ad9 100644 > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -2908,6 +2908,9 @@ static bool __flush_work(struct work_struct *work, bool > from_cancel) > if (WARN_ON(!wq_online)) > return false; > > + if (WARN_ON(!work->func)) > + return false; > + > if (!from_cancel) { > lock_map_acquire(>lockdep_map); > lock_map_release(>lockdep_map); Thanks for updating the changelog. FWIW, you can add Reviewed-by: Daniel Jordan
Re: possible deadlock in __do_page_fault
Joel Fernandes wrote: > > Why do we need to call fallocate() synchronously with ashmem_mutex held? > > Why can't we call fallocate() asynchronously from WQ_MEM_RECLAIM workqueue > > context so that we can call fallocate() with ashmem_mutex not held? > > > > I don't know how ashmem works, but as far as I can guess, offloading is > > possible as long as other operations which depend on the completion of > > fallocate() operation (e.g. read()/mmap(), querying/changing pinned status) > > wait for completion of asynchronous fallocate() operation (like a draft > > patch shown below is doing). > > This adds a bit of complexity, I am worried if it will introduce more > bugs especially because ashmem is going away in the long term, in favor of > memfd - and if its worth adding more complexity / maintenance burden to it. I don't care migrating to memfd. I care when bugs are fixed. > > I am wondering if we can do this synchronously, without using a workqueue. > All you would need is a temporary list of areas to punch. In > ashmem_shrink_scan, you would create this list under mutex and then once you > release the mutex, you can go through this list and do the fallocate followed > by the wake up of waiters on the wait queue, right? If you can do it this > way, then it would be better IMO. Are you sure that none of locks held before doing GFP_KERNEL allocation interferes lock dependency used by fallocate() ? If yes, we can do without a workqueue context (like a draft patch shown below). Since I don't understand what locks are potentially involved, I offloaded to a clean workqueue context. Anyway, I need your checks regarding whether this approach is waiting for completion at all locations which need to wait for completion. --- drivers/staging/android/ashmem.c | 25 - 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index 90a8a9f1ac7d..6a267563cb66 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -75,6 +75,9 @@ struct ashmem_range { /* LRU list of unpinned pages, protected by ashmem_mutex */ static LIST_HEAD(ashmem_lru_list); +static atomic_t ashmem_shrink_inflight = ATOMIC_INIT(0); +static DECLARE_WAIT_QUEUE_HEAD(ashmem_shrink_wait); + /* * long lru_count - The count of pages on our LRU list. * @@ -292,6 +295,7 @@ static ssize_t ashmem_read_iter(struct kiocb *iocb, struct iov_iter *iter) int ret = 0; mutex_lock(_mutex); + wait_event(ashmem_shrink_wait, !atomic_read(_shrink_inflight)); /* If size is not set, or set to 0, always return EOF. */ if (asma->size == 0) @@ -359,6 +363,7 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) int ret = 0; mutex_lock(_mutex); + wait_event(ashmem_shrink_wait, !atomic_read(_shrink_inflight)); /* user needs to SET_SIZE before mapping */ if (!asma->size) { @@ -438,7 +443,6 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) static unsigned long ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct ashmem_range *range, *next; unsigned long freed = 0; /* We might recurse into filesystem code, so bail out if necessary */ @@ -448,17 +452,27 @@ ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) if (!mutex_trylock(_mutex)) return -1; - list_for_each_entry_safe(range, next, _lru_list, lru) { + while (!list_empty(_lru_list)) { + struct ashmem_range *range = + list_first_entry(_lru_list, typeof(*range), lru); loff_t start = range->pgstart * PAGE_SIZE; loff_t end = (range->pgend + 1) * PAGE_SIZE; + struct file *f = range->asma->file; - range->asma->file->f_op->fallocate(range->asma->file, - FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, - start, end - start); + get_file(f); + atomic_inc(_shrink_inflight); range->purged = ASHMEM_WAS_PURGED; lru_del(range); freed += range_size(range); + mutex_unlock(_mutex); + f->f_op->fallocate(f, + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, + start, end - start); + fput(f); + if (atomic_dec_and_test(_shrink_inflight)) + wake_up_all(_shrink_wait); + mutex_lock(_mutex); if (--sc->nr_to_scan <= 0) break; } @@ -713,6 +727,7 @@ static int ashmem_pin_unpin(struct ashmem_area *asma, unsigned long cmd, return -EFAULT; mutex_lock(_mutex); + wait_event(ashmem_shrink_wait, !atomic_read(_shrink_inflight));
Re: [alsa-devel] [PATCH] ASoC: soc-core: Fix null pointer dereference in soc_find_component
On 1/22/19 7:36 PM, Curtis Malainey wrote: Curtis Malainey | Software Engineer | cujomalai...@google.com | 650-898-3849 On Wed, Jan 23, 2019 at 4:11 AM Pierre-Louis Bossart wrote: The issue was that we were seeing a memory corruption bug on an AMD chromebooks with that function already (not observed on Intel). I was testing some SOF integrations and was seeing this in the kernel logs. I had Dylan verify my logic before I sent the patch because it took so long to identify the bug and it was traced to the patch that introduce soc_init_platform. [ 10.922112] cz-da7219-max98357a AMD7219:00: ASoC: CPU DAI designware-i2s.1.auto not registered [ 10.922122] cz-da7219-max98357a AMD7219:00: devm_snd_soc_register_card(acpd7219m98357) failed: -517 [ 11.001411] cz-da7219-max98357a AMD7219:00: ASoC: Both platform name/of_node are set for amd-max98357-play [ 11.001423] cz-da7219-max98357a AMD7219:00: ASoC: failed to init link amd-max98357-play [ 11.001431] cz-da7219-max98357a AMD7219:00: devm_snd_soc_register_card(acpd7219m98357) failed: -22 [ 11.001577] cz-da7219-max98357a: probe of AMD7219:00 failed with error -22 of_node was never getting set but the pointer was becoming populated (outside of the probe call) which traced to soc_init_platform function which was not reallocating memory on a EPROBE_DEFER even though it was getting freed by devm. I am not very familiar with devm but my local maintainers say that it should be freeing the memory even on a PROBE_DEFER. The patch should mirror the memory behaviour in snd_soc_init_multicodec which also reallocates its memory on every probe. I'm not sure how the patch is causing you to defer, is your component list corrupt? Sorry for the duplicate spam, forgot to send via plain text mode, re-sending for the mailing list so it gets accepted. There is no defer issue with the intel stuff, but we call this routine multiple times snd_soc_register_card --soc_init_dai_link snd_soc_init_platform -- soc_soc_bind_card snd_soc_instantiate_card -- soc_check_tplg_fes snd_soc_init_platform << ALLOC1 soc_init_dai_link --snd_soc_init_platform << ALLOC2 Ah that explains it, in my testing I didn't have the patch that brought in the call from within tplg_fes Initially dai_link->legacy_platform is 0, so gets set after the first first devm_kzalloc (ALLOC1) and after that we always allocate new memory (ALLOC2). The end result is that whatever we set in soc_check_tplg_fes is lost with the new/unnecessary alloc. I would guess your solution is also a work-around, if devm_ effectively freed the memory then the pointer would become NULL. Or may that's the issue is that no one actually resets it. Yes, its a work around to fix the memory issue. If you set the platform in the machine driver the code will ignore it and not reset it. That being said that is not a full proof workaround and a better solution is definitely needed. We could go and clean up the pointers in soc_instantiate_card based on the flag being set. That way we only relocate on a NULL pointer like we used to but still don't affect statically allocated memory. I will draft a patch, test it on the AMD device, reply to this thread later with it, Pierre can you test it as well? I am curious why soc_check_tplg_fes is calling snd_soc_init_platform. It should have already been called earlier, in soc_init_dai_link at the beginning of snd_soc_register_card so the memory should already be initialized. Unless I am missing somewhere where links are getting added between the calls. This is actually a second order problem, the main issue i have is that the very first call to init_dai_link fails with the new DEFER_PROBE handling. I don't quite understand what Linaro/AMD folks are doing but I trust their changes are legitimate. To move forward, maybe it's not worth spending too much time on a grand unification of string theory, there are simpler solutions: the Intel machine drivers already do get the platform driver name as an platform_data argument, so we could modify the dailinks platform names before even registering the card. I tested with the attached proof-of-concept patch, it adds 2 lines of code per machine driver if we use a common helper (after the transition to the "modern" dailink representation that's needed anyways) so maybe it's better in the end? the override we care about is really the automatic handling of all the hard-coded front-ends, the platform-name override isn't really a battle i want to pick or spend time on. >From 5680c64b09964b134e20bf96142d1ce5dcf0f77f Mon Sep 17 00:00:00 2001 From: Pierre-Louis Bossart Date: Tue, 22 Jan 2019 18:53:43 -0600 Subject: [PATCH] ASoC: add helper to change platform name for all dailinks To reuse the same machine drivers with Atom/SST, Skylake and SOF, we need to change the default platform_name (or platforms->name in the "modern" representation). So far, this override was done with an automatic
Re: [LSF/MM TOPIC] Page flags, can we free up space ?
On 1/22/19 12:17 PM, Jerome Glisse wrote: > So lattely i have been looking at page flags and we are using 6 flags > for memory reclaim and compaction: > > PG_referenced > PG_lru > PG_active > PG_workingset > PG_reclaim > PG_unevictable > > On top of which you can add the page anonymous flag (anonymous or > share memory) > PG_anon // does not exist, lower bit of page->mapping > > And also the movable flag (which alias with KSM) > PG_movable // does not exist, lower bit of page->mapping > > > So i would like to explore if there is a way to express the same amount > of information with less bits. My methodology is to exhaustively list > all the possible states (valid combination of above flags) and then to > see how we change from one state to another (what event trigger the change > like mlock(), page being referenced, ...) and under which rules (ie do we > hold the page lock, zone lock, ...). > > My hope is that there might be someway to use less bits to express the > same thing. I am doing this because for my work on generic page write > protection (ie KSM for file back page) which i talk about last year and > want to talk about again ;) I will need to unalias the movable bit from > KSM bit. > > > Right now this is more a temptative ie i do not know if i will succeed, > in any case i can report on failure or success and discuss my finding to > get people opinions on the matter. > > > I think everyone interested in mm will be interested in this topic :) Explicitly adding Matthew on Cc as I am pretty sure he has been working in this area. -- Mike Kravetz
[PATCH] acpi_pm: Reduce PMTMR counter read contention
On a large system with many CPUs, using PMTMR as the clock source can have a significant impact on the overall system performance because of the following reasons: 1) There is a single PMTMR counter shared by all the CPUs. 2) PMTMR counter reading is a very slow operation. Using PMTMR as the default clock source may happen when, for example, the TSC clock calibration exceeds the allowable tolerance and HPET disabled by nohpet on kernel command line. Sometimes the performance slowdown can be so severe that the system may crash because of a NMI watchdog soft lockup, logs: [ 20.181521] clocksource: acpi_pm: mask: 0xff max_cycles: 0xff, max_idle_ns: 2085701024 ns [ 44.273786] BUG: soft lockup - CPU#48 stuck for 23s! [swapper/48:0] [ 44.279992] BUG: soft lockup - CPU#49 stuck for 23s! [migration/49:307] [ 44.285169] BUG: soft lockup - CPU#50 stuck for 23s! [migration/50:313] Commit f99fd22e4d4b ("x86/hpet: Reduce HPET counter read contention") fixed a similar issue for HPET, this patch adapts that design to PMTMR. Signed-off-by: Zhenzhong Duan Tested-by: Kin Cho Cc: Daniel Lezcano Cc: Thomas Gleixner Cc: Waiman Long Cc: Srinivas Eeda --- drivers/clocksource/acpi_pm.c | 101 +- 1 file changed, 100 insertions(+), 1 deletion(-) diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c index 1961e35..8b522eb 100644 --- a/drivers/clocksource/acpi_pm.c +++ b/drivers/clocksource/acpi_pm.c @@ -32,12 +32,111 @@ */ u32 pmtmr_ioport __read_mostly; -static inline u32 read_pmtmr(void) +static inline u32 pmtmr_readl(void) { /* mask the output to 24 bits */ return inl(pmtmr_ioport) & ACPI_PM_MASK; } +#if defined(CONFIG_SMP) && defined(CONFIG_64BIT) +/* + * Reading the PMTMR counter is a very slow operation. If a large number of + * CPUs are trying to access the PMTMR counter simultaneously, it can cause + * massive delay and slow down system performance dramatically. This may + * happen when PMTMR is the default clock source instead of TSC. For a + * really large system with hundreds of CPUs, the slowdown may be so + * severe that it may actually crash the system because of a NMI watchdog + * soft lockup, for example. + * + * If multiple CPUs are trying to access the PMTMR counter at the same time, + * we don't actually need to read the counter multiple times. Instead, the + * other CPUs can use the counter value read by the first CPU in the group. + * + * This special feature is only enabled on x86-64 systems. It is unlikely + * that 32-bit x86 systems will have enough CPUs to require this feature + * with its associated locking overhead. And we also need 64-bit atomic + * read. + * + * The lock and the pmtmr value are stored together and can be read in a + * single atomic 64-bit read. It is explicitly assumed that arch_spinlock_t + * is 32 bits in size. + */ +union pmtmr_lock { + struct { + arch_spinlock_t lock; + u32 value; + }; + u64 lockval; +}; + +static union pmtmr_lock pmtmr __cacheline_aligned = { + { .lock = __ARCH_SPIN_LOCK_UNLOCKED, }, +}; + +static u32 read_pmtmr(void) +{ + unsigned long flags; + union pmtmr_lock old, new; + + BUILD_BUG_ON(sizeof(union pmtmr_lock) != 8); + + /* +* Read PMTMR directly if in NMI. +*/ + if (in_nmi()) + return (u64)pmtmr_readl(); + + /* +* Read the current state of the lock and PMTMR value atomically. +*/ + old.lockval = READ_ONCE(pmtmr.lockval); + + if (arch_spin_is_locked()) + goto contended; + + local_irq_save(flags); + if (arch_spin_trylock()) { + new.value = pmtmr_readl(); + /* +* Use WRITE_ONCE() to prevent store tearing. +*/ + WRITE_ONCE(pmtmr.value, new.value); + arch_spin_unlock(); + local_irq_restore(flags); + return (u64)new.value; + } + local_irq_restore(flags); + +contended: + /* +* Contended case +* -- +* Wait until the PMTMR value change or the lock is free to indicate +* its value is up-to-date. +* +* It is possible that old.value has already contained the latest +* PMTMR value while the lock holder was in the process of releasing +* the lock. Checking for lock state change will enable us to return +* the value immediately instead of waiting for the next PMTMR reader +* to come along. +*/ + do { + cpu_relax(); + new.lockval = READ_ONCE(pmtmr.lockval); + } while ((new.value == old.value) && arch_spin_is_locked()); + + return (u64)new.value; +} +#else +/* + * For UP or 32-bit. + */ +static inline u32 read_pmtmr(void) +{ + return pmtmr_readl(); +} +#endif + u32 acpi_pm_read_verified(void) { u32 v1 =
[PATCH -next] perf: xgene: Remove set but not used variable 'config'
Fixes gcc '-Wunused-but-set-variable' warning: drivers/perf/xgene_pmu.c: In function 'xgene_perf_stop': drivers/perf/xgene_pmu.c:1055:6: warning: variable 'config' set but not used [-Wunused-but-set-variable] It never used since introduction. Signed-off-by: YueHaibing --- drivers/perf/xgene_pmu.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c index d4ec048..27574e8 100644 --- a/drivers/perf/xgene_pmu.c +++ b/drivers/perf/xgene_pmu.c @@ -1052,7 +1052,6 @@ static void xgene_perf_start(struct perf_event *event, int flags) static void xgene_perf_stop(struct perf_event *event, int flags) { struct hw_perf_event *hw = >hw; - u64 config; if (hw->state & PERF_HES_UPTODATE) return; @@ -1064,7 +1063,6 @@ static void xgene_perf_stop(struct perf_event *event, int flags) if (hw->state & PERF_HES_UPTODATE) return; - config = hw->config; xgene_perf_read(event); hw->state |= PERF_HES_UPTODATE; }
Re: [PATCH net-next 1/1] net: stmmac: implement the SIOCGHWTSTAMP ioctl
From: Artem Panfilov Date: Sun, 20 Jan 2019 19:05:15 +0300 > This patch adds support for the SIOCGHWTSTAMP ioctl which enables user > processes to read the current hwtstamp_config settings > non-destructively. > > Signed-off-by: Artem Panfilov Applied, thanks.
Re: [alsa-devel] [PATCH] ASoC: soc-core: Fix null pointer dereference in soc_find_component
Curtis Malainey | Software Engineer | cujomalai...@google.com | 650-898-3849 On Wed, Jan 23, 2019 at 4:11 AM Pierre-Louis Bossart wrote: > > > > The issue was that we were seeing a memory corruption bug on an AMD > > chromebooks with that function already (not observed on Intel). I was > > testing some SOF integrations and was seeing this in the kernel logs. > > I had Dylan verify my logic before I sent the patch because it took so > > long to identify the bug and it was traced to the patch that introduce > > soc_init_platform. > > > > [ 10.922112] cz-da7219-max98357a AMD7219:00: ASoC: CPU DAI > > designware-i2s.1.auto not registered > > [ 10.922122] cz-da7219-max98357a AMD7219:00: > > devm_snd_soc_register_card(acpd7219m98357) failed: -517 > > [ 11.001411] cz-da7219-max98357a AMD7219:00: ASoC: Both platform > > name/of_node are set for amd-max98357-play > > [ 11.001423] cz-da7219-max98357a AMD7219:00: ASoC: failed to init > > link amd-max98357-play > > [ 11.001431] cz-da7219-max98357a AMD7219:00: > > devm_snd_soc_register_card(acpd7219m98357) failed: -22 > > [ 11.001577] cz-da7219-max98357a: probe of AMD7219:00 failed with error > > -22 > > > > of_node was never getting set but the pointer was becoming populated > > (outside of the probe call) which traced to soc_init_platform function > > which was not reallocating memory on a EPROBE_DEFER even though it was > > getting freed by devm. I am not very familiar with devm but my local > > maintainers say that it should be freeing the memory even on a > > PROBE_DEFER. > > The patch should mirror the memory behaviour in > > snd_soc_init_multicodec which also reallocates its memory on every > > probe. I'm not sure how the patch is causing you to defer, is your > > component list corrupt? > > > > Sorry for the duplicate spam, forgot to send via plain text mode, > > re-sending for the mailing list so it gets accepted. > > There is no defer issue with the intel stuff, but we call this routine > multiple times > > snd_soc_register_card > > --soc_init_dai_link > > snd_soc_init_platform > > -- soc_soc_bind_card > > snd_soc_instantiate_card > > -- soc_check_tplg_fes > > snd_soc_init_platform << ALLOC1 > > soc_init_dai_link > > --snd_soc_init_platform << ALLOC2 > Ah that explains it, in my testing I didn't have the patch that brought in the call from within tplg_fes > > Initially dai_link->legacy_platform is 0, so gets set after the first > first devm_kzalloc (ALLOC1) and after that we always allocate new memory > (ALLOC2). The end result is that whatever we set in soc_check_tplg_fes > is lost with the new/unnecessary alloc. > > I would guess your solution is also a work-around, if devm_ effectively > freed the memory then the pointer would become NULL. Or may that's the > issue is that no one actually resets it. > > Yes, its a work around to fix the memory issue. If you set the platform in the machine driver the code will ignore it and not reset it. That being said that is not a full proof workaround and a better solution is definitely needed. We could go and clean up the pointers in soc_instantiate_card based on the flag being set. That way we only relocate on a NULL pointer like we used to but still don't affect statically allocated memory. I will draft a patch, test it on the AMD device, reply to this thread later with it, Pierre can you test it as well? I am curious why soc_check_tplg_fes is calling snd_soc_init_platform. It should have already been called earlier, in soc_init_dai_link at the beginning of snd_soc_register_card so the memory should already be initialized. Unless I am missing somewhere where links are getting added between the calls.
Re: [PATCH 0/5] mips: cleanup debugfs usage
Hello, Greg Kroah-Hartman wrote: > When calling debugfs code, there is no need to ever check the return > value of the call, as no logic should ever change if a call works > properly or not. Fix up a bunch of x86-specific code to not care about > the results of debugfs. > > Greg Kroah-Hartman (5): > mips: cavium: no need to check return value of debugfs_create > functions > mips: ralink: no need to check return value of debugfs_create > functions > mips: mm: no need to check return value of debugfs_create functions > mips: math-emu: no need to check return value of debugfs_create > functions > mips: kernel: no need to check return value of debugfs_create > functions > > arch/mips/cavium-octeon/oct_ilm.c | 31 --- > arch/mips/kernel/mips-r2-to-r6-emul.c | 21 -- > arch/mips/kernel/segment.c| 15 +++-- > arch/mips/kernel/setup.c | 7 +- > arch/mips/kernel/spinlock_test.c | 21 -- > arch/mips/kernel/unaligned.c | 16 -- > arch/mips/math-emu/me-debugfs.c | 23 > arch/mips/mm/sc-debugfs.c | 15 +++-- > arch/mips/ralink/bootrom.c| 8 +-- > 9 files changed, 28 insertions(+), 129 deletions(-) Series applied to mips-next. Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH v1 00/11] MIPS: ath79: move towards proper OF support
Hello, Oleksij Rempel wrote: > This patches are take from OpenWRT, rebased and tested with kernel > v5.0-rt1 on DPTechnics DPT-Module (Atheros AR9331) by me. > > Since one dt-bindings header is touched, I added DT maintainers to the > TO/CC. > > Felix Fietkau (6): > MIPS: ath79: add helpers for setting clocks and expose the ref clock > MIPS: ath79: move legacy "wdt" and "uart" clock aliases out of soc > init > MIPS: ath79: pass PLL base to clock init functions > MIPS: ath79: make specifying the reference clock in DT optional > MIPS: ath79: support setting up clock via DT on all SoC types > MIPS: ath79: export switch MDIO reference clock > > John Crispin (5): > MIPS: ath79: drop legacy IRQ code > MIPS: ath79: drop machfiles > MIPS: ath79: drop legacy pci code > MIPS: ath79: drop platform device registration code > MIPS: ath79: drop !OF clock code > > arch/mips/Kconfig| 1 - > arch/mips/ath79/Kconfig | 73 - > arch/mips/ath79/Makefile | 23 +- > arch/mips/ath79/clock.c | 342 ++- > arch/mips/ath79/common.h | 5 - > arch/mips/ath79/dev-common.c | 159 --- > arch/mips/ath79/dev-common.h | 18 -- > arch/mips/ath79/dev-gpio-buttons.c | 56 > arch/mips/ath79/dev-gpio-buttons.h | 23 -- > arch/mips/ath79/dev-leds-gpio.c | 54 > arch/mips/ath79/dev-leds-gpio.h | 21 -- > arch/mips/ath79/dev-spi.c| 38 --- > arch/mips/ath79/dev-spi.h| 22 -- > arch/mips/ath79/dev-usb.c| 242 > arch/mips/ath79/dev-usb.h| 17 -- > arch/mips/ath79/dev-wmac.c | 155 -- > arch/mips/ath79/dev-wmac.h | 17 -- > arch/mips/ath79/irq.c| 169 --- > arch/mips/ath79/mach-ap121.c | 92 -- > arch/mips/ath79/mach-ap136.c | 156 --- > arch/mips/ath79/mach-ap81.c | 100 --- > arch/mips/ath79/mach-db120.c | 136 - > arch/mips/ath79/mach-pb44.c | 128 - > arch/mips/ath79/mach-ubnt-xm.c | 126 - > arch/mips/ath79/machtypes.h | 28 -- > arch/mips/ath79/pci.c| 273 -- > arch/mips/ath79/pci.h| 35 --- > arch/mips/ath79/setup.c | 78 +- > arch/mips/include/asm/mach-ath79/ath79.h | 4 - > arch/mips/pci/Makefile | 1 + > arch/mips/pci/fixup-ath79.c | 21 ++ > include/dt-bindings/clock/ath79-clk.h| 4 +- > 32 files changed, 185 insertions(+), 2432 deletions(-) > delete mode 100644 arch/mips/ath79/dev-common.c > delete mode 100644 arch/mips/ath79/dev-common.h > delete mode 100644 arch/mips/ath79/dev-gpio-buttons.c > delete mode 100644 arch/mips/ath79/dev-gpio-buttons.h > delete mode 100644 arch/mips/ath79/dev-leds-gpio.c > delete mode 100644 arch/mips/ath79/dev-leds-gpio.h > delete mode 100644 arch/mips/ath79/dev-spi.c > delete mode 100644 arch/mips/ath79/dev-spi.h > delete mode 100644 arch/mips/ath79/dev-usb.c > delete mode 100644 arch/mips/ath79/dev-usb.h > delete mode 100644 arch/mips/ath79/dev-wmac.c > delete mode 100644 arch/mips/ath79/dev-wmac.h > delete mode 100644 arch/mips/ath79/irq.c > delete mode 100644 arch/mips/ath79/mach-ap121.c > delete mode 100644 arch/mips/ath79/mach-ap136.c > delete mode 100644 arch/mips/ath79/mach-ap81.c > delete mode 100644 arch/mips/ath79/mach-db120.c > delete mode 100644 arch/mips/ath79/mach-pb44.c > delete mode 100644 arch/mips/ath79/mach-ubnt-xm.c > delete mode 100644 arch/mips/ath79/machtypes.h > delete mode 100644 arch/mips/ath79/pci.c > delete mode 100644 arch/mips/ath79/pci.h > create mode 100644 arch/mips/pci/fixup-ath79.c Series applied to mips-next. Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH net-next v2 0/4] bridge: implement Multicast Router Discovery (RFC4286)
From: Linus Lüssing Date: Mon, 21 Jan 2019 07:26:24 +0100 > This patchset adds initial Multicast Router Discovery support to > the Linux bridge (RFC4286). With MRD it is possible to detect multicast > routers and mark bridge ports and forward multicast packets to such routers > accordingly. > > So far, multicast routers are detected via IGMP/MLD queries and PIM > messages in the Linux bridge. As there is only one active, selected > querier at a time RFC4541 ("Considerations for Internet Group Management > Protocol (IGMP) and Multicast Listener Discovery (MLD) Snooping > Switches") section 2.1.1.a) recommends snooping Multicast Router > Advertisements as provided by MRD (RFC4286). > > > The first two patches are refactoring some existing code which is reused > for parsing the Multicast Router Advertisements later in the fourth > patch. The third patch lets the bridge join the all-snoopers multicast > address to be able to reliably receive the Multicast Router > Advertisements. ... Series applied, thanks!
Re: [PATCH v3 10/10] arm64: dts: qcom: sdm845: Add Q6V5 MSS node
On Tue 22 Jan 16:28 PST 2019, Doug Anderson wrote: > Hi, > > On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson > wrote: > > > > From: Sibi Sankar > > > > This patch adds Q6V5 MSS remoteproc node for SDM845 SoCs. > > > > Signed-off-by: Sibi Sankar > > Reviewed-by: Douglas Anderson > > Signed-off-by: Bjorn Andersson > > --- > > > > Changes since v2: > > - Picked up Sibi's patch > > - Fixed reg to work with address/size-cells as 2 > > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 58 > > 1 file changed, 58 insertions(+) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index 5cc2615461da..78df5f1bce2d 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > @@ -1617,6 +1617,64 @@ > > clock-names = "xo"; > > }; > > > > + mss_pil: remoteproc@408 { > > + compatible = "qcom,sdm845-mss-pil"; > > + reg = <0 0x0408 0 0x408>, <0 0x0418 0 0x48>; > > + reg-names = "qdsp6", "rmb"; > > + > > + interrupts-extended = > > + < GIC_SPI 266 IRQ_TYPE_EDGE_RISING>, > > + <_smp2p_in 0 IRQ_TYPE_EDGE_RISING>, > > + <_smp2p_in 1 IRQ_TYPE_EDGE_RISING>, > > + <_smp2p_in 2 IRQ_TYPE_EDGE_RISING>, > > + <_smp2p_in 3 IRQ_TYPE_EDGE_RISING>, > > + <_smp2p_in 7 IRQ_TYPE_EDGE_RISING>; > > + interrupt-names = "wdog", "fatal", "ready", > > + "handover", "stop-ack", > > + "shutdown-ack"; > > + > > + clocks = < GCC_MSS_CFG_AHB_CLK>, > > +< GCC_MSS_Q6_MEMNOC_AXI_CLK>, > > +< GCC_BOOT_ROM_AHB_CLK>, > > +< GCC_MSS_GPLL0_DIV_CLK_SRC>, > > +< GCC_MSS_SNOC_AXI_CLK>, > > +< GCC_MSS_MFAB_AXIS_CLK>, > > +< GCC_PRNG_AHB_CLK>, > > +< RPMH_CXO_CLK>; > > + clock-names = "iface", "bus", "mem", "gpll0_mss", > > + "snoc_axi", "mnoc_axi", "prng", "xo"; > > + > > + qcom,smem-states = <_smp2p_out 0>; > > + qcom,smem-state-names = "stop"; > > + > > + resets = <_reset AOSS_CC_MSS_RESTART>, > > +<_reset PDC_MODEM_SYNC_RESET>; > > + reset-names = "mss_restart", "pdc_reset"; > > + > > + qcom,halt-regs = <_mutex_regs 0x23000 0x25000 > > 0x24000>; > > + > > + power-domains = <_qmp AOSS_QMP_LS_MODEM>, > > + < SDM845_CX>, > > + < SDM845_MX>, > > + < SDM845_MSS>; > > + power-domain-names = "load_state", "cx", "mx", > > "mss"; > > + > > + mba { > > + memory-region = <_region>; > > + }; > > + > > + mpss { > > + memory-region = <_region>; > > + }; > > + > > + glink-edge { > > + interrupts = > IRQ_TYPE_EDGE_RISING>; > > + label = "modem"; > > + qcom,remote-pid = <1>; > > + mboxes = <_shared 12>; > > + }; > > + }; > > + > > sdhc_2: sdhci@8804000 { > > Can you please sort by unit address now that you have a device tree > that has more stuff? > Of course, sorry for missing that. Regards, Bjorn
Re: [PATCH V2] livepatch: fix non-static warnings
On Tue, Jan 22, 2019 at 11:30:30AM -0500, Joe Lawrence wrote: > On 12/18/18 10:18 AM, Joe Lawrence wrote: > >On 12/18/2018 03:49 AM, Miroslav Benes wrote: > >>On Mon, 17 Dec 2018, Joe Lawrence wrote: > >> > >>>I'm just being picky about its documentation and how we should note its > >>>usage in the v3 patch. Think that s/__noclone/used/g of the v2 commit > >>>message would be sufficient? > >> > >>We could rephrase it. After all it is not only about symbol names in the > >>symbol table. The traceable/patchable code has to be present... > >> > >>"Sparse reported warnings about non-static symbols. For the variables > >>a simple static attribute is fine - for the functions referenced by > >>livepatch via klp_func the symbol-names must be unmodified in the > >>symbol table and the patchable code has to be emitted. > >> > >>Attach __used attribute to the shared statically declared functions." > >> > >>? > > > >That works for me. > > > Hi Nicholas, > > Did you still want to post a v3 for this fix? I think there were only a few > v3 suggestions (link tag, tag order, __used attribute, and commit msg > phrasing.) > yup - will go cleanup and repost. thx! hofrat
Re: [GIT PULL] clk fixes for v5.0-rc3
The pull request you sent on Tue, 22 Jan 2019 14:37:29 -0800: > https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git > tags/clk-fixes-for-linus has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/0b0d4be6b4880c7199d41afe4d9a3f20f47fd9bb Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [PATCH v3 03/10] arm64: dts: sdm845: Introduce ADSP and CDSP PAS nodes
On Tue 22 Jan 16:40 PST 2019, Doug Anderson wrote: > Hi, > > On Tue, Jan 22, 2019 at 4:26 PM Bjorn Andersson > wrote: > > > > + clocks = <_board>; > > > > + clock-names = "xo"; > > > > > > I've found that nearly all the places that refer to xo_board are wrong > > > and should actually point to '< RPMH_CXO_CLK>'. Maybe yours > > > should too? > > > > > > > Yes, xo_board is a fake clock representing the 19.2MHz clock feeding the > > cxo (or cxo2) pad of the SoC. So you're definitely right in that this > > should be referencing the actual 19.2MHz clock. > > > > We've kept referring to this as xo_board, as we don't handle probe > > deferral when gcc will probe earlier than rpmcc in the boot and for > > other non-clock drivers the fear of actually hitting 0 on the refcounter > > for this (you don't want to disable the cxo while running the system). > > Note that, as defined in the device tree, "xo_board" is actually 38.4. > IIUC that is not actually a fake/bogus clock but represents the actual > crystal on the board. There's a divide by 2 in the CPU though so most > peripherals consider "xo" as 19.2. > There's the 38.4MHz XO connected to the PMIC, but the signal going into the CXO_IN pad of the SoC is supposed to come from LNBBCLK1 and be 19.2MHz. > ...OK, confirmed. The actual RF_XO_CLK pin on the board is truly > connected to 38.4. > And the three RF clocks from the PMIC are all ticking at 38.4MHz. The "xo" I need here is the LNBBCLK1 (RPMH_CXO_CLK in clk-rpmh), for the purpose of preventing the root clock to be turned off if apps goes to suspend while the modem is booting, before it has had a chance to tell RPM(h) that it needs it to be on. Regards, Bjorn
[PATCH] i2c: imx: fix inconsistent IS_ERR and PTR_ERR
Fix inconsistent IS_ERR and PTR_ERR in i2c_imx_dma_request. The proper pointer to be passed as argument is dma->chan_tx. This bug was detected with the help of Coccinelle. Fixes: 5b3a23a3cc94 ("i2c: imx: notify about real errors on dma i2c_imx_dma_request") Signed-off-by: Gustavo A. R. Silva --- drivers/i2c/busses/i2c-imx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index 09b124547669..42fed40198a0 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -287,7 +287,7 @@ static int i2c_imx_dma_request(struct imx_i2c_struct *i2c_imx, dma->chan_tx = dma_request_chan(dev, "tx"); if (IS_ERR(dma->chan_tx)) { - ret = PTR_ERR(dma->chan_rx); + ret = PTR_ERR(dma->chan_tx); if (ret != -ENODEV && ret != -EPROBE_DEFER) dev_err(dev, "can't request DMA tx channel (%d)\n", ret); goto fail_al; -- 2.20.1
Re: [PATCH reset-next 2/2] reset: brcmstb: Fix 32-bit build with 64-bit resource_size_t
On 1/22/19 4:33 PM, Florian Fainelli wrote: > On 32-bit architectures defining resource_size_t as 64-bit (because of > PAE), we can run into a linker failure because of the modulo and the > division against resource_size(), replace the two problematic operations > with an alignment check on the register resource (instead of modulo), > and the division with DIV_ROUND_CLOSEST_ULL(). > > Reported-by: Randy Dunlap > Fixes: c196cdc7659d ("reset: Add Broadcom STB SW_INIT reset controller > driver") > Signed-off-by: Florian Fainelli > --- > drivers/reset/reset-brcmstb.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/reset/reset-brcmstb.c b/drivers/reset/reset-brcmstb.c > index 01ab1f71518b..c4cab8b5052d 100644 > --- a/drivers/reset/reset-brcmstb.c > +++ b/drivers/reset/reset-brcmstb.c > @@ -91,7 +91,8 @@ static int brcmstb_reset_probe(struct platform_device *pdev) > return -ENOMEM; > > res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > - if (resource_size(res) % SW_INIT_BANK_SIZE) { > + if (!IS_ALIGNED(res->start, SW_INIT_BANK_SIZE) || > + !IS_AGLINED(resource_size(res), SW_INIT_BANK_SIZE)) { > dev_err(kdev, "incorrect register range\n"); > return -EINVAL; > } > @@ -103,7 +104,8 @@ static int brcmstb_reset_probe(struct platform_device > *pdev) > dev_set_drvdata(kdev, priv); > > priv->rcdev.owner = THIS_MODULE; > - priv->rcdev.nr_resets = (resource_size(res) / SW_INIT_BANK_SIZE) * 32; > + priv->rcdev.nr_resets = DIV_ROUND_CLOSEST_ULL(resource_size(res), > + SW_INIT_BANK_SIZE) * 32; > priv->rcdev.ops = _reset_ops; > priv->rcdev.of_node = kdev->of_node; > /* Use defaults: 1 cell and simple xlate function */ > Hi Florian, This gives me: CC drivers/reset/reset-brcmstb.o ../drivers/reset/reset-brcmstb.c: In function ‘brcmstb_reset_probe’: ../drivers/reset/reset-brcmstb.c:95:6: error: implicit declaration of function ‘IS_AGLINED’ [-Werror=implicit-function-declaration] !IS_AGLINED(resource_size(res), SW_INIT_BANK_SIZE)) { ^ but if the typo is fixed, it is fine :) then you can added: Acked-by: Randy Dunlap Thanks. -- ~Randy
Re: [PATCH 01/15] habanalabs: add skeleton driver
On Wed, 2019-01-23 at 02:00 +0200, Oded Gabbay wrote: > This patch adds the habanalabs skeleton driver. The driver does nothing at > this stage except very basic operations. It contains the minimal code to > insmod and rmmod the driver and to create a /dev/hlX file per PCI device. trivial notes: > > diff --git a/drivers/misc/habanalabs/Makefile > b/drivers/misc/habanalabs/Makefile [] > \ No newline at end of file You should fixes these. There are a least a couple of them. > diff --git a/drivers/misc/habanalabs/device.c > b/drivers/misc/habanalabs/device.c [] > @@ -0,0 +1,331 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +/* > + * Copyright 2016-2018 HabanaLabs, Ltd. > + * All Rights Reserved. > + */ Add #define pr_fmt(fmt) "habanalabs: " fmt > + > +#include "habanalabs.h" or add it in this file > +static int device_setup_cdev(struct hl_device *hdev, struct class *hclass, > + int minor, const struct file_operations *fops) > +{ > + int err, devno = MKDEV(hdev->major, minor); > + struct cdev *hdev_cdev = >cdev; > + char name[8]; > + > + sprintf(name, "hl%d", hdev->id); Might overflow name one day > + > + cdev_init(hdev_cdev, fops); > + hdev_cdev->owner = THIS_MODULE; > + err = cdev_add(hdev_cdev, devno, 1); > + if (err) { > + pr_err("habanalabs: Failed to add char device %s", name); So #define pr_fmt can auto prefix these and this would be pr_err("Failed to add char device %s\n", name); missing terminating '\n' btw > + goto err_cdev_add; > + } > + > + hdev->dev = device_create(hclass, NULL, devno, NULL, "%s", name); > + if (IS_ERR(hdev->dev)) { > + pr_err("habanalabs: Failed to create device %s\n", name); And this would be: pr_err("Failed to create device %s\n", name); etc... > +static int device_early_init(struct hl_device *hdev) > +{ > + switch (hdev->asic_type) { > + case ASIC_GOYA: > + sprintf(hdev->asic_name, "GOYA"); strcpy or perhaps better still as strlcpy > +int hl_device_init(struct hl_device *hdev, struct class *hclass) > +{ [] > + dev_notice(hdev->dev, > + "Successfully added device to habanalabs driver\n"); This is mostly aligned to open parenthesis, but perhaps it could check with scripts/checkpatch.pl --strict and see if you agree with anything it bleats. > +int hl_poll_timeout_memory(struct hl_device *hdev, u64 addr, > + u32 timeout_us, u32 *val) > +{ > + /* > + * pReturnVal is defined as volatile because it points to HOST memory, > + * which is being written to by the device. Therefore, we can't use > + * locks to synchronize it and it is not a memory-mapped register space > + */ > + volatile u32 *pReturnVal = (volatile u32 *) addr; It'd be nice to avoid hungarian and camelcase > + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); > + > + might_sleep(); > + > + for (;;) { > + *val = *pReturnVal; > + if (*val) > + break; > + if (ktime_compare(ktime_get(), timeout) > 0) { > + *val = *pReturnVal; > + break; > + } > + usleep_range((100 >> 2) + 1, 100); > + } > + > + return (*val ? 0 : -ETIMEDOUT); Unnecessary parentheses > diff --git a/drivers/misc/habanalabs/habanalabs_drv.c > b/drivers/misc/habanalabs/habanalabs_drv.c [] > +static struct pci_device_id ids[] = { > + { PCI_DEVICE(PCI_VENDOR_ID_HABANALABS, PCI_IDS_GOYA), }, > + { 0, } > +}; static const? > diff --git a/drivers/misc/habanalabs/include/habanalabs_device_if.h > b/drivers/misc/habanalabs/include/habanalabs_device_if.h [] > +struct hl_bd { > + __u64 ptr; > + __u32 len; > + union { > + struct { > + __u32 repeat:16; > + __u32 res1:8; > + __u32 repeat_valid:1; > + __u32 res2:7; > + }; > + __u32 ctl; > + }; > +}; Maybe use the appropriate bit-endian __le instead of __u with whatever cpu_to_le / le_to_cpu bits are necessary.
Re: [PATCH] KVM: VMX: Fix vm entry failure caused by invalid vmexit controls
On Tue, Jan 22, 2019 at 09:00:45AM -0800, Sean Christopherson wrote: > On Tue, Jan 22, 2019 at 11:29:52PM +0800, Changbin Du wrote: > > The commit c73da3f ("KVM: VMX: Properly handle dynamic VM Entry/Exit > > controls") has a typo that cause invalid vmexit controls. The > > VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL is against _vmentry_control. > > > > KVM: entry failed, hardware error 0x7 > > EAX= EBX= ECX= EDX=000206c2 > > ESI= EDI= EBP= ESP= > > EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0 > > ES = 9300 > > CS =f000 9b00 > > SS = 9300 > > DS = 9300 > > FS = 9300 > > GS = 9300 > > LDT= 8200 > > TR = 8b00 > > GDT= > > IDT= > > CR0=6010 CR2= CR3= CR4= > > DR0= DR1= DR2= > > DR3= DR6=0ff0 DR7=0400 > > EFER= > > > > Fixes: c73da3f ("KVM: VMX: Properly handle dynamic VM Entry/Exit controls") > > Signed-off-by: Changbin Du > > Patch already submitted[1]. > > Paolo/Radim, the VM-Exit fix needs to be queued asap. The fix for the > objtool warning[2] should also go into v5.0. > echo. This bug breaks kvm on some old machines! > [1] https://patchwork.kernel.org/patch/10763351/ > [2] https://patchwork.kernel.org/patch/10765309/ > > > > > --- > > arch/x86/kvm/vmx/vmx.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index f6915f10e584..0762fcab8fc9 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -2344,7 +2344,7 @@ static __init int setup_vmcs_config(struct > > vmcs_config *vmcs_conf, > > case 37: /* AAT100 */ > > case 44: /* BC86,AAY89,BD102 */ > > case 46: /* BA97 */ > > - _vmexit_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL; > > + _vmentry_control &= > > ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL; > > _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL; > > pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL " > > "does not work properly. Using > > workaround\n"); > > -- > > 2.19.1 > > -- Cheers, Changbin Du
Re: [PATCH v3 02/10] arm64: dts: qcom: sdm845: Define rmtfs memory
On Tue 22 Jan 15:26 PST 2019, Doug Anderson wrote: > Hi, > > On Mon, Jan 21, 2019 at 9:51 PM Bjorn Andersson > wrote: > > > > Define the rmtfs memory node, as described in version 10 of the memory > > map. > > > > Signed-off-by: Bjorn Andersson > > --- > > > > Changes since v2: > > - New patch > > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 9 + > > 1 file changed, 9 insertions(+) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index cdcac3704c13..64f57cc5c61a 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > @@ -72,6 +72,15 @@ > > #size-cells = <2>; > > ranges; > > > > + rmtfs@85d0 { > > + compatible = "qcom,rmtfs-mem"; > > + reg = <0 0x85d0 0 0x20>; > > + no-map; > > + > > + qcom,client-id = <1>; > > + qcom,vmid = <15>; > > + }; > > Ah, I saw this after I posted my comments to patch #1. I guess this > is the same as this node we have in our cheza board file downstream > (need to get that posted upstream soon): > > rmtfs@88f0 { > compatible = "qcom,rmtfs-mem"; > reg = <0x0 0x88f0 0x0 0x80>; > no-map; > > qcom,client-id = <1>; > }; > > That brings up a few things: > > 1. You should add a node label here. This allows us to act on the > node more easily from board files, like setting it to disabled or > changing it. > I'll make sure to label it. > 2. In https://crrev.com/c/1119572, the argument was made that the size > of this carveout is board-specific. That makes it hard to put it in > sdm845.dts. > I don't think I've seen a modern platform where this isn't 2MB, so I think it's safe to add it to the platform. But a label sounds good, if someone out there has a custom modem firmware with some odd changes in this area. I'll label it, to make it possible to move, resize or reclaim in boards. Regards, Bjorn
Re: [PATCH 1/2] thermal/int340x_thermal: Add additional UUIDs
Hello Rui, On Tue, 4 Dec 2018 at 02:12, Zhang Rui wrote: > On 三, 2018-10-10 at 01:30 -0700, Matthew Garrett wrote: > > Platforms support more DPTF policies than the driver currently > > exposes. > > Add them. This effectively reverts > > 31908f45a583e8f21db37f402b6e8d5739945afd which removed several UUIDs > > without explaining why. > > > I'm going to apply this patch series, just with two minor changes, > 1. 31908f45a583e8f21db37f402b6e8d5739945afd does not follow the git > commit description style 'commit <12+ chars of sha1> ("")' > 2. the UUIDs were removed previously because these policies were not > used. Which tree did this series get applied to? Cheers, Joel > > thanks, > rui > > > Signed-off-by: Matthew Garrett > > Cc: Zhang Rui > > Cc: Nisha Aram > > --- > > drivers/thermal/int340x_thermal/int3400_thermal.c | 14 > > ++ > > 1 file changed, 14 insertions(+) > > > > diff --git a/drivers/thermal/int340x_thermal/int3400_thermal.c > > b/drivers/thermal/int340x_thermal/int3400_thermal.c > > index e26b01c05e82..51c9097eaf7a 100644 > > --- a/drivers/thermal/int340x_thermal/int3400_thermal.c > > +++ b/drivers/thermal/int340x_thermal/int3400_thermal.c > > @@ -22,6 +22,13 @@ enum int3400_thermal_uuid { > > INT3400_THERMAL_PASSIVE_1, > > INT3400_THERMAL_ACTIVE, > > INT3400_THERMAL_CRITICAL, > > + INT3400_THERMAL_ADAPTIVE_PERFORMANCE, > > + INT3400_THERMAL_EMERGENCY_CALL_MODE, > > + INT3400_THERMAL_PASSIVE_2, > > + INT3400_THERMAL_POWER_BOSS, > > + INT3400_THERMAL_VIRTUAL_SENSOR, > > + INT3400_THERMAL_COOLING_MODE, > > + INT3400_THERMAL_HARDWARE_DUTY_CYCLING, > > INT3400_THERMAL_MAXIMUM_UUID, > > }; > > > > @@ -29,6 +36,13 @@ static char > > *int3400_thermal_uuids[INT3400_THERMAL_MAXIMUM_UUID] = { > > "42A441D6-AE6A-462b-A84B-4A8CE79027D3", > > "3A95C389-E4B8-4629-A526-C52C88626BAE", > > "97C68AE7-15FA-499c-B8C9-5DA81D606E0A", > > + "63BE270F-1C11-48FD-A6F7-3AF253FF3E2D", > > + "5349962F-71E6-431D-9AE8-0A635B710AEE", > > + "9E04115A-AE87-4D1C-9500-0F3E340BFE75", > > + "F5A35014-C209-46A4-993A-EB56DE7530A1", > > + "6ED722A7-9240-48A5-B479-31EEF723D7CF", > > + "16CAF1B7-DD38-40ED-B1C1-1B8A1913D531", > > + "BE84BABF-C4D4-403D-B495-3128FD44dAC1", > > }; > > > > struct int3400_thermal_priv {
Re: [PATCH] workqueue: Try to catch flush_work() without INIT_WORK().
Daniel Jordan wrote: > On Sat, Jan 19, 2019 at 11:41:22AM +0900, Tetsuo Handa wrote: > > On 2019/01/19 4:48, Daniel Jordan wrote: > > > On Sat, Jan 19, 2019 at 02:04:58AM +0900, Tetsuo Handa wrote: > > > __queue_work has a sanity check already for work, but using list_empty. > > > Seems > > > slightly better to be consistent? > > > > > > > list_empty() won't work, for "struct work_struct" is embedded into a struct > > which is allocated by kzalloc(). > > Please check list_empty's definition again, it compares the address of the > node > to its next pointer, so it should work for a zeroed node. I'll reiterate that > it seems slightly better to be consistent in "is work_struct initialized?" > checks, but it's not a big deal and I'm fine either way. You are talking about if (WARN_ON(!list_empty(>entry))) { spin_unlock(>pool->lock); return; } part in __queue_work(), aren't you? But since flush_work() is used for waiting for a work to complete, that work can be either queued state (list_empty() == false) or not queued state (list_empty() == true). Thus, I don't think that flush_work() can use list_empty() for checking whether that work was initialized. [PATCH v2] workqueue: Try to catch flush_work() without INIT_WORK(). syzbot found a flush_work() caller who forgot to call INIT_WORK() because that work_struct was allocated by kzalloc() [1]. But the message INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. by lock_map_acquire() is failing to tell that INIT_WORK() is missing. Since flush_work() without INIT_WORK() is a bug, and INIT_WORK() should set ->func field to non-zero, let's warn if ->func field is zero. [1] https://syzkaller.appspot.com/bug?id=a5954455fcfa51c29ca2ab55b203076337e1c770 Signed-off-by: Tetsuo Handa --- kernel/workqueue.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 392be4b..a503ad9 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2908,6 +2908,9 @@ static bool __flush_work(struct work_struct *work, bool from_cancel) if (WARN_ON(!wq_online)) return false; + if (WARN_ON(!work->func)) + return false; + if (!from_cancel) { lock_map_acquire(>lockdep_map); lock_map_release(>lockdep_map);
Re: [PATCH v3 01/10] arm64: dts: qcom: sdm845: Update PIL region memory map
On Tue 22 Jan 15:16 PST 2019, Doug Anderson wrote: > Hi, > > On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson > wrote: > > > > Update existing and add all missing PIL regions to the reserved memory > > map, as described in version 10. > > > > Signed-off-by: Bjorn Andersson > > --- > > > > Changes since v2: > > - New patch > > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 61 ++-- > > 1 file changed, 58 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index 0ec827394e92..cdcac3704c13 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > @@ -89,12 +89,47 @@ > > }; > > > > memory@8620 { > > - reg = <0 0x8620 0 0x2d0>; > > + reg = <0 0x8620 0 0x10>; > > no-map; > > }; > > > > - wlan_msa_mem: memory@9670 { > > - reg = <0 0x9670 0 0x10>; > > + memory@8630 { > > + reg = <0 0x8630 0 0x480>; > > + no-map; > > + }; > > I know it's not a problem upstream (yet), but downstream this collides > with a memory region in the cheza board. We have: > > rmtfs@88f0 { > compatible = "qcom,rmtfs-mem"; > reg = <0x0 0x88f0 0x0 0x80>; > no-map; > > qcom,client-id = <1>; > }; > > ...and the above region overlays it since it goes till 0x8ab0 > Digging through the table again I see that there's another level here, so it seems only the first 44MB of these 78MB are reserved for non-APSS things. So this should actually be 0x2c0 long. I will update this and we'll have one conflict less. > > > + > > + memory@8ab0 { > > + reg = <0 0x8ab0 0 0x140>; > > + no-map; > > + }; > > + > > + memory@8bf0 { > > + reg = <0 0x8bf0 0 0x50>; > > + no-map; > > + }; > > + > > + ipa_fw_mem: memory@8c40 { > > + reg = <0 0x8c40 0 0x1>; > > + no-map; > > + }; > > + > > + ipa_gsi_mem: memory@8c41 { > > + reg = <0 0x8c41 0 0x5000>; > > + no-map; > > + }; > > + > > + memory@8c415000 { > > + reg = <0 0x8c415000 0 0x2000>; > > + no-map; > > + }; > > + > > + adsp_mem: memory@8c50 { > > + reg = <0 0x8c50 0 0x1a0>; > > + no-map; > > + }; > > + > > + wlan_msa_mem: memory@8df0 { > > Your patch moves 'wlan_msa_mem' from 0x9670 to 0x8df0. Is > that OK? I haven't been involved in all of the previous discussions > but if everything is all OK w/ the device tree just moving this chunk > around (without any other coordination w/ firmware) it seems really > weird that we even need to specify it in the device tree. ...but > maybe I shouldn't open this can of worms. You can pretend I didn't > say anything. > 0x9670 seems to be reserved for the sensor core, so either WiFi wasn't actually tested before, or more likely its firmware is position independent. Most (all?) firmware is position independent, but the security configuration prevents us from relocating it. One such example is that the ADSP in the newer firmware versions are not allowed to execute from the old memory region. Regards, Bjorn
Re: [PATCH v3 03/10] arm64: dts: sdm845: Introduce ADSP and CDSP PAS nodes
Hi, On Tue, Jan 22, 2019 at 4:26 PM Bjorn Andersson wrote: > > > + clocks = <_board>; > > > + clock-names = "xo"; > > > > I've found that nearly all the places that refer to xo_board are wrong > > and should actually point to '< RPMH_CXO_CLK>'. Maybe yours > > should too? > > > > Yes, xo_board is a fake clock representing the 19.2MHz clock feeding the > cxo (or cxo2) pad of the SoC. So you're definitely right in that this > should be referencing the actual 19.2MHz clock. > > We've kept referring to this as xo_board, as we don't handle probe > deferral when gcc will probe earlier than rpmcc in the boot and for > other non-clock drivers the fear of actually hitting 0 on the refcounter > for this (you don't want to disable the cxo while running the system). Note that, as defined in the device tree, "xo_board" is actually 38.4. IIUC that is not actually a fake/bogus clock but represents the actual crystal on the board. There's a divide by 2 in the CPU though so most peripherals consider "xo" as 19.2. ...OK, confirmed. The actual RF_XO_CLK pin on the board is truly connected to 38.4. -Doug
[PATCH reset-next 1/2] reset: brcmstb: Make it tristate
The driver can be built as a module just fine, so let's make it selectable as such. Reported-by: Paul Gortmaker Signed-off-by: Florian Fainelli --- drivers/reset/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/reset/Kconfig b/drivers/reset/Kconfig index 1ca03c57e049..d9a02b7f90cf 100644 --- a/drivers/reset/Kconfig +++ b/drivers/reset/Kconfig @@ -41,7 +41,8 @@ config RESET_BERLIN This enables the reset controller driver for Marvell Berlin SoCs. config RESET_BRCMSTB - bool "Broadcom STB reset controller" if COMPILE_TEST + tristate "Broadcom STB reset controller" + depends on ARCH_BRCMSTB || COMPILE_TEST default ARCH_BRCMSTB help This enables the reset controller driver for Broadcom STB SoCs using -- 2.17.1
[PATCH reset-next 2/2] reset: brcmstb: Fix 32-bit build with 64-bit resource_size_t
On 32-bit architectures defining resource_size_t as 64-bit (because of PAE), we can run into a linker failure because of the modulo and the division against resource_size(), replace the two problematic operations with an alignment check on the register resource (instead of modulo), and the division with DIV_ROUND_CLOSEST_ULL(). Reported-by: Randy Dunlap Fixes: c196cdc7659d ("reset: Add Broadcom STB SW_INIT reset controller driver") Signed-off-by: Florian Fainelli --- drivers/reset/reset-brcmstb.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/reset/reset-brcmstb.c b/drivers/reset/reset-brcmstb.c index 01ab1f71518b..c4cab8b5052d 100644 --- a/drivers/reset/reset-brcmstb.c +++ b/drivers/reset/reset-brcmstb.c @@ -91,7 +91,8 @@ static int brcmstb_reset_probe(struct platform_device *pdev) return -ENOMEM; res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - if (resource_size(res) % SW_INIT_BANK_SIZE) { + if (!IS_ALIGNED(res->start, SW_INIT_BANK_SIZE) || + !IS_AGLINED(resource_size(res), SW_INIT_BANK_SIZE)) { dev_err(kdev, "incorrect register range\n"); return -EINVAL; } @@ -103,7 +104,8 @@ static int brcmstb_reset_probe(struct platform_device *pdev) dev_set_drvdata(kdev, priv); priv->rcdev.owner = THIS_MODULE; - priv->rcdev.nr_resets = (resource_size(res) / SW_INIT_BANK_SIZE) * 32; + priv->rcdev.nr_resets = DIV_ROUND_CLOSEST_ULL(resource_size(res), + SW_INIT_BANK_SIZE) * 32; priv->rcdev.ops = _reset_ops; priv->rcdev.of_node = kdev->of_node; /* Use defaults: 1 cell and simple xlate function */ -- 2.17.1
[PATCH reset-next 0/2] reset: brcmstb: Misc fixes
Hi Philipp, These two patches fix some recent issues brought up by Paul and Randy, feel free to squash into c196cdc7659d ("reset: Add Broadcom STB SW_INIT reset controller driver") since this is only in reset/next and linux-next so far. Thank you! Florian Fainelli (2): reset: brcmstb: Make it tristate reset: brcmstb: Fix 32-bit build with 64-bit resource_size_t drivers/reset/Kconfig | 3 ++- drivers/reset/reset-brcmstb.c | 6 -- 2 files changed, 6 insertions(+), 3 deletions(-) -- 2.17.1
Re: [PATCH] block: aoe: no need to check return value of debugfs_create functions
On Tue, Jan 22, 2019 at 6:29 PM Omar Sandoval wrote: ... > Now entry is uninitialized here when we assign it to d->debugfs. Thanks for noticing that.
Re: [PATCH v3 01/10] arm64: dts: qcom: sdm845: Update PIL region memory map
On Tue 22 Jan 15:10 PST 2019, Doug Anderson wrote: > Hi, > > On Tue, Jan 22, 2019 at 11:24 AM Bjorn Andersson > wrote: > > > > On Tue 22 Jan 10:58 PST 2019, Stephen Boyd wrote: > > > > > Quoting Bjorn Andersson (2019-01-21 21:51:03) > > > > @@ -103,10 +138,30 @@ > > > > no-map; > > > > }; > > > > > > > > + venus_mem: memory@9580 { > > > > + reg = <0 0x9580 0 0x50>; > > > > + no-map; > > > > + }; > > > > + > > > > + cdsp_mem: memory@95d0 { > > > > + reg = <0 0x95d0 0 0x80>; > > > > + no-map; > > > > + }; > > > > + > > > > mba_region: memory@9650 { > > > > reg = <0 0x9650 0 0x20>; > > > > no-map; > > > > }; > > > > + > > > > + slpi_mem: memory@9670 { > > > > + reg = <0 0x9670 0 0x140>; > > > > + no-map; > > > > + }; > > > > + > > > > + spss_mem: memory@97b0 { > > > > + reg = <0 0x97b0 0 0x10>; > > > > + no-map; > > > > + }; > > > > }; > > > > > > > > > > What's the plan if certain configurations don't use all these carveouts? > > > Can we mark the reservation nodes as status = "disabled", or the reverse > > > and mark them as status = "ok" in all boards, and then reclaim the > > > memory for peripherals we don't care to use? > > > > > > > The code path that picks these up does look for "status", so I suggest > > that we leave them all enabled in the platform dtsi and then let the > > device's reclaim them as needed. > > Does that mean we should add labels for all of the sub-nodes so that > boards can easily mark them "disabled"? > That sounds reasonable, I'll dig up some labels for the unlabeled nodes as well. Thanks, Bjorn
Re: [PATCH v3 10/10] arm64: dts: qcom: sdm845: Add Q6V5 MSS node
Hi, On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson wrote: > > From: Sibi Sankar > > This patch adds Q6V5 MSS remoteproc node for SDM845 SoCs. > > Signed-off-by: Sibi Sankar > Reviewed-by: Douglas Anderson > Signed-off-by: Bjorn Andersson > --- > > Changes since v2: > - Picked up Sibi's patch > - Fixed reg to work with address/size-cells as 2 > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 58 > 1 file changed, 58 insertions(+) > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi > b/arch/arm64/boot/dts/qcom/sdm845.dtsi > index 5cc2615461da..78df5f1bce2d 100644 > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > @@ -1617,6 +1617,64 @@ > clock-names = "xo"; > }; > > + mss_pil: remoteproc@408 { > + compatible = "qcom,sdm845-mss-pil"; > + reg = <0 0x0408 0 0x408>, <0 0x0418 0 0x48>; > + reg-names = "qdsp6", "rmb"; > + > + interrupts-extended = > + < GIC_SPI 266 IRQ_TYPE_EDGE_RISING>, > + <_smp2p_in 0 IRQ_TYPE_EDGE_RISING>, > + <_smp2p_in 1 IRQ_TYPE_EDGE_RISING>, > + <_smp2p_in 2 IRQ_TYPE_EDGE_RISING>, > + <_smp2p_in 3 IRQ_TYPE_EDGE_RISING>, > + <_smp2p_in 7 IRQ_TYPE_EDGE_RISING>; > + interrupt-names = "wdog", "fatal", "ready", > + "handover", "stop-ack", > + "shutdown-ack"; > + > + clocks = < GCC_MSS_CFG_AHB_CLK>, > +< GCC_MSS_Q6_MEMNOC_AXI_CLK>, > +< GCC_BOOT_ROM_AHB_CLK>, > +< GCC_MSS_GPLL0_DIV_CLK_SRC>, > +< GCC_MSS_SNOC_AXI_CLK>, > +< GCC_MSS_MFAB_AXIS_CLK>, > +< GCC_PRNG_AHB_CLK>, > +< RPMH_CXO_CLK>; > + clock-names = "iface", "bus", "mem", "gpll0_mss", > + "snoc_axi", "mnoc_axi", "prng", "xo"; > + > + qcom,smem-states = <_smp2p_out 0>; > + qcom,smem-state-names = "stop"; > + > + resets = <_reset AOSS_CC_MSS_RESTART>, > +<_reset PDC_MODEM_SYNC_RESET>; > + reset-names = "mss_restart", "pdc_reset"; > + > + qcom,halt-regs = <_mutex_regs 0x23000 0x25000 > 0x24000>; > + > + power-domains = <_qmp AOSS_QMP_LS_MODEM>, > + < SDM845_CX>, > + < SDM845_MX>, > + < SDM845_MSS>; > + power-domain-names = "load_state", "cx", "mx", "mss"; > + > + mba { > + memory-region = <_region>; > + }; > + > + mpss { > + memory-region = <_region>; > + }; > + > + glink-edge { > + interrupts = IRQ_TYPE_EDGE_RISING>; > + label = "modem"; > + qcom,remote-pid = <1>; > + mboxes = <_shared 12>; > + }; > + }; > + > sdhc_2: sdhci@8804000 { Can you please sort by unit address now that you have a device tree that has more stuff? -Doug
[PATCH] drm/vmwgfx: Replace PTR_RET with PTR_ERR_OR_ZERO
PTR_RET is deprecated and will be removed soon. Use PTR_ERR_OR_ZERO instead. Notice that these are the last instances of PTR_RET in the whole codebase. Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c index f2d13a72c05d..137cb1a4d6b0 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -2521,7 +2521,7 @@ static int vmw_cmd_dx_clear_rendertarget_view(struct vmw_private *dev_priv, SVGA3dCmdDXClearRenderTargetView body; } *cmd = container_of(header, typeof(*cmd), header); - return PTR_RET(vmw_view_id_val_add(sw_context, vmw_view_rt, + return PTR_ERR_OR_ZERO(vmw_view_id_val_add(sw_context, vmw_view_rt, cmd->body.renderTargetViewId)); } @@ -2542,7 +2542,7 @@ static int vmw_cmd_dx_clear_depthstencil_view(struct vmw_private *dev_priv, SVGA3dCmdDXClearDepthStencilView body; } *cmd = container_of(header, typeof(*cmd), header); - return PTR_RET(vmw_view_id_val_add(sw_context, vmw_view_ds, + return PTR_ERR_OR_ZERO(vmw_view_id_val_add(sw_context, vmw_view_ds, cmd->body.depthStencilViewId)); } @@ -2916,7 +2916,7 @@ static int vmw_cmd_dx_genmips(struct vmw_private *dev_priv, SVGA3dCmdDXGenMips body; } *cmd = container_of(header, typeof(*cmd), header); - return PTR_RET(vmw_view_id_val_add(sw_context, vmw_view_sr, + return PTR_ERR_OR_ZERO(vmw_view_id_val_add(sw_context, vmw_view_sr, cmd->body.shaderResourceViewId)); } -- 2.20.1
Re: [PATCH 5/8 v4] dma: k3dma: Add support for dma-channel-mask
On Thu, Jan 17, 2019 at 9:14 AM Manivannan Sadhasivam wrote: > > /* Skip the channels which are masked */ > if ((d->dma_channel_mask) & BIT(pch)) > continue; Per the discussion w/ Vinod and Rob, I think I'll leave this bit be, so we use the channels in the bitmask. > PS: Use BIT() macro where applicable. But this suggestions I've integrated for v5. Thanks so much for the review! -john
Re: [PATCH 7/8] infiniband: usnic: no need to check return value of debugfs_create functions
usnic driver was tested with this change. Acked-by: Parvi Kaustubhi > On Jan 22, 2019, at 7:17 AM, Greg Kroah-Hartman > wrote: > > When calling debugfs functions, there is no need to ever check the > return value. The function can work or not, but the code logic should > never do something different based on this. > > Cc: Christian Benvenuti > Cc: Nelson Escobar > Cc: Parvi Kaustubhi > Cc: Doug Ledford > Cc: Jason Gunthorpe > Cc: linux-r...@vger.kernel.org > Signed-off-by: Greg Kroah-Hartman > --- > drivers/infiniband/hw/usnic/usnic_debugfs.c | 26 - > 1 file changed, 26 deletions(-) > > diff --git a/drivers/infiniband/hw/usnic/usnic_debugfs.c > b/drivers/infiniband/hw/usnic/usnic_debugfs.c > index a3115709fb03..e5a3f02fb078 100644 > --- a/drivers/infiniband/hw/usnic/usnic_debugfs.c > +++ b/drivers/infiniband/hw/usnic/usnic_debugfs.c > @@ -113,42 +113,21 @@ static const struct file_operations flowinfo_ops = { > void usnic_debugfs_init(void) > { > debugfs_root = debugfs_create_dir(DRV_NAME, NULL); > - if (IS_ERR(debugfs_root)) { > - usnic_err("Failed to create debugfs root dir, check if debugfs > is enabled in kernel configuration\n"); > - goto out_clear_root; > - } > > flows_dentry = debugfs_create_dir("flows", debugfs_root); > - if (IS_ERR_OR_NULL(flows_dentry)) { > - usnic_err("Failed to create debugfs flow dir with err %ld\n", > - PTR_ERR(flows_dentry)); > - goto out_free_root; > - } > > debugfs_create_file("build-info", S_IRUGO, debugfs_root, > NULL, _debugfs_buildinfo_ops); > - return; > - > -out_free_root: > - debugfs_remove_recursive(debugfs_root); > -out_clear_root: > - debugfs_root = NULL; > } > > void usnic_debugfs_exit(void) > { > - if (!debugfs_root) > - return; > - > debugfs_remove_recursive(debugfs_root); > debugfs_root = NULL; > } > > void usnic_debugfs_flow_add(struct usnic_ib_qp_grp_flow *qp_flow) > { > - if (IS_ERR_OR_NULL(flows_dentry)) > - return; > - > scnprintf(qp_flow->dentry_name, sizeof(qp_flow->dentry_name), > "%u", qp_flow->flow->flow_id); > qp_flow->dbgfs_dentry = debugfs_create_file(qp_flow->dentry_name, > @@ -156,11 +135,6 @@ void usnic_debugfs_flow_add(struct usnic_ib_qp_grp_flow > *qp_flow) > flows_dentry, > qp_flow, > _ops); > - if (IS_ERR_OR_NULL(qp_flow->dbgfs_dentry)) { > - usnic_err("Failed to create dbg fs entry for flow %u with error > %ld\n", > - qp_flow->flow->flow_id, > - PTR_ERR(qp_flow->dbgfs_dentry)); > - } > } > > void usnic_debugfs_flow_remove(struct usnic_ib_qp_grp_flow *qp_flow) > -- > 2.20.1 >
Re: [PATCH v3 03/10] arm64: dts: sdm845: Introduce ADSP and CDSP PAS nodes
On Tue 22 Jan 15:46 PST 2019, Doug Anderson wrote: > Hi, > > On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson > wrote: > > > > Add the ADSP and CDSP nodes for PAS-based remoteproc, supporting booting > > these cores on e.g. the MTP, and enable the same for the MTP. > > > > Signed-off-by: Bjorn Andersson > > --- > > > > Changes since v2: > > - New patch > > > > arch/arm64/boot/dts/qcom/sdm845-mtp.dts | 8 > > arch/arm64/boot/dts/qcom/sdm845.dtsi| 58 + > > 2 files changed, 66 insertions(+) > > It's a bit of a nit of mine that if it's not totally obvious what > acronyms mean that they should be spelled out in places that use them. > > In this case I believe ADSP is the Audio DSP. Is CDSP the Camera DSP? ...or > ? > C as in Compute. I'll spell these out as I respin the series. > > > + adsp_pas: remoteproc-adsp { > > + compatible = "qcom,sdm845-adsp-pas"; > > + > > + interrupts-extended = < GIC_SPI 162 > > IRQ_TYPE_EDGE_RISING>, > > + <_smp2p_in 0 > > IRQ_TYPE_EDGE_RISING>, > > + <_smp2p_in 1 > > IRQ_TYPE_EDGE_RISING>, > > + <_smp2p_in 2 > > IRQ_TYPE_EDGE_RISING>, > > + <_smp2p_in 3 > > IRQ_TYPE_EDGE_RISING>; > > + interrupt-names = "wdog", "fatal", "ready", > > + "handover", "stop-ack"; > > + > > + clocks = <_board>; > > + clock-names = "xo"; > > I've found that nearly all the places that refer to xo_board are wrong > and should actually point to '< RPMH_CXO_CLK>'. Maybe yours > should too? > Yes, xo_board is a fake clock representing the 19.2MHz clock feeding the cxo (or cxo2) pad of the SoC. So you're definitely right in that this should be referencing the actual 19.2MHz clock. We've kept referring to this as xo_board, as we don't handle probe deferral when gcc will probe earlier than rpmcc in the boot and for other non-clock drivers the fear of actually hitting 0 on the refcounter for this (you don't want to disable the cxo while running the system). I'll give it a spin with appropriate reference and see what happens, I think this should either be changed or documented in the commit message. Thanks, Bjorn
Re: [PATCH v8 11/11] media: imx.rst: Update doc to reflect fixes to interlaced capture
On 1/22/19 11:51 AM, Tim Harvey wrote: On Mon, Jan 21, 2019 at 12:24 PM Tim Harvey wrote: On Tue, Jan 15, 2019 at 3:54 PM Steve Longerbeam wrote: Hi Tim, On 1/15/19 1:58 PM, Tim Harvey wrote: On Wed, Jan 9, 2019 at 10:30 AM Steve Longerbeam wrote: Also add an example pipeline for unconverted capture with interweave on SabreAuto. Cleanup some language in various places in the process. Signed-off-by: Steve Longerbeam Reviewed-by: Philipp Zabel --- Changes since v4: - Make clear that it is IDMAC channel that does pixel reordering and interweave, not the CSI. Caught by Philipp Zabel. Changes since v3: - none. Changes since v2: - expand on idmac interweave behavior in CSI subdev. - switch second SabreAuto pipeline example to PAL to give both NTSC and PAL examples. - Cleanup some language in various places. --- Documentation/media/v4l-drivers/imx.rst | 103 +++- 1 file changed, 66 insertions(+), 37 deletions(-) Capture Pipelines - @@ -516,10 +522,33 @@ On the SabreAuto, an on-board ADV7180 SD decoder is connected to the parallel bus input on the internal video mux to IPU1 CSI0. The following example configures a pipeline to capture from the ADV7180 -video decoder, assuming NTSC 720x480 input signals, with Motion -Compensated de-interlacing. Pad field types assume the adv7180 outputs -"interlaced". $outputfmt can be any format supported by the ipu1_ic_prpvf -entity at its output pad: +video decoder, assuming NTSC 720x480 input signals, using simple +interweave (unconverted and without motion compensation). The adv7180 +must output sequential or alternating fields (field type 'seq-bt' for +NTSC, or 'alternate'): + +.. code-block:: none + + # Setup links + media-ctl -l "'adv7180 3-0021':0 -> 'ipu1_csi0_mux':1[1]" + media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]" + media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]" + # Configure pads + media-ctl -V "'adv7180 3-0021':0 [fmt:UYVY2X8/720x480 field:seq-bt]" + media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x480]" + media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]" + # Configure "ipu1_csi0 capture" interface (assumed at /dev/video4) + v4l2-ctl -d4 --set-fmt-video=field=interlaced_bt + +Streaming can then begin on /dev/video4. The v4l2-ctl tool can also be +used to select any supported YUV pixelformat on /dev/video4. + Hi Steve, I'm testing 4.20 with this patchset on top. I'm on a GW5104 which has an IMX6Q with the adv7180 on ipu1_csi0 like the SabeAuto example above I can't get the simple interveave example to work: media-ctl -r # reset all links # Setup links (ADV7180 IPU1_CSI0) media-ctl -l '"adv7180 2-0020":0 -> "ipu1_csi0_mux":1[1]' media-ctl -l '"ipu1_csi0_mux":2 -> "ipu1_csi0":0[1]' media-ctl -l '"ipu1_csi0":2 -> "ipu1_csi0 capture":0[1]' # /dev/video4 # Configure pads media-ctl -V "'adv7180 2-0020':0 [fmt:UYVY2X8/720x480 field:seq-bt]" media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x480]" media-ctl -V "'ipu1_csi0':0 [fmt:AYUV32/720x480]" This is the reason. The adv7180 is only allowing to configure alternate field mode, and thus it reports the field height on the mbus, not the full frame height. Imx deals with alternate field mode by capturing a full frame, so the CSI entity sets the output pad height to double the height. So the CSI input pad needs to be configured with the field height: media-ctl -V "'ipu1_csi0':0 [fmt:AYUV32/720x240]" It should work for you after doing that. And better yet, don't bother configuring the input pad, because media-ctl will propagate formats from source to sink pads for you, so it's better to rely on the propagation, and set the CSI output pad format instead (full frame height at output pad): media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]" Steve, Thanks - that makes sense. I also noticed that if I setup one of the vdic pipelines first then went back after a 'media-ctl -r' and setup the example that failed it no longer failed. I'm thinking that this is because 'media-ctl -r' make reset all the links but does not reset all the V4L2 formats on pads? Final note: the imx.rst doc is technically correct even though it is showing full frame heights being configured at the pads, because it is expecting the adv7180 has accepted 'seq-bt'. But even the example given in that doc works for alternate field mode, because the pad heights are forced to the correct field height for alternate mode. hmmm... I don't quite follow this statement. It sounds like the example would only be correct if you were setting 'field:alternate' but the example sets 'field:seq-bt' instead. I wonder if you should add some verbiage explaining the difference in format (resolution specifically) between the input and output pads and/or change the example to set the output pad format so people don't run into what I did trying to follow the example. Steve, I'm able to link a sensor->mux->csi->vdic->ic_prp->ic_prpenc but not a
Re: [PATCH] fail_function: no need to check return value of debugfs_create functions
On Tue, 22 Jan 2019 16:21:44 +0100 Greg Kroah-Hartman wrote: > When calling debugfs functions, there is no need to ever check the > return value. The function can work or not, but the code logic should > never do something different based on this. Ah, OK. It simplifies the code. But I have a question below, > > Cc: Masami Hiramatsu > Cc: Kees Cook > Cc: Josef Bacik > Cc: Thomas Gleixner > Cc: "Naveen N. Rao" > Cc: zhong jiang > Signed-off-by: Greg Kroah-Hartman > --- > kernel/fail_function.c | 23 +-- > 1 file changed, 5 insertions(+), 18 deletions(-) > > diff --git a/kernel/fail_function.c b/kernel/fail_function.c > index 17f75b545f66..afc779be5ebb 100644 > --- a/kernel/fail_function.c > +++ b/kernel/fail_function.c > @@ -152,20 +152,13 @@ static int fei_retval_get(void *data, u64 *val) > DEFINE_DEBUGFS_ATTRIBUTE(fei_retval_ops, fei_retval_get, fei_retval_set, >"%llx\n"); > > -static int fei_debugfs_add_attr(struct fei_attr *attr) > +static void fei_debugfs_add_attr(struct fei_attr *attr) > { > struct dentry *dir; > > dir = debugfs_create_dir(attr->kp.symbol_name, fei_debugfs_dir); > - if (!dir) > - return -ENOMEM; > - > - if (!debugfs_create_file("retval", 0600, dir, attr, _retval_ops)) { > - debugfs_remove_recursive(dir); > - return -ENOMEM; > - } > > - return 0; Don't we need to check dir here? If above debugfs_create_dir() returns NULL, it seems we will create "retval" under root directory of debugfs. Thank you, > + debugfs_create_file("retval", 0600, dir, attr, _retval_ops); > } > > static void fei_debugfs_remove_attr(struct fei_attr *attr) > @@ -306,7 +299,7 @@ static ssize_t fei_write(struct file *file, const char > __user *buffer, > > ret = register_kprobe(>kp); > if (!ret) > - ret = fei_debugfs_add_attr(attr); > + fei_debugfs_add_attr(attr); > if (ret < 0) > fei_attr_remove(attr); > else { > @@ -337,19 +330,13 @@ static int __init fei_debugfs_init(void) > return PTR_ERR(dir); > > /* injectable attribute is just a symlink of error_inject/list */ > - if (!debugfs_create_symlink("injectable", dir, > - "../error_injection/list")) > - goto error; > + debugfs_create_symlink("injectable", dir, "../error_injection/list"); > > - if (!debugfs_create_file("inject", 0600, dir, NULL, _ops)) > - goto error; > + debugfs_create_file("inject", 0600, dir, NULL, _ops); > > fei_debugfs_dir = dir; > > return 0; > -error: > - debugfs_remove_recursive(dir); > - return -ENOMEM; > } > > late_initcall(fei_debugfs_init); > -- > 2.20.1 > -- Masami Hiramatsu
Re: [PATCH v8 11/11] media: imx.rst: Update doc to reflect fixes to interlaced capture
On 1/21/19 12:24 PM, Tim Harvey wrote: On Tue, Jan 15, 2019 at 3:54 PM Steve Longerbeam wrote: Hi Tim, On 1/15/19 1:58 PM, Tim Harvey wrote: On Wed, Jan 9, 2019 at 10:30 AM Steve Longerbeam wrote: Also add an example pipeline for unconverted capture with interweave on SabreAuto. Cleanup some language in various places in the process. Signed-off-by: Steve Longerbeam Reviewed-by: Philipp Zabel --- Changes since v4: - Make clear that it is IDMAC channel that does pixel reordering and interweave, not the CSI. Caught by Philipp Zabel. Changes since v3: - none. Changes since v2: - expand on idmac interweave behavior in CSI subdev. - switch second SabreAuto pipeline example to PAL to give both NTSC and PAL examples. - Cleanup some language in various places. --- Documentation/media/v4l-drivers/imx.rst | 103 +++- 1 file changed, 66 insertions(+), 37 deletions(-) Capture Pipelines - @@ -516,10 +522,33 @@ On the SabreAuto, an on-board ADV7180 SD decoder is connected to the parallel bus input on the internal video mux to IPU1 CSI0. The following example configures a pipeline to capture from the ADV7180 -video decoder, assuming NTSC 720x480 input signals, with Motion -Compensated de-interlacing. Pad field types assume the adv7180 outputs -"interlaced". $outputfmt can be any format supported by the ipu1_ic_prpvf -entity at its output pad: +video decoder, assuming NTSC 720x480 input signals, using simple +interweave (unconverted and without motion compensation). The adv7180 +must output sequential or alternating fields (field type 'seq-bt' for +NTSC, or 'alternate'): + +.. code-block:: none + + # Setup links + media-ctl -l "'adv7180 3-0021':0 -> 'ipu1_csi0_mux':1[1]" + media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]" + media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]" + # Configure pads + media-ctl -V "'adv7180 3-0021':0 [fmt:UYVY2X8/720x480 field:seq-bt]" + media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x480]" + media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]" + # Configure "ipu1_csi0 capture" interface (assumed at /dev/video4) + v4l2-ctl -d4 --set-fmt-video=field=interlaced_bt + +Streaming can then begin on /dev/video4. The v4l2-ctl tool can also be +used to select any supported YUV pixelformat on /dev/video4. + Hi Steve, I'm testing 4.20 with this patchset on top. I'm on a GW5104 which has an IMX6Q with the adv7180 on ipu1_csi0 like the SabeAuto example above I can't get the simple interveave example to work: media-ctl -r # reset all links # Setup links (ADV7180 IPU1_CSI0) media-ctl -l '"adv7180 2-0020":0 -> "ipu1_csi0_mux":1[1]' media-ctl -l '"ipu1_csi0_mux":2 -> "ipu1_csi0":0[1]' media-ctl -l '"ipu1_csi0":2 -> "ipu1_csi0 capture":0[1]' # /dev/video4 # Configure pads media-ctl -V "'adv7180 2-0020':0 [fmt:UYVY2X8/720x480 field:seq-bt]" media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x480]" media-ctl -V "'ipu1_csi0':0 [fmt:AYUV32/720x480]" This is the reason. The adv7180 is only allowing to configure alternate field mode, and thus it reports the field height on the mbus, not the full frame height. Imx deals with alternate field mode by capturing a full frame, so the CSI entity sets the output pad height to double the height. So the CSI input pad needs to be configured with the field height: media-ctl -V "'ipu1_csi0':0 [fmt:AYUV32/720x240]" It should work for you after doing that. And better yet, don't bother configuring the input pad, because media-ctl will propagate formats from source to sink pads for you, so it's better to rely on the propagation, and set the CSI output pad format instead (full frame height at output pad): media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]" Steve, Thanks - that makes sense. I also noticed that if I setup one of the vdic pipelines first then went back after a 'media-ctl -r' and setup the example that failed it no longer failed. I'm thinking that this is because 'media-ctl -r' make reset all the links but does not reset all the V4L2 formats on pads? Final note: the imx.rst doc is technically correct even though it is showing full frame heights being configured at the pads, because it is expecting the adv7180 has accepted 'seq-bt'. But even the example given in that doc works for alternate field mode, because the pad heights are forced to the correct field height for alternate mode. hmmm... I don't quite follow this statement. It sounds like the example would only be correct if you were setting 'field:alternate' but the example sets 'field:seq-bt' instead. The example is consistent for a sensor that sends seq-bt. Here is the example config from the imx.rst doc again, a (ntsc) height of 480 lines is correct for a seq-bt source: # Setup links media-ctl -l "'adv7180 3-0021':0 -> 'ipu1_csi0_mux':1[1]" media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]" media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]" # Configure pads media-ctl
Dear Friend,
Dear Friend, With due respect to your person and much sincerity of purpose, I make this contact with you as I believe that you can be of great assistance to me. My name is Mr. Abdul Samad, from Ouagadougou Republic of BURKINA FASO, West Africa . Presently i work in the African development Bank as telex manager. I have been searching for your contact since you left our country some years ago. I do not know whether this is your correct email address or not because I only used your name initials to search for your contact .In case you are not the person I am supposed to contact, please see this as a confidential message and do not reveal it to another person but if you are not the intended receiver, do let me know whether you can be of assistance regarding my proposal below because it is top secret. I am about to retire from active Bank service to start a new life but I am sceptical to reveal this particular secret to a stranger. You must assure me that everything will be handled confidentially because we are not going to suffer again in life. It has been 10 years now that most of the greedy African Politicians used our bank to launder money overseas through the help of their Political advisers. Most of the funds which they transferred out of the shores of Africa were gold and oil money that was supposed to have been used to develop the continent. Their Political advisers always inflated the amounts before transfer to foreign accounts so I also used the opportunity to divert part of the funds hence I am aware that there is no official trace of how much was transferred as all the accounts used for such transfers were being closed after transfer. I acted as the Bank Officer to most of the politicians and when I discovered that they were using me to succeed in their greedy act; I also cleaned some of their banking records from the Bank files and no one cared to ask me because the money was too much for them to control. They laundered over $5b Dollars during the process .As I am sending this message to you, I was able to divert thirty five million united state dollars ($35m) to an escrow account belonging to no one in the bank. The bank is anxious now to know who the beneficiary to the funds is because they have made a lot of profits with the funds. It is more than Eight years now and most of the politicians are no longer using our bank to transfer funds overseas. The ($35) Million Dollars has been laying waste but I don't want to retire from the bank without transferring the funds to a foreign account to enable me share the proceeds with the receiver. The money will be shared 60% for me and 40% for you.. There is no one coming to ask you about the funds because I secured everything. I only want you to assist me by providing a bank account where the funds can be transferred. You are not to face any difficulties or legal implications as I am going to handle the transfer personally. If you are capable of receiving the funds, do let me know immediately to enable me give you a detailed information on what to do. For me, I have not stolen the money from anyone because the other people that took the whole money did not face any problems. This is my chance also to grab my own but you must keep the details of the funds secret to avoid any leakages as no one in the bank knows about the funds. Please get back to me if you are interested and capable to handle this project I shall intimate you on what to do when I hear from your confirmation and acceptance. If you are capable of being my trusted associate, do declare your consent to me .E-mail: as6391...@gmail.com Waiting for your urgent response. Yours Faithfully,
[PATCH 2/2] f2fs: sync filesystem after roll-forward recovery
Some works after roll-forward recovery can get an error which will release all the data structures. Let's flush them in order to make it clean. One possible corruption came from: [ 90.400500] list_del corruption. prev->next should be ffed1f566208, but was (null) [ 90.675349] Call trace: [ 90.677869] __list_del_entry_valid+0x94/0xb4 [ 90.682351] remove_dirty_inode+0xac/0x114 [ 90.686563] __f2fs_write_data_pages+0x6a8/0x6c8 [ 90.691302] f2fs_write_data_pages+0x40/0x4c [ 90.695695] do_writepages+0x80/0xf0 [ 90.699372] __writeback_single_inode+0xdc/0x4ac [ 90.704113] writeback_sb_inodes+0x280/0x440 [ 90.708501] wb_writeback+0x1b8/0x3d0 [ 90.712267] wb_workfn+0x1a8/0x4d4 [ 90.715765] process_one_work+0x1c0/0x3d4 [ 90.719883] worker_thread+0x224/0x344 [ 90.723739] kthread+0x120/0x130 [ 90.727055] ret_from_fork+0x10/0x18 Reported-by: Sahitya Tummala Signed-off-by: Jaegeuk Kim --- fs/f2fs/checkpoint.c | 5 +++-- fs/f2fs/node.c | 4 +++- fs/f2fs/super.c | 42 +++--- 3 files changed, 37 insertions(+), 14 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index f955cd3e0677..f0ce2f06 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -306,8 +306,9 @@ static int f2fs_write_meta_pages(struct address_space *mapping, goto skip_write; /* collect a number of dirty meta pages and write together */ - if (wbc->for_kupdate || - get_pages(sbi, F2FS_DIRTY_META) < nr_pages_to_skip(sbi, META)) + if (wbc->sync_mode != WB_SYNC_ALL && + get_pages(sbi, F2FS_DIRTY_META) < + nr_pages_to_skip(sbi, META)) goto skip_write; /* if locked failed, cp will flush dirty pages instead */ diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 4f450e573312..f6ff84e29749 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1920,7 +1920,9 @@ static int f2fs_write_node_pages(struct address_space *mapping, f2fs_balance_fs_bg(sbi); /* collect a number of dirty node pages and write together */ - if (get_pages(sbi, F2FS_DIRTY_NODES) < nr_pages_to_skip(sbi, NODE)) + if (wbc->sync_mode != WB_SYNC_ALL && + get_pages(sbi, F2FS_DIRTY_NODES) < + nr_pages_to_skip(sbi, NODE)) goto skip_write; if (wbc->sync_mode == WB_SYNC_ALL) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 7998ff5418f2..2af0db2b738e 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -1456,9 +1456,16 @@ static int f2fs_enable_quotas(struct super_block *sb); static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi) { + unsigned int s_flags = sbi->sb->s_flags; struct cp_control cpc; - int err; + int err = 0; + int ret; + if (s_flags & SB_RDONLY) { + f2fs_msg(sbi->sb, KERN_ERR, + "checkpoint=disable on readonly fs"); + return -EINVAL; + } sbi->sb->s_flags |= SB_ACTIVE; f2fs_update_time(sbi, DISABLE_TIME); @@ -1466,18 +1473,24 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi) while (!f2fs_time_over(sbi, DISABLE_TIME)) { mutex_lock(>gc_mutex); err = f2fs_gc(sbi, true, false, NULL_SEGNO); - if (err == -ENODATA) + if (err == -ENODATA) { + err = 0; break; + } if (err && err != -EAGAIN) - return err; + break; } - err = sync_filesystem(sbi->sb); - if (err) - return err; + ret = sync_filesystem(sbi->sb); + if (ret || err) { + err = ret ? ret: err; + goto restore_flag; + } - if (f2fs_disable_cp_again(sbi)) - return -EAGAIN; + if (f2fs_disable_cp_again(sbi)) { + err = -EAGAIN; + goto restore_flag; + } mutex_lock(>gc_mutex); cpc.reason = CP_PAUSE; @@ -1486,7 +1499,9 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi) sbi->unusable_block_count = 0; mutex_unlock(>gc_mutex); - return 0; +restore_flag: + sbi->sb->s_flags = s_flags; /* Restore MS_RDONLY status */ + return err; } static void f2fs_enable_checkpoint(struct f2fs_sb_info *sbi) @@ -3356,7 +3371,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent) if (test_opt(sbi, DISABLE_CHECKPOINT)) { err = f2fs_disable_checkpoint(sbi); if (err) - goto free_meta; + goto sync_free_meta; } else if (is_set_ckpt_flags(sbi, CP_DISABLED_FLAG)) { f2fs_enable_checkpoint(sbi); } @@ -3369,7 +3384,7 @@
[PATCH 10/15] habanalabs: add device reset support
This patch adds support for doing various on-the-fly reset of Goya. The driver supports two types of resets: 1. soft-reset 2. hard-reset Soft-reset is done when the device detects a timeout of a command submission that was given to the device. The soft-reset process only resets the engines that are relevant for the submission of compute jobs, i.e. the DMA channels, the TPCs and the MME. The purpose is to bring the device as fast as possible to a working state. Hard-reset is done in several cases: 1. After soft-reset is done but the device is not responding 2. When fatal errors occur inside the device, e.g. ECC error 3. When the driver is removed Hard-reset performs a reset of the entire chip except for the PCI controller and the PLLs. It is a much longer process then soft-reset but it helps to recover the device without the need to reboot the Host. After hard-reset, the driver will restore the max power attribute and in case of manual power management, the frequencies that were set. This patch also adds two entries to the sysfs, which allows the root user to initiate a soft or hard reset. Signed-off-by: Oded Gabbay --- drivers/misc/habanalabs/command_buffer.c | 11 +- drivers/misc/habanalabs/device.c | 308 +- drivers/misc/habanalabs/goya/goya.c | 201 ++ drivers/misc/habanalabs/goya/goya_hwmgr.c | 18 +- drivers/misc/habanalabs/habanalabs.h | 35 +++ drivers/misc/habanalabs/habanalabs_drv.c | 9 +- drivers/misc/habanalabs/hwmon.c | 4 +- drivers/misc/habanalabs/irq.c | 31 +++ drivers/misc/habanalabs/sysfs.c | 120 - 9 files changed, 712 insertions(+), 25 deletions(-) diff --git a/drivers/misc/habanalabs/command_buffer.c b/drivers/misc/habanalabs/command_buffer.c index 535ed6cc5bda..700c6da01188 100644 --- a/drivers/misc/habanalabs/command_buffer.c +++ b/drivers/misc/habanalabs/command_buffer.c @@ -81,9 +81,10 @@ int hl_cb_create(struct hl_device *hdev, struct hl_cb_mgr *mgr, bool alloc_new_cb = true; int rc; - if (hdev->disabled) { + if ((hdev->disabled) || ((atomic_read(>in_reset)) && + (ctx_id != HL_KERNEL_ASID_ID))) { dev_warn_ratelimited(hdev->dev, - "Device is disabled !!! Can't create new CBs\n"); + "Device is disabled or in reset !!! Can't create new CBs\n"); rc = -EBUSY; goto out_err; } @@ -187,6 +188,12 @@ int hl_cb_ioctl(struct hl_fpriv *hpriv, void *data) u64 handle; int rc; + if (hdev->hard_reset_pending) { + dev_crit_ratelimited(hdev->dev, + "Device HARD reset pending !!! Please close FD\n"); + return -ENODEV; + } + switch (args->in.op) { case HL_CB_OP_CREATE: rc = hl_cb_create(hdev, >cb_mgr, args->in.cb_size, diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c index ff7b610f18c4..00fde57ce823 100644 --- a/drivers/misc/habanalabs/device.c +++ b/drivers/misc/habanalabs/device.c @@ -188,6 +188,7 @@ static int device_early_init(struct hl_device *hdev) mutex_init(>device_open); mutex_init(>send_cpu_message_lock); + atomic_set(>in_reset, 0); atomic_set(>fd_open_cnt, 0); return 0; @@ -238,6 +239,27 @@ static void set_freq_to_low_job(struct work_struct *work) usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC)); } +static void hl_device_heartbeat(struct work_struct *work) +{ + struct hl_device *hdev = container_of(work, struct hl_device, + work_heartbeat.work); + + if ((hdev->disabled) || (atomic_read(>in_reset))) + goto reschedule; + + if (!hdev->asic_funcs->send_heartbeat(hdev)) + goto reschedule; + + dev_err(hdev->dev, "Device heartbeat failed !!!\n"); + hl_device_reset(hdev, true, false); + + return; + +reschedule: + schedule_delayed_work(>work_heartbeat, + usecs_to_jiffies(HL_HEARTBEAT_PER_USEC)); +} + /** * device_late_init - do late stuff initialization for the habanalabs device * @@ -273,6 +295,12 @@ static int device_late_init(struct hl_device *hdev) schedule_delayed_work(>work_freq, usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC)); + if (hdev->heartbeat) { + INIT_DELAYED_WORK(>work_heartbeat, hl_device_heartbeat); + schedule_delayed_work(>work_heartbeat, + usecs_to_jiffies(HL_HEARTBEAT_PER_USEC)); + } + hdev->late_init_done = true; return 0; @@ -290,6 +318,8 @@ static void device_late_fini(struct hl_device *hdev) return; cancel_delayed_work_sync(>work_freq); + if (hdev->heartbeat) +
[PATCH 07/15] habanalabs: add h/w queues module
This patch adds the H/W queues module and the code to initialize Goya's various compute and DMA engines and their queues. Goya has 5 DMA channels, 8 TPC engines and a single MME engine. For each channel/engine, there is a H/W queue logic which is used to pass commands from the user to the H/W. That logic is called QMAN. There are two types of QMANs: external and internal. The DMA QMANs are considered external while the TPC and MME QMANs are considered internal. For each external queue there is a completion queue, which is located on the Host memory. The differences between external and internal QMANs are: 1. The location of the queue's memory. External QMANs are located on the Host memory while internal QMANs are located on the on-chip memory. 2. The external QMAN write an entry to a completion queue and sends an MSI-X interrupt upon completion of a command buffer that was given to it. The internal QMAN doesn't do that. Signed-off-by: Oded Gabbay --- drivers/misc/habanalabs/Makefile |2 +- drivers/misc/habanalabs/device.c | 74 +- drivers/misc/habanalabs/goya/goya.c | 1518 +++-- drivers/misc/habanalabs/goya/goyaP.h |6 + drivers/misc/habanalabs/habanalabs.h | 176 +- drivers/misc/habanalabs/habanalabs_drv.c |6 + drivers/misc/habanalabs/hw_queue.c| 404 + .../habanalabs/include/goya/goya_packets.h| 234 +++ .../habanalabs/include/habanalabs_device_if.h | 272 +++ drivers/misc/habanalabs/irq.c | 150 ++ 10 files changed, 2721 insertions(+), 121 deletions(-) create mode 100644 drivers/misc/habanalabs/hw_queue.c create mode 100644 drivers/misc/habanalabs/include/goya/goya_packets.h create mode 100644 drivers/misc/habanalabs/irq.c diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile index 2530c9b78ca4..c07f3ccb57dc 100644 --- a/drivers/misc/habanalabs/Makefile +++ b/drivers/misc/habanalabs/Makefile @@ -5,7 +5,7 @@ obj-m := habanalabs.o habanalabs-y := habanalabs_drv.o device.o context.o asid.o habanalabs_ioctl.o \ - command_buffer.o + command_buffer.o hw_queue.o irq.o include $(src)/goya/Makefile habanalabs-y += $(HL_GOYA_FILES) diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c index 9fc7218a973c..98220628a467 100644 --- a/drivers/misc/habanalabs/device.c +++ b/drivers/misc/habanalabs/device.c @@ -170,13 +170,22 @@ static int device_early_init(struct hl_device *hdev) if (rc) goto early_fini; + hdev->cq_wq = alloc_workqueue("hl-free-jobs", WQ_UNBOUND, 0); + if (hdev->cq_wq == NULL) { + dev_err(hdev->dev, "Failed to allocate CQ workqueue\n"); + goto asid_fini; + } + hl_cb_mgr_init(>kernel_cb_mgr); mutex_init(>device_open); + mutex_init(>send_cpu_message_lock); atomic_set(>fd_open_cnt, 0); return 0; +asid_fini: + hl_asid_fini(hdev); early_fini: if (hdev->asic_funcs->early_fini) hdev->asic_funcs->early_fini(hdev); @@ -192,9 +201,12 @@ static int device_early_init(struct hl_device *hdev) */ static void device_early_fini(struct hl_device *hdev) { + mutex_destroy(>send_cpu_message_lock); hl_cb_mgr_fini(hdev, >kernel_cb_mgr); + destroy_workqueue(hdev->cq_wq); + hl_asid_fini(hdev); if (hdev->asic_funcs->early_fini) @@ -273,7 +285,7 @@ int hl_device_resume(struct hl_device *hdev) */ int hl_device_init(struct hl_device *hdev, struct class *hclass) { - int rc; + int i, rc, cq_ready_cnt; /* Create device */ rc = device_setup_cdev(hdev, hclass, hdev->id, _ops); @@ -294,11 +306,48 @@ int hl_device_init(struct hl_device *hdev, struct class *hclass) if (rc) goto early_fini; + /* +* Initialize the H/W queues. Must be done before hw_init, because +* there the addresses of the kernel queue are being written to the +* registers of the device +*/ + rc = hl_hw_queues_create(hdev); + if (rc) { + dev_err(hdev->dev, "failed to initialize kernel queues\n"); + goto sw_fini; + } + + /* +* Initialize the completion queues. Must be done before hw_init, +* because there the addresses of the completion queues are being +* passed as arguments to request_irq +*/ + hdev->completion_queue = + kcalloc(hdev->asic_prop.completion_queues_count, + sizeof(*hdev->completion_queue), GFP_KERNEL); + + if (!hdev->completion_queue) { + dev_err(hdev->dev, "failed to allocate completion queues\n"); + rc = -ENOMEM; + goto hw_queues_destroy; + } + + for (i = 0, cq_ready_cnt = 0; + i <
[PATCH 04/15] habanalabs: add context and ASID modules
This patch adds two modules - ASID and context. Each user process the opens a device's file must have at least one context before it is able to "work" with the device. Each context has its own device address-space and contains information about its runtime state (its active command submissions). To have address-space separation between contexts, each context is assigned a unique ASID, which stands for "address-space id". Goya supports up to 1024 ASIDs. Currently, the driver doesn't support multiple contexts. Therefore, the user doesn't need to actively create a context. A "primary context" is created automatically when the user opens the device's file. Signed-off-by: Oded Gabbay --- drivers/misc/habanalabs/Makefile | 2 +- drivers/misc/habanalabs/asid.c | 58 + drivers/misc/habanalabs/context.c| 155 +++ drivers/misc/habanalabs/device.c | 47 +++ drivers/misc/habanalabs/habanalabs.h | 70 ++ drivers/misc/habanalabs/habanalabs_drv.c | 46 ++- 6 files changed, 375 insertions(+), 3 deletions(-) create mode 100644 drivers/misc/habanalabs/asid.c create mode 100644 drivers/misc/habanalabs/context.c diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile index 6f1ead69bd77..3ffbadc2ca01 100644 --- a/drivers/misc/habanalabs/Makefile +++ b/drivers/misc/habanalabs/Makefile @@ -4,7 +4,7 @@ obj-m := habanalabs.o -habanalabs-y := habanalabs_drv.o device.o +habanalabs-y := habanalabs_drv.o device.o context.o asid.o include $(src)/goya/Makefile habanalabs-y += $(HL_GOYA_FILES) diff --git a/drivers/misc/habanalabs/asid.c b/drivers/misc/habanalabs/asid.c new file mode 100644 index ..0ce84c8f5a47 --- /dev/null +++ b/drivers/misc/habanalabs/asid.c @@ -0,0 +1,58 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright 2016-2018 HabanaLabs, Ltd. + * All Rights Reserved. + */ + +#include "habanalabs.h" + +#include +#include + +int hl_asid_init(struct hl_device *hdev) +{ + hdev->asid_bitmap = kcalloc(BITS_TO_LONGS(hdev->asic_prop.max_asid), + sizeof(*hdev->asid_bitmap), GFP_KERNEL); + if (!hdev->asid_bitmap) + return -ENOMEM; + + mutex_init(>asid_mutex); + + /* ASID 0 is reserved for KMD */ + set_bit(0, hdev->asid_bitmap); + + return 0; +} + +void hl_asid_fini(struct hl_device *hdev) +{ + mutex_destroy(>asid_mutex); + kfree(hdev->asid_bitmap); +} + +unsigned long hl_asid_alloc(struct hl_device *hdev) +{ + unsigned long found; + + mutex_lock(>asid_mutex); + + found = find_first_zero_bit(hdev->asid_bitmap, + hdev->asic_prop.max_asid); + if (found == hdev->asic_prop.max_asid) + found = 0; + else + set_bit(found, hdev->asid_bitmap); + + mutex_unlock(>asid_mutex); + + return found; +} + +void hl_asid_free(struct hl_device *hdev, unsigned long asid) +{ + if (WARN((asid == 0 || asid >= hdev->asic_prop.max_asid), + "Invalid ASID %lu", asid)) + return; + clear_bit(asid, hdev->asid_bitmap); +} diff --git a/drivers/misc/habanalabs/context.c b/drivers/misc/habanalabs/context.c new file mode 100644 index ..cdcad077e5cf --- /dev/null +++ b/drivers/misc/habanalabs/context.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright 2016-2018 HabanaLabs, Ltd. + * All Rights Reserved. + */ + +#include "habanalabs.h" + +#include +#include + +static void hl_ctx_fini(struct hl_ctx *ctx) +{ + struct hl_device *hdev = ctx->hdev; + + if (ctx->asid != HL_KERNEL_ASID_ID) + hl_asid_free(hdev, ctx->asid); +} + +void hl_ctx_do_release(struct kref *ref) +{ + struct hl_ctx *ctx; + + ctx = container_of(ref, struct hl_ctx, refcount); + + dev_dbg(ctx->hdev->dev, "Now really releasing context %d\n", ctx->asid); + + hl_ctx_fini(ctx); + + if (ctx->hpriv) + hl_hpriv_put(ctx->hpriv); + + kfree(ctx); +} + +int hl_ctx_create(struct hl_device *hdev, struct hl_fpriv *hpriv) +{ + struct hl_ctx_mgr *mgr = >ctx_mgr; + struct hl_ctx *ctx; + int rc; + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) { + rc = -ENOMEM; + goto out_err; + } + + rc = hl_ctx_init(hdev, ctx, false); + if (rc) + goto free_ctx; + + hl_hpriv_get(hpriv); + ctx->hpriv = hpriv; + + /* TODO: remove for multiple contexts */ + hpriv->ctx = ctx; + hdev->user_ctx = ctx; + + mutex_lock(>ctx_lock); + rc = idr_alloc(>ctx_handles, ctx, 1, 0, GFP_KERNEL); + mutex_unlock(>ctx_lock); + + if (rc < 0) { + dev_err(hdev->dev, "Failed to allocate IDR for a new CTX\n"); + hl_ctx_free(hdev, ctx); + goto
[PATCH 1/2] f2fs: run discard jobs when put_super
When we umount f2fs, we need to avoid long delay due to discard commands, which is actually taking tens of seconds, if storage is very slow on UNMAP. So, this patch introduces timeout-based work on it. By default, let me give 5 seconds for discard. Signed-off-by: Jaegeuk Kim --- Documentation/ABI/testing/sysfs-fs-f2fs | 7 +++ fs/f2fs/f2fs.h | 5 - fs/f2fs/segment.c | 11 ++- fs/f2fs/super.c | 17 - fs/f2fs/sysfs.c | 3 +++ 5 files changed, 32 insertions(+), 11 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index a7ce33199457..91822ce25831 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -86,6 +86,13 @@ Description: The unit size is one block, now only support configuring in range of [1, 512]. +What: /sys/fs/f2fs//umount_discard_timeout +Date: January 2019 +Contact: "Jaegeuk Kim" +Description: + Set timeout to issue discard commands during umount. + Default: 5 secs + What: /sys/fs/f2fs//max_victim_search Date: January 2014 Contact: "Jaegeuk Kim" diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 0f564883e078..6b6ec5600089 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -191,6 +191,7 @@ enum { #define DEF_CP_INTERVAL60 /* 60 secs */ #define DEF_IDLE_INTERVAL 5 /* 5 secs */ #define DEF_DISABLE_INTERVAL 5 /* 5 secs */ +#define DEF_UMOUNT_DISCARD_TIMEOUT 5 /* 5 secs */ struct cp_control { int reason; @@ -310,6 +311,7 @@ struct discard_policy { bool sync; /* submit discard with REQ_SYNC flag */ bool ordered; /* issue discard by lba order */ unsigned int granularity; /* discard granularity */ + int timeout;/* discard timeout for put_super */ }; struct discard_cmd_control { @@ -1110,6 +1112,7 @@ enum { DISCARD_TIME, GC_TIME, DISABLE_TIME, + UMOUNT_DISCARD_TIMEOUT, MAX_TIME, }; @@ -3006,7 +3009,7 @@ void f2fs_invalidate_blocks(struct f2fs_sb_info *sbi, block_t addr); bool f2fs_is_checkpointed_data(struct f2fs_sb_info *sbi, block_t blkaddr); void f2fs_drop_discard_cmd(struct f2fs_sb_info *sbi); void f2fs_stop_discard_thread(struct f2fs_sb_info *sbi); -bool f2fs_wait_discard_bios(struct f2fs_sb_info *sbi); +bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi); void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, struct cp_control *cpc); void f2fs_dirty_to_prefree(struct f2fs_sb_info *sbi); diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 9b79056d705d..97e0faf09ebf 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1037,6 +1037,7 @@ static void __init_discard_policy(struct f2fs_sb_info *sbi, dpolicy->max_requests = DEF_MAX_DISCARD_REQUEST; dpolicy->io_aware_gran = MAX_PLIST_NUM; + dpolicy->timeout = MAX_TIME; if (discard_type == DPOLICY_BG) { dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME; @@ -1424,7 +1425,14 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, int i, issued = 0; bool io_interrupted = false; + if (dpolicy->timeout != MAX_TIME) + f2fs_update_time(sbi, dpolicy->timeout); + for (i = MAX_PLIST_NUM - 1; i >= 0; i--) { + if (dpolicy->timeout != MAX_TIME && + f2fs_time_over(sbi, dpolicy->timeout)) + break; + if (i + 1 < dpolicy->granularity) break; @@ -1611,7 +1619,7 @@ void f2fs_stop_discard_thread(struct f2fs_sb_info *sbi) } /* This comes from f2fs_put_super */ -bool f2fs_wait_discard_bios(struct f2fs_sb_info *sbi) +bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi) { struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; struct discard_policy dpolicy; @@ -1619,6 +1627,7 @@ bool f2fs_wait_discard_bios(struct f2fs_sb_info *sbi) __init_discard_policy(sbi, , DPOLICY_UMOUNT, dcc->discard_granularity); + dpolicy.timeout = UMOUNT_DISCARD_TIMEOUT; __issue_discard_cmd(sbi, ); dropped = __drop_discard_cmd(sbi); diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index ea514acede36..7998ff5418f2 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -1029,6 +1029,9 @@ static void f2fs_put_super(struct super_block *sb) int i; bool dropped; + /* be sure to wait for any on-going discard commands */ + dropped = f2fs_issue_discard_timeout(sbi); + f2fs_quota_off_umount(sb); /* prevent remaining shrinker jobs
Re: [PATCH v3 07/10] remoteproc: q6v5-mss: Vote for rpmh power domains
Hi, On Mon, Jan 21, 2019 at 9:52 PM Bjorn Andersson wrote: > @@ -1333,7 +1431,7 @@ static int q6v5_probe(struct platform_device *pdev) > ret = qcom_q6v5_init(>q6v5, pdev, rproc, > MPSS_CRASH_REASON_SMEM, > qcom_msa_handover); > if (ret) > - goto free_rproc; > + goto detach_proxy_pds; > > qproc->mpss_perm = BIT(QCOM_SCM_VMID_HLOS); > qproc->mba_perm = BIT(QCOM_SCM_VMID_HLOS); > @@ -1344,10 +1442,12 @@ static int q6v5_probe(struct platform_device *pdev) > > ret = rproc_add(rproc); > if (ret) > - goto free_rproc; > + goto detach_proxy_pds; I can't comment on the patch overall since I haven't spent any time in remoteproc, but as previously pointed out by Sibi, now that you've landed commit 027045a6e2b7 ("remoteproc: qcom: Add shutdown-ack irq"), you need to adjust the "goto"s in your patch. Specifically (whitespace damaged) atop your whole series: @@ -1488,7 +1488,7 @@ static int q6v5_probe(struct platform_device *pdev) qproc->sysmon = qcom_add_sysmon_subdev(rproc, "modem", 0x12); if (IS_ERR(qproc->sysmon)) { ret = PTR_ERR(qproc->sysmon); - goto free_rproc; + goto detach_proxy_pds; } -Doug
[PATCH 14/15] habanalabs: add debugfs support
This patch adds debugfs support to the driver. It allows the user-space to display information that is contained in the internal structures of the driver, such as: - active command submissions - active user virtual memory mappings - number of allocated command buffers It also enables the user to perform reads and writes through Goya's PCI bars. Signed-off-by: Oded Gabbay --- .../ABI/testing/debugfs-driver-habanalabs | 127 ++ drivers/misc/habanalabs/Makefile |2 + drivers/misc/habanalabs/command_buffer.c |4 + drivers/misc/habanalabs/command_submission.c | 12 + drivers/misc/habanalabs/debugfs.c | 1069 + drivers/misc/habanalabs/device.c |6 + drivers/misc/habanalabs/goya/goya.c | 108 ++ drivers/misc/habanalabs/goya/goyaP.h |5 + drivers/misc/habanalabs/habanalabs.h | 191 +++ drivers/misc/habanalabs/habanalabs_drv.c | 16 +- drivers/misc/habanalabs/memory.c |8 + 11 files changed, 1546 insertions(+), 2 deletions(-) create mode 100644 Documentation/ABI/testing/debugfs-driver-habanalabs create mode 100644 drivers/misc/habanalabs/debugfs.c diff --git a/Documentation/ABI/testing/debugfs-driver-habanalabs b/Documentation/ABI/testing/debugfs-driver-habanalabs new file mode 100644 index ..2b606c84938c --- /dev/null +++ b/Documentation/ABI/testing/debugfs-driver-habanalabs @@ -0,0 +1,127 @@ +What: /sys/kernel/debug/habanalabs/hl/addr +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Sets the device address to be used for read or write through +PCI bar. The acceptable value is a string that starts with "0x" + +What: /sys/kernel/debug/habanalabs/hl/command_buffers +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Displays a list with information about the currently allocated +command buffers + +What: /sys/kernel/debug/habanalabs/hl/command_submission +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Displays a list with information about the currently active +command submissions + +What: /sys/kernel/debug/habanalabs/hl/command_submission_jobs +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Displays a list with detailed information about each JOB (CB) of +each active command submission + +What: /sys/kernel/debug/habanalabs/hl/data32 +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Allows the root user to read or write directly through the +device's PCI bar. Writing to this file generates a write +transaction while reading from the file generates a read +transcation. This custom interface is needed (instead of using +the generic Linux user-space PCI mapping) because the DDR bar +is very small compared to the DDR memory and only the driver can +move the bar before and after the transaction + +What: /sys/kernel/debug/habanalabs/hl/device +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Enables the root user to set the device to specific state. +Valid values are "disable", "enable", "suspend", "resume". +User can read this property to see the valid values + +What: /sys/kernel/debug/habanalabs/hl/i2c_addr +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Sets I2C device address for I2C transaction that is generated +by the device's CPU + +What: /sys/kernel/debug/habanalabs/hl/i2c_bus +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Sets I2C bus address for I2C transaction that is generated by +the device's CPU + +What: /sys/kernel/debug/habanalabs/hl/i2c_data +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Triggers an I2C transaction that is generated by the device's +CPU. Writing to this file generates a write transaction while +reading from the file generates a read transcation + +What: /sys/kernel/debug/habanalabs/hl/i2c_reg +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Sets I2C register id for I2C transaction that is generated by +the device's CPU + +What: /sys/kernel/debug/habanalabs/hl/led0 +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Sets the state of the first S/W led on the device + +What:
[PATCH 15/15] Update MAINTAINERS and CREDITS with habanalabs info
The habanalabs driver was written from scratch from the very first days of Habana and is maintained by Oded Gabbay. Signed-off-by: Oded Gabbay --- CREDITS | 2 +- MAINTAINERS | 9 + 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/CREDITS b/CREDITS index e818eb6a3e71..03f3d67126fc 100644 --- a/CREDITS +++ b/CREDITS @@ -1222,7 +1222,7 @@ S: Brazil N: Oded Gabbay E: oded.gab...@gmail.com -D: AMD KFD maintainer +D: HabanaLabs and AMD KFD maintainer S: 12 Shraga Raphaeli S: Petah-Tikva, 4906418 S: Israel diff --git a/MAINTAINERS b/MAINTAINERS index 51029a425dbe..93e047336cab 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6641,6 +6641,15 @@ F: drivers/clocksource/h8300_*.c F: drivers/clk/h8300/ F: drivers/irqchip/irq-renesas-h8*.c +HABANALABS PCI DRIVER +M: Oded Gabbay +T: git https://github.com/HabanaAI/linux.git +S: Supported +F: drivers/misc/habanalabs/ +F: include/uapi/misc/habanalabs.h +F: Documentation/ABI/testing/sysfs-driver-habanalabs +F: Documentation/ABI/testing/debugfs-driver-habanalabs + HACKRF MEDIA DRIVER M: Antti Palosaari L: linux-me...@vger.kernel.org -- 2.17.1
[PATCH 12/15] habanalabs: add virtual memory and MMU modules
From: Omer Shpigelman This patch adds the Virtual Memory and MMU modules. Goya has an internal MMU which provides process isolation on the internal DDR. The internal MMU also performs translations for transactions that go from Goya to the Host. The driver is responsible for allocating and freeing memory on the DDR upon user request. It also provides an interface to map and unmap DDR and Host memory to the device address space. Signed-off-by: Omer Shpigelman Signed-off-by: Oded Gabbay --- drivers/misc/habanalabs/Makefile |2 +- drivers/misc/habanalabs/context.c | 19 +- drivers/misc/habanalabs/device.c | 20 +- drivers/misc/habanalabs/goya/goya.c | 391 + drivers/misc/habanalabs/habanalabs.h | 195 +++ drivers/misc/habanalabs/habanalabs_drv.c |2 +- drivers/misc/habanalabs/habanalabs_ioctl.c|3 +- drivers/misc/habanalabs/include/goya/goya.h |6 +- .../include/hw_ip/mmu/mmu_general.h | 45 + .../habanalabs/include/hw_ip/mmu/mmu_v1_0.h | 15 + drivers/misc/habanalabs/memory.c | 1506 + drivers/misc/habanalabs/mmu.c | 604 +++ include/uapi/misc/habanalabs.h| 122 +- 13 files changed, 2922 insertions(+), 8 deletions(-) create mode 100644 drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h create mode 100644 drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_0.h create mode 100644 drivers/misc/habanalabs/mmu.c diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile index d2fd0e18b1eb..fd46f8b48bab 100644 --- a/drivers/misc/habanalabs/Makefile +++ b/drivers/misc/habanalabs/Makefile @@ -6,7 +6,7 @@ obj-m := habanalabs.o habanalabs-y := habanalabs_drv.o device.o context.o asid.o habanalabs_ioctl.o \ command_buffer.o hw_queue.o irq.o sysfs.o hwmon.o memory.o \ - command_submission.o + command_submission.o mmu.o include $(src)/goya/Makefile habanalabs-y += $(HL_GOYA_FILES) diff --git a/drivers/misc/habanalabs/context.c b/drivers/misc/habanalabs/context.c index 2da672113e7a..dc0800a0ac9c 100644 --- a/drivers/misc/habanalabs/context.c +++ b/drivers/misc/habanalabs/context.c @@ -26,8 +26,10 @@ static void hl_ctx_fini(struct hl_ctx *ctx) for (i = 0 ; i < HL_MAX_PENDING_CS ; i++) dma_fence_put(ctx->cs_pending[i]); - if (ctx->asid != HL_KERNEL_ASID_ID) + if (ctx->asid != HL_KERNEL_ASID_ID) { + hl_vm_ctx_fini(ctx); hl_asid_free(hdev, ctx->asid); + } } void hl_ctx_do_release(struct kref *ref) @@ -97,6 +99,8 @@ void hl_ctx_free(struct hl_device *hdev, struct hl_ctx *ctx) int hl_ctx_init(struct hl_device *hdev, struct hl_ctx *ctx, bool is_kernel_ctx) { + int rc = 0; + ctx->hdev = hdev; kref_init(>refcount); @@ -114,9 +118,22 @@ int hl_ctx_init(struct hl_device *hdev, struct hl_ctx *ctx, bool is_kernel_ctx) dev_err(hdev->dev, "No free ASID, failed to create context\n"); return -ENOMEM; } + + rc = hl_vm_ctx_init(ctx); + if (rc) { + dev_err(hdev->dev, "Failed to init mem ctx module\n"); + rc = -ENOMEM; + goto mem_ctx_err; + } } return 0; + +mem_ctx_err: + if (ctx->asid != HL_KERNEL_ASID_ID) + hl_asid_free(hdev, ctx->asid); + + return rc; } void hl_ctx_get(struct hl_device *hdev, struct hl_ctx *ctx) diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c index a47e00fe5ccf..1f7340551386 100644 --- a/drivers/misc/habanalabs/device.c +++ b/drivers/misc/habanalabs/device.c @@ -585,8 +585,10 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset, /* Reset the H/W. It will be in idle state after this returns */ hdev->asic_funcs->hw_fini(hdev, hard_reset); - if (hard_reset) + if (hard_reset) { + hl_vm_fini(hdev); hl_eq_reset(hdev, >event_queue); + } /* Re-initialize PI,CI to 0 in all queues (hw queue, cq) */ hl_hw_queue_reset(hdev, hard_reset); @@ -647,6 +649,13 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset, goto out_err; } + rc = hl_vm_init(hdev); + if (rc) { + dev_err(hdev->dev, + "Failed to init memory module after hard reset\n"); + goto out_err; + } + hl_set_max_power(hdev, hdev->max_power); hdev->hard_reset_pending = false; @@ -828,6 +837,13 @@ int hl_device_init(struct hl_device *hdev, struct class *hclass) hdev->asic_name, hdev->asic_prop.dram_size / 1024 / 1024 / 1024); + rc =
[PATCH 08/15] habanalabs: add event queue and interrupts
This patch adds support for receiving events from Goya's control CPU and for receiving MSI-X interrupts from Goya's DMA engines and CPU. Goya's PCI controller supports up to 8 MSI-X interrupts, which only 6 of them are currently used. The first 5 interrupts are dedicated for Goya's DMA engine queues. The 6th interrupt is dedicated for Goya's control CPU. The DMA queue will signal its MSI-X entry upon each completion of a command buffer that was placed on its primary queue. The driver will then mark that CB as completed and free the related resources. It will also update the command submission object which that CB belongs to. There is a dedicated event queue (EQ) between the driver and Goya's control CPU. The EQ is located on the Host memory. The control CPU writes a new entry to the EQ for various reasons, such as ECC error, MMU page fault, Hot temperature. After writing the new entry to the EQ, the control CPU will trigger its dedicated MSI-X entry to signal the driver that there is a new entry in the EQ. The driver will then read the entry and act accordingly. Signed-off-by: Oded Gabbay --- drivers/misc/habanalabs/device.c| 35 +- drivers/misc/habanalabs/goya/goya.c | 522 +++- drivers/misc/habanalabs/goya/goyaP.h| 1 + drivers/misc/habanalabs/habanalabs.h| 37 ++ drivers/misc/habanalabs/include/goya/goya.h | 1 - drivers/misc/habanalabs/irq.c | 144 ++ 6 files changed, 729 insertions(+), 11 deletions(-) diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c index 98220628a467..9199e070e79e 100644 --- a/drivers/misc/habanalabs/device.c +++ b/drivers/misc/habanalabs/device.c @@ -173,9 +173,17 @@ static int device_early_init(struct hl_device *hdev) hdev->cq_wq = alloc_workqueue("hl-free-jobs", WQ_UNBOUND, 0); if (hdev->cq_wq == NULL) { dev_err(hdev->dev, "Failed to allocate CQ workqueue\n"); + rc = -ENOMEM; goto asid_fini; } + hdev->eq_wq = alloc_workqueue("hl-events", WQ_UNBOUND, 0); + if (hdev->eq_wq == NULL) { + dev_err(hdev->dev, "Failed to allocate EQ workqueue\n"); + rc = -ENOMEM; + goto free_cq_wq; + } + hl_cb_mgr_init(>kernel_cb_mgr); mutex_init(>device_open); @@ -184,6 +192,8 @@ static int device_early_init(struct hl_device *hdev) return 0; +free_cq_wq: + destroy_workqueue(hdev->cq_wq); asid_fini: hl_asid_fini(hdev); early_fini: @@ -205,6 +215,7 @@ static void device_early_fini(struct hl_device *hdev) hl_cb_mgr_fini(hdev, >kernel_cb_mgr); + destroy_workqueue(hdev->eq_wq); destroy_workqueue(hdev->cq_wq); hl_asid_fini(hdev); @@ -343,11 +354,22 @@ int hl_device_init(struct hl_device *hdev, struct class *hclass) } } + /* +* Initialize the event queue. Must be done before hw_init, +* because there the address of the event queue is being +* passed as argument to request_irq +*/ + rc = hl_eq_init(hdev, >event_queue); + if (rc) { + dev_err(hdev->dev, "failed to initialize event queue\n"); + goto cq_fini; + } + /* Allocate the kernel context */ hdev->kernel_ctx = kzalloc(sizeof(*hdev->kernel_ctx), GFP_KERNEL); if (!hdev->kernel_ctx) { rc = -ENOMEM; - goto cq_fini; + goto eq_fini; } hdev->user_ctx = NULL; @@ -392,6 +414,8 @@ int hl_device_init(struct hl_device *hdev, struct class *hclass) "kernel ctx is still alive on initialization failure\n"); free_ctx: kfree(hdev->kernel_ctx); +eq_fini: + hl_eq_fini(hdev, >event_queue); cq_fini: for (i = 0 ; i < cq_ready_cnt ; i++) hl_cq_fini(hdev, >completion_queue[i]); @@ -433,6 +457,13 @@ void hl_device_fini(struct hl_device *hdev) /* Mark device as disabled */ hdev->disabled = true; + /* +* Halt the engines and disable interrupts so we won't get any more +* completions from H/W and we won't have any accesses from the +* H/W to the host machine +*/ + hdev->asic_funcs->halt_engines(hdev, true); + hl_cb_pool_fini(hdev); /* Release kernel context */ @@ -442,6 +473,8 @@ void hl_device_fini(struct hl_device *hdev) /* Reset the H/W. It will be in idle state after this returns */ hdev->asic_funcs->hw_fini(hdev, true); + hl_eq_fini(hdev, >event_queue); + for (i = 0 ; i < hdev->asic_prop.completion_queues_count ; i++) hl_cq_fini(hdev, >completion_queue[i]); kfree(hdev->completion_queue); diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c index 08d5227eaf1d..6c04277ae0fa 100644 --- a/drivers/misc/habanalabs/goya/goya.c +++
[PATCH 09/15] habanalabs: add sysfs and hwmon support
This patch add the sysfs and hwmon entries that are exposed by the driver. Goya has several sensors, from various categories such as temperature, voltage, current, etc. The driver exposes those sensors in the standard hwmon mechanism. In addition, the driver exposes a couple of interfaces in sysfs, both for configuration and for providing status of the device or driver. The configuration attributes is for Power Management: - Automatic or manual - Frequency value when moving to high frequency mode - Maximum power the device is allowed to consume The rest of the attributes are read-only and provide the following information: - Versions of the various firmwares running on the device - Contents of the device's EEPROM - The device type (currently only Goya is supported) - PCI address of the device (to allow user-space to connect between /dev/hlX to PCI address) - Status of the device (operational, malfunction, in_reset) - How many processes are open on the device's file Signed-off-by: Oded Gabbay --- .../ABI/testing/sysfs-driver-habanalabs | 190 ++ drivers/misc/habanalabs/Makefile | 2 +- drivers/misc/habanalabs/device.c | 146 + drivers/misc/habanalabs/goya/Makefile | 2 +- drivers/misc/habanalabs/goya/goya.c | 230 +++ drivers/misc/habanalabs/goya/goyaP.h | 21 + drivers/misc/habanalabs/goya/goya_hwmgr.c | 306 + drivers/misc/habanalabs/habanalabs.h | 97 +++ drivers/misc/habanalabs/habanalabs_drv.c | 7 + drivers/misc/habanalabs/hwmon.c | 449 + drivers/misc/habanalabs/sysfs.c | 588 ++ 11 files changed, 2036 insertions(+), 2 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-driver-habanalabs create mode 100644 drivers/misc/habanalabs/goya/goya_hwmgr.c create mode 100644 drivers/misc/habanalabs/hwmon.c create mode 100644 drivers/misc/habanalabs/sysfs.c diff --git a/Documentation/ABI/testing/sysfs-driver-habanalabs b/Documentation/ABI/testing/sysfs-driver-habanalabs new file mode 100644 index ..19edd4da87c1 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-habanalabs @@ -0,0 +1,190 @@ +What: /sys/class/habanalabs/hl/armcp_kernel_ver +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Version of the Linux kernel running on the device's CPU + +What: /sys/class/habanalabs/hl/armcp_ver +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Version of the application running on the device's CPU + +What: /sys/class/habanalabs/hl/cpld_ver +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Version of the Device's CPLD F/W + +What: /sys/class/habanalabs/hl/device_type +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Displays the code name of the device according to its type. +The supported values are: "GOYA" + +What: /sys/class/habanalabs/hl/eeprom +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:A binary file attribute that contains the contents of the +on-board EEPROM + +What: /sys/class/habanalabs/hl/fuse_ver +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Displays the device's version from the eFuse + +What: /sys/class/habanalabs/hl/hard_reset +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Interface to trigger a hard-reset operation for the device. +Hard-reset will reset ALL internal components of the device +except for the PCI interface and the internal PLLs + +What: /sys/class/habanalabs/hl/hard_reset_cnt +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Displays how many times the device have undergone a hard-reset +operation + +What: /sys/class/habanalabs/hl/high_pll +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Allows the user to set the maximum clock frequency for MME, TPC +and IC when the power management profile is set to "automatic". + +What: /sys/class/habanalabs/hl/ic_clk +Date: Jan 2019 +KernelVersion: 5.1 +Contact:oded.gab...@gmail.com +Description:Allows the user to set the maximum clock frequency of the +Interconnect fabric. Writes to this parameter affect the device +only when the power management profile is set to "manual" mode. +The device IC clock might be set to lower value then the +maximum. The user should read the ic_clk_curr to see
[PATCH 11/15] habanalabs: add command submission module
This patch adds the main flow for the user to submit work to the device. Each work is described by a command submission object (CS). The CS contains 3 arrays of command buffers: One for execution, and two for context-switch (store and restore). For each CB, the user specifies on which queue to put that CB. In case of an internal queue, the entry doesn't contain a pointer to the CB but the address in the on-chip memory that the CB resides at. The driver parses some of the CBs to enforce security restrictions. The user receives a sequence number that represents the CS object. The user can then query the driver regarding the status of the CS, using that sequence number. In case the CS doesn't finish before the timeout expires, the driver will perform a soft-reset of the device. Signed-off-by: Oded Gabbay --- drivers/misc/habanalabs/Makefile |3 +- drivers/misc/habanalabs/command_submission.c | 787 + drivers/misc/habanalabs/context.c| 52 +- drivers/misc/habanalabs/device.c | 16 + drivers/misc/habanalabs/goya/goya.c | 1082 ++ drivers/misc/habanalabs/habanalabs.h | 274 + drivers/misc/habanalabs/habanalabs_drv.c | 23 + drivers/misc/habanalabs/habanalabs_ioctl.c |4 +- drivers/misc/habanalabs/hw_queue.c | 250 drivers/misc/habanalabs/memory.c | 200 include/uapi/misc/habanalabs.h | 158 ++- 11 files changed, 2842 insertions(+), 7 deletions(-) create mode 100644 drivers/misc/habanalabs/command_submission.c create mode 100644 drivers/misc/habanalabs/memory.c diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile index b5607233d216..d2fd0e18b1eb 100644 --- a/drivers/misc/habanalabs/Makefile +++ b/drivers/misc/habanalabs/Makefile @@ -5,7 +5,8 @@ obj-m := habanalabs.o habanalabs-y := habanalabs_drv.o device.o context.o asid.o habanalabs_ioctl.o \ - command_buffer.o hw_queue.o irq.o sysfs.o hwmon.o + command_buffer.o hw_queue.o irq.o sysfs.o hwmon.o memory.o \ + command_submission.o include $(src)/goya/Makefile habanalabs-y += $(HL_GOYA_FILES) diff --git a/drivers/misc/habanalabs/command_submission.c b/drivers/misc/habanalabs/command_submission.c new file mode 100644 index ..0116c2262f17 --- /dev/null +++ b/drivers/misc/habanalabs/command_submission.c @@ -0,0 +1,787 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright 2016-2018 HabanaLabs, Ltd. + * All Rights Reserved. + */ + +#include +#include "habanalabs.h" + +#include +#include +#include +#include +#include +#include + +static void job_wq_completion(struct work_struct *work); +static long _hl_cs_wait_ioctl(struct hl_device *hdev, + struct hl_ctx *ctx, u64 timeout_us, u64 seq); +static void cs_do_release(struct kref *ref); + +static const char *hl_fence_get_driver_name(struct dma_fence *fence) +{ + return "HabanaLabs"; +} + +static const char *hl_fence_get_timeline_name(struct dma_fence *fence) +{ + struct hl_dma_fence *hl_fence = + container_of(fence, struct hl_dma_fence, base_fence); + + return dev_name(hl_fence->hdev->dev); +} + +static bool hl_fence_enable_signaling(struct dma_fence *fence) +{ + return true; +} + +static void hl_fence_release(struct dma_fence *fence) +{ + struct hl_dma_fence *hl_fence = + container_of(fence, struct hl_dma_fence, base_fence); + + kfree_rcu(hl_fence, base_fence.rcu); +} + +static const struct dma_fence_ops hl_fence_ops = { + .get_driver_name = hl_fence_get_driver_name, + .get_timeline_name = hl_fence_get_timeline_name, + .enable_signaling = hl_fence_enable_signaling, + .wait = dma_fence_default_wait, + .release = hl_fence_release +}; + +static void cs_get(struct hl_cs *cs) +{ + kref_get(>refcount); +} + +static int cs_get_unless_zero(struct hl_cs *cs) +{ + return kref_get_unless_zero(>refcount); +} + +static void cs_put(struct hl_cs *cs) +{ + kref_put(>refcount, cs_do_release); +} + +/** + * cs_parser - parse the user command submission + * + * @hpriv : pointer to the private data of the fd + * @job: pointer to the job that holds the command submission info + * + * The function parses the command submission of the user. It calls the + * ASIC specific parser, which returns a list of memory blocks to send + * to the device as different command buffers + * + */ +static int cs_parser(struct hl_fpriv *hpriv, struct hl_cs_job *job) +{ + struct hl_device *hdev = hpriv->hdev; + struct hl_cs_parser parser; + int rc; + + parser.ctx_id = job->cs->ctx->asid; + parser.cs_sequence = job->cs->sequence; + parser.job_id = job->id; + + parser.hw_queue_id = job->hw_queue_id; + parser.job_userptr_list = >userptr_list; + parser.patched_cb = NULL; + parser.user_cb = job->user_cb; +
[PATCH 13/15] habanalabs: implement INFO IOCTL
This patch implements the INFO IOCTL. That IOCTL is used by the user to query information that is relevant/needed by the user in order to submit deep learning jobs to Goya. The information is divided into several categories, such as H/W IP, Events that happened, DDR usage and more. Signed-off-by: Oded Gabbay --- drivers/misc/habanalabs/goya/goya.c| 6 + drivers/misc/habanalabs/habanalabs.h | 2 + drivers/misc/habanalabs/habanalabs_ioctl.c | 132 + include/uapi/misc/habanalabs.h | 76 +++- 4 files changed, 215 insertions(+), 1 deletion(-) diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c index 94ee4cb00a49..c21c6046f09b 100644 --- a/drivers/misc/habanalabs/goya/goya.c +++ b/drivers/misc/habanalabs/goya/goya.c @@ -6120,6 +6120,11 @@ static void goya_hw_queues_unlock(struct hl_device *hdev) spin_unlock(>hw_queues_lock); } +static u32 goya_get_pci_id(struct hl_device *hdev) +{ + return hdev->pdev->device; +} + int goya_get_eeprom_data(struct hl_device *hdev, void *data, size_t max_size) { struct goya_device *goya = hdev->asic_specific; @@ -6217,6 +6222,7 @@ static const struct hl_asic_funcs goya_funcs = { .soft_reset_late_init = goya_soft_reset_late_init, .hw_queues_lock = goya_hw_queues_lock, .hw_queues_unlock = goya_hw_queues_unlock, + .get_pci_id = goya_get_pci_id, .get_eeprom_data = goya_get_eeprom_data, .send_cpu_message = goya_send_cpu_message }; diff --git a/drivers/misc/habanalabs/habanalabs.h b/drivers/misc/habanalabs/habanalabs.h index 1abc139d4293..6c0fe76936be 100644 --- a/drivers/misc/habanalabs/habanalabs.h +++ b/drivers/misc/habanalabs/habanalabs.h @@ -462,6 +462,7 @@ enum hl_pll_frequency { * @soft_reset_late_init: perform certain actions needed after soft reset. * @hw_queues_lock: acquire H/W queues lock. * @hw_queues_unlock: release H/W queues lock. + * @get_pci_id: retrieve PCI ID. * @get_eeprom_data: retrieve EEPROM data from F/W. * @send_cpu_message: send buffer to ArmCP. */ @@ -530,6 +531,7 @@ struct hl_asic_funcs { int (*soft_reset_late_init)(struct hl_device *hdev); void (*hw_queues_lock)(struct hl_device *hdev); void (*hw_queues_unlock)(struct hl_device *hdev); + u32 (*get_pci_id)(struct hl_device *hdev); int (*get_eeprom_data)(struct hl_device *hdev, void *data, size_t max_size); int (*send_cpu_message)(struct hl_device *hdev, u32 *msg, diff --git a/drivers/misc/habanalabs/habanalabs_ioctl.c b/drivers/misc/habanalabs/habanalabs_ioctl.c index 6dcad810b821..067cf640ad50 100644 --- a/drivers/misc/habanalabs/habanalabs_ioctl.c +++ b/drivers/misc/habanalabs/habanalabs_ioctl.c @@ -12,10 +12,142 @@ #include #include +static int hw_ip_info(struct hl_device *hdev, struct hl_info_args *args) +{ + struct hl_info_hw_ip_info hw_ip = {0}; + u32 size = args->return_size; + void __user *out = (void __user *) (uintptr_t) args->return_pointer; + struct asic_fixed_properties *prop = >asic_prop; + u64 sram_kmd_size, dram_kmd_size; + + if ((!size) || (!out)) + return -EINVAL; + + sram_kmd_size = (prop->sram_user_base_address - + prop->sram_base_address); + dram_kmd_size = (prop->dram_user_base_address - + prop->dram_base_address); + + hw_ip.device_id = hdev->asic_funcs->get_pci_id(hdev); + hw_ip.sram_base_address = prop->sram_user_base_address; + hw_ip.dram_base_address = prop->dram_user_base_address; + hw_ip.tpc_enabled_mask = prop->tpc_enabled_mask; + hw_ip.sram_size = prop->sram_size - sram_kmd_size; + hw_ip.dram_size = prop->dram_size - dram_kmd_size; + if (hw_ip.dram_size > 0) + hw_ip.dram_enabled = 1; + hw_ip.num_of_events = prop->num_of_events; + memcpy(hw_ip.armcp_version, + prop->armcp_info.armcp_version, VERSION_MAX_LEN); + hw_ip.armcp_cpld_version = prop->armcp_info.cpld_version; + hw_ip.psoc_pci_pll_nr = prop->psoc_pci_pll_nr; + hw_ip.psoc_pci_pll_nf = prop->psoc_pci_pll_nf; + hw_ip.psoc_pci_pll_od = prop->psoc_pci_pll_od; + hw_ip.psoc_pci_pll_div_factor = prop->psoc_pci_pll_div_factor; + + return copy_to_user(out, _ip, + min((size_t)size, sizeof(hw_ip))) ? -EFAULT : 0; +} + +static int hw_events_info(struct hl_device *hdev, struct hl_info_args *args) +{ + u32 size, max_size = args->return_size; + void __user *out = (void __user *) (uintptr_t) args->return_pointer; + void *arr; + + if ((!max_size) || (!out)) + return -EINVAL; + + arr = hdev->asic_funcs->get_events_stat(hdev, ); + + return copy_to_user(out, arr, min(max_size, size)) ? -EFAULT : 0; +} + +static int dram_usage_info(struct hl_device *hdev, struct hl_info_args
[PATCH 01/15] habanalabs: add skeleton driver
This patch adds the habanalabs skeleton driver. The driver does nothing at this stage except very basic operations. It contains the minimal code to insmod and rmmod the driver and to create a /dev/hlX file per PCI device. Signed-off-by: Oded Gabbay --- drivers/misc/Kconfig | 1 + drivers/misc/Makefile | 1 + drivers/misc/habanalabs/Kconfig | 22 ++ drivers/misc/habanalabs/Makefile | 7 + drivers/misc/habanalabs/device.c | 331 drivers/misc/habanalabs/habanalabs.h | 149 +++ drivers/misc/habanalabs/habanalabs_drv.c | 366 ++ .../habanalabs/include/habanalabs_device_if.h | 125 ++ 8 files changed, 1002 insertions(+) create mode 100644 drivers/misc/habanalabs/Kconfig create mode 100644 drivers/misc/habanalabs/Makefile create mode 100644 drivers/misc/habanalabs/device.c create mode 100644 drivers/misc/habanalabs/habanalabs.h create mode 100644 drivers/misc/habanalabs/habanalabs_drv.c create mode 100644 drivers/misc/habanalabs/include/habanalabs_device_if.h diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index f417b06e11c5..fecab53c4f21 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -535,4 +535,5 @@ source "drivers/misc/echo/Kconfig" source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" +source "drivers/misc/habanalabs/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index e39ccbbc1b3a..ae77dfd790a4 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -59,3 +59,4 @@ obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o obj-$(CONFIG_OCXL) += ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o +obj-$(CONFIG_HABANA_AI)+= habanalabs/ diff --git a/drivers/misc/habanalabs/Kconfig b/drivers/misc/habanalabs/Kconfig new file mode 100644 index ..b7f38a14caf5 --- /dev/null +++ b/drivers/misc/habanalabs/Kconfig @@ -0,0 +1,22 @@ +# +# HabanaLabs AI accelerators driver +# + +config HABANA_AI + tristate "HabanaAI accelerators (habanalabs)" + depends on PCI + select FRAME_VECTOR + help + Enables PCIe card driver for Habana's AI Processors (AIP) that are + designed to accelerate Deep Learning inference and training workloads. + + The driver manages the PCIe devices and provides IOCTL interface for + the user to submit workloads to the devices. + + The user-space interface is described in + include/uapi/misc/habanalabs.h + + If unsure, say N. + + To compile this driver as a module, choose M here: the + module will be called habanalabs. diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile new file mode 100644 index ..b41433a09e02 --- /dev/null +++ b/drivers/misc/habanalabs/Makefile @@ -0,0 +1,7 @@ +# +# Makefile for HabanaLabs AI accelerators driver +# + +obj-m := habanalabs.o + +habanalabs-y := habanalabs_drv.o device.o \ No newline at end of file diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c new file mode 100644 index ..376b55eb73d4 --- /dev/null +++ b/drivers/misc/habanalabs/device.c @@ -0,0 +1,331 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright 2016-2018 HabanaLabs, Ltd. + * All Rights Reserved. + */ + +#include "habanalabs.h" + +#include +#include +#include + +static void hpriv_release(struct kref *ref) +{ + struct hl_fpriv *hpriv; + struct hl_device *hdev; + + hpriv = container_of(ref, struct hl_fpriv, refcount); + + hdev = hpriv->hdev; + + put_pid(hpriv->taskpid); + + kfree(hpriv); +} + +void hl_hpriv_get(struct hl_fpriv *hpriv) +{ + kref_get(>refcount); +} + +void hl_hpriv_put(struct hl_fpriv *hpriv) +{ + kref_put(>refcount, hpriv_release); +} + +/** + * hl_device_release - release function for habanalabs device + * + * @inode: pointer to inode structure + * @filp: pointer to file structure + * + * Called when process closes an habanalabs device + */ +static int hl_device_release(struct inode *inode, struct file *filp) +{ + struct hl_fpriv *hpriv = filp->private_data; + + filp->private_data = NULL; + + hl_hpriv_put(hpriv); + + return 0; +} + +static const struct file_operations hl_ops = { + .owner = THIS_MODULE, + .open = hl_device_open, + .release = hl_device_release +}; + +/** + * device_setup_cdev - setup cdev and device for habanalabs device + * + * @hdev: pointer to habanalabs device structure + * @hclass: pointer to the class object of the device + * @minor: minor number of the specific device + * @fpos : file operations to install for this device + * + * Create a cdev and a Linux device for habanalabs's device. Need to be + * called at the end of the
[PATCH 03/15] habanalabs: add basic Goya support
This patch adds a basic support for the Goya device. The code initializes the device's PCI controller and PCI bars. It also initializes various S/W structures and adds some basic helper functions. Signed-off-by: Oded Gabbay --- drivers/misc/habanalabs/Makefile| 5 +- drivers/misc/habanalabs/device.c| 71 +++ drivers/misc/habanalabs/goya/Makefile | 3 + drivers/misc/habanalabs/goya/goya.c | 633 drivers/misc/habanalabs/goya/goyaP.h| 125 drivers/misc/habanalabs/habanalabs.h| 131 drivers/misc/habanalabs/habanalabs_drv.c| 3 + drivers/misc/habanalabs/include/goya/goya.h | 115 8 files changed, 1085 insertions(+), 1 deletion(-) create mode 100644 drivers/misc/habanalabs/goya/Makefile create mode 100644 drivers/misc/habanalabs/goya/goya.c create mode 100644 drivers/misc/habanalabs/goya/goyaP.h create mode 100644 drivers/misc/habanalabs/include/goya/goya.h diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile index b41433a09e02..6f1ead69bd77 100644 --- a/drivers/misc/habanalabs/Makefile +++ b/drivers/misc/habanalabs/Makefile @@ -4,4 +4,7 @@ obj-m := habanalabs.o -habanalabs-y := habanalabs_drv.o device.o \ No newline at end of file +habanalabs-y := habanalabs_drv.o device.o + +include $(src)/goya/Makefile +habanalabs-y += $(HL_GOYA_FILES) diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c index 376b55eb73d4..a4276ef559b3 100644 --- a/drivers/misc/habanalabs/device.c +++ b/drivers/misc/habanalabs/device.c @@ -116,8 +116,11 @@ static int device_setup_cdev(struct hl_device *hdev, struct class *hclass, */ static int device_early_init(struct hl_device *hdev) { + int rc; + switch (hdev->asic_type) { case ASIC_GOYA: + goya_set_asic_funcs(hdev); sprintf(hdev->asic_name, "GOYA"); break; default: @@ -126,6 +129,10 @@ static int device_early_init(struct hl_device *hdev) return -EINVAL; } + rc = hdev->asic_funcs->early_init(hdev); + if (rc) + return rc; + return 0; } @@ -137,6 +144,10 @@ static int device_early_init(struct hl_device *hdev) */ static void device_early_fini(struct hl_device *hdev) { + + if (hdev->asic_funcs->early_fini) + hdev->asic_funcs->early_fini(hdev); + } /** @@ -150,8 +161,15 @@ static void device_early_fini(struct hl_device *hdev) */ int hl_device_suspend(struct hl_device *hdev) { + int rc; + pci_save_state(hdev->pdev); + rc = hdev->asic_funcs->suspend(hdev); + if (rc) + dev_err(hdev->dev, + "Failed to disable PCI access of device CPU\n"); + /* Shut down the device */ pci_disable_device(hdev->pdev); pci_set_power_state(hdev->pdev, PCI_D3hot); @@ -181,6 +199,13 @@ int hl_device_resume(struct hl_device *hdev) return rc; } + rc = hdev->asic_funcs->resume(hdev); + if (rc) { + dev_err(hdev->dev, + "Failed to enable PCI access from device CPU\n"); + return rc; + } + return 0; } @@ -208,11 +233,21 @@ int hl_device_init(struct hl_device *hdev, struct class *hclass) if (rc) goto release_device; + /* +* Start calling ASIC initialization. First S/W then H/W and finally +* late init +*/ + rc = hdev->asic_funcs->sw_init(hdev); + if (rc) + goto early_fini; + dev_notice(hdev->dev, "Successfully added device to habanalabs driver\n"); return 0; +early_fini: + device_early_fini(hdev); release_device: device_destroy(hclass, hdev->dev->devt); cdev_del(>cdev); @@ -243,6 +278,9 @@ void hl_device_fini(struct hl_device *hdev) /* Mark device as disabled */ hdev->disabled = true; + /* Call ASIC S/W finalize function */ + hdev->asic_funcs->sw_fini(hdev); + device_early_fini(hdev); /* Hide device from user */ @@ -329,3 +367,36 @@ int hl_poll_timeout_device_memory(struct hl_device *hdev, void __iomem *addr, return (*val ? 0 : -ETIMEDOUT); } + +/* + * MMIO register access helper functions. + */ + +/** + * hl_rreg - Read an MMIO register + * + * @hdev: pointer to habanalabs device structure + * @reg: MMIO register offset (in bytes) + * + * Returns the value of the MMIO register we are asked to read + * + */ +inline u32 hl_rreg(struct hl_device *hdev, u32 reg) +{ + return readl(hdev->rmmio + reg); +} + +/** + * hl_wreg - Write to an MMIO register + * + * @hdev: pointer to habanalabs device structure + * @reg: MMIO register offset (in bytes) + * @val: 32-bit value + * + * Writes the 32-bit value into the MMIO register + * + */ +inline void hl_wreg(struct hl_device *hdev, u32 reg, u32 val) +{ +
[PATCH 00/15] Habana Labs kernel driver
Hello, For those who don't know me, my name is Oded Gabbay (Kernel Maintainer for AMD's amdkfd driver, worked at RedHat's Desktop group) and I work at Habana Labs since its inception two and a half years ago. Habana is a leading startup in the emerging AI processor space and we have already started production of our first Goya inference processor PCIe card and delivered it to customers. The Goya processor silicon has been tested since June of 2018 and is production-qualified by now. The Gaudi training processor solution is slated to sample in the second quarter of 2019. This patch-set contains the kernel driver for Habana's AI Processors (AIP) that are designed to accelerate Deep Learning inference and training workloads. The current version supports only the Goya processor and support for Gaudi will be upstreamed after the ASIC will be available to customers. The Goya processor has been designed from the ground up for deep learning inference workloads. It comprises a cluster of eight fully programmable Tensor Processing Cores (TPC). The TPC core is a VLIW SIMD vector processor with ISA and hardware that was tailored to serve deep learning workloads efficiently. In addition, Goya contains software-managed, on-die memory along with five separate DMA channels, a PCIe Gen4 x16 system interface and 4/8/16GB of DDR4 memory. Goya has 3 PCI bars (64-bit), which are not exposed to user-space. They map the on-chip memory and configuration space (bar 0-1), MSI-X table (bar 2-3) and DDR4 memory (bar 4-5). Each TPC engine and DMA channel has a H/W queue attached to it, called QMAN. The S/W provides command buffers to the H/W queues (through the kernel driver) and the H/W consumes the command buffers. To prevent malicious users from stealing data from other users through the Host or Device memory, Goya has an internal MMU and a security protection scheme. In addition, The kernel driver parses the command buffer and rejects it if it contains disallowed commands. The QMANs are triggered by a write to a PI (producer index) register. The QMAN H/W logic maintains a CI (consumer index) register. When PI==CI, the queue is empty. When PI+1==CI, the queue is full (note the queue is cyclic). Each entry in the H/W queue is 16-bytes, and contains a pointer and length of a variable-size command buffer, which the user fills with specific commands that the H/W logic can read and execute. For each DMA QMAN, there is a completion queue that the QMAN writes to when it finishes the execution of the command buffer. The QMAN also sends an MSI-X interrupt after writing the completion entry. Inference workloads running on Goya are associated with an address space through the ASID (address-space ID) property. Goya supports up to 1024 ASIDs. The ASID value is updated by the kernel driver in the relevant registers before scheduling a workload. During its initialization, the driver registers itself to the PCI subsystem. For each Habana PCI device found, a char device node (/dev/hlX) is created. The driver currently exposes a total of five IOCTLs. One IOCTL allows the application to submit workloads to the device, and another to wait on completion of submitted workloads. The other three IOCTLs are used for memory management, command buffer creation and information/status retrieval. In addition, the driver exposes several sensors through the hwmon subsystem and provides various system-level information in sysfs for system administrators. The first step for an application process is to open the correct hlX device it wants to work with. Calls to open create a new "context" for that application in the driver's internal structures and a unique ASID is assigned to that context. The context object lives until the process releases the file descriptor AND its command submissions have finished executing on the device. Next step is for the application to request information about the device, such as amount of DDR4 memory. The application then can go on to create command buffers for its command submissions and allocate and map device or host memory (host memory can only be mapped) to the internal device's MMU subsystem. At this point the application can load various deep learning topologies to the device DDR memory. After that, it can start to submit inference workloads using those topologies. For each workload, the the application receives a sequence number that represents the workload. The application can then query the driver regarding the status of the workload using that sequence number. In case a workload didn't finish execution after 5 seconds (configurable using a kernel module parameter) from the time it was scheduled to run, a TDR (timeout detection & recovery) event occurs in the driver. The driver will then mark that workload as "timed out", perform a minimal reset of the device (DMA and compute units only) and abort all other workloads of that context that were already submitted to the H/W queues. I would appricate any