Re: [PATCH v4 2/5] irqchip, gicv3: Workaround for Cavium ThunderX erratum 23154
On 14/08/15 19:28, Robert Richter wrote: From: Robert Richter This patch implements Cavium ThunderX erratum 23154. The gicv3 of ThunderX requires a modified version for reading the IAR status to ensure data synchronization. Since this is in the fast-path and called with each interrupt, runtime patching is used using jump label patching for smallest overhead (no-op). This is the same technique as used for tracepoints. v4: * simplify code to only use cpus_have_cap() in gicv3_enable_quirks() v3: * fix erratum to be dependend from midr * use arm64 errata framework v2: * implement code in a single asm() to keep instruction sequence * added comment to the code that explains the erratum * apply workaround also if running as guest, thus check MIDR Signed-off-by: Robert Richter --- arch/arm64/Kconfig | 11 ++ arch/arm64/include/asm/cpufeature.h | 3 ++- arch/arm64/include/asm/cputype.h| 18 +--- arch/arm64/kernel/cpu_errata.c | 9 drivers/irqchip/irq-gic-v3.c| 42 - 5 files changed, 74 insertions(+), 9 deletions(-) ... }; diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index c52f7ba205b4..4211c39b8744 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void) ... +} + static void __maybe_unused gic_write_pmr(u64 val) { asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val)); @@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = { .free = gic_irq_domain_free, }; +static void gicv3_enable_quirks(void) +{ + if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) + static_key_slow_inc(&is_cavium_thunderx); May be you could use the enable() method added to struct arm64_cpu_capability here to perform the above operation, added by James : commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d Author: James Morse Date: Tue Jul 21 13:23:28 2015 +0100 arm64: kernel: Add cpufeature 'enable' callback +} + static int __init gic_of_init(struct device_node *node, struct device_node *parent) { void __iomem *dist_base; @@ -825,6 +863,8 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare gic_data.nr_redist_regions = nr_redist_regions; gic_data.redist_stride = redist_stride; + gicv3_enable_quirks(); + than adding a hook here ? Cheers Suzuki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 0/8] Allow GFP_NOFS allocation to fail
Michal Hocko wrote: > As the VM cannot do much about these requests we should face the reality > and allow those allocations to fail. Johannes has already posted the > patch which does that (http://marc.info/?l=linux-mm&m=142726428514236&w=2) > but the discussion died pretty quickly. Addition of __GFP_NOFAIL to some locations is accepted, but otherwise this patchset seems to be stalled. > With all the patches applied none of the 4 filesystems gets aborted > transactions and RO remount (well xfs didn't need any special > treatment). This is obviously not sufficient to claim that failing > GFP_NOFS is OK now but I think it is a good start for the further > discussion. I would be grateful if FS people could have a look at those > patches. I have simply used __GFP_NOFAIL in the critical paths. This > might be not the best strategy but it sounds like a good first step. I posted my comment at https://osdn.jp/projects/tomoyo/lists/archive/users-en/2015-September/000630.html . > The third patch allows GFP_NOFS to fail and I believe it should see much > more testing coverage. It would be really great if it could sit in the > mmotm tree for few release cycles so that we can catch more fallouts. Guessing from responses to this patchset, sitting in the mmotm tree can hardly acquire testing coverage. Also, FS is not the only location that needs to be tested. If you really want to push "GFP_NOFS can fail" patch, I think you need to make a lot of effort to encourage kernel developers to test using mandatory fault injection. > Thoughts? Opinions? To me, fixing callers (adding __GFP_NORETRY to callers) in a step-by-step fashion after adding proactive countermeasure sounds better than changing the default behavior (implicitly applying __GFP_NORETRY inside). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V1] audit: add warning that an old auditd may be starved out by a new auditd
Nothing prevents a new auditd starting up and replacing a valid audit_pid when an old auditd is still running, effectively starving out the old auditd since audit_pid no longer points to the old valid auditd. There isn't an easy way to detect if an old auditd is still running on the existing audit_pid other than attempting to send a message to see if it fails. If no message to auditd has been attempted since auditd died unnaturally or got killed, audit_pid will still indicate it is alive. Signed-off-by: Richard Guy Briggs --- Note: Would it be too bold to actually block the registration of a new auditd if the netlink_getsockbyportid() call succeeded? Would other checks be appropriate? kernel/audit.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index 18cdfe2..1fa1e0d 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -872,6 +872,11 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) if (s.mask & AUDIT_STATUS_PID) { int new_pid = s.pid; + if (audit_pid && new_pid && + !IS_ERR(netlink_getsockbyportid(audit_sock, audit_nlk_portid))) + pr_warn("auditd replaced by new auditd before normal shutdown: " + "(old)audit_pid=%d (by)pid=%d new_pid=%d", + audit_pid, pid, new_pid); if ((!new_pid) && (task_tgid_vnr(current) != audit_pid)) return -EACCES; if (audit_enabled != AUDIT_OFF) -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] irqchip, gicv3-its, numa: Workaround for Cavium ThunderX erratum 23144
On 07.09.15 17:44:41, Marc Zyngier wrote: > On 25/08/15 11:18, Ganapatrao Kulkarni wrote: > > The patch below adds a workaround for gicv3 in a numa environment. It > > is on top of Robert's recent gicv3 errata patch submission v4 and my > > arm64 numa patches v5. > > > > This implements a workaround for gicv3-its erratum 23144 on Cavium's > > ThunderX dual-socket platforms, where LPI cannot be routed to a > > redistributors present on a foreign node. > > > > v2: > > updatated as per Marc Zyngier's review comments. > > > > Signed-off-by: Ganapatrao Kulkarni > > Signed-off-by: Robert Richter > > --- > > drivers/irqchip/irq-gic-v3-its.c | 53 > > +--- > > 1 file changed, 44 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/irqchip/irq-gic-v3-its.c > > b/drivers/irqchip/irq-gic-v3-its.c > > index 614a367..d3fe0a4 100644 > > --- a/drivers/irqchip/irq-gic-v3-its.c > > +++ b/drivers/irqchip/irq-gic-v3-its.c > > @@ -40,7 +40,8 @@ > > #include "irqchip.h" > > > > #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING (1ULL << 0) > > -#define ITS_FLAGS_CAVIUM_THUNDERX (1ULL << 1) > > +#define ITS_WORKAROUND_CAVIUM_22375(1ULL << 1) > > +#define ITS_WORKAROUND_CAVIUM_23144(1ULL << 2) > > Please move this to Robert's series, as it doesn't make much sense to > add a quirk flag just to modify it in the next patch. This will help > declutter this patch. I will merge the bits in and rebase and rework this one on top (we will post this separately due to dependencies to other patch sets). Thanks, -Robert -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] irqchip, gicv3-its, numa: Workaround for Cavium ThunderX erratum 23144
On 25/08/15 11:18, Ganapatrao Kulkarni wrote: > The patch below adds a workaround for gicv3 in a numa environment. It > is on top of Robert's recent gicv3 errata patch submission v4 and my > arm64 numa patches v5. > > This implements a workaround for gicv3-its erratum 23144 on Cavium's > ThunderX dual-socket platforms, where LPI cannot be routed to a > redistributors present on a foreign node. > > v2: > updatated as per Marc Zyngier's review comments. > > Signed-off-by: Ganapatrao Kulkarni > Signed-off-by: Robert Richter > --- > drivers/irqchip/irq-gic-v3-its.c | 53 > +--- > 1 file changed, 44 insertions(+), 9 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c > b/drivers/irqchip/irq-gic-v3-its.c > index 614a367..d3fe0a4 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -40,7 +40,8 @@ > #include "irqchip.h" > > #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING (1ULL << 0) > -#define ITS_FLAGS_CAVIUM_THUNDERX (1ULL << 1) > +#define ITS_WORKAROUND_CAVIUM_22375(1ULL << 1) > +#define ITS_WORKAROUND_CAVIUM_23144(1ULL << 2) Please move this to Robert's series, as it doesn't make much sense to add a quirk flag just to modify it in the next patch. This will help declutter this patch. > > #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING (1 << 0) > > @@ -73,6 +74,7 @@ struct its_node { > struct list_headits_device_list; > u64 flags; > u32 ite_size; > + int numa_node; > }; > > #define ITS_ITT_ALIGNSZ_256 > @@ -607,11 +609,20 @@ static void its_eoi_irq(struct irq_data *d) > static int its_set_affinity(struct irq_data *d, const struct cpumask > *mask_val, > bool force) > { > - unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask); > + unsigned int cpu; > + const struct cpumask *cpu_mask = cpu_online_mask; > struct its_device *its_dev = irq_data_get_irq_chip_data(d); > struct its_collection *target_col; > u32 id = its_get_event_id(d); > > + /* lpi cannot be routed to a redistributor that is on a foreign node */ > + if (its_dev->its->flags & ITS_WORKAROUND_CAVIUM_23144) { > + cpu_mask = cpumask_of_node(its_dev->its->numa_node); > + if (!cpumask_intersects(mask_val, cpu_mask)) > + return -EINVAL; > + } > + > + cpu = cpumask_any_and(mask_val, cpu_mask); > if (cpu >= nr_cpu_ids) > return -EINVAL; > > @@ -1338,9 +1349,14 @@ static void its_irq_domain_activate(struct irq_domain > *domain, > { > struct its_device *its_dev = irq_data_get_irq_chip_data(d); > u32 event = its_get_event_id(d); > + const struct cpumask *cpu_mask = cpu_online_mask; > + > + /* get the cpu_mask of local node */ > + if (IS_ENABLED(CONFIG_NUMA)) > + cpu_mask = cpumask_of_node(its_dev->its->numa_node); > > /* Bind the LPI to the first possible CPU */ > - its_dev->event_map.col_map[event] = cpumask_first(cpu_online_mask); > + its_dev->event_map.col_map[event] = cpumask_first(cpu_mask); > > /* Map the GIC IRQ and event to the device */ > its_send_mapvi(its_dev, d->hwirq, event); > @@ -1423,11 +1439,19 @@ static int its_force_quiescent(void __iomem *base) > } > } > > -static void its_enable_cavium_thunderx(void *data) > +static void its_enable_cavium_thunderx_22375(void *data) > { > struct its_node *its = data; > > - its->flags |= ITS_FLAGS_CAVIUM_THUNDERX; > + its->flags |= ITS_WORKAROUND_CAVIUM_22375; > +} > + > +static void its_enable_cavium_thunderx_23144(void *data) > +{ > + struct its_node *its = data; > + > + if (num_possible_nodes() > 1) > + its->flags |= ITS_WORKAROUND_CAVIUM_23144; > } > > static const struct gic_capabilities its_errata[] = { > @@ -1435,10 +1459,16 @@ static const struct gic_capabilities its_errata[] = { > .desc = "ITS: Cavium errata 22375, 24313", > .iidr = 0xa100034c, /* ThunderX pass 1.x */ > .mask = 0x0fff, > - .init = its_enable_cavium_thunderx, > - }, > - { > - } > + .init = its_enable_cavium_thunderx_22375, > + }, > + { > + .desc = "ITS: Cavium errata 23144", > + .iidr = 0xa100034c, /* ThunderX pass 1.x */ > + .mask = 0x0fff, > + .init = its_enable_cavium_thunderx_23144, > + }, > + { > + } > }; > > static void its_enable_quirks(struct its_node *its) > @@ -1456,6 +1486,7 @@ static int its_probe(struct device_node *node, struct > irq_domain *parent) > u32 val; > u64 baser, tmp; > int err; > + int numa_node; > > err = of_address_to_resource(node, 0, &res); > if (err) { > @@ -1463,6 +14
Re: [PATCH v4 12/20] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
On Mon, 7 Sep 2015, Julien Grall wrote: > For ARM64 guests, Linux is able to support either 64K or 4K page > granularity. Although, the hypercall interface is always based on 4K > page granularity. > > With 64K page granularity, a single page will be spread over multiple > Xen frame. > > To avoid splitting the page into 4K frame, take advantage of the > extent_order field to directly allocate/free chunk of the Linux page > size. > > Note that PVMMU is only used for PV guest (which is x86) and the page > granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure > that because the code has not been modified. > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > --- > Cc: Konrad Rzeszutek Wilk > Cc: Boris Ostrovsky > Cc: David Vrabel > Cc: Wei Liu > > Note that two BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE) in code built > for the PV MMU code is kept in order to have at least one even if we > ever decide to drop of code section. > > Changes in v4: > - s/xen_page_to_pfn/page_to_xen_pfn/ based on the new naming > - Use the field lru in the page to get a list of pages when > decreasing the memory reservation. It avoids to use a static > array to store the pages (see v3). > - Update comment for EXTENT_ORDER. > > Changes in v3: > - Fix errors reported by checkpatch.pl > - s/mfn/gfn/ based on the new naming > - Rather than splitting the page into 4KB chunk, use the > extent_order field to allocate directly a Linux page size. This > is avoid lots of code for no benefits. > > Changes in v2: > - Use xen_apply_to_page to split a page in 4K chunk > - It's not necessary to have a smaller frame list. Re-use > PAGE_SIZE > - Convert reserve_additional_memory to use XEN_... macro > --- > drivers/xen/balloon.c | 59 > ++- > 1 file changed, 44 insertions(+), 15 deletions(-) > > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index c79329f..3babf13 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -70,6 +70,11 @@ > #include > #include > > +/* Use one extent per PAGE_SIZE to avoid to break down the page into > + * multiple frame. > + */ > +#define EXTENT_ORDER (fls(XEN_PFN_PER_PAGE) - 1) > + > /* > * balloon_process() state: > * > @@ -230,6 +235,11 @@ static enum bp_state reserve_additional_memory(long > credit) > nid = memory_add_physaddr_to_nid(hotplug_start_paddr); > > #ifdef CONFIG_XEN_HAVE_PVMMU > + /* We don't support PV MMU when Linux and Xen is using > + * different page granularity. > + */ > + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); > + > /* > * add_memory() will build page tables for the new memory so > * the p2m must contain invalid entries so the correct > @@ -326,11 +336,11 @@ static enum bp_state reserve_additional_memory(long > credit) > static enum bp_state increase_reservation(unsigned long nr_pages) > { > int rc; > - unsigned long pfn, i; > + unsigned long i; > struct page *page; > struct xen_memory_reservation reservation = { > .address_bits = 0, > - .extent_order = 0, > + .extent_order = EXTENT_ORDER, > .domid= DOMID_SELF > }; > > @@ -352,7 +362,11 @@ static enum bp_state increase_reservation(unsigned long > nr_pages) > nr_pages = i; > break; > } > - frame_list[i] = page_to_pfn(page); > + > + /* XENMEM_populate_physmap requires a PFN based on Xen > + * granularity. > + */ > + frame_list[i] = page_to_xen_pfn(page); > page = balloon_next_page(page); > } > > @@ -366,10 +380,15 @@ static enum bp_state increase_reservation(unsigned long > nr_pages) > page = balloon_retrieve(false); > BUG_ON(page == NULL); > > - pfn = page_to_pfn(page); > - > #ifdef CONFIG_XEN_HAVE_PVMMU > + /* We don't support PV MMU when Linux and Xen is using > + * different page granularity. > + */ > + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); > + > if (!xen_feature(XENFEAT_auto_translated_physmap)) { > + unsigned long pfn = page_to_pfn(page); > + > set_phys_to_machine(pfn, frame_list[i]); > > /* Link back into the page tables if not highmem. */ > @@ -396,14 +415,15 @@ static enum bp_state increase_reservation(unsigned long > nr_pages) > static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > { > enum bp_state state = BP_DONE; > - unsigned long pfn, i; > - struct page *page; > + unsigned long i; > + struct page *page, *tmp; > int ret; > struct xen_memory_reserv
Fwd: Use-after-free in page_cache_async_readahead
On Thu, Sep 3, 2015 at 1:49 PM, Andrey Konovalov wrote: > On Wed, Sep 2, 2015 at 9:40 PM, Tejun Heo wrote: >> Hello, Andrey. > > Hello Tejun, > >> On Wed, Sep 02, 2015 at 01:08:52PM +0200, Andrey Konovalov wrote: >>> While running KASAN on 4.2 with Trinity I got the following report: >>> >>> == >>> BUG: KASan: use after free in page_cache_async_readahead+0x2cb/0x3f0 >>> at addr 880034bf6690 >>> Read of size 8 by task sshd/2571 >>> = >>> BUG kmalloc-16 (Tainted: GW ): kasan: bad access detected >>> - >>> >>> Disabling lock debugging due to kernel taint >>> INFO: Allocated in bdi_init+0x168/0x960 age=554826 cpu=0 pid=6 >> >> Can you please verify that the following patch fixes the issue? > > I've hit this bug only twice during 24 hours of fuzzing, so there's no > fast way to verify this. > I'll be testing with your patch now, and I'll let you know if I hit > the bug again. Hello Tejun, I haven't seen any reports while testing with your patch for the last few days, so I think it's safe to say that your patch fixes the issue. Thanks! > > Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 0/5] irqchip, gicv3: Updates and Cavium ThunderX errata workarounds
Hi Robert, On 14/08/15 19:28, Robert Richter wrote: > From: Robert Richter > > This patch series adds gicv3 updates and workarounds for HW errata in > Cavium's ThunderX GICV3. > > The first one is an unchanged resubmission of a patch from a gicv3 > series I sent a while ago. > > The next patches implement the workarounds for ThunderX's gicv3. Patch > #2 implements the cpu workaround for gicv3 on ThunderX. Patch #3 is a > prerequisit for patch #5. Patch #4 adds generic code to parse the hw > revision provided by an IIDR. This patch is used for the implementa- > tion of the actual gicv3-its workaround in #5. > > All current review comments addressed so far with v4. There has been a small number of comments on this series. Would you mind respining it so that it could make it a a 4.3-rc? Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] arm64: kernel: Use a separate stack for irq interrupts.
On Sep 8, 2015, at 1:06 AM, James Morse wrote: > On 07/09/15 16:48, Jungseok Lee wrote: >> On Sep 7, 2015, at 11:36 PM, James Morse wrote: >> >> Hi James, >> >>> Having to handle interrupts on top of an existing kernel stack means the >>> kernel stack must be large enough to accomodate both the maximum kernel >>> usage, and the maximum irq handler usage. Switching to a different stack >>> when processing irqs allows us to make the stack size smaller. >>> >>> Maximum kernel stack usage (running ltp and generating usb+ethernet >>> interrupts) was 7256 bytes. With this patch, the same workload gives >>> a maximum stack usage of 5816 bytes. >> >> I'd like to know how to measure the max stack depth. >> AFAIK, a stack tracer on ftrace does not work well. Did you dump a stack >> region and find or track down an untouched region? > > I enabled the 'Trace max stack' option under menuconfig 'Kernel Hacking' -> > 'Tracers', then looked in debugfs:/tracing/stack_max_size. > > What problems did you encounter? > (I may be missing something…) When I enabled the feature, all entries had *0* size except the last entry. It can be reproduced easily as looking in debugs:/tracing/stack_trace. You can track down my report and Akashi's changes with the following links: - http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/354126.html - https://lkml.org/lkml/2015/7/13/29 Although it is impossible to measure an exact depth at this moment, the feature could be utilized to check improvement. Cc'ing Akashi for additional comments if needed. Best Regards Jungseok Lee-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v1 2/4] irqchip: GICv3: set non-percpu irqs status with _IRQ_MOVE_PCNTXT
On 2015/9/7 22:56, Marc Zyngier wrote: > Hi Thomas, > > On 07/09/15 14:24, Thomas Gleixner wrote: >> On Mon, 7 Sep 2015, Marc Zyngier wrote: >>> On 06/09/15 06:56, Jiang Liu wrote: On 2015/9/6 12:23, Yang Yingliang wrote: > Use irq_settings_set_move_pcntxt() helper irqs status with > _IRQ_MOVE_PCNTXT. So that it can do set affinity when calling > irq_set_affinity_locked(). Hi Yingliang, We could only set _IRQ_MOVE_PCNTCT flag to enable migrating IRQ in process context if your hardware platform supports atomically change IRQ configuration. Not sure whether that's true for GICv3. If GICv3 doesn't support atomically change irq configuration, this change may cause trouble. >>> >>> I think it boils down to what exactly "process context" means here. If >>> this means "we do not need to mask the interrupt" while moving it, then >>> it should be fine (the GIC architecture guarantees that a pending >>> interrupt will be migrated). >>> >>> Is there any other requirement for this flag? >> >> The history of this flag is as follows: >> >> On x86 interrupts can only be safely migrated while the interrupt is >> handled. > > Woa! That's creative! :-) I suppose this doesn't work very well with CPU > hotplug though... X86 has special handling of this case when hot-removing a CPU. Basically, it does: 1) mask an irq 2) migrate irq to other cpus with set_affinity 3) redirect(retrigger) irq to other CPUs if it's pending on the CPU to be removed. Thanks! Gerry > >> With the introduction of IRQ remapping this requirement >> changed. Remapped interrupts can be migrated in any context. >> >> If you look at irq_set_affinity_locked() >> >>if (irq_can_move_pcntxt(data) { >> irq_do_set_affinity(data,...) >> chip->irq_set_affinity(data,...); >>} else { >> irqd_set_move_pending(data); >>} >> >> So if IRQ_MOVE_PCNTXT is not set, we handle the migration of the >> interrupt from next the interrupt. If it's set set_affinity() is >> called right away. > > OK, that is now starting to make more sense. > >> All architectures which do not select GENERIC_PENDING_IRQ are using >> the direct method. > > Right. On ARM, only the direct method makes sense so far (we have no > constraint such as the one you describe above). > > So I wonder why we bother introducing the IRQ_MOVE_PCNTXT flag on ARM at > all. Is that just because migration.c is only compiled when > GENERIC_PENDING_IRQ is set? > > Thanks, > > M. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 5/5] irqchip, gicv3-its: Workaround for Cavium ThunderX errata 22375, 24313
On 14/08/15 19:28, Robert Richter wrote: > From: Robert Richter > > This implements two gicv3-its errata workarounds for ThunderX. Both > with small impact affecting only ITS table allocation. > > erratum 22375: only alloc 8MB table size > erratum 24313: ignore memory access type > > The fixes are in ITS initialization and basically ignore memory access > type and table size provided by the TYPER and BASER registers. > > v3: > * fix erratum to be dependend from iidr > > Signed-off-by: Robert Richter > --- > drivers/irqchip/irq-gic-v3-its.c | 35 +++ > 1 file changed, 31 insertions(+), 4 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c > b/drivers/irqchip/irq-gic-v3-its.c > index 697421e834ee..30459df2ee2c 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -39,7 +39,8 @@ > #include "irq-gic-common.h" > #include "irqchip.h" > > -#define ITS_FLAGS_CMDQ_NEEDS_FLUSHING(1 << 0) > +#define ITS_FLAGS_CMDQ_NEEDS_FLUSHING(1ULL << 0) > +#define ITS_FLAGS_CAVIUM_THUNDERX(1ULL << 1) I think you might need something slightly more explicit, as I'd expect some ulterior revision of ThunderX to be eventually fixed... ITS_FLAGS_THUNDERX_BOGUS_TYPER? Or something based on the errata numbers? > > #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING (1 << 0) > > @@ -803,9 +804,22 @@ static int its_alloc_tables(struct its_node *its) > int i; > int psz = SZ_64K; > u64 shr = GITS_BASER_InnerShareable; > - u64 cache = GITS_BASER_WaWb; > - u64 typer = readq_relaxed(its->base + GITS_TYPER); > - u32 ids = GITS_TYPER_DEVBITS(typer); > + u64 cache; > + u64 typer; > + u32 ids; > + > + if (its->flags & ITS_FLAGS_CAVIUM_THUNDERX) { > + /* > + * erratum 22375: only alloc 8MB table size > + * erratum 24313: ignore memory access type > + */ > + cache = 0; > + ids = 0x13; /* 20 bits, 8MB */ > + } else { You can move the typer definition here, as it is only used here. > + cache = GITS_BASER_WaWb; > + typer = readq_relaxed(its->base + GITS_TYPER); > + ids = GITS_TYPER_DEVBITS(typer); > + } > > for (i = 0; i < GITS_BASER_NR_REGS; i++) { > u64 val = readq_relaxed(its->base + GITS_BASER + i * 8); > @@ -1391,8 +1405,21 @@ static int its_force_quiescent(void __iomem *base) > } > } > > +static void its_enable_cavium_thunderx(void *data) > +{ > + struct its_node *its = data; > + > + its->flags |= ITS_FLAGS_CAVIUM_THUNDERX; > +} > + > static const struct gic_capabilities its_errata[] = { > { > + .desc = "ITS: Cavium errata 22375, 24313", > + .iidr = 0xa100034c, /* ThunderX pass 1.x */ > + .mask = 0x0fff, > + .init = its_enable_cavium_thunderx, > + }, > + { > } > }; > > Otherwise looks OK to me. Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] ASoC: atmel-classd: DT binding for Class D audio amplifier driver
On Sun, Sep 06, 2015 at 05:44:30PM +0800, Wu, Songjun wrote: > On 9/3/2015 19:43, Mark Brown wrote: > >Why is this a separate DT node? It seems that this IP is entirely self > >contained so I'm not clear why we need a separate node for the card, the > >card is usually a separate node because it ties together multiple > >different devices in the system but that's not the case here. > The classD can finish the audio function without other devices. > But I want to reuse the code in ASoC, leave many things(like creating PCM, > DMA operations) to ASoC, then the driver can only focus on how to configure > classD. > The classD IP is divided to tree parts logically, platform, CPU dai, > and codec, and these parts are registered to ASoC. > This separate DT node is needed in ASoC, ties these tree parts in ClassD. Sure, there's no problem at all having that structure in software but it should be possible to do this without having to represent this structure in DT. It should be possible to register the card at the same time as the rest of the components rather than needing the separate device in the DT. signature.asc Description: Digital signature
Re: [PATCH v4 4/5] irqchip, gicv3-its: Add HW revision detection and configuration
Hi Robert, On 14/08/15 19:28, Robert Richter wrote: > From: Robert Richter > > Some GIC revisions require an individual configuration to esp. add > workarounds for HW bugs. This patch implements generic code to parse > the hw revision provided by an IIDR register value and runs specific > code if hw matches. There are functions that read the IIDR registers > for GICV3 and ITS (GICD_IIDR/GITS_IIDR) and then go through a list of > init functions to be called for specific versions. > > A MIDR register value may also be used, this is especially useful for > hw detection from a guest. I don't think this sentence is relevant anymore. > > The patch is needed to implement workarounds for HW errata in Cavium's > ThunderX GICV3. > > v4: > * only enable hw detection for its in its_enable_quirks() > * removed gicv3_check_capabilities() > > v3: > * use arm64 errata framework for midr check > > v2: > * adding MIDR check > > Signed-off-by: Robert Richter > --- > drivers/irqchip/irq-gic-common.c | 11 +++ > drivers/irqchip/irq-gic-common.h | 9 + > drivers/irqchip/irq-gic-v3-its.c | 15 +++ > 3 files changed, 35 insertions(+) > > diff --git a/drivers/irqchip/irq-gic-common.c > b/drivers/irqchip/irq-gic-common.c > index 9448e391cb71..ee789b07f2d1 100644 > --- a/drivers/irqchip/irq-gic-common.c > +++ b/drivers/irqchip/irq-gic-common.c > @@ -21,6 +21,17 @@ > > #include "irq-gic-common.h" > > +void gic_check_capabilities(u32 iidr, const struct gic_capabilities *cap, > + void *data) Let's call a duck a duck, and replace all occurrences of capabilit{y,ies} with "quirk". > +{ > + for (; cap->desc; cap++) { > + if (cap->iidr != (cap->mask & iidr)) > + continue; > + cap->init(data); > + pr_info("%s\n", cap->desc); > + } > +} > + > int gic_configure_irq(unsigned int irq, unsigned int type, > void __iomem *base, void (*sync_access)(void)) > { > diff --git a/drivers/irqchip/irq-gic-common.h > b/drivers/irqchip/irq-gic-common.h > index 35a9884778bd..ca12635bbe3c 100644 > --- a/drivers/irqchip/irq-gic-common.h > +++ b/drivers/irqchip/irq-gic-common.h > @@ -20,10 +20,19 @@ > #include > #include > > +struct gic_capabilities { > + const char *desc; > + void (*init)(void *data); > + u32 iidr; > + u32 mask; > +}; > + > int gic_configure_irq(unsigned int irq, unsigned int type, > void __iomem *base, void (*sync_access)(void)); > void gic_dist_config(void __iomem *base, int gic_irqs, >void (*sync_access)(void)); > void gic_cpu_config(void __iomem *base, void (*sync_access)(void)); > +void gic_check_capabilities(u32 iidr, const struct gic_capabilities *cap, > + void *data); > > #endif /* _IRQ_GIC_COMMON_H */ > diff --git a/drivers/irqchip/irq-gic-v3-its.c > b/drivers/irqchip/irq-gic-v3-its.c > index 06131db7a198..697421e834ee 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -36,6 +36,7 @@ > #include > #include > > +#include "irq-gic-common.h" > #include "irqchip.h" > > #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING(1 << 0) > @@ -1390,6 +1391,18 @@ static int its_force_quiescent(void __iomem *base) > } > } > > +static const struct gic_capabilities its_errata[] = { > + { > + } > +}; > + > +static void its_enable_quirks(struct its_node *its) > +{ > + u32 iidr = readl_relaxed(its->base + GITS_IIDR); > + > + gic_check_capabilities(iidr, its_errata, its); > +} > + > static int its_probe(struct device_node *node, struct irq_domain *parent) > { > struct resource res; > @@ -1448,6 +1461,8 @@ static int its_probe(struct device_node *node, struct > irq_domain *parent) > } > its->cmd_write = its->cmd_base; > > + its_enable_quirks(its); > + > err = its_alloc_tables(its); > if (err) > goto out_free_cmd; > Otherwise looks good to me. M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ARM: dts: omap3-igep: Move eth IRQ pinmux to IGEPv2 common dtsi
Only the IGEPv2 boards have a LAN9221i chip connected to the GPMC so the pinmux configuration for the GPIO connected to the IRQ line of the LAN chip should not be defined in the IGEP common dtsi but in the one common to the IGEPv2 boards. While there, use the OMAP3_CORE1_IOPAD() macro for the padconf reg. Suggested-by: Ladislav Michl Signed-off-by: Javier Martinez Canillas --- arch/arm/boot/dts/omap3-igep.dtsi| 6 -- arch/arm/boot/dts/omap3-igep0020-common.dtsi | 6 ++ 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arm/boot/dts/omap3-igep.dtsi b/arch/arm/boot/dts/omap3-igep.dtsi index d5e5cd449b16..2230e1c03320 100644 --- a/arch/arm/boot/dts/omap3-igep.dtsi +++ b/arch/arm/boot/dts/omap3-igep.dtsi @@ -78,12 +78,6 @@ >; }; - smsc9221_pins: pinmux_smsc9221_pins { - pinctrl-single,pins = < - 0x1a2 (PIN_INPUT | MUX_MODE4) /* mcspi1_cs2.gpio_176 */ - >; - }; - i2c1_pins: pinmux_i2c1_pins { pinctrl-single,pins = < 0x18a (PIN_INPUT | MUX_MODE0) /* i2c1_scl.i2c1_scl */ diff --git a/arch/arm/boot/dts/omap3-igep0020-common.dtsi b/arch/arm/boot/dts/omap3-igep0020-common.dtsi index e458c2185e3c..5ad688c57a00 100644 --- a/arch/arm/boot/dts/omap3-igep0020-common.dtsi +++ b/arch/arm/boot/dts/omap3-igep0020-common.dtsi @@ -156,6 +156,12 @@ OMAP3_CORE1_IOPAD(0x217a, PIN_INPUT | MUX_MODE0) /* uart2_rx.uart2_rx */ >; }; + + smsc9221_pins: pinmux_smsc9221_pins { + pinctrl-single,pins = < + OMAP3_CORE1_IOPAD(0x21d2, PIN_INPUT | MUX_MODE4) /* mcspi1_cs2.gpio_176 */ + >; + }; }; &omap3_pmx_core2 { -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 2/2] leds: leds-ipaq-micro: Fix coding style issues
Spaces at the starting of a line are removed, indentation using tab, instead of space. Also, line width of more than 80 characters is also taken care of. Two warnings are left alone to aid better readability. Signed-off-by: Muhammad Falak R Wani --- drivers/leds/leds-ipaq-micro.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/leds/leds-ipaq-micro.c b/drivers/leds/leds-ipaq-micro.c index 1206215..fa262b6 100644 --- a/drivers/leds/leds-ipaq-micro.c +++ b/drivers/leds/leds-ipaq-micro.c @@ -16,9 +16,9 @@ #define LED_YELLOW 0x00 #define LED_GREEN 0x01 -#define LED_EN (1 << 4)/* LED ON/OFF 0:off, 1:on */ -#define LED_AUTOSTOP(1 << 5)/* LED ON/OFF auto stop set 0:disable, 1:enable */ -#define LED_ALWAYS (1 << 6)/* LED Interrupt Mask 0:No mask, 1:mask */ +#define LED_EN (1 << 4) /* LED ON/OFF 0:off, 1:on */ +#define LED_AUTOSTOP (1 << 5) /* LED ON/OFF auto stop set 0:disable, 1:enable */ +#define LED_ALWAYS (1 << 6) /* LED Interrupt Mask 0:No mask, 1:mask */ static void micro_leds_brightness_set(struct led_classdev *led_cdev, enum led_brightness value) @@ -79,14 +79,14 @@ static int micro_leds_blink_set(struct led_classdev *led_cdev, }; msg.tx_data[0] = LED_GREEN; -if (*delay_on > IPAQ_LED_MAX_DUTY || + if (*delay_on > IPAQ_LED_MAX_DUTY || *delay_off > IPAQ_LED_MAX_DUTY) -return -EINVAL; + return -EINVAL; -if (*delay_on == 0 && *delay_off == 0) { -*delay_on = 100; -*delay_off = 100; -} + if (*delay_on == 0 && *delay_off == 0) { + *delay_on = 100; + *delay_off = 100; + } msg.tx_data[1] = 0; if (*delay_on >= IPAQ_LED_MAX_DUTY) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ASoC: atmel-classd: add the Audio Class D Amplifier code
On Sun, Sep 06, 2015 at 05:44:21PM +0800, Wu, Songjun wrote: > On 9/3/2015 19:37, Mark Brown wrote: > >On Tue, Sep 01, 2015 at 01:41:40PM +0800, Songjun Wu wrote: > >>+static const char * const eqcfg_bass_text[] = { > >>+ "-12 dB", "-6 dB", "0 dB", "+6 dB", "+12 dB" > >>+}; > > > >>+static const unsigned int eqcfg_bass_value[] = { > >>+ CLASSD_INTPMR_EQCFG_B_CUT_12, > >>+ CLASSD_INTPMR_EQCFG_B_CUT_6, CLASSD_INTPMR_EQCFG_FLAT, > >>+ CLASSD_INTPMR_EQCFG_B_BOOST_6, CLASSD_INTPMR_EQCFG_B_BOOST_12 > >>+}; > >This should be a Volume control with TLV information, as should the > >following few controls. > The Volume control with TLV information is not suitable for this case. > Bass, Medium, and treble are mutually exclusive. > So I think the SOC_ENUM control is suitable for this case. > The register layout is not very good, > The register is defined as below. > • EQCFG: Equalization Selection > Value Name Description > 0 FLAT Flat Response > 1 BBOOST12 Bass boost +12 dB > 2 BBOOST6Bass boost +6 dB > 3 BCUT12 Bass cut -12 dB > 4 BCUT6 Bass cut -6 dB > 5 MBOOST3Medium boost +3 dB > 6 MBOOST8Medium boost +8 dB > 7 MCUT3 Medium cut -3 dB > 8 MCUT8 Medium cut -8 dB > 9 TBOOST12 Treble boost +12 dB > 10TBOOST6Treble boost +6 dB > 11TCUT12 Treble cut -12 dB > 12TCUT6 Treble cut -6 dB OK, so that's not actually what the code was doing - it had separate enums for bass, mid and treble. If you make this a single enum with all the above options in it that seems like the best way of handling things. > >>+static const struct snd_kcontrol_new atmel_classd_snd_controls[] = { > >>+SOC_SINGLE_TLV("Left Volume", CLASSD_INTPMR, > >>+ CLASSD_INTPMR_ATTL_SHIFT, 78, 1, classd_digital_tlv), > >>+ > >>+SOC_SINGLE_TLV("Right Volume", CLASSD_INTPMR, > >>+ CLASSD_INTPMR_ATTR_SHIFT, 78, 1, classd_digital_tlv), > > > >This should be a single stereo control rather than separate left and > >right controls. > Since the classD IP defines two register fields to control left volume and > right volume respectively, I think it's better to provide two controls to > user. No, this is really common, we combine them in Linux to present a consistent interface to userspace. > >>+ dev_info(dev, > >>+ "Atmel Class D Amplifier (CLASSD) device at 0x%p (irq %d)\n", > >>+ io_base, dd->irq); > >This is a bit noisy and not really based on interaction with the > >hardware... dev_dbg() seems better. > This information will occur only once when linux kernel starts. > It shows the classD is loaded to linux kernel. > I think it's better to provide more information to user. This stuff all adds up and since it'll go out on the console by default it both makes things more noisy and slows down boot - printing on the serial port isn't free. If we want to have this sort of information we printed we should really do it in the driver core so it appears consistently for all devices rather than having individual code in each driver. signature.asc Description: Digital signature
Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig
On 7 September 2015 at 17:37, Dietmar Eggemann wrote: > On 04/09/15 00:51, Steve Muckle wrote: >> Hi Morten, Dietmar, >> >> On 08/14/2015 09:23 AM, Morten Rasmussen wrote: >> ... >>> + * cfs_rq.avg.util_avg is the sum of running time of runnable tasks plus >>> the >>> + * recent utilization of currently non-runnable tasks on a CPU. It >>> represents >>> + * the amount of utilization of a CPU in the range [0..capacity_orig] where >> >> I see util_sum is scaled by SCHED_LOAD_SHIFT at the end of >> __update_load_avg(). If there is now an assumption that util_avg may be >> used directly as a capacity value, should it be changed to >> SCHED_CAPACITY_SHIFT? These are equal right now, not sure if they will >> always be or if they can be combined. > > You're referring to the code line > > 2647 sa->util_avg = (sa->util_sum << SCHED_LOAD_SHIFT) / LOAD_AVG_MAX; > > in __update_load_avg()? > > Here we actually scale by 'SCHED_LOAD_SCALE/LOAD_AVG_MAX' so both values are > load related. I agree with Steve that there is an issue from a unit point of view sa->util_sum and LOAD_AVG_MAX have the same unit so sa->util_avg is a load because of << SCHED_LOAD_SHIFT) Before this patch , the translation from load to capacity unit was done in get_cpu_usage with "* capacity) >> SCHED_LOAD_SHIFT" So you still have to change the unit from load to capacity with a "/ SCHED_LOAD_SCALE * SCHED_CAPACITY_SCALE" somewhere. sa->util_avg = ((sa->util_sum << SCHED_LOAD_SHIFT) /SCHED_LOAD_SCALE * SCHED_CAPACITY_SCALE / LOAD_AVG_MAX = (sa->util_sum << SCHED_CAPACITY_SHIFT) / LOAD_AVG_MAX; Regards, Vincent > > LOAD (UTIL) and CAPACITY have the same SCALE and SHIFT values because > SCHED_LOAD_RESOLUTION is always defined to 0. scale_load() and > scale_load_down() are also NOPs so this area is probably > worth a separate clean-up. > Beyond that, I'm not sure if the current functionality is > broken if we use different SCALE and SHIFT values for LOAD and CAPACITY? > >> >>> + * capacity_orig is the cpu_capacity available at * the highest frequency >> >> spurious * >> >> thanks, >> Steve >> > > Fixed. > > Thanks, > > -- Dietmar > > -- >8 -- > > From: Dietmar Eggemann > Date: Fri, 14 Aug 2015 17:23:13 +0100 > Subject: [PATCH] sched/fair: Get rid of scaling utilization by capacity_orig > > Utilization is currently scaled by capacity_orig, but since we now have > frequency and cpu invariant cfs_rq.avg.util_avg, frequency and cpu scaling > now happens as part of the utilization tracking itself. > So cfs_rq.avg.util_avg should no longer be scaled in cpu_util(). > > Cc: Ingo Molnar > Cc: Peter Zijlstra > Signed-off-by: Dietmar Eggemann > Signed-off-by: Morten Rasmussen > --- > kernel/sched/fair.c | 38 ++ > 1 file changed, 22 insertions(+), 16 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 2074d45a67c2..a73ece2372f5 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4824,33 +4824,39 @@ static int select_idle_sibling(struct task_struct *p, > int target) > done: > return target; > } > + > /* > * cpu_util returns the amount of capacity of a CPU that is used by CFS > * tasks. The unit of the return value must be the one of capacity so we can > * compare the utilization with the capacity of the CPU that is available for > * CFS task (ie cpu_capacity). > - * cfs.avg.util_avg is the sum of running time of runnable tasks on a > - * CPU. It represents the amount of utilization of a CPU in the range > - * [0..SCHED_LOAD_SCALE]. The utilization of a CPU can't be higher than the > - * full capacity of the CPU because it's about the running time on this CPU. > - * Nevertheless, cfs.avg.util_avg can be higher than SCHED_LOAD_SCALE > - * because of unfortunate rounding in util_avg or just > - * after migrating tasks until the average stabilizes with the new running > - * time. So we need to check that the utilization stays into the range > - * [0..cpu_capacity_orig] and cap if necessary. > - * Without capping the utilization, a group could be seen as overloaded (CPU0 > - * utilization at 121% + CPU1 utilization at 80%) whereas CPU1 has 20% of > - * available capacity. > + * > + * cfs_rq.avg.util_avg is the sum of running time of runnable tasks plus the > + * recent utilization of currently non-runnable tasks on a CPU. It represents > + * the amount of utilization of a CPU in the range [0..capacity_orig] where > + * capacity_orig is the cpu_capacity available at the highest frequency > + * (arch_scale_freq_capacity()). > + * The utilization of a CPU converges towards a sum equal to or less than the > + * current capacity (capacity_curr <= capacity_orig) of the CPU because it is > + * the running time on this CPU scaled by capacity_curr. > + * > + * Nevertheless, cfs_rq.avg.util_avg can be higher than capacity_curr or even > + * higher than capacity_orig because of unfortunate rounding in > + * cfs.avg.util_avg or just after migrating tasks and new
[PATCH v2 1/9] [picked] powerpc: allocate sys_membarrier system call number
Allow it to be used from SPU, since it should not have unwanted side-effects. [ Picked-by: Michael Ellerman ] Signed-off-by: Mathieu Desnoyers CC: Andrew Morton CC: linux-...@vger.kernel.org CC: Benjamin Herrenschmidt CC: Paul Mackerras CC: Michael Ellerman CC: linuxppc-...@lists.ozlabs.org --- arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h index 4d65499..126d0c4 100644 --- a/arch/powerpc/include/asm/systbl.h +++ b/arch/powerpc/include/asm/systbl.h @@ -369,3 +369,4 @@ SYSCALL_SPU(bpf) COMPAT_SYS(execveat) PPC64ONLY(switch_endian) SYSCALL_SPU(userfaultfd) +SYSCALL_SPU(membarrier) diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h index 4a055b6..13411be 100644 --- a/arch/powerpc/include/asm/unistd.h +++ b/arch/powerpc/include/asm/unistd.h @@ -12,7 +12,7 @@ #include -#define __NR_syscalls 365 +#define __NR_syscalls 366 #define __NR__exit __NR_exit #define NR_syscalls__NR_syscalls diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h index 6ad58d4..6337738 100644 --- a/arch/powerpc/include/uapi/asm/unistd.h +++ b/arch/powerpc/include/uapi/asm/unistd.h @@ -387,5 +387,6 @@ #define __NR_execveat 362 #define __NR_switch_endian 363 #define __NR_userfaultfd 364 +#define __NR_membarrier365 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */ -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/9] sparc/sparc64: allocate sys_membarrier system call number
Signed-off-by: Mathieu Desnoyers Acked-by: "David S. Miller" CC: Andrew Morton CC: linux-...@vger.kernel.org CC: sparcli...@vger.kernel.org --- arch/sparc/include/uapi/asm/unistd.h | 3 ++- arch/sparc/kernel/systbls_32.S | 2 +- arch/sparc/kernel/systbls_64.S | 4 ++-- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/sparc/include/uapi/asm/unistd.h b/arch/sparc/include/uapi/asm/unistd.h index 6f35f4d..efe9479 100644 --- a/arch/sparc/include/uapi/asm/unistd.h +++ b/arch/sparc/include/uapi/asm/unistd.h @@ -416,8 +416,9 @@ #define __NR_memfd_create 348 #define __NR_bpf 349 #define __NR_execveat 350 +#define __NR_membarrier351 -#define NR_syscalls351 +#define NR_syscalls352 /* Bitmask values returned from kern_features system call. */ #define KERN_FEATURE_MIXED_MODE_STACK 0x0001 diff --git a/arch/sparc/kernel/systbls_32.S b/arch/sparc/kernel/systbls_32.S index e31a905..cc23b62 100644 --- a/arch/sparc/kernel/systbls_32.S +++ b/arch/sparc/kernel/systbls_32.S @@ -87,4 +87,4 @@ sys_call_table: /*335*/.long sys_syncfs, sys_sendmmsg, sys_setns, sys_process_vm_readv, sys_process_vm_writev /*340*/.long sys_ni_syscall, sys_kcmp, sys_finit_module, sys_sched_setattr, sys_sched_getattr /*345*/.long sys_renameat2, sys_seccomp, sys_getrandom, sys_memfd_create, sys_bpf -/*350*/.long sys_execveat +/*350*/.long sys_execveat, sys_membarrier diff --git a/arch/sparc/kernel/systbls_64.S b/arch/sparc/kernel/systbls_64.S index d72f76a..f229468 100644 --- a/arch/sparc/kernel/systbls_64.S +++ b/arch/sparc/kernel/systbls_64.S @@ -88,7 +88,7 @@ sys_call_table32: .word sys_syncfs, compat_sys_sendmmsg, sys_setns, compat_sys_process_vm_readv, compat_sys_process_vm_writev /*340*/.word sys_kern_features, sys_kcmp, sys_finit_module, sys_sched_setattr, sys_sched_getattr .word sys32_renameat2, sys_seccomp, sys_getrandom, sys_memfd_create, sys_bpf -/*350*/.word sys32_execveat +/*350*/.word sys32_execveat, sys_membarrier #endif /* CONFIG_COMPAT */ @@ -168,4 +168,4 @@ sys_call_table: .word sys_syncfs, sys_sendmmsg, sys_setns, sys_process_vm_readv, sys_process_vm_writev /*340*/.word sys_kern_features, sys_kcmp, sys_finit_module, sys_sched_setattr, sys_sched_getattr .word sys_renameat2, sys_seccomp, sys_getrandom, sys_memfd_create, sys_bpf -/*350*/.word sys64_execveat +/*350*/.word sys64_execveat, sys_membarrier -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v2 5/9] alpha: allocate sys_membarrier system call number
[ Untested on this architecture. To try it out: fetch linux-next/akpm, apply this patch, build/run a membarrier-enabled kernel, and do make kselftest. ] Signed-off-by: Mathieu Desnoyers CC: Andrew Morton CC: linux-...@vger.kernel.org CC: Richard Henderson CC: Ivan Kokshaysky CC: Matt Turner CC: linux-al...@vger.kernel.org --- arch/alpha/include/asm/unistd.h | 2 +- arch/alpha/include/uapi/asm/unistd.h | 1 + arch/alpha/kernel/systbls.S | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/alpha/include/asm/unistd.h b/arch/alpha/include/asm/unistd.h index a56e608..07aa4ca 100644 --- a/arch/alpha/include/asm/unistd.h +++ b/arch/alpha/include/asm/unistd.h @@ -3,7 +3,7 @@ #include -#define NR_SYSCALLS514 +#define NR_SYSCALLS515 #define __ARCH_WANT_OLD_READDIR #define __ARCH_WANT_STAT64 diff --git a/arch/alpha/include/uapi/asm/unistd.h b/arch/alpha/include/uapi/asm/unistd.h index aa33bf5..7725619 100644 --- a/arch/alpha/include/uapi/asm/unistd.h +++ b/arch/alpha/include/uapi/asm/unistd.h @@ -475,5 +475,6 @@ #define __NR_getrandom 511 #define __NR_memfd_create 512 #define __NR_execveat 513 +#define __NR_membarrier514 #endif /* _UAPI_ALPHA_UNISTD_H */ diff --git a/arch/alpha/kernel/systbls.S b/arch/alpha/kernel/systbls.S index 9b62e3f..1ea64f4 100644 --- a/arch/alpha/kernel/systbls.S +++ b/arch/alpha/kernel/systbls.S @@ -532,6 +532,7 @@ sys_call_table: .quad sys_getrandom .quad sys_memfd_create .quad sys_execveat + .quad sys_membarrier .size sys_call_table, . - sys_call_table .type sys_call_table, @object -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/9] parisc: allocate sys_membarrier system call number
Signed-off-by: Mathieu Desnoyers Tested-by: Helge Deller CC: Andrew Morton CC: linux-...@vger.kernel.org CC: "James E.J. Bottomley" CC: linux-par...@vger.kernel.org --- arch/parisc/include/uapi/asm/unistd.h | 3 ++- arch/parisc/kernel/syscall_table.S| 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/parisc/include/uapi/asm/unistd.h b/arch/parisc/include/uapi/asm/unistd.h index 2e639d7..dadcada 100644 --- a/arch/parisc/include/uapi/asm/unistd.h +++ b/arch/parisc/include/uapi/asm/unistd.h @@ -358,8 +358,9 @@ #define __NR_memfd_create (__NR_Linux + 340) #define __NR_bpf (__NR_Linux + 341) #define __NR_execveat (__NR_Linux + 342) +#define __NR_membarrier(__NR_Linux + 343) -#define __NR_Linux_syscalls(__NR_execveat + 1) +#define __NR_Linux_syscalls(__NR_membarrier + 1) #define __IGNORE_select/* newselect */ diff --git a/arch/parisc/kernel/syscall_table.S b/arch/parisc/kernel/syscall_table.S index 8eefb12..4e77991 100644 --- a/arch/parisc/kernel/syscall_table.S +++ b/arch/parisc/kernel/syscall_table.S @@ -438,6 +438,7 @@ ENTRY_SAME(memfd_create)/* 340 */ ENTRY_SAME(bpf) ENTRY_COMP(execveat) + ENTRY_SAME(membarrier) .ifne (. - 90b) - (__NR_Linux_syscalls * (91b - 90b)) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v2 9/9] s390/s390x: allocate sys_membarrier system call number
[ Untested on this architecture. To try it out: fetch linux-next/akpm, apply this patch, build/run a membarrier-enabled kernel, and do make kselftest. ] Signed-off-by: Mathieu Desnoyers CC: Andrew Morton CC: linux-...@vger.kernel.org CC: Martin Schwidefsky CC: Heiko Carstens CC: linux-s...@vger.kernel.org --- arch/s390/include/uapi/asm/unistd.h | 3 ++- arch/s390/kernel/syscalls.S | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/s390/include/uapi/asm/unistd.h b/arch/s390/include/uapi/asm/unistd.h index 59d2bb4..2f1de70 100644 --- a/arch/s390/include/uapi/asm/unistd.h +++ b/arch/s390/include/uapi/asm/unistd.h @@ -290,7 +290,8 @@ #define __NR_s390_pci_mmio_write 352 #define __NR_s390_pci_mmio_read353 #define __NR_execveat 354 -#define NR_syscalls 355 +#define __NR_membarrier355 +#define NR_syscalls 356 /* * There are some system calls that are not present on 64 bit, some diff --git a/arch/s390/kernel/syscalls.S b/arch/s390/kernel/syscalls.S index f3f4a13..914c098 100644 --- a/arch/s390/kernel/syscalls.S +++ b/arch/s390/kernel/syscalls.S @@ -363,3 +363,4 @@ SYSCALL(sys_bpf,compat_sys_bpf) SYSCALL(sys_s390_pci_mmio_write,compat_sys_s390_pci_mmio_write) SYSCALL(sys_s390_pci_mmio_read,compat_sys_s390_pci_mmio_read) SYSCALL(sys_execveat,compat_sys_execveat) +SYSCALL(sys_membarrier,sys_membarrier) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v2 7/9] arm64: allocate sys_membarrier system call number
arm64 sys_membarrier number is already wired for arm64 through asm-generic/unistd.h, but needs to be allocated separately for the 32-bit compability layer of arm64. [ Untested on this architecture. To try it out: fetch linux-next/akpm, apply this patch, build/run a membarrier-enabled kernel, and do make kselftest. ] Signed-off-by: Mathieu Desnoyers CC: Andrew Morton CC: linux-...@vger.kernel.org CC: Catalin Marinas CC: Will Deacon --- arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 3bc498c..e70f7e7 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -44,7 +44,7 @@ #define __ARM_NR_compat_cacheflush (__ARM_NR_COMPAT_BASE+2) #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE+5) -#define __NR_compat_syscalls 388 +#define __NR_compat_syscalls 389 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index cef934a..d97be80 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -797,3 +797,5 @@ __SYSCALL(__NR_memfd_create, sys_memfd_create) __SYSCALL(__NR_bpf, sys_bpf) #define __NR_execveat 387 __SYSCALL(__NR_execveat, compat_sys_execveat) +#define __NR_membarrier 388 +__SYSCALL(__NR_membarrier, sys_membarrier) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v2 8/9] ia64: allocate sys_membarrier system call number
[ Untested on this architecture. To try it out: fetch linux-next/akpm, apply this patch, build/run a membarrier-enabled kernel, and do make kselftest. ] Signed-off-by: Mathieu Desnoyers CC: Andrew Morton CC: linux-...@vger.kernel.org CC: Tony Luck CC: Fenghua Yu CC: linux-i...@vger.kernel.org --- arch/ia64/include/asm/unistd.h | 2 +- arch/ia64/include/uapi/asm/unistd.h | 1 + arch/ia64/kernel/entry.S| 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/ia64/include/asm/unistd.h b/arch/ia64/include/asm/unistd.h index 95c39b9..1d54e17 100644 --- a/arch/ia64/include/asm/unistd.h +++ b/arch/ia64/include/asm/unistd.h @@ -11,7 +11,7 @@ -#define NR_syscalls319 /* length of syscall table */ +#define NR_syscalls320 /* length of syscall table */ /* * The following defines stop scripts/checksyscalls.sh from complaining about diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index 4610795..b7aae55 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -332,5 +332,6 @@ #define __NR_memfd_create 1340 #define __NR_bpf 1341 #define __NR_execveat 1342 +#define __NR_membarrier1343 #endif /* _UAPI_ASM_IA64_UNISTD_H */ diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S index ae0de7b..1ce01f9 100644 --- a/arch/ia64/kernel/entry.S +++ b/arch/ia64/kernel/entry.S @@ -1768,5 +1768,6 @@ sys_call_table: data8 sys_memfd_create // 1340 data8 sys_bpf data8 sys_execveat + data8 sys_membarrier .org sys_call_table + 8*NR_syscalls // guard against failures to increase NR_syscalls -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v2 6/9] arm: allocate sys_membarrier system call number
[ Untested on this architecture. To try it out: fetch linux-next/akpm, apply this patch, build/run a membarrier-enabled kernel, and do make kselftest. ] Signed-off-by: Mathieu Desnoyers CC: Andrew Morton CC: linux-...@vger.kernel.org CC: Russell King --- arch/arm/include/asm/unistd.h | 2 +- arch/arm/include/uapi/asm/unistd.h | 1 + arch/arm/kernel/calls.S| 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h index 32640c4..d93876c 100644 --- a/arch/arm/include/asm/unistd.h +++ b/arch/arm/include/asm/unistd.h @@ -19,7 +19,7 @@ * This may need to be greater than __NR_last_syscall+1 in order to * account for the padding in the syscall table */ -#define __NR_syscalls (388) +#define __NR_syscalls (389) /* * *NOTE*: This is a ghost syscall private to the kernel. Only the diff --git a/arch/arm/include/uapi/asm/unistd.h b/arch/arm/include/uapi/asm/unistd.h index 0c3f5a0..436bb32 100644 --- a/arch/arm/include/uapi/asm/unistd.h +++ b/arch/arm/include/uapi/asm/unistd.h @@ -414,6 +414,7 @@ #define __NR_memfd_create (__NR_SYSCALL_BASE+385) #define __NR_bpf (__NR_SYSCALL_BASE+386) #define __NR_execveat (__NR_SYSCALL_BASE+387) +#define __NR_membarrier(__NR_SYSCALL_BASE+388) /* * The following SWIs are ARM private. diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S index 05745eb..310699c 100644 --- a/arch/arm/kernel/calls.S +++ b/arch/arm/kernel/calls.S @@ -397,6 +397,7 @@ /* 385 */ CALL(sys_memfd_create) CALL(sys_bpf) CALL(sys_execveat) + CALL(sys_membarrier) #ifndef syscalls_counted .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls #define syscalls_counted -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/9] Allocate sys_membarrier on main architectures
Following feedback from architecture maintainers, this is v2 of this patchset. Status: * Picked into maintainer's tree: - powerpc * Ready to be picked into maintainer's tree (acked/tested): - mips, sparc/sparc64, parisc, * Awaiting feedback/testing: - arm, arm64, ia64, s390/s390x Thanks, Mathieu Mathieu Desnoyers (9): [picked] powerpc: allocate sys_membarrier system call number mips: allocate sys_membarrier system call number sparc/sparc64: allocate sys_membarrier system call number parisc: allocate sys_membarrier system call number alpha: allocate sys_membarrier system call number arm: allocate sys_membarrier system call number arm64: allocate sys_membarrier system call number ia64: allocate sys_membarrier system call number s390/s390x: allocate sys_membarrier system call number arch/alpha/include/asm/unistd.h| 2 +- arch/alpha/include/uapi/asm/unistd.h | 1 + arch/alpha/kernel/systbls.S| 1 + arch/arm/include/asm/unistd.h | 2 +- arch/arm/include/uapi/asm/unistd.h | 1 + arch/arm/kernel/calls.S| 1 + arch/arm64/include/asm/unistd.h| 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ arch/ia64/include/asm/unistd.h | 2 +- arch/ia64/include/uapi/asm/unistd.h| 1 + arch/ia64/kernel/entry.S | 1 + arch/mips/include/uapi/asm/unistd.h| 15 +-- arch/mips/kernel/scall32-o32.S | 1 + arch/mips/kernel/scall64-64.S | 1 + arch/mips/kernel/scall64-n32.S | 1 + arch/mips/kernel/scall64-o32.S | 1 + arch/parisc/include/uapi/asm/unistd.h | 3 ++- arch/parisc/kernel/syscall_table.S | 1 + arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 1 + arch/s390/include/uapi/asm/unistd.h| 3 ++- arch/s390/kernel/syscalls.S| 1 + arch/sparc/include/uapi/asm/unistd.h | 3 ++- arch/sparc/kernel/systbls_32.S | 2 +- arch/sparc/kernel/systbls_64.S | 4 ++-- 26 files changed, 39 insertions(+), 17 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/9] mips: allocate sys_membarrier system call number
Signed-off-by: Mathieu Desnoyers Acked-by: Ralf Baechle CC: Andrew Morton CC: linux-...@vger.kernel.org CC: linux-m...@linux-mips.org --- arch/mips/include/uapi/asm/unistd.h | 15 +-- arch/mips/kernel/scall32-o32.S | 1 + arch/mips/kernel/scall64-64.S | 1 + arch/mips/kernel/scall64-n32.S | 1 + arch/mips/kernel/scall64-o32.S | 1 + 5 files changed, 13 insertions(+), 6 deletions(-) diff --git a/arch/mips/include/uapi/asm/unistd.h b/arch/mips/include/uapi/asm/unistd.h index d0bdfaa..b107983 100644 --- a/arch/mips/include/uapi/asm/unistd.h +++ b/arch/mips/include/uapi/asm/unistd.h @@ -378,16 +378,17 @@ #define __NR_bpf (__NR_Linux + 355) #define __NR_execveat (__NR_Linux + 356) #define __NR_mlock2(__NR_Linux + 357) +#define __NR_membarrier(__NR_Linux + 358) /* * Offset of the last Linux o32 flavoured syscall */ -#define __NR_Linux_syscalls357 +#define __NR_Linux_syscalls358 #endif /* _MIPS_SIM == _MIPS_SIM_ABI32 */ #define __NR_O32_Linux 4000 -#define __NR_O32_Linux_syscalls357 +#define __NR_O32_Linux_syscalls358 #if _MIPS_SIM == _MIPS_SIM_ABI64 @@ -713,16 +714,17 @@ #define __NR_bpf (__NR_Linux + 315) #define __NR_execveat (__NR_Linux + 316) #define __NR_mlock2(__NR_Linux + 317) +#define __NR_membarrier(__NR_Linux + 318) /* * Offset of the last Linux 64-bit flavoured syscall */ -#define __NR_Linux_syscalls317 +#define __NR_Linux_syscalls318 #endif /* _MIPS_SIM == _MIPS_SIM_ABI64 */ #define __NR_64_Linux 5000 -#define __NR_64_Linux_syscalls 317 +#define __NR_64_Linux_syscalls 318 #if _MIPS_SIM == _MIPS_SIM_NABI32 @@ -1052,15 +1054,16 @@ #define __NR_bpf (__NR_Linux + 319) #define __NR_execveat (__NR_Linux + 320) #define __NR_mlock2(__NR_Linux + 321) +#define __NR_membarrier(__NR_Linux + 322) /* * Offset of the last N32 flavoured syscall */ -#define __NR_Linux_syscalls321 +#define __NR_Linux_syscalls322 #endif /* _MIPS_SIM == _MIPS_SIM_NABI32 */ #define __NR_N32_Linux 6000 -#define __NR_N32_Linux_syscalls321 +#define __NR_N32_Linux_syscalls322 #endif /* _UAPI_ASM_UNISTD_H */ diff --git a/arch/mips/kernel/scall32-o32.S b/arch/mips/kernel/scall32-o32.S index b0b377a..9265542 100644 --- a/arch/mips/kernel/scall32-o32.S +++ b/arch/mips/kernel/scall32-o32.S @@ -600,3 +600,4 @@ EXPORT(sys_call_table) PTR sys_bpf /* 4355 */ PTR sys_execveat PTR sys_mlock2 + PTR sys_membarrier diff --git a/arch/mips/kernel/scall64-64.S b/arch/mips/kernel/scall64-64.S index f12eb03..79d4fb0 100644 --- a/arch/mips/kernel/scall64-64.S +++ b/arch/mips/kernel/scall64-64.S @@ -437,4 +437,5 @@ EXPORT(sys_call_table) PTR sys_bpf /* 5315 */ PTR sys_execveat PTR sys_mlock2 + PTR sys_membarrier .size sys_call_table,.-sys_call_table diff --git a/arch/mips/kernel/scall64-n32.S b/arch/mips/kernel/scall64-n32.S index ecdd65a..235892a 100644 --- a/arch/mips/kernel/scall64-n32.S +++ b/arch/mips/kernel/scall64-n32.S @@ -430,4 +430,5 @@ EXPORT(sysn32_call_table) PTR sys_bpf PTR compat_sys_execveat /* 6320 */ PTR sys_mlock2 + PTR sys_membarrier .size sysn32_call_table,.-sysn32_call_table diff --git a/arch/mips/kernel/scall64-o32.S b/arch/mips/kernel/scall64-o32.S index 7a8b2df..c051bd3 100644 --- a/arch/mips/kernel/scall64-o32.S +++ b/arch/mips/kernel/scall64-o32.S @@ -585,4 +585,5 @@ EXPORT(sys32_call_table) PTR sys_bpf /* 4355 */ PTR compat_sys_execveat PTR sys_mlock2 + PTR sys_membarrier .size sys32_call_table,.-sys32_call_table -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-blk: use VIRTIO_BLK_F_WCE and VIRTIO_BLK_F_CONFIG_WCE in virtio1
On 22/08/2015 00:53, Paolo Bonzini wrote: > VIRTIO_BLK_F_CONFIG_WCE is important in order to achieve good performance > (up to 2x, though more realistically +30-40%) in latency-bound workloads. > However, it was removed by mistake together with VIRTIO_BLK_F_FLUSH. > > It will be restored in the next revision of the virtio 1.0 standard, so > do the same in Linux. > > Signed-off-by: Paolo Bonzini > --- > drivers/block/virtio_blk.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > index d4d05f064d39..ea2c17c66dfb 100644 > --- a/drivers/block/virtio_blk.c > +++ b/drivers/block/virtio_blk.c > @@ -478,8 +478,7 @@ static int virtblk_get_cache_mode(struct virtio_device > *vdev) > struct virtio_blk_config, wce, > &writeback); > if (err) > - writeback = virtio_has_feature(vdev, VIRTIO_BLK_F_WCE) || > - virtio_has_feature(vdev, VIRTIO_F_VERSION_1); > + writeback = virtio_has_feature(vdev, VIRTIO_BLK_F_WCE); > > return writeback; > } > @@ -840,7 +839,7 @@ static unsigned int features_legacy[] = { > static unsigned int features[] = { > VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_GEOMETRY, > VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, > - VIRTIO_BLK_F_TOPOLOGY, > + VIRTIO_BLK_F_WCE, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE, > VIRTIO_BLK_F_MQ, > }; > > Ping? Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] arm64: kernel: Use a separate stack for irq interrupts.
On 07/09/15 16:48, Jungseok Lee wrote: > On Sep 7, 2015, at 11:36 PM, James Morse wrote: > > Hi James, > >> Having to handle interrupts on top of an existing kernel stack means the >> kernel stack must be large enough to accomodate both the maximum kernel >> usage, and the maximum irq handler usage. Switching to a different stack >> when processing irqs allows us to make the stack size smaller. >> >> Maximum kernel stack usage (running ltp and generating usb+ethernet >> interrupts) was 7256 bytes. With this patch, the same workload gives >> a maximum stack usage of 5816 bytes. > > I'd like to know how to measure the max stack depth. > AFAIK, a stack tracer on ftrace does not work well. Did you dump a stack > region and find or track down an untouched region? I enabled the 'Trace max stack' option under menuconfig 'Kernel Hacking' -> 'Tracers', then looked in debugfs:/tracing/stack_max_size. What problems did you encounter? (I may be missing something...) Thanks, James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] selftests: add membarrier syscall test
- On Sep 3, 2015, at 11:36 PM, Michael Ellerman m...@ellerman.id.au wrote: > On Thu, 2015-09-03 at 15:47 +, Mathieu Desnoyers wrote: >> - On Sep 3, 2015, at 5:33 AM, Michael Ellerman m...@ellerman.id.au wrote: >> >> > On Tue, 2015-09-01 at 11:32 -0700, Andy Lutomirski wrote: >> >> On Tue, Sep 1, 2015 at 10:11 AM, Mathieu Desnoyers >> >> wrote: >> >> > Just to make sure I understand: should we expect that >> >> > everyone will issue "make headers_install" on their system >> >> > before doing a make kselftest ? >> >> > >> >> > I see that a few selftests (e.g. memfd) are adding the >> >> > source tree include paths to the compiler include paths, >> >> > which I guess is to ensure that the kselftest will >> >> > work even if the system headers are not up to date. >> >> >> >> It would be really nice if there were a clean way for selftests to >> >> include the kernel headers. >> > >> > What's wrong with make headers_install? >> > >> > Or do you mean when writing the tests? That we could fix by adding the >> > ../../../../usr/include path to CFLAGS in lib.mk. And fixing all the tests >> > that >> > overwrite CFLAGS to append to CFLAGS. >> > >> >> Perhaps make should build the exportable headers somewhere as a >> >> dependency of >> >> kselftests. >> > >> > Yeah the top-level kselftest target could do that I think. >> > >> > Folks who don't want the headers installed can just run the selftests >> > Makefile >> > directly. >> > >> > Does this work for you? >> > >> > diff --git a/Makefile b/Makefile >> > index c361593..c8841d3 100644 >> > --- a/Makefile >> > +++ b/Makefile >> > @@ -1080,7 +1080,7 @@ headers_check: headers_install >> > # Kernel selftest >> > >> > PHONY += kselftest >> > -kselftest: >> > +kselftest: headers_install >> >$(Q)$(MAKE) -C tools/testing/selftests run_tests >> >> My personal experience is that make headers_install does not necessarily play >> well with the distribution header file hierarchy, which requires some tweaks >> to be done by the users (e.g. asm vs x86_64-linux-gnu). > > OK, I've never had issues. What exactly are you doing and how is it going > wrong? After some investigation, I noticed the following: 1) I first ran make headers_install as root, which installed the headers within my build tree. I later tried it again as user, and it failed due to permission issues (my bad). This is where I tried to install it into my system rather than under my build directory, which caused a mess. 2) Since make kselftest should be run as root (according to make help), this means that all the output files generated by the build are owned by root. It leads to permissions issues when trying to rebuild the tests as user afterward. Perhaps we could introduce a distinction between make kselftest_build and make kselftest_run ? The former could be executed as user, and the latter as root. > >> Also, headers_install typically expects a INSTALL_HDR_PATH. > > You can specify it, but the default is just usr/, ie. in the kernel directory, > that is what I was proposing. (Actually it's $(objtree)/usr). OK, trying it out. > >> It would be interesting if we could install the kernel headers into a >> specific location that is then re-used by kselftest, so using it without too >> much manual configuration does not require to overwrite the distribution >> header files to run tests. > > I think we can do that now, ie: > > $ ls /usr/include/linux/membarrier.h > ls: cannot access /usr/include/linux/membarrier.h: No such file or directory > > $ cd linux-next > $ make mrproper > $ make headers_install > ... > $ ls usr/include/linux/membarrier.h > usr/include/linux/membarrier.h > $ make -C tools/testing/selftests TARGETS=membarrier > make: Entering directory > '/home/michael/work/topics/selftests/linux-next/tools/testing/selftests' > for TARGET in membarrier; do \ > make -C $TARGET; \ > done; > make[1]: Entering directory > > '/home/michael/work/topics/selftests/linux-next/tools/testing/selftests/membarrier' > gcc -g -I../../../../usr/include/ membarrier_test.c -o membarrier_test > make[1]: Leaving directory > > '/home/michael/work/topics/selftests/linux-next/tools/testing/selftests/membarrier' > make: Leaving directory > '/home/michael/work/topics/selftests/linux-next/tools/testing/selftests' > > $ ./tools/testing/selftests/membarrier/membarrier_test > membarrier MEMBARRIER_CMD_QUERY failed. Function not implemented. > $ > > > So that seems to be working for me. Are you doing some different work flow, or > am I just missing something? When doing make headers_install, it indeed installs membarrier.h where we expect it under the build output dir: $ ls usr/include/linux/membarrier.h usr/include/linux/membarrier.h However, if I issue $ make -C tools/testing/selftests TARGETS=membarrier make: Entering directory `/home/efficios/git/linux-next/tools/testing/selftests' for TARGET in membarrier; do \ make -C $TARGET; \
Re: [PATCH v4 0/3] mtd: nand: jz4780: Add NAND and BCH drivers
On 7 September 2015 at 11:54, Alex Smith wrote: > On 06/09/2015 21:38, Ezequiel Garcia wrote: >> On 27 Jul 02:50 PM, Alex Smith wrote: >>> Hi, >>> >>> This series adds support for the BCH controller and NAND devices on >>> the Ingenic JZ4780 SoC. >>> >>> Tested on the MIPS Creator Ci20 board. All dependencies are now in >>> mainline so it should be possible to compile test now. >>> >>> This version of the series has been rebased on 4.2-rc4, and also adds >>> an additional patch to fix an issue that was encountered in the >>> external Ci20 3.18 kernel branch. >>> >>> Review and feedback welcome. >>> >> >> The NEMC driver seems to be upstream. Any chance you submit devicetree >> changes as well for Ci20 (so we can actually test this)? > > Sure, can do. The pinctrl driver is not yet upstream (needs some work) which > is why I didn't add the DT changes initially, but at least if you boot the > board from the NAND then U-Boot should have left everything in a state usable > by the kernel. > Great, thanks! I definitely look forward to test this. -- Ezequiel GarcĂa, VanguardiaSur www.vanguardiasur.com.ar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] acpi: Add basic device probing infrastructure
[+M.Salter] On Fri, Sep 04, 2015 at 06:06:48PM +0100, Marc Zyngier wrote: > IRQ controllers and timers are the two types of device the kernel > requires before being able to use the device driver model. > > ACPI so far lacks a proper probing infrastructure similar to the one > we have with DT, where we're able to declare IRQ chips and > clocksources inside the driver code, and let the core code pick it up > and call us back on a match. This leads to all kind of really ugly > hacks all over the arm64 code and even in the ACPI layer. > > In order to allow some basic probing based on the ACPI tables, > introduce "struct acpi_probe_entry" which contains just enough > data and callbacks to match a table, an optional subtable, and > call a probe function. A driver can, at build time, register itself > and expect being called if the right entry exists in the ACPI > table. > > A acpi_probe_device_init() is provided, taking an ACPI table > identifier, and iterating over the registered entries. > > Signed-off-by: Marc Zyngier > --- > drivers/acpi/scan.c | 41 > include/asm-generic/vmlinux.lds.h | 11 > include/linux/acpi.h | 56 > +++ > 3 files changed, 108 insertions(+) > > diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c > index ec25635..9e920ec 100644 > --- a/drivers/acpi/scan.c > +++ b/drivers/acpi/scan.c > @@ -2793,3 +2793,44 @@ int __init acpi_scan_init(void) > mutex_unlock(&acpi_scan_lock); > return result; > } > + > +static const struct acpi_probe_entry device_acpi_probe_end > + __used __section(__device_acpi_probe_table_end); > +extern struct acpi_probe_entry __device_acpi_probe_table[]; > +static struct acpi_probe_entry *ape; > +static int acpi_probe_count; > +static DEFINE_SPINLOCK(acpi_probe_lock); > + > +static int __init acpi_match_madt(struct acpi_subtable_header *header, > + const unsigned long end) > +{ > + if (!ape->validate_subtbl || ape->validate_subtbl(header, ape)) > + if (!ape->probe_subtbl(header, end)) > + acpi_probe_count++; > + > + return 0; > +} > + > +int __init acpi_probe_device_table(const char *id) > +{ > + int count = 0; > + > + if (acpi_disabled) > + return 0; > + > + spin_lock(&acpi_probe_lock); > + for (ape = __device_acpi_probe_table; ape->probe_table; ape++) { > + if (!ACPI_COMPARE_NAME(id, ape->id)) > + continue; > + if (ACPI_COMPARE_NAME(ACPI_SIG_MADT, ape->id)) { > + acpi_probe_count = 0; > + acpi_table_parse_madt(ape->type, acpi_match_madt, 0); > + count += acpi_probe_count; > + } else { > + count = acpi_table_parse(ape->id, ape->probe_table); > + } > + } > + spin_unlock(&acpi_probe_lock); > + > + return count; > +} We should add a mechanism to prevent re-parsing the same entries multiple times (in case this function is called with the same signature multiple times). We could create a separate table of device entries, per-subsystem, that we want to parse (irqchip specific table, timers, etc.) instead of adding all the devices to the same table (ie linker section), you can do this already with the current patchset by just choosing different table names as DT does. We may also want to extend this set so that it can be used to parse the same table, same subtype multiple times at different stages in the boot path (but let's first see if it is a) really needed b) feasible). Basically it is to avoid parsing the MADT multiple times: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/340267.html Those can be extensions to the current patchset (because basically they are not real issues at present), it is just a heads-up. Thanks for putting it together ! Lorenzo > diff --git a/include/asm-generic/vmlinux.lds.h > b/include/asm-generic/vmlinux.lds.h > index 8bd374d..875397a 100644 > --- a/include/asm-generic/vmlinux.lds.h > +++ b/include/asm-generic/vmlinux.lds.h > @@ -181,6 +181,16 @@ > #define CPUIDLE_METHOD_OF_TABLES() OF_TABLE(CONFIG_CPU_IDLE, cpuidle_method) > #define EARLYCON_OF_TABLES() OF_TABLE(CONFIG_SERIAL_EARLYCON, earlycon) > > +#ifdef CONFIG_ACPI > +#define ACPI_PROBE_TABLE(name) > \ > + . = ALIGN(8); \ > + VMLINUX_SYMBOL(__##name##_acpi_probe_table) = .;\ > + *(__##name##_acpi_probe_table) \ > + *(__##name##_acpi_probe_table_end) > +#else > +#define ACPI_PROBE_TABLE(name) > +#endif > + > #define KERNEL_DTB() \ > STRUCT_ALIGN(); \ > VMLINUX_SYMBOL(__dtb_start) = .;
[PATCH v2 1/1] Add Corsair Vengeance K90 driver
This patch implements a HID driver for the Corsair Vengeance K90 keyboard. It fixes the behaviour of the keys using incorrect HID usage codes and exposes the macro playback mode and current profile to the user space through sysfs attributes. It also adds two LED class devices controlling the "record" LED and the backlight. Signed-off-by: Clément Vuchener --- Documentation/ABI/testing/sysfs-driver-hid-corsair | 15 + drivers/hid/Kconfig| 10 + drivers/hid/Makefile | 1 + drivers/hid/hid-core.c | 1 + drivers/hid/hid-corsair.c | 555 + drivers/hid/hid-ids.h | 3 + 6 files changed, 585 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-hid-corsair create mode 100644 drivers/hid/hid-corsair.c diff --git a/Documentation/ABI/testing/sysfs-driver-hid-corsair b/Documentation/ABI/testing/sysfs-driver-hid-corsair new file mode 100644 index 000..b8827f0 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-hid-corsair @@ -0,0 +1,15 @@ +What: /sys/bus/drivers/corsair//macro_mode +Date: August 2015 +KernelVersion: 4.2 +Contact: Clement Vuchener +Description: Get/set the current playback mode. "SW" for software mode + where G-keys triggers their regular key codes. "HW" for + hardware playback mode where the G-keys play their macro + from the on-board memory. + + +What: /sys/bus/drivers/corsair//current_profile +Date: August 2015 +KernelVersion: 4.2 +Contact: Clement Vuchener +Description: Get/set the current selected profile. Values are from 1 to 3. diff --git a/drivers/hid/Kconfig b/drivers/hid/Kconfig index 6ab51ae..3fe9678 100644 --- a/drivers/hid/Kconfig +++ b/drivers/hid/Kconfig @@ -171,6 +171,16 @@ config HID_CHICONY ---help--- Support for Chicony Tactical pad. +config HID_CORSAIR + tristate "Corsair devices" + depends on HID && USB && LEDS_CLASS + ---help--- + Support for Corsair devices that are not fully compliant with the + HID standard. + + Supported devices: + - Vengeance K90 + config HID_PRODIKEYS tristate "Prodikeys PC-MIDI Keyboard support" depends on HID && SND diff --git a/drivers/hid/Makefile b/drivers/hid/Makefile index e6441bc..edaa0f2 100644 --- a/drivers/hid/Makefile +++ b/drivers/hid/Makefile @@ -29,6 +29,7 @@ obj-$(CONFIG_HID_BELKIN) += hid-belkin.o obj-$(CONFIG_HID_BETOP_FF) += hid-betopff.o obj-$(CONFIG_HID_CHERRY) += hid-cherry.o obj-$(CONFIG_HID_CHICONY) += hid-chicony.o +obj-$(CONFIG_HID_CORSAIR) += hid-corsair.o obj-$(CONFIG_HID_CP2112) += hid-cp2112.o obj-$(CONFIG_HID_CYPRESS) += hid-cypress.o obj-$(CONFIG_HID_DRAGONRISE) += hid-dr.o diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index bcd914a..d5fc4d1 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -1828,6 +1828,7 @@ static const struct hid_device_id hid_have_special_driver[] = { { HID_USB_DEVICE(USB_VENDOR_ID_CHICONY, USB_DEVICE_ID_CHICONY_WIRELESS2) }, { HID_USB_DEVICE(USB_VENDOR_ID_CHICONY, USB_DEVICE_ID_CHICONY_AK1D) }, { HID_USB_DEVICE(USB_VENDOR_ID_CHICONY, USB_DEVICE_ID_CHICONY_ACER_SWITCH12) }, + { HID_USB_DEVICE(USB_VENDOR_ID_CORSAIR, USB_DEVICE_ID_CORSAIR_K90) }, { HID_USB_DEVICE(USB_VENDOR_ID_CREATIVELABS, USB_DEVICE_ID_PRODIKEYS_PCMIDI) }, { HID_USB_DEVICE(USB_VENDOR_ID_CYGNAL, USB_DEVICE_ID_CYGNAL_CP2112) }, { HID_USB_DEVICE(USB_VENDOR_ID_CYPRESS, USB_DEVICE_ID_CYPRESS_BARCODE_1) }, diff --git a/drivers/hid/hid-corsair.c b/drivers/hid/hid-corsair.c new file mode 100644 index 000..580c214 --- /dev/null +++ b/drivers/hid/hid-corsair.c @@ -0,0 +1,555 @@ +/* + * HID driver for Corsair devices + * + * Supported devices: + * - Vengeance K90 Keyboard + * + * Copyright (c) 2015 Clement Vuchener + */ + +/* + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. + */ + +#include +#include +#include +#include + +#include "hid-ids.h" + +struct k90_led { + struct led_classdev cdev; + int brightness; + struct work_struct work; + int removed; +}; + +struct k90_drvdata { + int current_profile; + int macro_mode; + int meta_locked; + struct k90_led backlight; + struct k90_led record_led; +}; + +#define K90_GKEY_COUNT 18 + +static int k90_usage_to_gkey(unsigned int usage) +{ + /* G1 (0xd0) to G16 (0xdf) */ + if (usage >= 0xd0 && usage <= 0xdf) + return usage - 0xd0 + 1; + /* G17 (0xe8) to G18 (0xe9) */ + if (usage >= 0xe8 && usage <= 0
[PATCH v2 0/1] Corsair Vengeance K90 driver
I removed the k90_profile class completely. I cannot write a good enough ABI with what I know of the keyboard so I am leaving that part out of the kernel. If I change my mind in the future, it will be done in another patch. I also fixed a bug I had when unregistering the led device. Work was being scheduled after the led device was unregistered. On the name change, I kept a lot of K90 references. As far as I know, the only similar keyboard is the K60 that shares the same firmware but does not have all the special keys and backlight, and for which the hid-generic driver should be enough. The more recent RGB keyboard series uses a different protocol from what I have seen from the unofficial userspace driver (CKB from MSC). changes in v2: - Removed the k90_profile class and devices - Renamed driver for a more generic name ("corsair" driver in hid-corsair.c) - Fixed led devices clean up (hang when unplugging and led state reset) - Added dependency on USB and LEDS_CLASS in Kconfig Clément Vuchener (1): Add Corsair Vengeance K90 driver Documentation/ABI/testing/sysfs-driver-hid-corsair | 15 + drivers/hid/Kconfig| 10 + drivers/hid/Makefile | 1 + drivers/hid/hid-core.c | 1 + drivers/hid/hid-corsair.c | 555 + drivers/hid/hid-ids.h | 3 + 6 files changed, 585 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-hid-corsair create mode 100644 drivers/hid/hid-corsair.c -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v1] usb: core: driver: Use kmalloc_array
Use kmalloc_array instead of kmalloc to allocate memory for an array. Also, remove the dev_warn for a memory leak, making the if check more sleek. Signed-off-by: Muhammad Falak R Wani --- On suggestion by Joe Perches Changes since v0 -remove dev_warn for memory leak -remove unnecessary parens for if --- drivers/usb/core/driver.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c index 818369a..e0636c1 100644 --- a/drivers/usb/core/driver.c +++ b/drivers/usb/core/driver.c @@ -416,12 +416,10 @@ static int usb_unbind_interface(struct device *dev) if (ep->streams == 0) continue; if (j == 0) { - eps = kmalloc(USB_MAXENDPOINTS * sizeof(void *), + eps = kmalloc_array(USB_MAXENDPOINTS, sizeof(void *), GFP_KERNEL); - if (!eps) { - dev_warn(dev, "oom, leaking streams\n"); + if (!eps) break; - } } eps[j++] = ep; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 RESEND] x86/asm/entry/32, selftests: Add 'test_syscall_vdso' test
This new test checks that all x86 registers are preserved across 32-bit syscalls. It tests syscalls through VDSO (if available) and through INT 0x80, normally and under ptrace. If kernel is a 64-bit one, high registers (r8..r15) are poisoned before the syscall is called and are checked afterwards. They must be either preserved, or cleared to zero (but r11 is special); r12..15 must be preserved for INT 0x80. EFLAGS is checked for changes too, but change there is not considered to be a bug (paravirt kernels do not preserve arithmetic flags). Run-tested on 64-bit kernel: $ ./test_syscall_vdso_32 [RUN] Executing 6-argument 32-bit syscall via VDSO [OK]Arguments are preserved across syscall [NOTE] R11 has changed:00200ed7 - assuming clobbered by SYSRET insn [OK]R8..R15 did not leak kernel data [RUN] Executing 6-argument 32-bit syscall via INT 80 [OK]Arguments are preserved across syscall [OK]R8..R15 did not leak kernel data [RUN] Running tests under ptrace [RUN] Executing 6-argument 32-bit syscall via VDSO [OK]Arguments are preserved across syscall [OK]R8..R15 did not leak kernel data [RUN] Executing 6-argument 32-bit syscall via INT 80 [OK]Arguments are preserved across syscall [OK]R8..R15 did not leak kernel data On 32-bit paravirt kernel: $ ./test_syscall_vdso_32 [NOTE] Not a 64-bit kernel, won't test R8..R15 leaks [RUN] Executing 6-argument 32-bit syscall via VDSO [WARN] Flags before=00200ed7 id 0 00 o d i s z 0 a 0 p 1 c [WARN] Flags after=00200246 id 0 00 i z 0 0 p 1 [WARN] Flags change=0c91 0 00 o d s 0 a 0 0 c [OK]Arguments are preserved across syscall [RUN] Executing 6-argument 32-bit syscall via INT 80 [OK]Arguments are preserved across syscall [RUN] Running tests under ptrace [RUN] Executing 6-argument 32-bit syscall via VDSO [OK]Arguments are preserved across syscall [RUN] Executing 6-argument 32-bit syscall via INT 80 [OK]Arguments are preserved across syscall Signed-off-by: Denys Vlasenko CC: Linus Torvalds CC: Steven Rostedt CC: Ingo Molnar CC: Borislav Petkov CC: "H. Peter Anvin" CC: Andy Lutomirski CC: Oleg Nesterov CC: Frederic Weisbecker CC: Alexei Starovoitov CC: Will Drewry CC: Kees Cook CC: x...@kernel.org CC: linux-kernel@vger.kernel.org --- Changes in v2: does not fail if VDSO can't be found; tests INT 80 syscall method; tests syscalls under ptrace; switched to /* */ comments Changes in v3: added checking for r8..r15 info leaks Changes in v4: re-added Makefile change tools/testing/selftests/x86/Makefile| 2 +- tools/testing/selftests/x86/test_syscall_vdso.c | 401 tools/testing/selftests/x86/thunks_32.S | 55 3 files changed, 457 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/x86/test_syscall_vdso.c create mode 100644 tools/testing/selftests/x86/thunks_32.S diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile index caa60d5..84effa6 100644 --- a/tools/testing/selftests/x86/Makefile +++ b/tools/testing/selftests/x86/Makefile @@ -5,7 +5,7 @@ include ../lib.mk .PHONY: all all_32 all_64 warn_32bit_failure clean TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs ldt_gdt syscall_nt -TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault sigreturn +TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault sigreturn test_syscall_vdso TARGETS_C_32BIT_ALL := $(TARGETS_C_BOTHBITS) $(TARGETS_C_32BIT_ONLY) BINARIES_32 := $(TARGETS_C_32BIT_ALL:%=%_32) @@ -60,3 +60,4 @@ endif # Some tests have additional dependencies. sysret_ss_attrs_64: thunks.S +test_syscall_vdso_32: thunks_32.S diff --git a/tools/testing/selftests/x86/test_syscall_vdso.c b/tools/testing/selftests/x86/test_syscall_vdso.c new file mode 100644 index 000..0792aef --- /dev/null +++ b/tools/testing/selftests/x86/test_syscall_vdso.c @@ -0,0 +1,401 @@ +/* + * 32-bit syscall ABI conformance test. + * + * Copyright (c) 2015 Denys Vlasenko + * + * This program is free software; you can redistribute it and/or modify + * it under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ +/* + * Can be built statically: + * gcc -Os -Wall -static -m32 test_syscall_vdso.c thunks_32.S + */ +#undef _GNU_SOURCE +#define _GNU_SOURCE 1 +#undef __USE_GNU +#define __USE_GNU 1 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#if !defined(__i386__) +int main(int argc, char **argv, char **envp) +{ + printf("[SKIP]\tNot a 32-bit x86 userspace\n"); + return 0; +} +#else + +long syscal
Re: [RFC PATCH 1/3] arm64: entry: Remove unnecessary calculation for S_SP in EL1h
On Sep 7, 2015, at 11:56 PM, Mark Rutland wrote: Hi Mark, > On Fri, Sep 04, 2015 at 03:23:05PM +0100, Jungseok Lee wrote: >> Under EL1h, S_SP data is not seen in kernel_exit. Thus, x21 calculation >> is not needed in kernel_entry. Currently, S_SP information is vaild only >> when sp_el0 is used. > > I don't think this is true. The generic BUG implementation will grab the > saved SP from the pt_regs, and with this change we'll report whatever > happened to be in x21 instead. > >> Signed-off-by: Jungseok Lee >> --- >> arch/arm64/kernel/entry.S | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S >> index e163518..d23ca0d 100644 >> --- a/arch/arm64/kernel/entry.S >> +++ b/arch/arm64/kernel/entry.S >> @@ -91,8 +91,6 @@ >> get_thread_info tsk // Ensure MDSCR_EL1.SS is clear, >> ldr x19, [tsk, #TI_FLAGS] // since we can unmask debug >> disable_step_tsk x19, x20 // exceptions when scheduling. >> -.else >> -add x21, sp, #S_FRAME_SIZE >> .endif >> mrs x22, elr_el1 >> mrs x23, spsr_el1 > > Immediately after this we do: > > stp lr, x21, [sp, #S_LR] > > To store the LR and SP to the pt_regs which bug_handler would use. > > Am I missing smoething? No, You're right. As James mentioned, x21 is used in do_sp_pc_abort. Thanks for the comment. Best Regards Jungseok Lee -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: fix bug which lowmem size is limited to 760MB
On Mon, 7 Sep 2015, Arnd Bergmann wrote: > On Monday 07 September 2015 11:34:36 Nicolas Pitre wrote: > > > > That shifts the risk to user space though. But if there is a regression > > there, it will manifest itself on all systems and not only with some > > particular hardware. > > I'd consider that a good thing, as it makes it easier to test when > you see the same behavior on systems with any memory size. Sure, that was my point, although I admitedly didn't say it clearly. Nicolas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] arm64: kernel: Use a separate stack for irq interrupts.
On Sep 7, 2015, at 11:36 PM, James Morse wrote: Hi James, > Having to handle interrupts on top of an existing kernel stack means the > kernel stack must be large enough to accomodate both the maximum kernel > usage, and the maximum irq handler usage. Switching to a different stack > when processing irqs allows us to make the stack size smaller. > > Maximum kernel stack usage (running ltp and generating usb+ethernet > interrupts) was 7256 bytes. With this patch, the same workload gives > a maximum stack usage of 5816 bytes. I'd like to know how to measure the max stack depth. AFAIK, a stack tracer on ftrace does not work well. Did you dump a stack region and find or track down an untouched region? I will leave comments after reading and playing with this change carefully. Best Regards Jungseok Lee > Signed-off-by: James Morse > --- > arch/arm64/include/asm/irq.h | 12 + > arch/arm64/include/asm/thread_info.h | 8 -- > arch/arm64/kernel/entry.S| 33 --- > arch/arm64/kernel/irq.c | 52 > arch/arm64/kernel/smp.c | 4 +++ > arch/arm64/kernel/stacktrace.c | 4 ++- > 6 files changed, 107 insertions(+), 6 deletions(-) > > diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h > index bbb251b14746..050d4196c736 100644 > --- a/arch/arm64/include/asm/irq.h > +++ b/arch/arm64/include/asm/irq.h > @@ -2,14 +2,20 @@ > #define __ASM_IRQ_H > > #include > +#include > > #include > +#include > + > +DECLARE_PER_CPU(unsigned long, irq_sp); > > struct pt_regs; > > extern void migrate_irqs(void); > extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); > > +extern int alloc_irq_stack(unsigned int cpu); > + > static inline void acpi_irq_init(void) > { > /* > @@ -21,4 +27,10 @@ static inline void acpi_irq_init(void) > } > #define acpi_irq_init acpi_irq_init > > +static inline bool is_irq_stack(unsigned long sp) > +{ > + struct thread_info *ti = get_thread_info(sp); > + return (get_thread_info(per_cpu(irq_sp, ti->cpu)) == ti); > +} > + > #endif > diff --git a/arch/arm64/include/asm/thread_info.h > b/arch/arm64/include/asm/thread_info.h > index dcd06d18a42a..b906254fc400 100644 > --- a/arch/arm64/include/asm/thread_info.h > +++ b/arch/arm64/include/asm/thread_info.h > @@ -69,12 +69,16 @@ register unsigned long current_stack_pointer asm ("sp"); > /* > * how to get the thread information struct from C > */ > +static inline struct thread_info *get_thread_info(unsigned long sp) > +{ > + return (struct thread_info *)(sp & ~(THREAD_SIZE - 1)); > +} > + > static inline struct thread_info *current_thread_info(void) > __attribute_const__; > > static inline struct thread_info *current_thread_info(void) > { > - return (struct thread_info *) > - (current_stack_pointer & ~(THREAD_SIZE - 1)); > + return get_thread_info(current_stack_pointer); > } > > #define thread_saved_pc(tsk) \ > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index e16351819fed..d42371f3f5a1 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -190,10 +190,37 @@ tsk .reqx28 // current thread_info > * Interrupt handling. > */ > .macro irq_handler > - adrpx1, handle_arch_irq > - ldr x1, [x1, #:lo12:handle_arch_irq] > - mov x0, sp > + mrs x21, tpidr_el1 > + adr_l x20, irq_sp > + add x20, x20, x21 > + > + ldr x21, [x20] > + mov x20, sp > + > + mov x0, x21 > + mov x1, x20 > + bl irq_copy_thread_info > + > + /* test for recursive use of irq_sp */ > + cbz w0, 1f > + mrs x30, elr_el1 > + mov sp, x21 > + > + /* > + * Create a fake stack frame to bump unwind_frame() onto the original > + * stack. This relies on x29 not being clobbered by kernel_entry(). > + */ > + pushx29, x30 > + > +1: ldr_l x1, handle_arch_irq > + mov x0, x20 > blr x1 > + > + mov x0, x20 > + mov x1, x21 > + bl irq_copy_thread_info > + mov sp, x20 > + > .endm > > .text > diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c > index 463fa2e7e34c..10b57a006da8 100644 > --- a/arch/arm64/kernel/irq.c > +++ b/arch/arm64/kernel/irq.c > @@ -26,11 +26,14 @@ > #include > #include > #include > +#include > #include > #include > > unsigned long irq_err_count; > > +DEFINE_PER_CPU(unsigned long, irq_sp) = 0; > + > int arch_show_interrupts(struct seq_file *p, int prec) > { > #ifdef CONFIG_SMP > @@ -55,6 +58,10 @@ void __init init_IRQ(void) > irqchip_init(); > if (!handle_arch_irq) > panic("No interrupt controller found."); > + > + /* Allocate an irq stack for the boot cpu */ > + if (alloc_irq_stack(smp_processor_id())) > + panic("Failed to allocate irq stack for boot cpu."); > } >
[PATCH v4 4/4] ARM: dts: add suspend opp to exynos4412
Mark 800MHz OPP as a suspend opp for Exynos4412 based boards so effectively cpufreq-dt driver behavior w.r.t. suspend frequency matches what the old exynos-cpufreq driver has been doing. This patch fixes suspend/resume support on Exynos4412 based Trats2 board and reboot hang on Exynos4412 based Odroid U3 board. Cc: Thomas Abraham Cc: Javier Martinez Canillas Cc: Krzysztof Kozlowski Cc: Marek Szyprowski Cc: Tobias Jakobi Acked-by: Viresh Kumar Signed-off-by: Bartlomiej Zolnierkiewicz --- arch/arm/boot/dts/exynos4412.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm/boot/dts/exynos4412.dtsi b/arch/arm/boot/dts/exynos4412.dtsi index ca0e3c1..294cfe4 100644 --- a/arch/arm/boot/dts/exynos4412.dtsi +++ b/arch/arm/boot/dts/exynos4412.dtsi @@ -98,6 +98,7 @@ opp-hz = /bits/ 64 <8>; opp-microvolt = <100>; clock-latency-ns = <20>; + opp-suspend; }; opp07 { opp-hz = /bits/ 64 <9>; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: fix bug which lowmem size is limited to 760MB
On Monday 07 September 2015 11:34:36 Nicolas Pitre wrote: > > That shifts the risk to user space though. But if there is a regression > there, it will manifest itself on all systems and not only with some > particular hardware. I'd consider that a good thing, as it makes it easier to test when you see the same behavior on systems with any memory size. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 3/4] cpufreq-dt: add suspend frequency support
Add suspend frequency support and if needed set it to the frequency obtained from the suspend opp (can be defined using opp-v2 bindings and is optional). Cc: Viresh Kumar Cc: Thomas Abraham Cc: Javier Martinez Canillas Cc: Krzysztof Kozlowski Cc: Marek Szyprowski Cc: Tobias Jakobi Signed-off-by: Bartlomiej Zolnierkiewicz --- drivers/cpufreq/cpufreq-dt.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index c3583cd..e08ae40 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -196,6 +196,7 @@ static int cpufreq_init(struct cpufreq_policy *policy) struct device *cpu_dev; struct regulator *cpu_reg; struct clk *cpu_clk; + struct dev_pm_opp *suspend_opp; unsigned long min_uV = ~0, max_uV = 0; unsigned int transition_latency; bool need_update = false; @@ -329,6 +330,13 @@ static int cpufreq_init(struct cpufreq_policy *policy) policy->driver_data = priv; policy->clk = cpu_clk; + + rcu_read_lock(); + suspend_opp = dev_pm_opp_get_suspend_opp(cpu_dev); + if (suspend_opp) + policy->suspend_freq = dev_pm_opp_get_freq(suspend_opp) / 1000; + rcu_read_unlock(); + ret = cpufreq_table_validate_and_show(policy, freq_table); if (ret) { dev_err(cpu_dev, "%s: invalid frequency table: %d\n", __func__, @@ -419,6 +427,9 @@ static struct cpufreq_driver dt_cpufreq_driver = { .ready = cpufreq_ready, .name = "cpufreq-dt", .attr = cpufreq_dt_attr, +#ifdef CONFIG_PM + .suspend = cpufreq_generic_suspend, +#endif }; static int dt_cpufreq_probe(struct platform_device *pdev) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 1/4] PM / OPP: add dev_pm_opp_get_suspend_opp() helper
Add dev_pm_opp_get_suspend_opp() helper to obtain suspend opp. Cc: Viresh Kumar Cc: Thomas Abraham Cc: Javier Martinez Canillas Cc: Krzysztof Kozlowski Cc: Marek Szyprowski Cc: Tobias Jakobi Signed-off-by: Bartlomiej Zolnierkiewicz --- drivers/base/power/opp.c | 30 ++ include/linux/pm_opp.h | 6 ++ 2 files changed, 36 insertions(+) diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c index eb25449..3d948ea 100644 --- a/drivers/base/power/opp.c +++ b/drivers/base/power/opp.c @@ -341,6 +341,36 @@ unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev) EXPORT_SYMBOL_GPL(dev_pm_opp_get_max_clock_latency); /** + * dev_pm_opp_get_suspend_opp() - Get suspend opp + * @dev: device for which we do this operation + * + * Return: This function returns pointer to the suspend opp if it is + * defined, otherwise it returns NULL. + * + * Locking: This function must be called under rcu_read_lock(). opp is a rcu + * protected pointer. The reason for the same is that the opp pointer which is + * returned will remain valid for use with opp_get_{voltage, freq} only while + * under the locked area. The pointer returned must be used prior to unlocking + * with rcu_read_unlock() to maintain the integrity of the pointer. + */ +struct dev_pm_opp *dev_pm_opp_get_suspend_opp(struct device *dev) +{ + struct device_opp *dev_opp; + struct dev_pm_opp *opp; + + opp_rcu_lockdep_assert(); + + dev_opp = _find_device_opp(dev); + if (IS_ERR(dev_opp)) + opp = NULL; + else + opp = dev_opp->suspend_opp; + + return opp; +} +EXPORT_SYMBOL_GPL(dev_pm_opp_get_suspend_opp); + +/** * dev_pm_opp_get_opp_count() - Get number of opps available in the opp list * @dev: device for which we do this operation * diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h index cab7ba5..e817722 100644 --- a/include/linux/pm_opp.h +++ b/include/linux/pm_opp.h @@ -34,6 +34,7 @@ bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp); int dev_pm_opp_get_opp_count(struct device *dev); unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev); +struct dev_pm_opp *dev_pm_opp_get_suspend_opp(struct device *dev); struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev, unsigned long freq, @@ -80,6 +81,11 @@ static inline unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev) return 0; } +static inline struct dev_pm_opp *dev_pm_opp_get_suspend_opp(struct device *dev) +{ + return NULL; +} + static inline struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev, unsigned long freq, bool available) { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 2/4] cpufreq: allow cpufreq_generic_suspend() to work without suspend frequency
Some cpufreq drivers may set suspend frequency only for selected setups but still would like to use the generic suspend handler. Thus don't treat !policy->suspend_freq condition as an incorrect one. Cc: Viresh Kumar Cc: Thomas Abraham Cc: Javier Martinez Canillas Cc: Krzysztof Kozlowski Cc: Marek Szyprowski Cc: Tobias Jakobi Signed-off-by: Bartlomiej Zolnierkiewicz --- drivers/cpufreq/cpufreq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index b3d9368..a634fcb 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1626,8 +1626,8 @@ int cpufreq_generic_suspend(struct cpufreq_policy *policy) int ret; if (!policy->suspend_freq) { - pr_err("%s: suspend_freq can't be zero\n", __func__); - return -EINVAL; + pr_debug("%s: suspend_freq not defined\n", __func__); + return 0; } pr_debug("%s: Setting suspend-freq: %u\n", __func__, -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 0/3] Implement IRQ stack on ARM64
On Sep 7, 2015, at 11:33 PM, James Morse wrote: > On 04/09/15 15:23, Jungseok Lee wrote: >> ARM64 kernel allocates 16KB kernel stack when creating a process. In case >> of low memory platforms with tough workloads on userland, this order-2 >> allocation request reaches to memory pressure and performance degradation >> simultaenously since VM page allocator falls into slowpath frequently, >> which triggers page reclaim and compaction. >> >> I believe that one of the best solutions is to reduce kernel stack size. >> According to the following data from stack tracer with some fixes, [1], >> a separate IRQ stack would greatly help to decrease a kernel stack depth. >> > > Hi Jungseok Lee, Hi James Morse, > I was working on a similar patch for irq stack, (patch as a follow up email). > > I suggest we work together on a single implementation. I think the only > major difference is that you're using sp_el0 as a temporary register to > store a copy of the stack-pointer to find struct thread_info, whereas I was > copying it between stacks (ends up as 2x ldp/stps), which keeps the change > restricted to irq_stack setup code. > > We should get some feedback as to which approach is preferred. Great idea! I'd really like to figure out the most ideal implementation of this feature. Best Regards Jungseok Lee-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 0/4] cpufreq-dt: add suspend frequency support
Hi, This patch series adds suspend frequency support (using opp-v2 bindings and suspend-opp functionality) to cpufreq-dt driver and then adds suspend opp for Exynos4412 based boards. This patch series fixes suspend/resume support on Exynos4412 based Trats2 board and reboot hang on Exynos4412 based Odroid U3 board. Changes since v3: - fixed dev_pm_opp_get_suspend_opp() locking - shortened variable name in dev_pm_opp_get_suspend_opp() - adjusted cpufreq_generic_suspend() to work with cpufreq-dt - removed no longer needed cpufreq_dt_suspend() - added Acked-by tag from Viresh to patch #4 Changes since v2: - rewrote to use suspend-opp functionality Changes since v1: - removed superfluous ";" Depends on: - next-20150902 branch of linux-next kernel tree Best regards, -- Bartlomiej Zolnierkiewicz Samsung R&D Institute Poland Samsung Electronics Bartlomiej Zolnierkiewicz (4): PM / OPP: add dev_pm_opp_get_suspend_opp() helper cpufreq: allow cpufreq_generic_suspend() to work without suspend frequency cpufreq-dt: add suspend frequency support ARM: dts: add suspend opp to exynos4412 arch/arm/boot/dts/exynos4412.dtsi | 1 + drivers/base/power/opp.c | 30 ++ drivers/cpufreq/cpufreq-dt.c | 11 +++ drivers/cpufreq/cpufreq.c | 4 ++-- include/linux/pm_opp.h| 6 ++ 5 files changed, 50 insertions(+), 2 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 18/20] net/xen-netback: Make it running on 64KB page granularity
The PV network protocol is using 4KB page granularity. The goal of this patch is to allow a Linux using 64KB page granularity working as a network backend on a non-modified Xen. It's only necessary to adapt the ring size and break skb data in small chunk of 4KB. The rest of the code is relying on the grant table code. Signed-off-by: Julien Grall --- Cc: Ian Campbell Cc: Wei Liu Cc: net...@vger.kernel.org Improvement such as support of 64KB grant is not taken into consideration in this patch because we have the requirement to run a Linux using 64KB pages on a non-modified Xen. Note that I haven't add a comment why the offset is 0 after the first iteration. See [1] for more details. [1] https://lkml.org/lkml/2015/8/10/456 Changes in v4: - Add a comment to explain how we compute MAX_XEN_SKB_FRAGS Changes in v3: - Fix errors reported by checkpatch.pl - s/mfn/gfn/ based on the new naming - gnttab_foreach_grant has been renamed to gnttab_forach_grant_in_range - The grant callback doesn't allow anymore to use less data. An helpers has been added in netback to handle this. Changes in v2: - Correctly set MAX_GRANT_COPY_OPS and XEN_NETBK_RX_SLOTS_MAX - Don't use XEN_PAGE_SIZE in handle_frag_list as we coalesce fragment into a new skb - Use gnntab_foreach_grant to split a Linux page into grant --- drivers/net/xen-netback/common.h | 18 +++-- drivers/net/xen-netback/netback.c | 153 -- 2 files changed, 110 insertions(+), 61 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 8a495b3..24cb365 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -44,6 +44,7 @@ #include #include #include +#include #include typedef unsigned int pending_ring_idx_t; @@ -64,8 +65,8 @@ struct pending_tx_info { struct ubuf_info callback_struct; }; -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE) +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE) struct xenvif_rx_meta { int id; @@ -80,16 +81,21 @@ struct xenvif_rx_meta { /* Discriminate from any valid pending_idx value. */ #define INVALID_PENDING_IDX 0x -#define MAX_BUFFER_OFFSET PAGE_SIZE +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE +/* The maximum number of frags is derived from the size of a grant (same + * as a Xen page size for now). + */ +#define MAX_XEN_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1) + /* It's possible for an skb to have a maximal number of frags * but still be less than MAX_BUFFER_OFFSET in size. Thus the - * worst-case number of copy operations is MAX_SKB_FRAGS per + * worst-case number of copy operations is MAX_XEN_SKB_FRAGS per * ring slot. */ -#define MAX_GRANT_COPY_OPS (MAX_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE) +#define MAX_GRANT_COPY_OPS (MAX_XEN_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE) #define NETBACK_INVALID_HANDLE -1 @@ -203,7 +209,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */ /* Maximum number of Rx slots a to-guest packet may use, including the * slot needed for GSO meta-data. */ -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1) +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_XEN_SKB_FRAGS + 1)) enum state_bit_shift { /* This bit marks that the vif is connected */ diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index d4c1bc7..b1649aa 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -263,6 +263,80 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif_queue *queue, return meta; } +struct gop_frag_copy { + struct xenvif_queue *queue; + struct netrx_pending_operations *npo; + struct xenvif_rx_meta *meta; + int head; + int gso_type; + + struct page *page; +}; + +static void xenvif_setup_copy_gop(unsigned long gfn, + unsigned int offset, + unsigned int *len, + struct gop_frag_copy *info) +{ + struct gnttab_copy *copy_gop; + struct xen_page_foreign *foreign; + /* Convenient aliases */ + struct xenvif_queue *queue = info->queue; + struct netrx_pending_operations *npo = info->npo; + struct page *page = info->page; + + BUG_ON(npo->copy_off > MAX_BUFFER_OFFSET); + + if (npo->copy_off == MAX_BUFFER_OFFSET) + info->meta = get_next_rx_buffer(queue, npo); + + if (npo->copy_off + *len > MAX_BUFFER_OFFSET) + *len = MAX_BUFFER_OFFSET - npo->copy_off; + + copy_gop = npo->copy + npo->copy_prod++; + copy_gop->flags = GNTCOPY_de
[PATCH v4 17/20] net/xen-netfront: Make it running on 64KB page granularity
The PV network protocol is using 4KB page granularity. The goal of this patch is to allow a Linux using 64KB page granularity using network device on a non-modified Xen. It's only necessary to adapt the ring size and break skb data in small chunk of 4KB. The rest of the code is relying on the grant table code. Note that we allocate a Linux page for each rx skb but only the first 4KB is used. We may improve the memory usage by extending the size of the rx skb. Signed-off-by: Julien Grall Reviewed-by: David Vrabel --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: net...@vger.kernel.org Improvement such as support of 64KB grant is not taken into consideration in this patch because we have the requirement to run a Linux using 64KB pages on a non-modified Xen. Tested with workload such as ping, ssh, wget, git... I would happy if someone give details how to test all the path. Changes in v4: - s/gnttab_one_grant/gnttab_for_one_grant/ based on the new naming - Add David's reviewed-by Changes in v3: - Fix errors reported by checkpatch.pl - s/mfn/gfn/ base on the new naming - xennet_tx_setup_grant was calling itself resulting an guest stall when using iperf. - The grant callback doesn't allow anymore to change the len (wasn't used here) - gnttab_foreach_grant has been renamed to gnttab_foreach_grant_in_range - gnttab_page_grant_foreign_ref has been renamed to gnttab_foreach_grant_foreign_ref_one Changes in v2: - Use gnttab_foreach_grant to split a Linux page in grant - Fix count slots --- drivers/net/xen-netfront.c | 122 - 1 file changed, 86 insertions(+), 36 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 47f791e..17b1013 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -74,8 +74,8 @@ struct netfront_cb { #define GRANT_INVALID_REF 0 -#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) -#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) +#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE) +#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE) /* Minimum number of Rx slots (includes slot for GSO metadata). */ #define NET_RX_SLOTS_MIN (XEN_NETIF_NR_SLOTS_MIN + 1) @@ -291,7 +291,7 @@ static void xennet_alloc_rx_buffers(struct netfront_queue *queue) struct sk_buff *skb; unsigned short id; grant_ref_t ref; - unsigned long gfn; + struct page *page; struct xen_netif_rx_request *req; skb = xennet_alloc_one_rx_buffer(queue); @@ -307,14 +307,13 @@ static void xennet_alloc_rx_buffers(struct netfront_queue *queue) BUG_ON((signed short)ref < 0); queue->grant_rx_ref[id] = ref; - gfn = xen_page_to_gfn(skb_frag_page(&skb_shinfo(skb)->frags[0])); + page = skb_frag_page(&skb_shinfo(skb)->frags[0]); req = RING_GET_REQUEST(&queue->rx, req_prod); - gnttab_grant_foreign_access_ref(ref, - queue->info->xbdev->otherend_id, - gfn, - 0); - + gnttab_page_grant_foreign_access_ref_one(ref, + queue->info->xbdev->otherend_id, +page, +0); req->id = id; req->gref = ref; } @@ -415,25 +414,33 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue) xennet_maybe_wake_tx(queue); } -static struct xen_netif_tx_request *xennet_make_one_txreq( - struct netfront_queue *queue, struct sk_buff *skb, - struct page *page, unsigned int offset, unsigned int len) +struct xennet_gnttab_make_txreq { + struct netfront_queue *queue; + struct sk_buff *skb; + struct page *page; + struct xen_netif_tx_request *tx; /* Last request */ + unsigned int size; +}; + +static void xennet_tx_setup_grant(unsigned long gfn, unsigned int offset, + unsigned int len, void *data) { + struct xennet_gnttab_make_txreq *info = data; unsigned int id; struct xen_netif_tx_request *tx; grant_ref_t ref; - - len = min_t(unsigned int, PAGE_SIZE - offset, len); + /* convenient aliases */ + struct page *page = info->page; + struct netfront_queue *queue = info->queue; + struct sk_buff *skb = info->skb; id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs); tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++); ref = gnttab_claim_
[PATCH v4 15/20] block/xen-blkfront: Make it running on 64KB page granularity
The PV block protocol is using 4KB page granularity. The goal of this patch is to allow a Linux using 64KB page granularity using block device on a non-modified Xen. The block API is using segment which should at least be the size of a Linux page. Therefore, the driver will have to break the page in chunk of 4K before giving the page to the backend. When breaking a 64KB segment in 4KB chunks, it is possible that some chunks are empty. As the PV protocol always require to have data in the chunk, we have to count the number of Xen page which will be in use and avoid sending empty chunks. Note that, a pre-defined number of grants are reserved before preparing the request. This pre-defined number is based on the number and the maximum size of the segments. If each segment contains a very small amount of data, the driver may reserve too many grants (16 grants is reserved per segment with 64KB page granularity). Furthermore, in the case of persistent grants we allocate one Linux page per grant although only the first 4KB of the page will be effectively in use. This could be improved by sharing the page with multiple grants. Signed-off-by: Julien Grall Acked-by: Roger Pau Monné --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: David Vrabel Improvement such as support 64KB grant is not taken into consideration in this patch because we have the requirement to run a Linux using 64KB page on a non-modified Xen. Changes in v4: - Rebase after d50babbe300eedf33ea5b00a12c5df3a05bd96c7 " xen-blkfront: introduce blkfront_gather_backend_features()" - Fix typoes - Add Roger's acked-by Changes in v3: - Use DIV_ROUND_UP in INDIRECT_GREFS - Split lines over 80 characters whenever it's possible - s/mfn/gfn/ based on the new naming - The grant callback doesn't allow anymore to change the len (wasn't used here). - gnttab_foreach_grant has been renamed to gnttab_foreach_grant_in_range - Use gnttab_count_grant to get the number of grants in a sg - Do some renaming to use the correct variable every time Changes in v2: - Use gnttab_foreach_grant to split a Linux page into grant --- drivers/block/xen-blkfront.c | 324 --- 1 file changed, 213 insertions(+), 111 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 4232cbd..f2cdc73 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -78,6 +78,7 @@ struct blk_shadow { struct grant **grants_used; struct grant **indirect_grants; struct scatterlist *sg; + unsigned int num_sg; }; struct split_bio { @@ -107,8 +108,12 @@ static unsigned int xen_blkif_max_ring_order; module_param_named(max_ring_page_order, xen_blkif_max_ring_order, int, S_IRUGO); MODULE_PARM_DESC(max_ring_page_order, "Maximum order of pages to be used for the shared ring"); -#define BLK_RING_SIZE(info) __CONST_RING_SIZE(blkif, PAGE_SIZE * (info)->nr_ring_pages) -#define BLK_MAX_RING_SIZE __CONST_RING_SIZE(blkif, PAGE_SIZE * XENBUS_MAX_RING_PAGES) +#define BLK_RING_SIZE(info)\ + __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * (info)->nr_ring_pages) + +#define BLK_MAX_RING_SIZE \ + __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * XENBUS_MAX_RING_PAGES) + /* * ring-ref%i i=(-1UL) would take 11 characters + 'ring-ref' is 8, so 19 * characters are enough. Define to 20 to keep consist with backend. @@ -147,6 +152,7 @@ struct blkfront_info unsigned int discard_granularity; unsigned int discard_alignment; unsigned int feature_persistent:1; + /* Number of 4KB segments handled */ unsigned int max_indirect_segments; int is_ready; struct blk_mq_tag_set tag_set; @@ -175,10 +181,23 @@ static DEFINE_SPINLOCK(minor_lock); #define DEV_NAME "xvd" /* name in /dev */ -#define SEGS_PER_INDIRECT_FRAME \ - (PAGE_SIZE/sizeof(struct blkif_request_segment)) -#define INDIRECT_GREFS(_segs) \ - ((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME) +/* + * Grants are always the same size as a Xen page (i.e 4KB). + * A physical segment is always the same size as a Linux page. + * Number of grants per physical segment + */ +#define GRANTS_PER_PSEG(PAGE_SIZE / XEN_PAGE_SIZE) + +#define GRANTS_PER_INDIRECT_FRAME \ + (XEN_PAGE_SIZE / sizeof(struct blkif_request_segment)) + +#define PSEGS_PER_INDIRECT_FRAME \ + (GRANTS_INDIRECT_FRAME / GRANTS_PSEGS) + +#define INDIRECT_GREFS(_grants)\ + DIV_ROUND_UP(_grants, GRANTS_PER_INDIRECT_FRAME) + +#define GREFS(_psegs) ((_psegs) * GRANTS_PER_PSEG) static int blkfront_setup_indirect(struct blkfront_info *info); static int blkfront_gather_backend_features(struct blkfront_info *info); @@ -466,14 +485,100 @@ static int blkif_queue_discard_req(struct request *req) return 0; } +struct setu
[PATCH v4 20/20] arm/xen: Add support for 64KB page granularity
The hypercall interface is always using 4KB page granularity. This is requiring to use xen page definition macro when we deal with hypercall. Note that pfn_to_gfn is working with a Xen pfn (i.e 4KB). We may want to rename pfn_gfn to make this explicit. We also allocate a 64KB page for the shared page even though only the first 4KB is used. I don't think this is really important for now as it helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a Xen PFN). Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini --- Cc: Russell King Stefano, I've dropped your reviewed-by given I've updated the doc and do changes to avoid usage of XEN_PAGE_SHIFT Changes in v4: - Add Stefano's Reviewed-by Changes in v3: - s/MFN/GFN/ base on the new naming - Use virt_to_gfn to avoid use XEN_PAGE_SHIFT - Drop Stefano's reviewed-by - Add some docs in arch/arm/asm/xen/page.h Changes in v2 - Add Stefano's reviewed-by --- arch/arm/include/asm/xen/page.h | 15 +-- arch/arm/xen/enlighten.c| 6 +++--- 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h index 98c9fc3..e3d94cf 100644 --- a/arch/arm/include/asm/xen/page.h +++ b/arch/arm/include/asm/xen/page.h @@ -28,6 +28,17 @@ typedef struct xpaddr { #define INVALID_P2M_ENTRY (~0UL) +/* + * The pseudo-physical frame (pfn) used in all the helpers is always based + * on Xen page granularity (i.e 4KB). + * + * A Linux page may be split across multiple non-contiguous Xen page so we + * have to keep track with frame based on 4KB page granularity. + * + * PV drivers should never make a direct usage of those helpers (particularly + * pfn_to_gfn and gfn_to_pfn). + */ + unsigned long __pfn_to_mfn(unsigned long pfn); extern struct rb_root phys_to_mach; @@ -64,8 +75,8 @@ static inline unsigned long bfn_to_pfn(unsigned long bfn) #define bfn_to_local_pfn(bfn) bfn_to_pfn(bfn) /* VIRT <-> GUEST conversion */ -#define virt_to_gfn(v) (pfn_to_gfn(virt_to_pfn(v))) -#define gfn_to_virt(m) (__va(gfn_to_pfn(m) << PAGE_SHIFT)) +#define virt_to_gfn(v) (pfn_to_gfn(virt_to_phys(v) >> XEN_PAGE_SHIFT)) +#define gfn_to_virt(m) (__va(gfn_to_pfn(m) << XEN_PAGE_SHIFT)) /* Only used in PV code. But ARM guests are always HVM. */ static inline xmaddr_t arbitrary_virt_to_machine(void *vaddr) diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c index eeeab07..50b4769 100644 --- a/arch/arm/xen/enlighten.c +++ b/arch/arm/xen/enlighten.c @@ -89,8 +89,8 @@ static void xen_percpu_init(void) pr_info("Xen: initializing cpu%d\n", cpu); vcpup = per_cpu_ptr(xen_vcpu_info, cpu); - info.mfn = __pa(vcpup) >> PAGE_SHIFT; - info.offset = offset_in_page(vcpup); + info.mfn = virt_to_gfn(vcpup); + info.offset = xen_offset_in_page(vcpup); err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info); BUG_ON(err); @@ -213,7 +213,7 @@ static int __init xen_guest_init(void) xatp.domid = DOMID_SELF; xatp.idx = 0; xatp.space = XENMAPSPACE_shared_info; - xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT; + xatp.gpfn = virt_to_gfn(shared_info_page); if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp)) BUG(); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 12/20] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
For ARM64 guests, Linux is able to support either 64K or 4K page granularity. Although, the hypercall interface is always based on 4K page granularity. With 64K page granularity, a single page will be spread over multiple Xen frame. To avoid splitting the page into 4K frame, take advantage of the extent_order field to directly allocate/free chunk of the Linux page size. Note that PVMMU is only used for PV guest (which is x86) and the page granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure that because the code has not been modified. Signed-off-by: Julien Grall --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: David Vrabel Cc: Wei Liu Note that two BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE) in code built for the PV MMU code is kept in order to have at least one even if we ever decide to drop of code section. Changes in v4: - s/xen_page_to_pfn/page_to_xen_pfn/ based on the new naming - Use the field lru in the page to get a list of pages when decreasing the memory reservation. It avoids to use a static array to store the pages (see v3). - Update comment for EXTENT_ORDER. Changes in v3: - Fix errors reported by checkpatch.pl - s/mfn/gfn/ based on the new naming - Rather than splitting the page into 4KB chunk, use the extent_order field to allocate directly a Linux page size. This is avoid lots of code for no benefits. Changes in v2: - Use xen_apply_to_page to split a page in 4K chunk - It's not necessary to have a smaller frame list. Re-use PAGE_SIZE - Convert reserve_additional_memory to use XEN_... macro --- drivers/xen/balloon.c | 59 ++- 1 file changed, 44 insertions(+), 15 deletions(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index c79329f..3babf13 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -70,6 +70,11 @@ #include #include +/* Use one extent per PAGE_SIZE to avoid to break down the page into + * multiple frame. + */ +#define EXTENT_ORDER (fls(XEN_PFN_PER_PAGE) - 1) + /* * balloon_process() state: * @@ -230,6 +235,11 @@ static enum bp_state reserve_additional_memory(long credit) nid = memory_add_physaddr_to_nid(hotplug_start_paddr); #ifdef CONFIG_XEN_HAVE_PVMMU + /* We don't support PV MMU when Linux and Xen is using +* different page granularity. +*/ + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); + /* * add_memory() will build page tables for the new memory so * the p2m must contain invalid entries so the correct @@ -326,11 +336,11 @@ static enum bp_state reserve_additional_memory(long credit) static enum bp_state increase_reservation(unsigned long nr_pages) { int rc; - unsigned long pfn, i; + unsigned long i; struct page *page; struct xen_memory_reservation reservation = { .address_bits = 0, - .extent_order = 0, + .extent_order = EXTENT_ORDER, .domid= DOMID_SELF }; @@ -352,7 +362,11 @@ static enum bp_state increase_reservation(unsigned long nr_pages) nr_pages = i; break; } - frame_list[i] = page_to_pfn(page); + + /* XENMEM_populate_physmap requires a PFN based on Xen +* granularity. +*/ + frame_list[i] = page_to_xen_pfn(page); page = balloon_next_page(page); } @@ -366,10 +380,15 @@ static enum bp_state increase_reservation(unsigned long nr_pages) page = balloon_retrieve(false); BUG_ON(page == NULL); - pfn = page_to_pfn(page); - #ifdef CONFIG_XEN_HAVE_PVMMU + /* We don't support PV MMU when Linux and Xen is using +* different page granularity. +*/ + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); + if (!xen_feature(XENFEAT_auto_translated_physmap)) { + unsigned long pfn = page_to_pfn(page); + set_phys_to_machine(pfn, frame_list[i]); /* Link back into the page tables if not highmem. */ @@ -396,14 +415,15 @@ static enum bp_state increase_reservation(unsigned long nr_pages) static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) { enum bp_state state = BP_DONE; - unsigned long pfn, i; - struct page *page; + unsigned long i; + struct page *page, *tmp; int ret; struct xen_memory_reservation reservation = { .address_bits = 0, - .extent_order = 0, + .extent_order = EXTENT_ORDER, .domid= DOMID_SELF }; + LIST_HEAD(pages); #ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
[PATCH v4 16/20] block/xen-blkback: Make it running on 64KB page granularity
The PV block protocol is using 4KB page granularity. The goal of this patch is to allow a Linux using 64KB page granularity behaving as a block backend on a non-modified Xen. It's only necessary to adapt the ring size and the number of request per indirect frames. The rest of the code is relying on the grant table code. Note that the grant table code is allocating a Linux page per grant which will result to waste 6OKB for every grant when Linux is using 64KB page granularity. This could be improved by sharing the page between multiple grants. Signed-off-by: Julien Grall Acked-by: "Roger Pau Monné" --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: David Vrabel Improvement such as support of 64KB grant is not taken into consideration in this patch because we have the requirement to run a Linux using 64KB pages on a non-modified Xen. This has been tested only with a loop device. I plan to test passing hard drive partition but I didn't yet convert the swiotlb code. Changes in v4: - Add Roger's acked-by Changes in v3: - Use DIV_ROUND_UP in INDIRECT_PAGES to avoid a line over 80 characters --- drivers/block/xen-blkback/blkback.c | 5 +++-- drivers/block/xen-blkback/common.h | 17 + drivers/block/xen-blkback/xenbus.c | 9 ++--- 3 files changed, 22 insertions(+), 9 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 954c002..802319a 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -961,7 +961,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request *req, seg[n].nsec = segments[i].last_sect - segments[i].first_sect + 1; seg[n].offset = (segments[i].first_sect << 9); - if ((segments[i].last_sect >= (PAGE_SIZE >> 9)) || + if ((segments[i].last_sect >= (XEN_PAGE_SIZE >> 9)) || (segments[i].last_sect < segments[i].first_sect)) { rc = -EINVAL; goto unmap; @@ -1210,6 +1210,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, req_operation = req->operation == BLKIF_OP_INDIRECT ? req->u.indirect.indirect_op : req->operation; + if ((req->operation == BLKIF_OP_INDIRECT) && (req_operation != BLKIF_OP_READ) && (req_operation != BLKIF_OP_WRITE)) { @@ -1268,7 +1269,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, seg[i].nsec = req->u.rw.seg[i].last_sect - req->u.rw.seg[i].first_sect + 1; seg[i].offset = (req->u.rw.seg[i].first_sect << 9); - if ((req->u.rw.seg[i].last_sect >= (PAGE_SIZE >> 9)) || + if ((req->u.rw.seg[i].last_sect >= (XEN_PAGE_SIZE >> 9)) || (req->u.rw.seg[i].last_sect < req->u.rw.seg[i].first_sect)) goto fail_response; diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index 45a044a..68e87a0 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -39,6 +39,7 @@ #include #include #include +#include #include #include #include @@ -51,12 +52,20 @@ extern unsigned int xen_blkif_max_ring_order; */ #define MAX_INDIRECT_SEGMENTS 256 -#define SEGS_PER_INDIRECT_FRAME \ - (PAGE_SIZE/sizeof(struct blkif_request_segment)) +/* + * Xen use 4K pages. The guest may use different page size (4K or 64K) + * Number of Xen pages per segment + */ +#define XEN_PAGES_PER_SEGMENT (PAGE_SIZE / XEN_PAGE_SIZE) + +#define XEN_PAGES_PER_INDIRECT_FRAME \ + (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment)) +#define SEGS_PER_INDIRECT_FRAME\ + (XEN_PAGES_PER_INDIRECT_FRAME / XEN_PAGES_PER_SEGMENT) + #define MAX_INDIRECT_PAGES \ ((MAX_INDIRECT_SEGMENTS + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME) -#define INDIRECT_PAGES(_segs) \ - ((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME) +#define INDIRECT_PAGES(_segs) DIV_ROUND_UP(_segs, XEN_PAGES_PER_INDIRECT_FRAME) /* Not a real protocol. Used to generate ring structs which contain * the elements common to all protocols only. This way we get a diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index deb3f00..edd27e4 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -176,21 +176,24 @@ static int xen_blkif_map(struct xen_blkif *blkif, grant_ref_t *gref, { struct blkif_sring *sring; sring = (struct blkif_sring *)blkif->blk_ring; - BACK_RING_INIT(&blkif->blk_rings.native, sring, PAGE_SIZE * nr_grefs); + BACK_RING_INIT(&blkif->blk_rings.native, sring, +
[PATCH v4 13/20] xen/events: fifo: Make it running on 64KB granularity
Only use the first 4KB of the page to store the events channel info. It means that we will waste 60KB every time we allocate page for: * control block: a page is allocating per CPU * event array: a page is allocating everytime we need to expand it I think we can reduce the memory waste for the 2 areas by: * control block: sharing between multiple vCPUs. Although it will require some bookkeeping in order to not free the page when the CPU goes offline and the other CPUs sharing the page still there * event array: always extend the array event by 64K (i.e 16 4K chunk). That would require more care when we fail to expand the event channel. Signed-off-by: Julien Grall Reviewed-by: David Vrabel Reviewed-by: Stefano Stabellini --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Note I haven't updated the suggestion to reduce the memory waste after David's email [1]. I can do it if necessary. Changes in v3: - Add David and Stefano's reviewed-by [1] http://lists.xen.org/archives/html/xen-devel/2015-07/msg04596.html --- drivers/xen/events/events_base.c | 2 +- drivers/xen/events/events_fifo.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c49bb7a..00dd923 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -40,11 +40,11 @@ #include #include #include -#include #endif #include #include #include +#include #include #include diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c index 1d4baf5..e3e9e3d 100644 --- a/drivers/xen/events/events_fifo.c +++ b/drivers/xen/events/events_fifo.c @@ -54,7 +54,7 @@ #include "events_internal.h" -#define EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t)) +#define EVENT_WORDS_PER_PAGE (XEN_PAGE_SIZE / sizeof(event_word_t)) #define MAX_EVENT_ARRAY_PAGES (EVTCHN_FIFO_NR_CHANNELS / EVENT_WORDS_PER_PAGE) struct evtchn_fifo_queue { -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 10/20] xen/xenbus: Use Xen page definition
All the ring (xenstore, and PV rings) are always based on the page granularity of Xen. Signed-off-by: Julien Grall Reviewed-by: David Vrabel Reviewed-by: Stefano Stabellini --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Changes in v3: - Fix errors reported by checkpatch.pl - s/MFN/GFN base on the new naming - Add David and Stefano's reviewed-by Changes in v2: - Also update the ring mapping function --- drivers/xen/xenbus/xenbus_client.c | 6 +++--- drivers/xen/xenbus/xenbus_probe.c | 3 ++- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c index 2ba09c1..359e654 100644 --- a/drivers/xen/xenbus/xenbus_client.c +++ b/drivers/xen/xenbus/xenbus_client.c @@ -388,7 +388,7 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr, } grefs[i] = err; - vaddr = vaddr + PAGE_SIZE; + vaddr = vaddr + XEN_PAGE_SIZE; } return 0; @@ -555,7 +555,7 @@ static int xenbus_map_ring_valloc_pv(struct xenbus_device *dev, if (!node) return -ENOMEM; - area = alloc_vm_area(PAGE_SIZE * nr_grefs, ptes); + area = alloc_vm_area(XEN_PAGE_SIZE * nr_grefs, ptes); if (!area) { kfree(node); return -ENOMEM; @@ -750,7 +750,7 @@ static int xenbus_unmap_ring_vfree_pv(struct xenbus_device *dev, void *vaddr) unsigned long addr; memset(&unmap[i], 0, sizeof(unmap[i])); - addr = (unsigned long)vaddr + (PAGE_SIZE * i); + addr = (unsigned long)vaddr + (XEN_PAGE_SIZE * i); unmap[i].host_addr = arbitrary_virt_to_machine( lookup_address(addr, &level)).maddr; unmap[i].dev_bus_addr = 0; diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c index 3cbe055..33a31cf 100644 --- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -802,7 +802,8 @@ static int __init xenbus_init(void) goto out_error; xen_store_gfn = (unsigned long)v; xen_store_interface = - xen_remap(xen_store_gfn << PAGE_SHIFT, PAGE_SIZE); + xen_remap(xen_store_gfn << XEN_PAGE_SHIFT, + XEN_PAGE_SIZE); break; default: pr_warn("Xenstore state unknown\n"); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 14/20] xen/grant-table: Make it running on 64KB granularity
The Xen interface is using 4KB page granularity. This means that each grant is 4KB. The current implementation allocates a Linux page per grant. On Linux using 64KB page granularity, only the first 4KB of the page will be used. We could decrease the memory wasted by sharing the page with multiple grant. It will require some care with the {Set,Clear}ForeignPage macro. Note that no changes has been made in the x86 code because both Linux and Xen will only use 4KB page granularity. Signed-off-by: Julien Grall Reviewed-by: David Vrabel Reviewed-by: Stefano Stabellini --- Cc: Stefano Stabellini Cc: Russell King Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Changes in v3: - Add Stefano's reviewed-by Changes in v2 - Add David's reviewed-by --- arch/arm/xen/p2m.c| 6 +++--- drivers/xen/grant-table.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c index 887596c..0ed01f2 100644 --- a/arch/arm/xen/p2m.c +++ b/arch/arm/xen/p2m.c @@ -93,8 +93,8 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops, for (i = 0; i < count; i++) { if (map_ops[i].status) continue; - set_phys_to_machine(map_ops[i].host_addr >> PAGE_SHIFT, - map_ops[i].dev_bus_addr >> PAGE_SHIFT); + set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT, + map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT); } return 0; @@ -108,7 +108,7 @@ int clear_foreign_p2m_mapping(struct gnttab_unmap_grant_ref *unmap_ops, int i; for (i = 0; i < count; i++) { - set_phys_to_machine(unmap_ops[i].host_addr >> PAGE_SHIFT, + set_phys_to_machine(unmap_ops[i].host_addr >> XEN_PAGE_SHIFT, INVALID_P2M_ENTRY); } diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index 7b4e1cf..99ed9c2 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -642,7 +642,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr) if (xen_auto_xlat_grant_frames.count) return -EINVAL; - vaddr = xen_remap(addr, PAGE_SIZE * max_nr_gframes); + vaddr = xen_remap(addr, XEN_PAGE_SIZE * max_nr_gframes); if (vaddr == NULL) { pr_warn("Failed to ioremap gnttab share frames (addr=%pa)!\n", &addr); @@ -654,7 +654,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr) return -ENOMEM; } for (i = 0; i < max_nr_gframes; i++) - pfn[i] = PFN_DOWN(addr) + i; + pfn[i] = XEN_PFN_DOWN(addr) + i; xen_auto_xlat_grant_frames.vaddr = vaddr; xen_auto_xlat_grant_frames.pfn = pfn; @@ -1004,7 +1004,7 @@ static void gnttab_request_version(void) { /* Only version 1 is used, which will always be available. */ grant_table_version = 1; - grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1); + grefs_per_grant_frame = XEN_PAGE_SIZE / sizeof(struct grant_entry_v1); gnttab_interface = &gnttab_v1_ops; pr_info("Grant tables using version %d layout\n", grant_table_version); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 11/20] tty/hvc: xen: Use xen page definition
The console ring is always based on the page granularity of Xen. Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini --- Cc: Greg Kroah-Hartman Cc: Jiri Slaby Cc: David Vrabel Cc: Boris Ostrovsky Cc: linuxppc-...@lists.ozlabs.org Changes in v4: - The ring is always 4K (i.e XEN_PAGE_SIZE), so no need to map with PAGE_SIZE. This was correctly done in v2 but lost with the rebase to the "s/mfn/gfn/" series Changes in v3: - Some changes has been moved in the series "Use correctly the Xen memory terminologies in Linux". - Add Stefano's reviewed-by --- drivers/tty/hvc/hvc_xen.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c index 10beb15..fa816b7 100644 --- a/drivers/tty/hvc/hvc_xen.c +++ b/drivers/tty/hvc/hvc_xen.c @@ -230,7 +230,7 @@ static int xen_hvm_console_init(void) if (r < 0 || v == 0) goto err; gfn = v; - info->intf = xen_remap(gfn << PAGE_SHIFT, PAGE_SIZE); + info->intf = xen_remap(gfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE); if (info->intf == NULL) goto err; info->vtermno = HVC_COOKIE; @@ -472,7 +472,7 @@ static int xencons_resume(struct xenbus_device *dev) struct xencons_info *info = dev_get_drvdata(&dev->dev); xencons_disconnect_backend(info); - memset(info->intf, 0, PAGE_SIZE); + memset(info->intf, 0, XEN_PAGE_SIZE); return xencons_connect_backend(dev, info); } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 19/20] xen/privcmd: Add support for Linux 64KB page granularity
The hypercall interface (as well as the toolstack) is always using 4KB page granularity. When the toolstack is asking for mapping a series of guest PFN in a batch, it expects to have the page map contiguously in its virtual memory. When Linux is using 64KB page granularity, the privcmd driver will have to map multiple Xen PFN in a single Linux page. Note that this solution works on page granularity which is a multiple of 4KB. Signed-off-by: Julien Grall Reviewed-by: David Vrabel --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky I kept the hypercall arguments in remap_data to avoid allocating them on the stack every time that remap_pte_fn is called. I will keep like that unless someone is strongly disagree. Changes in v4: - s/xen_page_to_pfn/page_to_xen_pfn/ based on the new naming - Add David's reviewed-by Changes in v3: - The function to split a Linux page in mutiple Xen page has been moved internally. It was the only use (not used anymore in the balloon) and it's not quite clear what should be the common interface. Differ the question until someone need to use it. - s/nr_pfn/numgfns/ to make clear that we are dealing with GFN - Use DIV_ROUND_UP rather round_up and fix the usage in xen_xlate_unmap_gfn_range Changes in v2: - Use xen_apply_to_page --- drivers/xen/privcmd.c | 8 ++-- drivers/xen/xlate_mmu.c | 124 2 files changed, 89 insertions(+), 43 deletions(-) diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c index c6deb87..c8798ee 100644 --- a/drivers/xen/privcmd.c +++ b/drivers/xen/privcmd.c @@ -446,7 +446,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version) return -EINVAL; } - nr_pages = m.num; + nr_pages = DIV_ROUND_UP(m.num, XEN_PFN_PER_PAGE); if ((m.num <= 0) || (nr_pages > (LONG_MAX >> PAGE_SHIFT))) return -EINVAL; @@ -494,7 +494,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version) goto out_unlock; } if (xen_feature(XENFEAT_auto_translated_physmap)) { - ret = alloc_empty_pages(vma, m.num); + ret = alloc_empty_pages(vma, nr_pages); if (ret < 0) goto out_unlock; } else @@ -518,6 +518,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version) state.global_error = 0; state.version = version; + BUILD_BUG_ON(((PAGE_SIZE / sizeof(xen_pfn_t)) % XEN_PFN_PER_PAGE) != 0); /* mmap_batch_fn guarantees ret == 0 */ BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t), &pagelist, mmap_batch_fn, &state)); @@ -582,12 +583,13 @@ static void privcmd_close(struct vm_area_struct *vma) { struct page **pages = vma->vm_private_data; int numpgs = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + int numgfns = (vma->vm_end - vma->vm_start) >> XEN_PAGE_SHIFT; int rc; if (!xen_feature(XENFEAT_auto_translated_physmap) || !numpgs || !pages) return; - rc = xen_unmap_domain_gfn_range(vma, numpgs, pages); + rc = xen_unmap_domain_gfn_range(vma, numgfns, pages); if (rc == 0) free_xenballooned_pages(numpgs, pages); else diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c index cff2387..5063c5e 100644 --- a/drivers/xen/xlate_mmu.c +++ b/drivers/xen/xlate_mmu.c @@ -38,31 +38,28 @@ #include #include -/* map fgfn of domid to lpfn in the current domain */ -static int map_foreign_page(unsigned long lpfn, unsigned long fgfn, - unsigned int domid) -{ - int rc; - struct xen_add_to_physmap_range xatp = { - .domid = DOMID_SELF, - .foreign_domid = domid, - .size = 1, - .space = XENMAPSPACE_gmfn_foreign, - }; - xen_ulong_t idx = fgfn; - xen_pfn_t gpfn = lpfn; - int err = 0; +typedef void (*xen_gfn_fn_t)(unsigned long gfn, void *data); - set_xen_guest_handle(xatp.idxs, &idx); - set_xen_guest_handle(xatp.gpfns, &gpfn); - set_xen_guest_handle(xatp.errs, &err); +/* Break down the pages in 4KB chunk and call fn for each gfn */ +static void xen_for_each_gfn(struct page **pages, unsigned nr_gfn, +xen_gfn_fn_t fn, void *data) +{ + unsigned long xen_pfn = 0; + struct page *page; + int i; - rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap_range, &xatp); - return rc < 0 ? rc : err; + for (i = 0; i < nr_gfn; i++) { + if ((i % XEN_PFN_PER_PAGE) == 0) { + page = pages[i / XEN_PFN_PER_PAGE]; + xen_pfn = page_to_xen_pfn(page); +
[PATCH v4 00/20] xen/arm64: Add support for 64KB page in Linux
Hi all, ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen hypercall interface and PV protocol are always based on 4KB page granularity. Any attempt to boot a Linux guest with 64KB pages enabled will result to a guest crash. This series is a first attempt to allow those Linux running with the current hypercall interface and PV protocol. This solution has been chosen because we want to run Linux 64KB in released Xen ARM version or/and platform using an old version of Linux DOM0. There is room for improvement, such as support of 64KB grant, modification of PV protocol to support different page size... They will be explored in a separate patch series later. TODO list: - Convert swiotlb to 64KB - Convert xenfb to 64KB - Support for multiple page ring support - Support for 64KB in gnttdev - Support of non-indirect grant with 64KB frontend - It may be possible to move some common define between netback/netfront and blkfront/blkback in an header I've got most of the patches for the TODO items. I'm planning to send them as a follow-up as it's not a requirement for a basic guests. All patches has been built tested for ARM32, ARM64, x86. But I haven't tested to run it on x86 as I don't have a box with Xen x86 running. I would be happy if someone give a try and see possible regression for x86. I know that Konrad as a test-suite for x86. Konrand, would it be possible to give a run to for this series? A branch based on the latest xentip/for-linus-4.3 can be found here: git://xenbits.xen.org/people/julieng/linux-arm.git branch xen-64k-v4 Comments, suggestions are welcomed. Sincerely yours, Cc: david.vra...@citrix.com Cc: konrad.w...@oracle.com Cc: boris.ostrov...@oracle.com Cc: wei.l...@citrix.com Cc: roger@citrix.com Status of each patch: A: Reviewed-by - Acked-by M: Patch modified in this series m: Minor changes in this series (i.e renaming due to previous patches, typoes) L: Missing Acked-by from a Linux maintainers (Boris, David, Konrad) N: Missing Acked-by from a Netback maintainers (Ian or Wei) Julien Grall (20): A net/xen-netback: xenvif_gop_frag_copy: move GSO check out of the loop A arm/xen: Drop pte_mfn and mfn_pte A M L xen: Add Xen specific page definition A M xen/grant: Introduce helpers to split a page into grant A xen/grant: Add helper gnttab_page_grant_foreign_access_ref_one A block/xen-blkfront: Split blkif_queue_request in 2 A m block/xen-blkfront: Store a page rather a pfn in the grant structure A block/xen-blkfront: split get_grant in 2 A m L xen/biomerge: Don't allow biovec's to be merged when Linux is not using 4KB pages A xen/xenbus: Use Xen page definition A m L tty/hvc: xen: Use xen page definition M L xen/balloon: Don't rely on the page granularity is the same for Xen and Linux A xen/events: fifo: Make it running on 64KB granularity A xen/grant-table: Make it running on 64KB granularity A m block/xen-blkfront: Make it running on 64KB page granularity A block/xen-blkback: Make it running on 64KB page granularity A m net/xen-netfront: Make it running on 64KB page granularity m N net/xen-netback: Make it running on 64KB page granularity A m xen/privcmd: Add support for Linux 64KB page granularity A arm/xen: Add support for 64KB page granularity arch/arm/include/asm/xen/page.h | 18 +- arch/arm/xen/enlighten.c| 6 +- arch/arm/xen/p2m.c | 6 +- arch/x86/include/asm/xen/page.h | 2 +- drivers/block/xen-blkback/blkback.c | 5 +- drivers/block/xen-blkback/common.h | 17 +- drivers/block/xen-blkback/xenbus.c | 9 +- drivers/block/xen-blkfront.c| 552 +++- drivers/net/xen-netback/common.h| 18 +- drivers/net/xen-netback/netback.c | 163 +++ drivers/net/xen-netfront.c | 122 +--- drivers/tty/hvc/hvc_xen.c | 4 +- drivers/xen/balloon.c | 59 +++- drivers/xen/biomerge.c | 8 + drivers/xen/events/events_base.c| 2 +- drivers/xen/events/events_fifo.c| 2 +- drivers/xen/grant-table.c | 32 ++- drivers/xen/privcmd.c | 8 +- drivers/xen/xenbus/xenbus_client.c | 6 +- drivers/xen/xenbus/xenbus_probe.c | 3 +- drivers/xen/xlate_mmu.c | 124 +--- include/xen/grant_table.h | 51 include/xen/page.h | 27 +- 23 files changed, 855 insertions(+), 389 deletions(-) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 07/20] block/xen-blkfront: Store a page rather a pfn in the grant structure
All the usage of the field pfn are done using the same idiom: pfn_to_page(grant->pfn) This will return always the same page. Store directly the page in the grant to clean up the code. Signed-off-by: Julien Grall Acked-by: Roger Pau Monné Reviewed-by: Stefano Stabellini --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: David Vrabel Roger, Stefano, I kept your Acked-by/Reviewed-by because the rebase was minor. Let me know if you disagree. Changes in v4: - rebase after 7adf12b87f45a77d364464018fb8e9e1ac875152 "xen-blkfront: don't add indirect pages to list when !feature_persistent" Changes in v3: - Use the correct indentation in get_grant. The current indentation (i.e without this patch) was wrong because it was using space rather than tabulation. - Add Roger's acked and Stefano's reviewed - s/mfn/gfn based on the new naming Changes in v2: - Patch added --- drivers/block/xen-blkfront.c | 39 +++ 1 file changed, 19 insertions(+), 20 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index b11f084..556475d 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -68,7 +68,7 @@ enum blkif_state { struct grant { grant_ref_t gref; - unsigned long pfn; + struct page *page; struct list_head node; }; @@ -222,7 +222,7 @@ static int fill_grant_buffer(struct blkfront_info *info, int num) kfree(gnt_list_entry); goto out_of_memory; } - gnt_list_entry->pfn = page_to_pfn(granted_page); + gnt_list_entry->page = granted_page; } gnt_list_entry->gref = GRANT_INVALID_REF; @@ -237,7 +237,7 @@ out_of_memory: &info->grants, node) { list_del(&gnt_list_entry->node); if (info->feature_persistent) - __free_page(pfn_to_page(gnt_list_entry->pfn)); + __free_page(gnt_list_entry->page); kfree(gnt_list_entry); i--; } @@ -246,8 +246,8 @@ out_of_memory: } static struct grant *get_grant(grant_ref_t *gref_head, - unsigned long pfn, - struct blkfront_info *info) + struct page *page, + struct blkfront_info *info) { struct grant *gnt_list_entry; unsigned long buffer_gfn; @@ -266,10 +266,10 @@ static struct grant *get_grant(grant_ref_t *gref_head, gnt_list_entry->gref = gnttab_claim_grant_reference(gref_head); BUG_ON(gnt_list_entry->gref == -ENOSPC); if (!info->feature_persistent) { - BUG_ON(!pfn); - gnt_list_entry->pfn = pfn; + BUG_ON(!page); + gnt_list_entry->page = page; } - buffer_gfn = pfn_to_gfn(gnt_list_entry->pfn); + buffer_gfn = xen_page_to_gfn(gnt_list_entry->page); gnttab_grant_foreign_access_ref(gnt_list_entry->gref, info->xbdev->otherend_id, buffer_gfn, 0); @@ -525,7 +525,7 @@ static int blkif_queue_rw_req(struct request *req) if ((ring_req->operation == BLKIF_OP_INDIRECT) && (i % SEGS_PER_INDIRECT_FRAME == 0)) { - unsigned long uninitialized_var(pfn); + struct page *uninitialized_var(page); if (segments) kunmap_atomic(segments); @@ -542,15 +542,15 @@ static int blkif_queue_rw_req(struct request *req) indirect_page = list_first_entry(&info->indirect_pages, struct page, lru); list_del(&indirect_page->lru); - pfn = page_to_pfn(indirect_page); + page = indirect_page; } - gnt_list_entry = get_grant(&gref_head, pfn, info); + gnt_list_entry = get_grant(&gref_head, page, info); info->shadow[id].indirect_grants[n] = gnt_list_entry; - segments = kmap_atomic(pfn_to_page(gnt_list_entry->pfn)); + segments = kmap_atomic(gnt_list_entry->page); ring_req->u.indirect.indirect_grefs[n] = gnt_list_entry->gref; } - gnt_list_entry = get_grant(&gref_head, page_to_pfn(sg_page(sg)), info); + gnt_list_entry = get_grant(&gref_head, sg_page(sg), info); ref = gnt_list_entry->gref; info->shadow[id].grants_used[i] = gnt_list_entry; @@ -561,7 +5
[PATCH v4 06/20] block/xen-blkfront: Split blkif_queue_request in 2
Currently, blkif_queue_request has 2 distinct execution path: - Send a discard request - Send a read/write request The function is also allocating grants to use for generating the request. Although, this is only used for read/write request. Rather than having a function with 2 distinct execution path, separate the function in 2. This will also remove one level of tabulation. Signed-off-by: Julien Grall Reviewed-by: Roger Pau Monné --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: David Vrabel Roger, if you really want if can drop the else clause in blkif_queue_request, IHMO it's more clear here. Although I've kept your Reviewed-by. Let me know if it's not fine. Changes in v3: - Fix errors reported by checkpatch.pl - Add Roger's Reviewed-by Changes in v2: - Patch added --- drivers/block/xen-blkfront.c | 277 --- 1 file changed, 153 insertions(+), 124 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 432e105..b11f084 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -395,13 +395,35 @@ static int blkif_ioctl(struct block_device *bdev, fmode_t mode, return 0; } -/* - * Generate a Xen blkfront IO request from a blk layer request. Reads - * and writes are handled as expected. - * - * @req: a request struct - */ -static int blkif_queue_request(struct request *req) +static int blkif_queue_discard_req(struct request *req) +{ + struct blkfront_info *info = req->rq_disk->private_data; + struct blkif_request *ring_req; + unsigned long id; + + /* Fill out a communications ring structure. */ + ring_req = RING_GET_REQUEST(&info->ring, info->ring.req_prod_pvt); + id = get_id_from_freelist(info); + info->shadow[id].request = req; + + ring_req->operation = BLKIF_OP_DISCARD; + ring_req->u.discard.nr_sectors = blk_rq_sectors(req); + ring_req->u.discard.id = id; + ring_req->u.discard.sector_number = (blkif_sector_t)blk_rq_pos(req); + if ((req->cmd_flags & REQ_SECURE) && info->feature_secdiscard) + ring_req->u.discard.flag = BLKIF_DISCARD_SECURE; + else + ring_req->u.discard.flag = 0; + + info->ring.req_prod_pvt++; + + /* Keep a private copy so we can reissue requests when recovering. */ + info->shadow[id].req = *ring_req; + + return 0; +} + +static int blkif_queue_rw_req(struct request *req) { struct blkfront_info *info = req->rq_disk->private_data; struct blkif_request *ring_req; @@ -421,9 +443,6 @@ static int blkif_queue_request(struct request *req) struct scatterlist *sg; int nseg, max_grefs; - if (unlikely(info->connected != BLKIF_STATE_CONNECTED)) - return 1; - max_grefs = req->nr_phys_segments; if (max_grefs > BLKIF_MAX_SEGMENTS_PER_REQUEST) /* @@ -453,139 +472,131 @@ static int blkif_queue_request(struct request *req) id = get_id_from_freelist(info); info->shadow[id].request = req; - if (unlikely(req->cmd_flags & (REQ_DISCARD | REQ_SECURE))) { - ring_req->operation = BLKIF_OP_DISCARD; - ring_req->u.discard.nr_sectors = blk_rq_sectors(req); - ring_req->u.discard.id = id; - ring_req->u.discard.sector_number = (blkif_sector_t)blk_rq_pos(req); - if ((req->cmd_flags & REQ_SECURE) && info->feature_secdiscard) - ring_req->u.discard.flag = BLKIF_DISCARD_SECURE; - else - ring_req->u.discard.flag = 0; + BUG_ON(info->max_indirect_segments == 0 && + req->nr_phys_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); + BUG_ON(info->max_indirect_segments && + req->nr_phys_segments > info->max_indirect_segments); + nseg = blk_rq_map_sg(req->q, req, info->shadow[id].sg); + ring_req->u.rw.id = id; + if (nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST) { + /* +* The indirect operation can only be a BLKIF_OP_READ or +* BLKIF_OP_WRITE +*/ + BUG_ON(req->cmd_flags & (REQ_FLUSH | REQ_FUA)); + ring_req->operation = BLKIF_OP_INDIRECT; + ring_req->u.indirect.indirect_op = rq_data_dir(req) ? + BLKIF_OP_WRITE : BLKIF_OP_READ; + ring_req->u.indirect.sector_number = (blkif_sector_t)blk_rq_pos(req); + ring_req->u.indirect.handle = info->handle; + ring_req->u.indirect.nr_segments = nseg; } else { - BUG_ON(info->max_indirect_segments == 0 && - req->nr_phys_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); - BUG_ON(info->max_indirect_segments && - req->nr_phys_segments > info->max_indirect_segments); - nseg = blk_r
Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig
On 04/09/15 00:51, Steve Muckle wrote: > Hi Morten, Dietmar, > > On 08/14/2015 09:23 AM, Morten Rasmussen wrote: > ... >> + * cfs_rq.avg.util_avg is the sum of running time of runnable tasks plus the >> + * recent utilization of currently non-runnable tasks on a CPU. It >> represents >> + * the amount of utilization of a CPU in the range [0..capacity_orig] where > > I see util_sum is scaled by SCHED_LOAD_SHIFT at the end of > __update_load_avg(). If there is now an assumption that util_avg may be > used directly as a capacity value, should it be changed to > SCHED_CAPACITY_SHIFT? These are equal right now, not sure if they will > always be or if they can be combined. You're referring to the code line 2647 sa->util_avg = (sa->util_sum << SCHED_LOAD_SHIFT) / LOAD_AVG_MAX; in __update_load_avg()? Here we actually scale by 'SCHED_LOAD_SCALE/LOAD_AVG_MAX' so both values are load related. LOAD (UTIL) and CAPACITY have the same SCALE and SHIFT values because SCHED_LOAD_RESOLUTION is always defined to 0. scale_load() and scale_load_down() are also NOPs so this area is probably worth a separate clean-up. Beyond that, I'm not sure if the current functionality is broken if we use different SCALE and SHIFT values for LOAD and CAPACITY? > >> + * capacity_orig is the cpu_capacity available at * the highest frequency > > spurious * > > thanks, > Steve > Fixed. Thanks, -- Dietmar -- >8 -- From: Dietmar Eggemann Date: Fri, 14 Aug 2015 17:23:13 +0100 Subject: [PATCH] sched/fair: Get rid of scaling utilization by capacity_orig Utilization is currently scaled by capacity_orig, but since we now have frequency and cpu invariant cfs_rq.avg.util_avg, frequency and cpu scaling now happens as part of the utilization tracking itself. So cfs_rq.avg.util_avg should no longer be scaled in cpu_util(). Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: Dietmar Eggemann Signed-off-by: Morten Rasmussen --- kernel/sched/fair.c | 38 ++ 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2074d45a67c2..a73ece2372f5 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4824,33 +4824,39 @@ static int select_idle_sibling(struct task_struct *p, int target) done: return target; } + /* * cpu_util returns the amount of capacity of a CPU that is used by CFS * tasks. The unit of the return value must be the one of capacity so we can * compare the utilization with the capacity of the CPU that is available for * CFS task (ie cpu_capacity). - * cfs.avg.util_avg is the sum of running time of runnable tasks on a - * CPU. It represents the amount of utilization of a CPU in the range - * [0..SCHED_LOAD_SCALE]. The utilization of a CPU can't be higher than the - * full capacity of the CPU because it's about the running time on this CPU. - * Nevertheless, cfs.avg.util_avg can be higher than SCHED_LOAD_SCALE - * because of unfortunate rounding in util_avg or just - * after migrating tasks until the average stabilizes with the new running - * time. So we need to check that the utilization stays into the range - * [0..cpu_capacity_orig] and cap if necessary. - * Without capping the utilization, a group could be seen as overloaded (CPU0 - * utilization at 121% + CPU1 utilization at 80%) whereas CPU1 has 20% of - * available capacity. + * + * cfs_rq.avg.util_avg is the sum of running time of runnable tasks plus the + * recent utilization of currently non-runnable tasks on a CPU. It represents + * the amount of utilization of a CPU in the range [0..capacity_orig] where + * capacity_orig is the cpu_capacity available at the highest frequency + * (arch_scale_freq_capacity()). + * The utilization of a CPU converges towards a sum equal to or less than the + * current capacity (capacity_curr <= capacity_orig) of the CPU because it is + * the running time on this CPU scaled by capacity_curr. + * + * Nevertheless, cfs_rq.avg.util_avg can be higher than capacity_curr or even + * higher than capacity_orig because of unfortunate rounding in + * cfs.avg.util_avg or just after migrating tasks and new task wakeups until + * the average stabilizes with the new running time. We need to check that the + * utilization stays within the range of [0..capacity_orig] and cap it if + * necessary. Without utilization capping, a group could be seen as overloaded + * (CPU0 utilization at 121% + CPU1 utilization at 80%) whereas CPU1 has 20% of + * available capacity. We allow utilization to overshoot capacity_curr (but not + * capacity_orig) as it useful for predicting the capacity required after task + * migrations (scheduler-driven DVFS). */ static int cpu_util(int cpu) { unsigned long util = cpu_rq(cpu)->cfs.avg.util_avg; unsigned long capacity = capacity_orig_of(cpu); - if (util >= SCHED_LOAD_SCALE) - return capacity; - - return (util * capacity) >> SCHED_LOAD_SHIFT; + return (util >=
[PATCH v4 04/20] xen/grant: Introduce helpers to split a page into grant
Currently, a grant is always based on the Xen page granularity (i.e 4KB). When Linux is using a different page granularity, a single page will be split between multiple grants. The new helpers will be in charge of splitting the Linux page into grants and call a function given by the caller on each grant. Also provide an helper to count the number of grants within a given contiguous region. Note that the x86/include/asm/xen/page.h is now including xen/interface/grant_table.h rather than xen/grant_table.h. It's necessary because xen/grant_table.h depends on asm/xen/page.h and will break the compilation. Furthermore, only definition in interface/grant_table.h is required. Signed-off-by: Julien Grall Reviewed-by: David Vrabel Reviewed-by: Stefano Stabellini --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x...@kernel.org Changes in v4: - Typoes - Rename gnttab_one_grant into gnttab_for_one_grant - Add Stefano and David's reviewed-by - s/xen_page_to_pfn/page_to_xen_pfn/ based on the new naming Changes in v3: - Fix error reported by checkpatch.pl - Typoes - s/pfn/xen_pfn/ in gnttab_foreach_grant - Drop the possibility to use less data. The complexity is moved in netback which is the only user - Rename gnttab_foreach_grant into gnttab_foreach_grant_in_range - s/offset/start/ in gnttab_count_grant and update the description of the parameter - s/mfn/gfn base on the new terminologies - Add EXPORT_SYMBOL_GPL for gnttab_foreach_grant_in_range - Use xen_offset_in_page and XEN_PFN_DOWN whenever it's possible - Fix compilation on x86. Changes in v2: - Patch added --- arch/x86/include/asm/xen/page.h | 2 +- drivers/xen/grant-table.c | 26 + include/xen/grant_table.h | 42 + 3 files changed, 69 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h index 0b762f6..501479e 100644 --- a/arch/x86/include/asm/xen/page.h +++ b/arch/x86/include/asm/xen/page.h @@ -12,7 +12,7 @@ #include #include -#include +#include #include /* Xen machine address */ diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index 62f591f..7b4e1cf 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -776,6 +776,32 @@ void gnttab_batch_copy(struct gnttab_copy *batch, unsigned count) } EXPORT_SYMBOL_GPL(gnttab_batch_copy); +void gnttab_foreach_grant_in_range(struct page *page, + unsigned int offset, + unsigned int len, + xen_grant_fn_t fn, + void *data) +{ + unsigned int goffset; + unsigned int glen; + unsigned long xen_pfn; + + len = min_t(unsigned int, PAGE_SIZE - offset, len); + goffset = xen_offset_in_page(offset); + + xen_pfn = page_to_xen_pfn(page) + XEN_PFN_DOWN(offset); + + while (len) { + glen = min_t(unsigned int, XEN_PAGE_SIZE - goffset, len); + fn(pfn_to_gfn(xen_pfn), goffset, glen, data); + + goffset = 0; + xen_pfn++; + len -= glen; + } +} +EXPORT_SYMBOL_GPL(gnttab_foreach_grant_in_range); + int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops, struct gnttab_map_grant_ref *kmap_ops, struct page **pages, unsigned int count) diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index 4478f4b..05b5b08 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -45,8 +45,10 @@ #include #include +#include #include #include +#include #define GNTTAB_RESERVED_XENSTORE 1 @@ -224,4 +226,44 @@ static inline struct xen_page_foreign *xen_page_foreign(struct page *page) #endif } +/* Split Linux page in chunk of the size of the grant and call fn + * + * Parameters of fn: + * gfn: guest frame number + * offset: offset in the grant + * len: length of the data in the grant. + * data: internal information + */ +typedef void (*xen_grant_fn_t)(unsigned long gfn, unsigned int offset, + unsigned int len, void *data); + +void gnttab_foreach_grant_in_range(struct page *page, + unsigned int offset, + unsigned int len, + xen_grant_fn_t fn, + void *data); + +/* Helper to get to call fn only on the first "grant chunk" */ +static inline void gnttab_for_one_grant(struct page *page, unsigned int offset, + unsigned len, xen_grant_fn_t fn, + void *data) +{ + /* The first request is
[PATCH v4 05/20] xen/grant: Add helper gnttab_page_grant_foreign_access_ref_one
Many PV drivers contain the idiom: pfn = page_to_gfn(...) /* Or similar */ gnttab_grant_foreign_access_ref Replace it by a new helper. Note that when Linux is using a different page granularity than Xen, the helper only gives access to the first 4KB grant. This is useful where drivers are allocating a full Linux page for each grant. Also include xen/interface/grant_table.h rather than xen/grant_table.h in asm/page.h for x86 to fix a compilation issue [1]. Only the former is useful in order to get the structure definition. [1] Interdependency between asm/page.h and xen/grant_table.h which result to page_mfn not being defined when necessary. Signed-off-by: Julien Grall Reviewed-by: David Vrabel Reviewed-by: Stefano Stabellini --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Changes in v3: - Rename gnttab_page_grant_foreign_access_ref into gnttab_page_grant_foreign_access_ref_one - Fix typo in the commit message - s/mfn/gfn based on the new naming - Add David and Stefano's reviewed-by Changes in v2: - Patch added --- include/xen/grant_table.h | 9 + 1 file changed, 9 insertions(+) diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index 05b5b08..e17a4b3 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -131,6 +131,15 @@ void gnttab_cancel_free_callback(struct gnttab_free_callback *callback); void gnttab_grant_foreign_access_ref(grant_ref_t ref, domid_t domid, unsigned long frame, int readonly); +/* Give access to the first 4K of the page */ +static inline void gnttab_page_grant_foreign_access_ref_one( + grant_ref_t ref, domid_t domid, + struct page *page, int readonly) +{ + gnttab_grant_foreign_access_ref(ref, domid, xen_page_to_gfn(page), + readonly); +} + void gnttab_grant_foreign_transfer_ref(grant_ref_t, domid_t domid, unsigned long pfn); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 09/20] xen/biomerge: Don't allow biovec's to be merged when Linux is not using 4KB pages
On ARM all dma-capable devices on a same platform may not be protected by an IOMMU. The DMA requests have to use the BFN (i.e MFN on ARM) in order to use correctly the device. While the DOM0 memory is allocated in a 1:1 fashion (PFN == MFN), grant mapping will screw this contiguous mapping. When Linux is using 64KB page granularitary, the page may be split accross multiple non-contiguous MFN (Xen is using 4KB page granularity). Therefore a DMA request will likely fail. Checking that a 64KB page is using contiguous MFN is tedious. For now, always says that biovec are not mergeable. Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: David Vrabel There is some ideas to check whether two biovec could be merged (see [1]) but it's not critical and can be consider as a performance improvement. Changes in v4: - Fix typoes in the subject - Add Stefano's reviewed-by Changes in v3: - Update commit message - s/mfn/bfn/ base on the new renaming - Update TODO Changes in v2: - Remove the workaround and check if the Linux page granularity is the same as Xen or not [1] https://lkml.org/lkml/2015/7/17/418 --- drivers/xen/biomerge.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c index 8ae2fc90..4da69db 100644 --- a/drivers/xen/biomerge.c +++ b/drivers/xen/biomerge.c @@ -6,10 +6,18 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, const struct bio_vec *vec2) { +#if XEN_PAGE_SIZE == PAGE_SIZE unsigned long bfn1 = pfn_to_bfn(page_to_pfn(vec1->bv_page)); unsigned long bfn2 = pfn_to_bfn(page_to_pfn(vec2->bv_page)); return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) && ((bfn1 == bfn2) || ((bfn1+1) == bfn2)); +#else + /* +* XXX: Add support for merging bio_vec when using different page +* size in Xen and Linux. +*/ + return 0; +#endif } EXPORT_SYMBOL(xen_biovec_phys_mergeable); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 01/20] net/xen-netback: xenvif_gop_frag_copy: move GSO check out of the loop
The skb doesn't change within the function. Therefore it's only necessary to check if we need GSO once at the beginning. Signed-off-by: Julien Grall Acked-by: Wei Liu --- Cc: Ian Campbell Cc: net...@vger.kernel.org Changes in v4: - Add Wei's acked Changes in v2: - Patch added --- drivers/net/xen-netback/netback.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 7c64c74..d4c1bc7 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -277,6 +277,13 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb unsigned long bytes; int gso_type = XEN_NETIF_GSO_TYPE_NONE; + if (skb_is_gso(skb)) { + if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) + gso_type = XEN_NETIF_GSO_TYPE_TCPV4; + else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) + gso_type = XEN_NETIF_GSO_TYPE_TCPV6; + } + /* Data must not cross a page boundary. */ BUG_ON(size + offset > PAGE_SIZEgso_type & SKB_GSO_TCPV6) - gso_type = XEN_NETIF_GSO_TYPE_TCPV6; - } - if (*head && ((1 << gso_type) & queue->vif->gso_mask)) queue->rx.req_cons++; -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 03/20] xen: Add Xen specific page definition
The Xen hypercall interface is always using 4K page granularity on ARM and x86 architecture. With the incoming support of 64K page granularity for ARM64 guest, it won't be possible to re-use the Linux page definition in Xen drivers. Introduce Xen page definition helpers based on the Linux page definition. They have exactly the same name but prefixed with XEN_/xen_ prefix. Also modify xen_page_to_gfn to use new Xen page definition. Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: David Vrabel Changes in v4: - Typoes - Rename xen_page_to_pfn to page_to_xen_pfn Changes in v3: - Fix errors reported by checkpatch.pl - Rename pfn to xen_pfn in xen_pfn_to_page - Add a comment that we assume PAGE_SIZE to be a multiple of XEN_PAGE_SIZE - s/MFN/GFN/ according to new naming - Add Stefano's reviewed-by Changes in v2: - Add XEN_PFN_UP - Add a comment describing the behavior of page_to_pfn --- include/xen/page.h | 27 ++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/include/xen/page.h b/include/xen/page.h index 1daae48..96294ac 100644 --- a/include/xen/page.h +++ b/include/xen/page.h @@ -1,11 +1,36 @@ #ifndef _XEN_PAGE_H #define _XEN_PAGE_H +#include + +/* The hypercall interface supports only 4KB page */ +#define XEN_PAGE_SHIFT 12 +#define XEN_PAGE_SIZE (_AC(1, UL) << XEN_PAGE_SHIFT) +#define XEN_PAGE_MASK (~(XEN_PAGE_SIZE-1)) +#define xen_offset_in_page(p) ((unsigned long)(p) & ~XEN_PAGE_MASK) + +/* + * We assume that PAGE_SIZE is a multiple of XEN_PAGE_SIZE + * XXX: Add a BUILD_BUG_ON? + */ + +#define xen_pfn_to_page(xen_pfn) \ + ((pfn_to_page(((unsigned long)(xen_pfn) << XEN_PAGE_SHIFT) >> PAGE_SHIFT))) +#define page_to_xen_pfn(page) \ + (((page_to_pfn(page)) << PAGE_SHIFT) >> XEN_PAGE_SHIFT) + +#define XEN_PFN_PER_PAGE (PAGE_SIZE / XEN_PAGE_SIZE) + +#define XEN_PFN_DOWN(x)((x) >> XEN_PAGE_SHIFT) +#define XEN_PFN_UP(x) (((x) + XEN_PAGE_SIZE-1) >> XEN_PAGE_SHIFT) +#define XEN_PFN_PHYS(x)((phys_addr_t)(x) << XEN_PAGE_SHIFT) + #include +/* Return the GFN associated to the first 4KB of the page */ static inline unsigned long xen_page_to_gfn(struct page *page) { - return pfn_to_gfn(page_to_pfn(page)); + return pfn_to_gfn(page_to_xen_pfn(page)); } struct xen_memory_region { -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 08/20] block/xen-blkfront: split get_grant in 2
Prepare the code to support 64KB page granularity. The first implementation will use a full Linux page per indirect and persistent grant. When non-persistent grant is used, each page of a bio request may be split in multiple grant. Furthermore, the field page of the grant structure is only used to copy data from persistent grant or indirect grant. Avoid to set it for other use case as it will have no meaning given the page will be split in multiple grant. Provide 2 functions, to setup indirect grant, the other for bio page. Signed-off-by: Julien Grall Acked-by: Roger Pau Monné --- Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: David Vrabel Changes in v4: - Add Roger's acked-by Changes in v3: - Fix errors reported by checkpatch.pl - gnttab_page_grant_foreign_access_ref has been renamed to gnttab_page_grant_foreign_access_ref_one - Fix compilation by using get_indirect_grant rather than get_grant (the changes was in a later patch...). - Make grant_foreign_access static inline - s/mfn/gfn/ based on the new naming Changes in v2: - Patch added --- drivers/block/xen-blkfront.c | 88 +--- 1 file changed, 59 insertions(+), 29 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 556475d..4232cbd 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -245,34 +245,77 @@ out_of_memory: return -ENOMEM; } -static struct grant *get_grant(grant_ref_t *gref_head, - struct page *page, - struct blkfront_info *info) +static struct grant *get_free_grant(struct blkfront_info *info) { struct grant *gnt_list_entry; - unsigned long buffer_gfn; BUG_ON(list_empty(&info->grants)); gnt_list_entry = list_first_entry(&info->grants, struct grant, - node); + node); list_del(&gnt_list_entry->node); - if (gnt_list_entry->gref != GRANT_INVALID_REF) { + if (gnt_list_entry->gref != GRANT_INVALID_REF) info->persistent_gnts_c--; + + return gnt_list_entry; +} + +static inline void grant_foreign_access(const struct grant *gnt_list_entry, + const struct blkfront_info *info) +{ + gnttab_page_grant_foreign_access_ref_one(gnt_list_entry->gref, +info->xbdev->otherend_id, +gnt_list_entry->page, +0); +} + +static struct grant *get_grant(grant_ref_t *gref_head, + unsigned long gfn, + struct blkfront_info *info) +{ + struct grant *gnt_list_entry = get_free_grant(info); + + if (gnt_list_entry->gref != GRANT_INVALID_REF) return gnt_list_entry; + + /* Assign a gref to this page */ + gnt_list_entry->gref = gnttab_claim_grant_reference(gref_head); + BUG_ON(gnt_list_entry->gref == -ENOSPC); + if (info->feature_persistent) + grant_foreign_access(gnt_list_entry, info); + else { + /* Grant access to the GFN passed by the caller */ + gnttab_grant_foreign_access_ref(gnt_list_entry->gref, + info->xbdev->otherend_id, + gfn, 0); } + return gnt_list_entry; +} + +static struct grant *get_indirect_grant(grant_ref_t *gref_head, + struct blkfront_info *info) +{ + struct grant *gnt_list_entry = get_free_grant(info); + + if (gnt_list_entry->gref != GRANT_INVALID_REF) + return gnt_list_entry; + /* Assign a gref to this page */ gnt_list_entry->gref = gnttab_claim_grant_reference(gref_head); BUG_ON(gnt_list_entry->gref == -ENOSPC); if (!info->feature_persistent) { - BUG_ON(!page); - gnt_list_entry->page = page; + struct page *indirect_page; + + /* Fetch a pre-allocated page to use for indirect grefs */ + BUG_ON(list_empty(&info->indirect_pages)); + indirect_page = list_first_entry(&info->indirect_pages, +struct page, lru); + list_del(&indirect_page->lru); + gnt_list_entry->page = indirect_page; } - buffer_gfn = xen_page_to_gfn(gnt_list_entry->page); - gnttab_grant_foreign_access_ref(gnt_list_entry->gref, - info->xbdev->otherend_id, - buffer_gfn, 0); + grant_foreign_access(gnt_list_entry, info); + return gnt_list_entry; } @@ -525,32 +568,19 @@ static
[PATCH v4 02/20] arm/xen: Drop pte_mfn and mfn_pte
They are not used in common code expect in one place in balloon.c which is only compiled when Linux is using PV MMU. It's not the case on ARM. Rather than worrying how to handle the 64KB case, drop them. Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini --- Cc: Russell King Changes in v4: - Add Stefano's reviewed Changes in v3: - Patch added --- arch/arm/include/asm/xen/page.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h index 1279563..98c9fc3 100644 --- a/arch/arm/include/asm/xen/page.h +++ b/arch/arm/include/asm/xen/page.h @@ -13,9 +13,6 @@ #define phys_to_machine_mapping_valid(pfn) (1) -#define pte_mfnpte_pfn -#define mfn_ptepfn_pte - /* Xen machine address */ typedef struct xmaddr { phys_addr_t maddr; -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: fix bug which lowmem size is limited to 760MB
On Mon, 7 Sep 2015, Arnd Bergmann wrote: > Given how much more common 1GB hardware configurations are compared to 768MB > configuration, we could however think about adding a VMSPLIT_3G_OPT option > that x86 has (also VMSPLIT_2_75G on ARCH_TILE), to allow using the entire > 1GB of lowmem without going all the way to VMSPLIT_2G. That option would > also let us use the entire 768MB on the machines that Yongtaek Lee is > interested in. That's easy enough: diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 0d1b717e1e..a63970f211 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1470,6 +1470,8 @@ choice config VMSPLIT_3G bool "3G/1G user/kernel split" + config VMSPLIT_3G_OPT + bool "3G/1G user/kernel split (for full 1G low memory)" config VMSPLIT_2G bool "2G/2G user/kernel split" config VMSPLIT_1G @@ -1481,6 +1483,7 @@ config PAGE_OFFSET default PHYS_OFFSET if !MMU default 0x4000 if VMSPLIT_1G default 0x8000 if VMSPLIT_2G + default 0xAF00 if VMSPLIT_3G_OPT default 0xC000 config NR_CPUS That shifts the risk to user space though. But if there is a regression there, it will manifest itself on all systems and not only with some particular hardware. Nicolas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] ACPI: add in a bad_madt_entry() function to eventually replace the macro
Hi Al, On 19/08/15 23:07, Al Stone wrote: I finally got a chance to try this series on Juno. Well it exposed a firmware bug in MADT table :) [..] acpi_tbl_entry_handler handler, @@ -245,6 +484,8 @@ acpi_parse_entries(char *id, unsigned long table_size, table_end) { if (entry->type == entry_id && (!max_entries || count < max_entries)) { + if (bad_madt_entry(table_header, entry)) + return -EINVAL; Not sure if we can have the above check here unconditionally. Currently I can see there are 2 other users of acpi_parse_entries i.e. PCC and NUMA. So may be it can be made conditional or return success for non-MADT tables from bad_madt_entry ? Other than that, you can add for ARM64 specific parts: Reviewed-and-tested-by: Sudeep Holla Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] regmap updates for v4.3
The following changes since commit 64291f7db5bd8150a74ad2036f1037e6a0428df2: Linux 4.2 (2015-08-30 11:34:09 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git tags/regmap-v4.3 for you to fetch changes up to 072502a67c9164625288cca17704808e6c06273f: Merge remote-tracking branches 'regmap/topic/lockdep' and 'regmap/topic/seq-delay' into regmap-next (2015-09-04 17:22:10 +0100) regmap: Changes for v4.3 This has been a busy release for regmap. By far the biggest set of changes here are those from Markus Pargmann which implement support for block transfers in smbus devices. This required quite a bit of refactoring but leaves us better able to handle odd restrictions that controllers may have and with better performance on smbus. Other new features include: - Fix interactions with lockdep for nested regmaps (eg, when a device using regmap is connected to a bus where the bus controller has a separate regmap). Lockdep's default class identification is too crude to work without help. - Support for must write bitfield operations, useful for operations which require writing a bit to trigger them from Kuniori Morimoto. - Support for delaying during register patch application from Nariman Poushin. - Support for overriding cache state via the debugfs implementation from Richard Fitzgerald. Axel Lin (1): regmap: debugfs: Fix misuse of IS_ENABLED Kuninori Morimoto (3): regmap: add force_write option on _regmap_update_bits() regmap: add regmap_write_bits() regmap: add regmap_fields_force_write() Lars-Peter Clausen (1): regmap: Add better support for devices without readback support Mark Brown (9): regmap: Silence warning on invalid zero length read Merge branches 'fix/raw', 'topic/core', 'topic/i2c', 'topic/raw' and 'topic/doc' of git://git.kernel.org/.../broonie/regmap into regmap-smbus-block regmap: Support bulk reads for devices without raw formatting Merge branch 'topic/smbus-block' of git://git.kernel.org/.../broonie/regmap into regmap-core Merge remote-tracking branch 'regmap/fix/core' into regmap-linus Merge remote-tracking branch 'regmap/fix/raw' into regmap-linus Merge remote-tracking branch 'regmap/topic/core' into regmap-next Merge remote-tracking branches 'regmap/topic/debugfs' and 'regmap/topic/force-update' into regmap-next Merge remote-tracking branches 'regmap/topic/lockdep' and 'regmap/topic/seq-delay' into regmap-next Markus Pargmann (11): regmap: Fix integertypes for register address and value regmap: Fix regmap_can_raw_write check regmap: regmap_raw_read return error on !bus->read regmap: Fix regmap_bulk_write for bus writes regmap: Split use_single_rw internally into use_single_read/write regmap: No multi_write support if bus->write does not exist regmap: Add missing comments about struct regmap_bus regmap: Introduce max_raw_read/write for regmap_bulk_read/write regmap: regmap max_raw_read/write getter functions regmap: Add raw_write/read checks for max_raw_write/read sizes regmap-i2c: Add smbus i2c block support Nariman Poushin (2): regmap: Use reg_sequence for multi_reg_write / register_patch regmap: Apply optional delay in multi_reg_write/register_patch Nicolas Boichat (4): mfd: vexpress: Add parentheses around bridge->ops->regmap_init call thermal: sti: Add parentheses around bridge->ops->regmap_init call regmap: Use different lockdep class for each regmap init call regmap: Move documentation to regmap.h Richard Fitzgerald (2): debugfs: Export bool read/write functions regmap: debugfs: Allow writes to cache state settings Sergey SENOZHATSKY (1): regmap: fix a NULL pointer dereference in __regmap_init Stephen Boyd (1): regulator: core: Print at debug level on debugfs creation failure Xiubo Li (1): regmap: fix typos in regmap.c drivers/base/regmap/internal.h | 12 +- drivers/base/regmap/regcache.c | 2 +- drivers/base/regmap/regmap-ac97.c| 41 ++-- drivers/base/regmap/regmap-debugfs.c | 99 - drivers/base/regmap/regmap-i2c.c | 90 +--- drivers/base/regmap/regmap-irq.c | 4 +- drivers/base/regmap/regmap-mmio.c| 52 ++--- drivers/base/regmap/regmap-spi.c | 41 ++-- drivers/base/regmap/regmap-spmi.c| 78 +++ drivers/base/regmap/regmap.c | 368 + drivers/bus/vexpress-config.c| 2 +- drivers/gpu/drm/i2c/adv7511.c| 2 +- drivers/input/misc/drv260x.c | 6 +- drivers/input/misc/drv2665.c | 2 +- drivers/input/misc/drv2667.c | 4 +- drivers/mfd/arizona-core.c | 2 +- drivers/mfd/twl6040.c
Re: [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state
On Fri 2015-09-04 16:49:46, Paul E. McKenney wrote: > On Fri, Sep 04, 2015 at 02:11:30PM +0200, Petr Mladek wrote: > > The deadline to force the quiescent state (jiffies_force_qs) is currently > > updated only when the previous timeout passed. But the timeout used for > > wait_event() is always the entire original timeout. This is strange. > > They tell me that kthreads aren't supposed to every catch signals, > hence the WARN_ON() in the early-exit case stray-signal case. Yup, I have investigated this recently. All signals are really blocked for kthreads by default. There are few threads that use signals but they explicitly enable it by allow_signal(). > In the case where we were awakened with an explicit force-quiescent-state > request, we do the scan, and then wait the full time for the next scan. > So the point of the delay is to space out the scans, not to fit a > pre-determined schedule. > > The reason we get awakened with an explicit force-quiescent-state > request is that a given CPU just got inundated with RCU callbacks > or that rcutorture wants to hammer this code path. > > So I am not seeing this as anything in need of fixing. > > Am I missing something subtle here? There is the commit 88d6df612cc3c99f5 ("rcu: Prevent spurious-wakeup DoS attack on rcu_gp_kthread()"). It suggests that the spurious wakeups are possible. I would consider this patch as a fix/clean up of this Dos attack fix. Huh, I forgot to mention it in the commit message. To be honest, I personally do not know how to trigger the spurious wakeup in the current state of the code. I am trying to convert the kthread into the kthread worker API and there I got the spurious wakeups but this is another story. Thanks a lot for reviewing. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v8 2/2] ARM: imx: support suspend states on imx7D
On Fri, Jul 31, 2015 at 04:33:59PM -0500, Shenwei Wang wrote: > IMX7D contains a new version of GPC IP block (GPCv2). It has two > major functions: power management and wakeup source management. > > GPCv2 provides low power mode control for Cortex-A7 and Cortex-M4 > domains. And it can support WAIT, STOP, and DSM(Deep Sleep Mode) modes. > After configuring the GPCv2 module, the platform can enter into a > selected mode either automatically triggered by ARM WFI instruction or > manually by software. The system will exit the low power states > by the predefined wakeup sources which are managed by the gpcv2 > irqchip driver. > > This patch adds a new suspend driver to manage the power states on IMX7D. > It currently supports "SUSPEND_STANDBY" and "SUSPEND_MEM" states. > > Signed-off-by: Shenwei Wang > Signed-off-by: Anson Huang Please stop sending patches to my Linaro mailbox, and use shawn...@kernel.org instead. You should already get that if you ever run ./scripts/get_maintainer.pl on the patch. Also please always copy ker...@pengutronix.de for i.MX platform patches like this. > --- > arch/arm/mach-imx/Kconfig| 1 + > arch/arm/mach-imx/Makefile | 2 + > arch/arm/mach-imx/common.h | 4 + > arch/arm/mach-imx/pm-imx7.c | 917 > +++ > arch/arm/mach-imx/suspend-imx7.S | 529 ++ > 5 files changed, 1453 insertions(+) 1453 lines addition to kernel only for i.MX7D suspend support. Yes, this is the way we support suspend on i.MX6, but that's enough, and we have to stop this somewhere. I would ask you to take Sudeep's comment and adopt PSCI for i.MX7D power management. Shawn [1] https://lkml.org/lkml/2015/8/26/554 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
similar files amd vs radeon
I executed a clone detection tool* on drivers source code and I found that there are similar files between drivers/gpu/drm/amd/ and drivers/gpu/drm/radeon, but also inside each of theses folders. Some examples: drivers/gpu/drm/amd/amdgpu/dce_v11_0.c,drivers/gpu/drm/amd/amdgpu/dce_v10_0.c drivers/gpu/drm/amd/amdgpu/ci_dpm.c,drivers/gpu/drm/radeon/ci_dpm.c drivers/gpu/drm/radeon/kv_dpm.c,drivers/gpu/drm/amd/amdgpu/kv_dpm.c I use meld for seeing the differences and similarities. More results from the tool at: http://pastebin.com/iX3fhifG (The number on the first field is the number of probable cloned lines of code). Should these files be consolidated? And if so how? Thank you, Peter * https://github.com/petersenna/ccfinderx-core -- Peter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] 9p: trans_fd, bail out if recv fcall if missing
req->rc is pre-allocated early on with p9_tag_alloc and shouldn't be missing Signed-off-by: Dominique Martinet --- net/9p/trans_fd.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) Feel free to adapt error code/message if you can think of something better. diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index a270dcc..a6d89c0 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -356,13 +356,12 @@ static void p9_read_work(struct work_struct *work) } if (m->req->rc == NULL) { - m->req->rc = kmalloc(sizeof(struct p9_fcall) + - m->client->msize, GFP_NOFS); - if (!m->req->rc) { - m->req = NULL; - err = -ENOMEM; - goto error; - } + p9_debug(P9_DEBUG_ERROR, +"No recv fcall for tag %d (req %p), disconnecting!\n", +m->rc.tag, m->req); + m->req = NULL; + err = -EIO; + goto error; } m->rc.sdata = (char *)m->req->rc + sizeof(struct p9_fcall); memcpy(m->rc.sdata, m->tmp_buf, m->rc.capacity); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] powerpc32: memcpy: only use dcbz once cache is enabled
From: Christophe Leroy > Sent: 07 September 2015 15:25 ... > diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S > index 2ef50c6..05b3096 100644 > --- a/arch/powerpc/lib/copy_32.S > +++ b/arch/powerpc/lib/copy_32.S > @@ -172,7 +172,16 @@ _GLOBAL(memcpy) > mtctr r0 > beq 63f > 53: > - dcbzr11,r6 > + /* > + * During early init, cache might not be active yet, so dcbz cannot be > + * used. We put dcbt instead of dcbz. If cache is not active, it's just > + * like a not. If cache is active, at least it prefetchs the line to be ^^^ nop ?? David > + * overwritten. > + * Will be replaced by dcbz in machine_init() > + */ > +_GLOBAL(ppc32_memcpy_dcbz) > + dcbtr11,r6 > + > COPY_16_BYTES > #if L1_CACHE_BYTES >= 32 > COPY_16_BYTES > -- > 2.1.0
Re: [PATCH v2 1/2] leds: leds-ipaq-micro: Use devm_led_classdev_register
On 09/07/2015 04:13 PM, Muhammad Falak R Wani wrote: Use of resource-managed function devm_led_classdev_register instead of led_classdev_register is preferred, consequently remove redundant function micro_leds_remove. Signed-off-by: Muhammad Falak R Wani --- drivers/leds/leds-ipaq-micro.c | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) Merged, thanks. -- Best Regards, Jacek Anaszewski -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v1 2/4] irqchip: GICv3: set non-percpu irqs status with _IRQ_MOVE_PCNTXT
On Mon, 7 Sep 2015, Marc Zyngier wrote: > On 07/09/15 14:24, Thomas Gleixner wrote: > > The history of this flag is as follows: > > > > On x86 interrupts can only be safely migrated while the interrupt is > > handled. > > Woa! That's creative! :-) I suppose this doesn't work very well with CPU > hotplug though... Go figure > So I wonder why we bother introducing the IRQ_MOVE_PCNTXT flag on ARM at > all. Is that just because migration.c is only compiled when > GENERIC_PENDING_IRQ is set? Looks like. We can distangle that, if this code needs to be reusable. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] rcu: Show the real fqs_state
On Fri 2015-09-04 16:24:22, Paul E. McKenney wrote: > On Fri, Sep 04, 2015 at 02:11:29PM +0200, Petr Mladek wrote: > > The value of "fqs_state" in struct rcu_state is always RCU_GP_IDLE. > > > > The real state is stored in a local variable in rcu_gp_kthread(). > > It is modified by rcu_gp_fqs() via parameter and return value. > > But the actual value is never stored to rsp->fqs_state. > > > > The result is that print_one_rcu_state() does not show the real > > state. > > > > This code has been added 3 years ago by the commit 4cdfc175c25c89ee > > ("rcu: Move quiescent-state forcing into kthread"). I guess that it > > was an overlook or optimization. > > > > Anyway, the value seems to be manipulated only by the thread, except > > for shoving the status. I do not see any risk in updating it directly > > in the struct. > > > > Signed-off-by: Petr Mladek > > Good catch, but how about the following fix instead? > > Thanx, Paul > > > > rcu: Finish folding ->fqs_state into ->gp_state > > Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing > into kthread") started the process of folding the old ->fqs_state > into ->gp_state, but did not complete it. This situation does not > cause any malfunction, but can result in extremely confusing trace > output. This commit completes this task of eliminating ->fqs_state > in favor of ->gp_state. It makes sense but it breaks dynticks handling in rcu_gp_fqs(), see below. > > Reported-by: Petr Mladek > Signed-off-by: Paul E. McKenney > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 69ab7ce2cf7b..04234936d897 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1949,16 +1949,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state > *rsp, int *gfp) > /* > * Do one round of quiescent-state forcing. > */ > -static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in) > +static void rcu_gp_fqs(struct rcu_state *rsp) > { > - int fqs_state = fqs_state_in; > bool isidle = false; > unsigned long maxj; > struct rcu_node *rnp = rcu_get_root(rsp); > > WRITE_ONCE(rsp->gp_activity, jiffies); > rsp->n_force_qs++; > - if (fqs_state == RCU_SAVE_DYNTICK) { > + if (rsp->gp_state == RCU_SAVE_DYNTICK) { This will never happen because rcu_gp_kthread() modifies rsp->gp_state many times. The last value before calling rcu_gp_fqs() is RCU_GP_DOING_FQS. I think about passing this information via a separate bool. [...] > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h > index d5f58e717c8b..9faad70a8246 100644 > --- a/kernel/rcu/tree.h > +++ b/kernel/rcu/tree.h > @@ -417,12 +417,11 @@ struct rcu_data { > struct rcu_state *rsp; > }; > > -/* Values for fqs_state field in struct rcu_state. */ > +/* Values for gp_state field in struct rcu_state. */ > #define RCU_GP_IDLE 0 /* No grace period in progress. */ This value seems to be used instead of the new RCU_GP_WAIT_INIT. > #define RCU_GP_INIT 1 /* Grace period being > #initialized. */ This value is unused. > #define RCU_SAVE_DYNTICK 2 /* Need to scan dyntick > #state. */ This one is not longer preserved when merged with the other state. > #define RCU_FORCE_QS 3 /* Need to force quiescent > #state. */ The meaning of this one is strange. If I get it correctly, it is set after the state was forced. But the comment suggests that it is before. By other words, these states seems to get obsoleted by /* Values for rcu_state structure's gp_flags field. */ #define RCU_GP_WAIT_INIT 0 /* Initial state. */ #define RCU_GP_WAIT_GPS 1 /* Wait for grace-period start. */ #define RCU_GP_DONE_GPS 2 /* Wait done for grace-period start. */ #define RCU_GP_WAIT_FQS 3 /* Wait for force-quiescent-state time. */ #define RCU_GP_DOING_FQS 4 /* Wait done for force-quiescent-state time. */ #define RCU_GP_CLEANUP 5 /* Grace-period cleanup started. */ #define RCU_GP_CLEANED 6 /* Grace-period cleanup complete. */ Please, find below your commit updated with my ideas: + used bool save_dyntick instead of RCU_SAVE_DYNTICK and RCU_FORCE_QS states + rename RCU_GP_WAIT_INIT -> RCU_GP_IDLE + remove all the obsolete states I am sorry if I handled "Signed-off-by" flags a wrong way. It is basically your patch with few small updates from me. I am not sure what is the right process in this case. Feel free to use Reviewed-by instead of Signed-off-by with my name. Well, I guess that this is not the final state ;-) >From 61a1bf6659f4f4c0c4021f185bc156f8c83f9ea5 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 4 Sep 2015 16:24:22 -0700 Subject: [PATCH] rcu: Finish folding ->fqs_state into ->gp_state Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-sta
Re: [RFC PATCH 1/3] arm64: entry: Remove unnecessary calculation for S_SP in EL1h
On Fri, Sep 04, 2015 at 03:23:05PM +0100, Jungseok Lee wrote: > Under EL1h, S_SP data is not seen in kernel_exit. Thus, x21 calculation > is not needed in kernel_entry. Currently, S_SP information is vaild only > when sp_el0 is used. I don't think this is true. The generic BUG implementation will grab the saved SP from the pt_regs, and with this change we'll report whatever happened to be in x21 instead. > Signed-off-by: Jungseok Lee > --- > arch/arm64/kernel/entry.S | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index e163518..d23ca0d 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -91,8 +91,6 @@ > get_thread_info tsk // Ensure MDSCR_EL1.SS is clear, > ldr x19, [tsk, #TI_FLAGS] // since we can unmask debug > disable_step_tsk x19, x20 // exceptions when scheduling. > - .else > - add x21, sp, #S_FRAME_SIZE > .endif > mrs x22, elr_el1 > mrs x23, spsr_el1 Immediately after this we do: stp lr, x21, [sp, #S_LR] To store the LR and SP to the pt_regs which bug_handler would use. Am I missing smoething? Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v1 2/4] irqchip: GICv3: set non-percpu irqs status with _IRQ_MOVE_PCNTXT
Hi Thomas, On 07/09/15 14:24, Thomas Gleixner wrote: > On Mon, 7 Sep 2015, Marc Zyngier wrote: >> On 06/09/15 06:56, Jiang Liu wrote: >>> On 2015/9/6 12:23, Yang Yingliang wrote: Use irq_settings_set_move_pcntxt() helper irqs status with _IRQ_MOVE_PCNTXT. So that it can do set affinity when calling irq_set_affinity_locked(). >>> Hi Yingliang, >>> We could only set _IRQ_MOVE_PCNTCT flag to enable migrating >>> IRQ in process context if your hardware platform supports atomically >>> change IRQ configuration. Not sure whether that's true for GICv3. >>> If GICv3 doesn't support atomically change irq configuration, this >>> change may cause trouble. >> >> I think it boils down to what exactly "process context" means here. If >> this means "we do not need to mask the interrupt" while moving it, then >> it should be fine (the GIC architecture guarantees that a pending >> interrupt will be migrated). >> >> Is there any other requirement for this flag? > > The history of this flag is as follows: > > On x86 interrupts can only be safely migrated while the interrupt is > handled. Woa! That's creative! :-) I suppose this doesn't work very well with CPU hotplug though... > With the introduction of IRQ remapping this requirement > changed. Remapped interrupts can be migrated in any context. > > If you look at irq_set_affinity_locked() > >if (irq_can_move_pcntxt(data) { > irq_do_set_affinity(data,...) > chip->irq_set_affinity(data,...); >} else { > irqd_set_move_pending(data); >} > > So if IRQ_MOVE_PCNTXT is not set, we handle the migration of the > interrupt from next the interrupt. If it's set set_affinity() is > called right away. OK, that is now starting to make more sense. > All architectures which do not select GENERIC_PENDING_IRQ are using > the direct method. Right. On ARM, only the direct method makes sense so far (we have no constraint such as the one you describe above). So I wonder why we bother introducing the IRQ_MOVE_PCNTXT flag on ARM at all. Is that just because migration.c is only compiled when GENERIC_PENDING_IRQ is set? Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] powerpc32: memcpy: only use dcbz once cache is enabled
On Mon, Sep 07 2015, Christophe Leroy wrote: > memcpy() uses instruction dcbz to speed up copy by not wasting time > loading cache line with data that will be overwritten. > Some platform like mpc52xx do no have cache active at startup and > can therefore not use memcpy(). Allthough no part of the code > explicitly uses memcpy(), GCC makes calls to it. > > This patch modifies memcpy() such that at startup, the 'dcbz' > instruction is replaced by 'dcbt' which is harmless if cache is not > enabled, and which helps a bit (allthough not as much as dcbz) if > cache is already enabled. > > Once the initial MMU is setup, in machine_init() we patch memcpy() > by replacing the temporary 'dcbt' by 'dcbz' > > Reported-by: Michal Sojka > Signed-off-by: Christophe Leroy > --- > @Michal, can you please test it ? Yes, it works. Tested-by: Michal Sojka -Michal -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 0/3] mtd: nand: jz4780: Add NAND and BCH drivers
On 06/09/2015 21:38, Ezequiel Garcia wrote: > On 27 Jul 02:50 PM, Alex Smith wrote: >> Hi, >> >> This series adds support for the BCH controller and NAND devices on >> the Ingenic JZ4780 SoC. >> >> Tested on the MIPS Creator Ci20 board. All dependencies are now in >> mainline so it should be possible to compile test now. >> >> This version of the series has been rebased on 4.2-rc4, and also adds >> an additional patch to fix an issue that was encountered in the >> external Ci20 3.18 kernel branch. >> >> Review and feedback welcome. >> > > The NEMC driver seems to be upstream. Any chance you submit devicetree > changes as well for Ci20 (so we can actually test this)? Sure, can do. The pinctrl driver is not yet upstream (needs some work) which is why I didn't add the DT changes initially, but at least if you boot the board from the NAND then U-Boot should have left everything in a state usable by the kernel. Thanks, Alex > >> Thanks, >> Alex >> >> Alex Smith (3): >> mtd: nand: increase ready wait timeout and report timeouts >> dt-bindings: binding for jz4780-{nand,bch} >> mtd: nand: jz4780: driver for NAND devices on JZ4780 SoCs >> >> .../bindings/mtd/ingenic,jz4780-nand.txt | 57 >> drivers/mtd/nand/Kconfig | 7 + >> drivers/mtd/nand/Makefile | 1 + >> drivers/mtd/nand/jz4780_bch.c | 354 +++ >> drivers/mtd/nand/jz4780_bch.h | 42 +++ >> drivers/mtd/nand/jz4780_nand.c | 376 >> + >> drivers/mtd/nand/nand_base.c | 15 +- >> 7 files changed, 849 insertions(+), 3 deletions(-) >> create mode 100644 >> Documentation/devicetree/bindings/mtd/ingenic,jz4780-nand.txt >> create mode 100644 drivers/mtd/nand/jz4780_bch.c >> create mode 100644 drivers/mtd/nand/jz4780_bch.h >> create mode 100644 drivers/mtd/nand/jz4780_nand.c >> >> -- >> 2.4.6 >> >> >> __ >> Linux MTD discussion mailing list >> http://lists.infradead.org/mailman/listinfo/linux-mtd/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] leds: leds-ipaq-micro: Fix coding style issues
Hi Muhammad, On 09/07/2015 04:13 PM, Muhammad Falak R Wani wrote: Spaces at the starting of a line are removed, indentation using tab, instead of space. Also, warnings related to line width of more than 80 characters is also taken care of. Two warnings have been left alone to aid better readability. Signed-off-by: Muhammad Falak R Wani --- drivers/leds/leds-ipaq-micro.c | 38 +++--- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/drivers/leds/leds-ipaq-micro.c b/drivers/leds/leds-ipaq-micro.c index 1206215..86716ea 100644 --- a/drivers/leds/leds-ipaq-micro.c +++ b/drivers/leds/leds-ipaq-micro.c @@ -16,9 +16,9 @@ #define LED_YELLOW0x00 #define LED_GREEN 0x01 -#define LED_EN (1 << 4)/* LED ON/OFF 0:off, 1:on */ -#define LED_AUTOSTOP(1 << 5)/* LED ON/OFF auto stop set 0:disable, 1:enable */ -#define LED_ALWAYS (1 << 6)/* LED Interrupt Mask 0:No mask, 1:mask */ +#define LED_EN (1 << 4) /* LED ON/OFF 0:off, 1:on*/ +#define LED_AUTOSTOP (1 << 5) /* LED ON/OFF auto stop set 0:disable,1:enable*/ +#define LED_ALWAYS(1 << 6) /* LED Interrupt Mask 0:No mask, 1:mask*/ Please keep comments ending in the same column. static void micro_leds_brightness_set(struct led_classdev *led_cdev, enum led_brightness value) @@ -27,14 +27,14 @@ static void micro_leds_brightness_set(struct led_classdev *led_cdev, /* * In this message: * Byte 0 = LED color: 0 = yellow, 1 = green -* yellow LED is always ~30 blinks per minute +*yellow LED is always ~30 blinks per minute * Byte 1 = duration (flags?) appears to be ignored * Byte 2 = green ontime in 1/10 sec (deciseconds) -* 1 = 1/10 second -* 0 = 256/10 second +*1 = 1/10 second +*0 = 256/10 second * Byte 3 = green offtime in 1/10 sec (deciseconds) -* 1 = 1/10 second -* 0 = 256/10 seconds +*1 = 1/10 second +*0 = 256/10 seconds */ struct ipaq_micro_msg msg = { .id = MSG_NOTIFY_LED, @@ -64,14 +64,14 @@ static int micro_leds_blink_set(struct led_classdev *led_cdev, /* * In this message: * Byte 0 = LED color: 0 = yellow, 1 = green -* yellow LED is always ~30 blinks per minute +*yellow LED is always ~30 blinks per minute * Byte 1 = duration (flags?) appears to be ignored * Byte 2 = green ontime in 1/10 sec (deciseconds) -* 1 = 1/10 second -* 0 = 256/10 second +*1 = 1/10 second +*0 = 256/10 second * Byte 3 = green offtime in 1/10 sec (deciseconds) -* 1 = 1/10 second -* 0 = 256/10 seconds +*1 = 1/10 second +*0 = 256/10 seconds */ This looks worse after applying the patch. Why actually did you change it? AFAICS checkpatch.pl doesn't complain here. struct ipaq_micro_msg msg = { .id = MSG_NOTIFY_LED, @@ -79,14 +79,14 @@ static int micro_leds_blink_set(struct led_classdev *led_cdev, }; msg.tx_data[0] = LED_GREEN; -if (*delay_on > IPAQ_LED_MAX_DUTY || + if (*delay_on > IPAQ_LED_MAX_DUTY || *delay_off > IPAQ_LED_MAX_DUTY) -return -EINVAL; + return -EINVAL; -if (*delay_on == 0 && *delay_off == 0) { -*delay_on = 100; -*delay_off = 100; -} + if (*delay_on == 0 && *delay_off == 0) { + *delay_on = 100; + *delay_off = 100; + } msg.tx_data[1] = 0; if (*delay_on >= IPAQ_LED_MAX_DUTY) -- Best Regards, Jacek Anaszewski -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 3/3] mtd: nand: jz4780: driver for NAND devices on JZ4780 SoCs
Hi, On 06/09/2015 22:21, Ezequiel Garcia wrote: > On 27 Jul 03:21 PM, Alex Smith wrote: >> Add a driver for NAND devices connected to the NEMC on JZ4780 SoCs, as >> well as the hardware BCH controller. DMA is not currently implemented. >> >> While older 47xx SoCs also have a BCH controller, they are incompatible >> with the one in the 4780 due to differing register/bit positions, which >> would make implementing a common driver for them quite messy. >> > > If the difference is only in register/bit positions, a common driver > might be fairly simple. See drivers/i2c/busses/i2c-mv64xxx.c, > which supports two different register layouts. I've just gone back and looked at the older SoCs and it doesn't seem as though this commit message really applies to the JZ4740, which is the only other Ingenic SoC currently supported upstream. The 4740 doesn't have a BCH controller at all and the NAND interface is fairly different. I think this driver could potentially be reused if support for the JZ4770 makes it upstream, for now though a separate driver is certainly needed for the 4780. >> +return 0; >> +} >> + >> +static const struct of_device_id jz4780_bch_dt_match[] = { >> +{ .compatible = "ingenic,jz4780-bch" }, >> +{}, >> +}; >> +MODULE_DEVICE_TABLE(of, jz4780_bch_dt_match); >> + >> +static struct platform_driver jz4780_bch_driver = { >> +.probe = jz4780_bch_probe, > > Why no remove? Is it needed? Everything should be cleaned up due to the use of devm functions. >> +static int jz4780_nand_init_chips(struct jz4780_nand *nand, struct device >> *dev) >> +{ >> +struct jz4780_nand_chip *chip; >> +const __be32 *prop; >> +u64 addr, size; >> +int i = 0; >> + >> +/* >> + * Iterate over each bank assigned to this device and request resources. >> + * The bank numbers may not be consecutive, but nand_scan_ident() >> + * expects chip numbers to be, so fill out a consecutive array of chips >> + * which map chip number to actual bank number. >> + */ >> +while ((prop = of_get_address(dev->of_node, i, &size, NULL))) { >> +chip = &nand->chips[i]; >> +chip->bank = of_read_number(prop, 1); >> + >> +jz4780_nemc_set_type(nand->dev, chip->bank, >> + JZ4780_NEMC_BANK_NAND); >> + >> +addr = of_translate_address(dev->of_node, prop); > > Are you sure you must translate the address yourself? > Isn't this handled by the OF magic behing the ranges property > in the NEMC DT node? I think the reasoning behind doing this was because I already have to get the address property here in order to get the bank number out of it. You're right though that I can just do "platform_get_resource(pdev, i)" and avoid doing the translation again, so I have changed it to do that. I've fixed the rest of your comments as well. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] arm64: entry: Remove unnecessary calculation for S_SP in EL1h
On 04/09/15 15:23, Jungseok Lee wrote: > Under EL1h, S_SP data is not seen in kernel_exit. Thus, x21 calculation > is not needed in kernel_entry. Currently, S_SP information is vaild only > when sp_el0 is used. > > Signed-off-by: Jungseok Lee > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index e163518..d23ca0d 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -91,8 +91,6 @@ > get_thread_info tsk // Ensure MDSCR_EL1.SS is clear, > ldr x19, [tsk, #TI_FLAGS] // since we can unmask debug > disable_step_tsk x19, x20 // exceptions when scheduling. > - .else > - add x21, sp, #S_FRAME_SIZE > .endif > mrs x22, elr_el1 > mrs x23, spsr_el1 > This sp value gets written to the struct pt_regs that is built on the stack, and passed to the fault handlers, see 'el1_sp_pc' in kernel/entry.S, which goes on to call do_sp_pc_abort() which prints this value out. (Other fault handlers may make decisions based on this value). It should be present and correct. Thanks, James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 2/3] arm64: Introduce IRQ stack
On 04/09/15 15:23, Jungseok Lee wrote: > Currently, kernel context and interrupts are handled using a single > kernel stack navigated by sp_el1. This forces many systems to use > 16KB stack, not 8KB one. Low memory platforms naturally suffer from > both memory pressure and performance degradation simultaneously as > VM page allocator falls into slowpath frequently. > > This patch, thus, solves the problem as introducing a separate percpu > IRQ stack to handle both hard and soft interrupts with two ground rules: > > - Utilize sp_el0 in EL1 context, which is not used currently > - Do *not* complicate current_thread_info calculation > > struct thread_info can be tracked easily using sp_el0, not sp_el1 when > this feature is enabled. > > Signed-off-by: Jungseok Lee > --- > arch/arm64/Kconfig.debug | 10 ++ > arch/arm64/include/asm/irq.h | 8 ++ > arch/arm64/include/asm/thread_info.h | 11 ++ > arch/arm64/kernel/asm-offsets.c | 8 ++ > arch/arm64/kernel/entry.S| 83 +++- > arch/arm64/kernel/head.S | 7 ++ > arch/arm64/kernel/irq.c | 18 > 7 files changed, 142 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug > index d6285ef..e16d91f 100644 > --- a/arch/arm64/Kconfig.debug > +++ b/arch/arm64/Kconfig.debug > @@ -18,6 +18,16 @@ config ARM64_PTDUMP > kernel. > If in doubt, say "N" > > +config IRQ_STACK > + bool "Use separate kernel stack when handling interrupts" > + depends on ARM64_4K_PAGES > + help > + Say Y here if you want to use separate kernel stack to handle both > + hard and soft interrupts. As reduceing memory footprint regarding > + kernel stack, it benefits low memory platforms. > + > + If in doubt, say N. > + I don't think it is necessary to have a debug-only Kconfig option for this. Reducing memory use is good for everyone! This would let you get rid of all the #ifdefs > config STRICT_DEVMEM > bool "Filter access to /dev/mem" > depends on MMU > diff --git a/arch/arm64/include/asm/thread_info.h > b/arch/arm64/include/asm/thread_info.h > index dcd06d1..5345a67 100644 > --- a/arch/arm64/include/asm/thread_info.h > +++ b/arch/arm64/include/asm/thread_info.h > @@ -71,11 +71,22 @@ register unsigned long current_stack_pointer asm ("sp"); > */ > static inline struct thread_info *current_thread_info(void) > __attribute_const__; > > +#ifndef CONFIG_IRQ_STACK > static inline struct thread_info *current_thread_info(void) > { > return (struct thread_info *) > (current_stack_pointer & ~(THREAD_SIZE - 1)); > } > +#else > +static inline struct thread_info *current_thread_info(void) > +{ > + unsigned long sp_el0; > + > + asm volatile("mrs %0, sp_el0" : "=r" (sp_el0)); > + > + return (struct thread_info *)(sp_el0 & ~(THREAD_SIZE - 1)); > +} > +#endif Because sp_el0 is only used as a stack value to find struct thread_info, you could just store the struct thread_info pointer in sp_el0, and save the masking on each read of the value. > > #define thread_saved_pc(tsk) \ > ((unsigned long)(tsk->thread.cpu_context.pc)) > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index d23ca0d..f1fdfa9 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -88,7 +88,11 @@ > > .if \el == 0 > mrs x21, sp_el0 > +#ifndef CONFIG_IRQ_STACK > get_thread_info tsk // Ensure MDSCR_EL1.SS is clear, > +#else > + get_thread_info \el, tsk > +#endif > ldr x19, [tsk, #TI_FLAGS] // since we can unmask debug > disable_step_tsk x19, x20 // exceptions when scheduling. > .endif > @@ -168,11 +172,56 @@ > eret// return to kernel > .endm > > +#ifndef CONFIG_IRQ_STACK > .macro get_thread_info, rd > mov \rd, sp > - and \rd, \rd, #~(THREAD_SIZE - 1) // top of stack > + and \rd, \rd, #~(THREAD_SIZE - 1) // bottom of stack > + .endm > +#else > + .macro get_thread_info, el, rd > + .if \el == 0 > + mov \rd, sp > + .else > + mrs \rd, sp_el0 > + .endif > + and \rd, \rd, #~(THREAD_SIZE - 1) // bottom of thread stack > + .endm > + > + .macro get_irq_stack > + get_thread_info 1, tsk > + ldr w22, [tsk, #TI_CPU] > + adr_l x21, irq_stacks > + mov x23, #IRQ_STACK_SIZE > + maddx21, x22, x23, x21 > .endm Using per_cpu variables would save the multiply here. You then wouldn't need IRQ_STACK_SIZE. > > + .macro irq_stack_entry > + get_irq_stack > + ldr w23, [x21, #IRQ_COUNT] > + cbnzw23, 1f > + mov x23, sp > + str x23, [x21, #IRQ_THREAD_SP] > + ldr x23, [x21, #IRQ_STACK] > + mov sp, x23 > + mov x23, xzr > +1: add w23, w2
similar files: fusbh200-hcd.c and fotg210-hcd.c
I executed a clone detection tool* on drivers source code and I found that the files drivers/usb/host/fusbh200-hcd.c and drivers/usb/host/fotg210-hcd.c are very similar. The main difference between the two files are replacing the string 'USBH20' by 'OTG21' and some white space fixes. Some changes are being applied to only one of the files, such as the commit f848a88d223cafa43cb318839a1171b498cf5ec8 that changes fotg210-hcd.c but not fusbh200-hcd.c. Should these files be consolidated? And if so how? Thank you, Peter * https://github.com/petersenna/ccfinderx-core -- Peter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/6] sched/fair: Compute capacity invariant load/utilization tracking
On 07/09/15 13:42, Peter Zijlstra wrote: > On Mon, Aug 31, 2015 at 11:24:49AM +0200, Peter Zijlstra wrote: > >> A quick run here gives: >> >> IVB-EP (2*20*2): > > As noted by someone; that should be 2*10*2, for a total of 40 cpus in > this machine. > >> >> perf stat --null --repeat 10 -- perf bench sched messaging -g 50 -l 5000 >> >> Before: After: >> 5.484170711 ( +- 0.74% )5.590001145 ( +- 0.45% ) >> >> Which is an almost 2% slowdown :/ >> >> I've yet to look at what happens. > > OK, so it appears this is link order nonsense. When I compared profiles > between the series, the one function that had significant change was > skb_release_data(), which doesn't make much sense. > > If I do a 'make clean' in front of each build, I get a repeatable > improvement with this patch set (although how much of that is due to the > patches itself or just because of code movement is as yet undetermined). > > I'm of a mind to apply these patches; with two patches on top, which > I'll post shortly. > -- >8 -- From: Dietmar Eggemann Date: Mon, 7 Sep 2015 14:57:22 +0100 Subject: [PATCH] sched/fair: Defer calling scaling functions Do not call the scaling functions in case time goes backwards or the last update of the sched_avg structure has happened less than 1024ns ago. Signed-off-by: Dietmar Eggemann --- kernel/sched/fair.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d6ca8d987a63..3445d2fb38f4 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2552,8 +2552,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, u64 delta, scaled_delta, periods; u32 contrib; unsigned int delta_w, scaled_delta_w, decayed = 0; - unsigned long scale_freq = arch_scale_freq_capacity(NULL, cpu); - unsigned long scale_cpu = arch_scale_cpu_capacity(NULL, cpu); + unsigned long scale_freq, scale_cpu; delta = now - sa->last_update_time; /* @@ -2574,6 +2573,9 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, return 0; sa->last_update_time = now; + scale_freq = arch_scale_freq_capacity(NULL, cpu); + scale_cpu = arch_scale_cpu_capacity(NULL, cpu); + /* delta_w is the amount already accumulated against our next period */ delta_w = sa->period_contrib; if (delta + delta_w >= 1024) { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 2/9] Input: goodix - use actual config length for each device type
Each of the Goodix devices supported by this driver has a fixed size for the configuration information registers. The size varies depending on the device and is specified in the datasheet. Use the proper configuration length as specified in the datasheet for each device model, so we do not read more than the actual size of the configuration registers. Signed-off-by: Irina Tirdea --- drivers/input/touchscreen/goodix.c | 25 +++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 6ae28c5..7be6eab 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -36,6 +36,7 @@ struct goodix_ts_data { unsigned int max_touch_num; unsigned int int_trigger_type; bool rotated_screen; + int cfg_len; }; #define GOODIX_MAX_HEIGHT 4096 @@ -45,6 +46,8 @@ struct goodix_ts_data { #define GOODIX_MAX_CONTACTS10 #define GOODIX_CONFIG_MAX_LENGTH 240 +#define GOODIX_CONFIG_911_LENGTH 186 +#define GOODIX_CONFIG_967_LENGTH 228 /* Register defines */ #define GOODIX_READ_COOR_ADDR 0x814E @@ -115,6 +118,23 @@ static int goodix_i2c_read(struct i2c_client *client, return ret < 0 ? ret : (ret != ARRAY_SIZE(msgs) ? -EIO : 0); } +static int goodix_get_cfg_len(u16 id) +{ + switch (id) { + case 911: + case 9271: + case 9110: + case 927: + case 928: + return GOODIX_CONFIG_911_LENGTH; + case 912: + case 967: + return GOODIX_CONFIG_967_LENGTH; + default: + return GOODIX_CONFIG_MAX_LENGTH; + } +} + static int goodix_ts_read_input_report(struct goodix_ts_data *ts, u8 *data) { int touch_num; @@ -230,8 +250,7 @@ static void goodix_read_config(struct goodix_ts_data *ts) int error; error = goodix_i2c_read(ts->client, GOODIX_REG_CONFIG_DATA, - config, - GOODIX_CONFIG_MAX_LENGTH); + config, ts->cfg_len); if (error) { dev_warn(&ts->client->dev, "Error reading config (%d), using defaults\n", @@ -398,6 +417,8 @@ static int goodix_ts_probe(struct i2c_client *client, return error; } + ts->cfg_len = goodix_get_cfg_len(id_info); + goodix_read_config(ts); error = goodix_request_input_dev(ts, version_info, id_info); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 4/9] Input: goodix - write configuration data to device
Goodix devices can be configured by writing custom data to the device at init. The configuration data is read with request_firmware from "goodix__cfg.bin", where is the product id read from the device (e.g.: goodix_911_cfg.bin for Goodix GT911, goodix_9271_cfg.bin for GT9271). The configuration information has a specific format described in the Goodix datasheet. It includes X/Y resolution, maximum supported touch points, interrupt flags, various sesitivity factors and settings for advanced features (like gesture recognition). Before writing the firmware, it is necessary to reset the device. If the device ACPI/DT information does not declare gpio pins (needed for reset), writing the firmware will not be available for these devices. This is based on Goodix datasheets for GT911 and GT9271 and on Goodix driver gt9xx.c for Android (publicly available in Android kernel trees for various devices). Signed-off-by: Octavian Purdila Signed-off-by: Irina Tirdea --- drivers/input/touchscreen/goodix.c | 225 +++-- 1 file changed, 192 insertions(+), 33 deletions(-) diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 8edfc06..9cf16ff7 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -40,6 +41,9 @@ struct goodix_ts_data { int cfg_len; struct gpio_desc *gpiod_int; struct gpio_desc *gpiod_rst; + u16 id; + u16 version; + char *cfg_name; }; #define GOODIX_MAX_HEIGHT 4096 @@ -145,6 +149,39 @@ static int goodix_i2c_read(struct i2c_client *client, return ret < 0 ? ret : (ret != ARRAY_SIZE(msgs) ? -EIO : 0); } +/** + * goodix_i2c_write - write data to a register of the i2c slave device. + * + * @client: i2c device. + * @reg: the register to write to. + * @buf: raw data buffer to write. + * @len: length of the buffer to write + */ +static int goodix_i2c_write(struct i2c_client *client, u16 reg, const u8 *buf, + unsigned len) +{ + u8 *addr_buf; + struct i2c_msg msg; + int ret; + + addr_buf = kmalloc(len + 2, GFP_KERNEL); + if (!addr_buf) + return -ENOMEM; + + addr_buf[0] = reg >> 8; + addr_buf[1] = reg & 0xFF; + memcpy(&addr_buf[2], buf, len); + + msg.flags = 0; + msg.addr = client->addr; + msg.buf = addr_buf; + msg.len = len + 2; + + ret = i2c_transfer(client->adapter, &msg, 1); + kfree(addr_buf); + return ret < 0 ? ret : (ret != 1 ? -EIO : 0); +} + static int goodix_get_cfg_len(u16 id) { switch (id) { @@ -264,6 +301,73 @@ static irqreturn_t goodix_ts_irq_handler(int irq, void *dev_id) return IRQ_HANDLED; } +/** + * goodix_check_cfg - Checks if config fw is valid + * + * @ts: goodix_ts_data pointer + * @cfg: firmware config data + */ +static int goodix_check_cfg(struct goodix_ts_data *ts, + const struct firmware *cfg) +{ + int i, raw_cfg_len; + u8 check_sum = 0; + + if (cfg->size > GOODIX_CONFIG_MAX_LENGTH) { + dev_err(&ts->client->dev, + "The length of the config fw is not correct"); + return -EINVAL; + } + + raw_cfg_len = cfg->size - 2; + for (i = 0; i < raw_cfg_len; i++) + check_sum += cfg->data[i]; + check_sum = (~check_sum) + 1; + if (check_sum != cfg->data[raw_cfg_len]) { + dev_err(&ts->client->dev, + "The checksum of the config fw is not correct"); + return -EINVAL; + } + + if (cfg->data[raw_cfg_len + 1] != 1) { + dev_err(&ts->client->dev, + "Config fw must have Config_Fresh register set"); + return -EINVAL; + } + + return 0; +} + +/** + * goodix_send_cfg - Write fw config to device + * + * @ts: goodix_ts_data pointer + * @cfg: config firmware to write to device + */ +static int goodix_send_cfg(struct goodix_ts_data *ts, + const struct firmware *cfg) +{ + int error; + + error = goodix_check_cfg(ts, cfg); + if (error) + return error; + + error = goodix_i2c_write(ts->client, GOODIX_REG_CONFIG_DATA, cfg->data, +cfg->size); + if (error) { + dev_err(&ts->client->dev, "Failed to write config data: %d", + error); + return error; + } + dev_dbg(&ts->client->dev, "Config sent successfully."); + + /* Let the firmware reconfigure itself, so sleep for 10ms */ + usleep_range(1, 11000); + + return 0; +} + static int goodix_int_sync(struct goodix_ts_data *ts) { int error; @@ -406,30 +510,29 @@ static void goodix_read_config(struct goodix_ts_data *ts) /** *
[PATCH v5 6/9] Input: goodix - use goodix_i2c_write_u8 instead of i2c_master_send
Use goodix_i2c_write_u8 instead of i2c_master_send to simplify code. Signed-off-by: Irina Tirdea --- drivers/input/touchscreen/goodix.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 3d4a004..03f3968 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -295,16 +295,11 @@ static void goodix_process_events(struct goodix_ts_data *ts) */ static irqreturn_t goodix_ts_irq_handler(int irq, void *dev_id) { - static const u8 end_cmd[] = { - GOODIX_READ_COOR_ADDR >> 8, - GOODIX_READ_COOR_ADDR & 0xff, - 0 - }; struct goodix_ts_data *ts = dev_id; goodix_process_events(ts); - if (i2c_master_send(ts->client, end_cmd, sizeof(end_cmd)) < 0) + if (goodix_i2c_write_u8(ts->client, GOODIX_READ_COOR_ADDR, 0) < 0) dev_err(&ts->client->dev, "I2C write end_cmd error\n"); return IRQ_HANDLED; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 8/9] Input: goodix - add sysfs interface to dump config
Goodix devices have a configuration information register area that specify various parameters for the device. The configuration information has a specific format described in the Goodix datasheet. It includes X/Y resolution, maximum supported touch points, interrupt flags, various sesitivity factors and settings for advanced features (like gesture recognition). Export a sysfs interface that would allow reading the configuration information. The default device configuration can be used as a starting point for creating a valid configuration firmware used by the device at init time to update its configuration. This sysfs interface will be exported only if the gpio pins are properly initialized from ACPI/DT. Signed-off-by: Irina Tirdea --- drivers/input/touchscreen/goodix.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 33a7b81..3179767 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -530,12 +530,35 @@ static ssize_t goodix_esd_timeout_store(struct device *dev, return count; } +static ssize_t goodix_dump_config_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct goodix_ts_data *ts = dev_get_drvdata(dev); + u8 config[GOODIX_CONFIG_MAX_LENGTH]; + int error, count = 0, i; + + error = goodix_i2c_read(ts->client, GOODIX_REG_CONFIG_DATA, + config, ts->cfg_len); + if (error) { + dev_warn(&ts->client->dev, +"Error reading config (%d)\n", error); + return error; + } + + for (i = 0; i < ts->cfg_len; i++) + count += scnprintf(buf + count, PAGE_SIZE - count, "%02x ", + config[i]); + return count; +} + /* ESD timeout in ms. Default disabled (0). Recommended 2000 ms. */ static DEVICE_ATTR(esd_timeout, S_IRUGO | S_IWUSR, goodix_esd_timeout_show, goodix_esd_timeout_store); +static DEVICE_ATTR(dump_config, S_IRUGO, goodix_dump_config_show, NULL); static struct attribute *goodix_attrs[] = { &dev_attr_esd_timeout.attr, + &dev_attr_dump_config.attr, NULL }; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] arm64: kernel: Use a separate stack for irq interrupts.
Having to handle interrupts on top of an existing kernel stack means the kernel stack must be large enough to accomodate both the maximum kernel usage, and the maximum irq handler usage. Switching to a different stack when processing irqs allows us to make the stack size smaller. Maximum kernel stack usage (running ltp and generating usb+ethernet interrupts) was 7256 bytes. With this patch, the same workload gives a maximum stack usage of 5816 bytes. Signed-off-by: James Morse --- arch/arm64/include/asm/irq.h | 12 + arch/arm64/include/asm/thread_info.h | 8 -- arch/arm64/kernel/entry.S| 33 --- arch/arm64/kernel/irq.c | 52 arch/arm64/kernel/smp.c | 4 +++ arch/arm64/kernel/stacktrace.c | 4 ++- 6 files changed, 107 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h index bbb251b14746..050d4196c736 100644 --- a/arch/arm64/include/asm/irq.h +++ b/arch/arm64/include/asm/irq.h @@ -2,14 +2,20 @@ #define __ASM_IRQ_H #include +#include #include +#include + +DECLARE_PER_CPU(unsigned long, irq_sp); struct pt_regs; extern void migrate_irqs(void); extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); +extern int alloc_irq_stack(unsigned int cpu); + static inline void acpi_irq_init(void) { /* @@ -21,4 +27,10 @@ static inline void acpi_irq_init(void) } #define acpi_irq_init acpi_irq_init +static inline bool is_irq_stack(unsigned long sp) +{ + struct thread_info *ti = get_thread_info(sp); + return (get_thread_info(per_cpu(irq_sp, ti->cpu)) == ti); +} + #endif diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index dcd06d18a42a..b906254fc400 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -69,12 +69,16 @@ register unsigned long current_stack_pointer asm ("sp"); /* * how to get the thread information struct from C */ +static inline struct thread_info *get_thread_info(unsigned long sp) +{ + return (struct thread_info *)(sp & ~(THREAD_SIZE - 1)); +} + static inline struct thread_info *current_thread_info(void) __attribute_const__; static inline struct thread_info *current_thread_info(void) { - return (struct thread_info *) - (current_stack_pointer & ~(THREAD_SIZE - 1)); + return get_thread_info(current_stack_pointer); } #define thread_saved_pc(tsk) \ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index e16351819fed..d42371f3f5a1 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -190,10 +190,37 @@ tsk .reqx28 // current thread_info * Interrupt handling. */ .macro irq_handler - adrpx1, handle_arch_irq - ldr x1, [x1, #:lo12:handle_arch_irq] - mov x0, sp + mrs x21, tpidr_el1 + adr_l x20, irq_sp + add x20, x20, x21 + + ldr x21, [x20] + mov x20, sp + + mov x0, x21 + mov x1, x20 + bl irq_copy_thread_info + + /* test for recursive use of irq_sp */ + cbz w0, 1f + mrs x30, elr_el1 + mov sp, x21 + + /* +* Create a fake stack frame to bump unwind_frame() onto the original +* stack. This relies on x29 not being clobbered by kernel_entry(). +*/ + pushx29, x30 + +1: ldr_l x1, handle_arch_irq + mov x0, x20 blr x1 + + mov x0, x20 + mov x1, x21 + bl irq_copy_thread_info + mov sp, x20 + .endm .text diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c index 463fa2e7e34c..10b57a006da8 100644 --- a/arch/arm64/kernel/irq.c +++ b/arch/arm64/kernel/irq.c @@ -26,11 +26,14 @@ #include #include #include +#include #include #include unsigned long irq_err_count; +DEFINE_PER_CPU(unsigned long, irq_sp) = 0; + int arch_show_interrupts(struct seq_file *p, int prec) { #ifdef CONFIG_SMP @@ -55,6 +58,10 @@ void __init init_IRQ(void) irqchip_init(); if (!handle_arch_irq) panic("No interrupt controller found."); + + /* Allocate an irq stack for the boot cpu */ + if (alloc_irq_stack(smp_processor_id())) + panic("Failed to allocate irq stack for boot cpu."); } #ifdef CONFIG_HOTPLUG_CPU @@ -117,3 +124,48 @@ void migrate_irqs(void) local_irq_restore(flags); } #endif /* CONFIG_HOTPLUG_CPU */ + +/* Allocate an irq_stack for a cpu that is about to be brought up. */ +int alloc_irq_stack(unsigned int cpu) +{ + struct page *irq_stack_page; + union thread_union *irq_stack; + + /* reuse stack allocated previously */ + if (per_cpu(irq_sp, cpu)) + return 0; + + irq_stack_page = alloc_kmem_pages(THREADINFO_GFP, THREAD_S
[PATCH v5 9/9] Input: goodix - add runtime power management support
Add support for runtime power management so that the device is turned off when not used (when the userspace holds no open handles of the input device). The device uses autosuspend with a default delay of 2 seconds, so the device will suspend if no handles to it are open for 2 seconds. The runtime management support is only available if the gpio pins are properly initialized from ACPI/DT. Signed-off-by: Irina Tirdea --- drivers/input/touchscreen/goodix.c | 57 +++--- 1 file changed, 53 insertions(+), 4 deletions(-) diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 3179767..34c0183 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include @@ -75,6 +76,8 @@ struct goodix_ts_data { #define MAX_CONTACTS_LOC 5 #define TRIGGER_LOC6 +#define GOODIX_AUTOSUSPEND_DELAY_MS2000 + static const unsigned long goodix_irq_flags[] = { IRQ_TYPE_EDGE_RISING, IRQ_TYPE_EDGE_FALLING, @@ -566,6 +569,27 @@ static const struct attribute_group goodix_attr_group = { .attrs = goodix_attrs, }; +static int goodix_open(struct input_dev *input_dev) +{ + struct goodix_ts_data *ts = input_get_drvdata(input_dev); + int error; + + error = pm_runtime_get_sync(&ts->client->dev); + if (error < 0) { + pm_runtime_put_noidle(&ts->client->dev); + return error; + } + return 0; +} + +static void goodix_close(struct input_dev *input_dev) +{ + struct goodix_ts_data *ts = input_get_drvdata(input_dev); + + pm_runtime_mark_last_busy(&ts->client->dev); + pm_runtime_put_autosuspend(&ts->client->dev); +} + /** * goodix_get_gpio_config - Get GPIO config from ACPI/DT * @@ -751,6 +775,9 @@ static int goodix_request_input_dev(struct goodix_ts_data *ts) ts->input_dev->id.vendor = 0x0416; ts->input_dev->id.product = ts->id; ts->input_dev->id.version = ts->version; + ts->input_dev->open = goodix_open; + ts->input_dev->close = goodix_close; + input_set_drvdata(ts->input_dev, ts); error = input_register_device(ts->input_dev); if (error) { @@ -798,7 +825,8 @@ static int goodix_configure_dev(struct goodix_ts_data *ts) * @ts: our goodix_ts_data pointer * * request_firmware_wait callback that finishes - * initialization of the device. + * initialization of the device. This will only be called + * when ts->gpiod_int and ts->gpiod_rst are properly initialized. */ static void goodix_config_cb(const struct firmware *cfg, void *ctx) { @@ -811,7 +839,21 @@ static void goodix_config_cb(const struct firmware *cfg, void *ctx) if (error) goto err_release_cfg; } - goodix_configure_dev(ts); + error = goodix_configure_dev(ts); + if (error) + goto err_release_cfg; + + error = pm_runtime_set_active(&ts->client->dev); + if (error) { + dev_err(&ts->client->dev, "failed to set active: %d\n", error); + goto err_release_cfg; + } + /* input_dev is a child of client->dev, ignore it for runtime pm */ + pm_suspend_ignore_children(&ts->client->dev, true); + pm_runtime_enable(&ts->client->dev); + pm_runtime_set_autosuspend_delay(&ts->client->dev, +GOODIX_AUTOSUSPEND_DELAY_MS); + pm_runtime_use_autosuspend(&ts->client->dev); err_release_cfg: release_firmware(cfg); @@ -915,8 +957,12 @@ static int goodix_ts_remove(struct i2c_client *client) { struct goodix_ts_data *ts = i2c_get_clientdata(client); - if (ts->gpiod_int && ts->gpiod_rst) + if (ts->gpiod_int && ts->gpiod_rst) { + pm_runtime_disable(&client->dev); + pm_runtime_set_suspended(&client->dev); + pm_runtime_put_noidle(&client->dev); sysfs_remove_group(&client->dev.kobj, &goodix_attr_group); + } goodix_disable_esd(ts); kfree(ts->cfg_name); return 0; @@ -990,7 +1036,10 @@ static int __maybe_unused goodix_resume(struct device *dev) return goodix_enable_esd(ts); } -static SIMPLE_DEV_PM_OPS(goodix_pm_ops, goodix_suspend, goodix_resume); +static const struct dev_pm_ops goodix_pm_ops = { + SET_SYSTEM_SLEEP_PM_OPS(goodix_suspend, goodix_resume) + SET_RUNTIME_PM_OPS(goodix_suspend, goodix_resume, NULL) +}; static const struct i2c_device_id goodix_ts_id[] = { { "GDIX1001:00", 0 }, -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 7/9] Input: goodix - add support for ESD
Add ESD (Electrostatic Discharge) protection mechanism. The driver enables ESD protection in HW and checks a register to determine if ESD occurred. If ESD is signalled by the HW, the driver will reset the device. The ESD poll time (in ms) can be set through the sysfs property esd_timeout. If it is set to 0, ESD protection is disabled. Recommended value is 2000 ms. The initial value for ESD timeout can be set through esd-recovery-timeout-ms ACPI/DT property. If there is no such property defined, ESD protection is disabled. For ACPI 5.1, the property can be specified using _DSD properties: Device (STAC) { Name (_HID, "GDIX1001") ... Name (_DSD, Package () { ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"), Package () { Package (2) { "esd-recovery-timeout-ms", Package(1) { 2000 }}, ... } }) } The ESD protection mechanism is only available if the gpio pins are properly initialized from ACPI/DT. This is based on Goodix datasheets for GT911 and GT9271 and on Goodix driver gt9xx.c for Android (publicly available in Android kernel trees for various devices). Signed-off-by: Irina Tirdea --- .../bindings/input/touchscreen/goodix.txt | 6 + drivers/input/touchscreen/goodix.c | 174 - 2 files changed, 173 insertions(+), 7 deletions(-) diff --git a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt index c0715f8..5891ad1 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt +++ b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt @@ -14,6 +14,12 @@ Required properties: - interrupts : Interrupt to which the chip is connected - gpios : GPIOS the chip is connected to: first one is the interrupt gpio and second one the reset gpio. +Optional properties: + + - esd-recovery-timeout-ms : ESD poll time (in milli seconds) for the driver to +check if ESD occurred and in that case reset the +device. ESD is disabled if this property is not set +or is set to 0. Example: diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 03f3968..33a7b81 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -45,8 +45,12 @@ struct goodix_ts_data { u16 version; char *cfg_name; unsigned long irq_flags; + atomic_t esd_timeout; + struct delayed_work esd_work; }; +#define GOODIX_DEVICE_ESD_TIMEOUT_PROPERTY "esd-recovery-timeout-ms" + #define GOODIX_MAX_HEIGHT 4096 #define GOODIX_MAX_WIDTH 4096 #define GOODIX_INT_TRIGGER 1 @@ -60,6 +64,8 @@ struct goodix_ts_data { /* Register defines */ #define GOODIX_REG_COMMAND 0x8040 #define GOODIX_CMD_SCREEN_OFF 0x05 +#define GOODIX_CMD_ESD_ENABLED 0xAA +#define GOODIX_REG_ESD_CHECK 0x8041 #define GOODIX_READ_COOR_ADDR 0x814E #define GOODIX_REG_CONFIG_DATA 0x8047 @@ -426,6 +432,117 @@ static int goodix_reset(struct goodix_ts_data *ts) return goodix_int_sync(ts); } +static void goodix_disable_esd(struct goodix_ts_data *ts) +{ + if (!atomic_read(&ts->esd_timeout)) + return; + cancel_delayed_work_sync(&ts->esd_work); +} + +static int goodix_enable_esd(struct goodix_ts_data *ts) +{ + int error, esd_timeout; + + esd_timeout = atomic_read(&ts->esd_timeout); + if (!esd_timeout) + return 0; + + error = goodix_i2c_write_u8(ts->client, GOODIX_REG_ESD_CHECK, + GOODIX_CMD_ESD_ENABLED); + if (error) { + dev_err(&ts->client->dev, "Failed to enable ESD: %d\n", error); + return error; + } + + schedule_delayed_work(&ts->esd_work, round_jiffies_relative( + msecs_to_jiffies(esd_timeout))); + return 0; +} + +static void goodix_esd_work(struct work_struct *work) +{ + struct goodix_ts_data *ts = container_of(work, struct goodix_ts_data, +esd_work.work); + int retries = 3, error; + u8 esd_data[2]; + const struct firmware *cfg = NULL; + + while (--retries) { + error = goodix_i2c_read(ts->client, GOODIX_REG_COMMAND, + esd_data, sizeof(esd_data)); + if (error) + continue; + if (esd_data[0] != GOODIX_CMD_ESD_ENABLED && + esd_data[1] == GOODIX_CMD_ESD_ENABLED) { + /* feed the watchdog */ + goodix_i2c_write_u8(ts->client, + GOODIX_REG_COMMAND, +
Re: [PATCH v4 1/3] mtd: nand: increase ready wait timeout and report timeouts
Hi Ezequiel, Thanks for reviewing the series. On 06/09/2015 21:37, Ezequiel Garcia wrote: > On 27 Jul 02:50 PM, Alex Smith wrote: >> If nand_wait_ready() times out, this is silently ignored, and its >> caller will then proceed to read from/write to the chip before it is >> ready. This can potentially result in corruption with no indication as >> to why. >> >> While a 20ms timeout seems like it should be plenty enough, certain >> behaviour can cause it to timeout much earlier than expected. The >> situation which prompted this change was that CPU 0, which is >> responsible for updating jiffies, was holding interrupts disabled >> for a fairly long time while writing to the console during a printk, >> causing several jiffies updates to be delayed. If CPU 1 happens to >> enter the timeout loop in nand_wait_ready() just before CPU 0 re- >> enables interrupts and updates jiffies, CPU 1 will immediately time >> out when the delayed jiffies updates are made. The result of this is >> that nand_wait_ready() actually waits less time than the NAND chip >> would normally take to be ready, and then read_page() proceeds to >> read out bad data from the chip. >> >> The situation described above may seem unlikely, but in fact it can be >> reproduced almost every boot on the MIPS Creator Ci20. >> > > Not only unlikely but scary :) BTW, can't find SMP patches for Ci20, > are you sure this behavior will apply once SMP is upstreamed? Certainly made for fun debugging ;) SMP support only exists in our 3.18 branch [1] at the moment, which was where this problem was encountered. Support should be upstreamed at some point, and I would guess that this behaviour could still happen then (even though it's a really obscure edge case that we were somehow managing to almost always hit on boot). [1] https://github.com/MIPS/CI20_linux > >> Debugging this was made more difficult by the misleading comment above >> nand_wait_ready() stating "The timeout is caught later" - no timeout >> was ever reported, leading me away from the real source of the problem. >> >> Therefore, this patch increases the timeout to 200ms. This should be >> enough to cover cases where jiffies updates get delayed. Additionally, >> add a pr_warn() when a timeout does occur so that it is easier to >> pinpoint any problems in future caused by the chip not becoming ready. >> >> Signed-off-by: Alex Smith >> Cc: Zubair Lutfullah Kakakhel >> Cc: David Woodhouse >> Cc: Brian Norris >> Cc: linux-...@lists.infradead.org >> Cc: linux-kernel@vger.kernel.org >> --- >> v3 -> v4: >> - New patch to fix issue encountered in external Ci20 3.18 kernel >>branch which also applies upstream. >> --- >> drivers/mtd/nand/nand_base.c | 15 --- >> 1 file changed, 12 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c >> index ceb68ca8277a..a0dab3414f16 100644 >> --- a/drivers/mtd/nand/nand_base.c >> +++ b/drivers/mtd/nand/nand_base.c >> @@ -543,23 +543,32 @@ static void panic_nand_wait_ready(struct mtd_info >> *mtd, unsigned long timeo) >> } >> } >> >> -/* Wait for the ready pin, after a command. The timeout is caught later. */ >> +/** >> + * nand_wait_ready - [GENERIC] Wait for the ready pin after commands. >> + * @mtd: MTD device structure >> + * >> + * Wait for the ready pin after a command, and warn if a timeout occurs. >> + */ >> void nand_wait_ready(struct mtd_info *mtd) >> { >> struct nand_chip *chip = mtd->priv; >> -unsigned long timeo = jiffies + msecs_to_jiffies(20); >> +unsigned long timeo = jiffies + msecs_to_jiffies(200); >> >> /* 400ms timeout */ >> if (in_interrupt() || oops_in_progress) >> return panic_nand_wait_ready(mtd, 400); >> >> led_trigger_event(nand_led_trigger, LED_FULL); >> + > > Spurious change here. Removed. > >> /* Wait until command is processed or timeout occurs */ >> do { >> if (chip->dev_ready(mtd)) >> -break; >> +goto out; >> touch_softlockup_watchdog(); >> } while (time_before(jiffies, timeo)); >> + >> +pr_warn("timeout while waiting for chip to become ready\n"); >> +out: >> led_trigger_event(nand_led_trigger, LED_OFF); >> } > > This change looks reasonable, a timeout value should be large enough > to be confident the operation has _really_ timed out. On non-error > path, this change shouldn't make any difference. > > And the warning is probably helpful too, so: > > Reviewed-by: Ezequiel Garcia Great, thanks. Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 5/9] Input: goodix - add power management support
Implement suspend/resume for goodix driver. The suspend and resume process uses the gpio pins. If the device ACPI/DT information does not declare gpio pins, suspend/resume will not be available for these devices. This is based on Goodix datasheets for GT911 and GT9271 and on Goodix driver gt9xx.c for Android (publicly available in Android kernel trees for various devices). Signed-off-by: Octavian Purdila Signed-off-by: Irina Tirdea --- drivers/input/touchscreen/goodix.c | 94 -- 1 file changed, 89 insertions(+), 5 deletions(-) diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 9cf16ff7..3d4a004 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -44,6 +44,7 @@ struct goodix_ts_data { u16 id; u16 version; char *cfg_name; + unsigned long irq_flags; }; #define GOODIX_MAX_HEIGHT 4096 @@ -57,6 +58,9 @@ struct goodix_ts_data { #define GOODIX_CONFIG_967_LENGTH 228 /* Register defines */ +#define GOODIX_REG_COMMAND 0x8040 +#define GOODIX_CMD_SCREEN_OFF 0x05 + #define GOODIX_READ_COOR_ADDR 0x814E #define GOODIX_REG_CONFIG_DATA 0x8047 #define GOODIX_REG_ID 0x8140 @@ -182,6 +186,11 @@ static int goodix_i2c_write(struct i2c_client *client, u16 reg, const u8 *buf, return ret < 0 ? ret : (ret != 1 ? -EIO : 0); } +static int goodix_i2c_write_u8(struct i2c_client *client, u16 reg, u8 value) +{ + return goodix_i2c_write(client, reg, &value, sizeof(value)); +} + static int goodix_get_cfg_len(u16 id) { switch (id) { @@ -301,6 +310,18 @@ static irqreturn_t goodix_ts_irq_handler(int irq, void *dev_id) return IRQ_HANDLED; } +static void goodix_free_irq(struct goodix_ts_data *ts) +{ + devm_free_irq(&ts->client->dev, ts->client->irq, ts); +} + +static int goodix_request_irq(struct goodix_ts_data *ts) +{ + return devm_request_threaded_irq(&ts->client->dev, ts->client->irq, +NULL, goodix_ts_irq_handler, +ts->irq_flags, ts->client->name, ts); +} + /** * goodix_check_cfg - Checks if config fw is valid * @@ -617,7 +638,6 @@ static int goodix_request_input_dev(struct goodix_ts_data *ts) static int goodix_configure_dev(struct goodix_ts_data *ts) { int error; - unsigned long irq_flags; goodix_read_config(ts); @@ -625,10 +645,8 @@ static int goodix_configure_dev(struct goodix_ts_data *ts) if (error) return error; - irq_flags = goodix_irq_flags[ts->int_trigger_type] | IRQF_ONESHOT; - error = devm_request_threaded_irq(&ts->client->dev, ts->client->irq, - NULL, goodix_ts_irq_handler, - irq_flags, ts->client->name, ts); + ts->irq_flags = goodix_irq_flags[ts->int_trigger_type] | IRQF_ONESHOT; + error = goodix_request_irq(ts); if (error) { dev_err(&ts->client->dev, "request IRQ failed: %d\n", error); return error; @@ -732,6 +750,71 @@ static int goodix_ts_probe(struct i2c_client *client, return goodix_configure_dev(ts); } +static int __maybe_unused goodix_suspend(struct device *dev) +{ + struct i2c_client *client = to_i2c_client(dev); + struct goodix_ts_data *ts = i2c_get_clientdata(client); + int error; + + /* We need gpio pins to suspend/resume */ + if (!ts->gpiod_int || !ts->gpiod_rst) + return 0; + + /* Free IRQ as IRQ pin is used as output in the suspend sequence */ + goodix_free_irq(ts); + /* Output LOW on the INT pin for 5 ms */ + error = gpiod_direction_output(ts->gpiod_int, 0); + if (error) { + goodix_request_irq(ts); + return error; + } + usleep_range(5000, 6000); + + error = goodix_i2c_write_u8(ts->client, GOODIX_REG_COMMAND, + GOODIX_CMD_SCREEN_OFF); + if (error) { + dev_err(&ts->client->dev, "Screen off command failed\n"); + gpiod_direction_input(ts->gpiod_int); + goodix_request_irq(ts); + return -EAGAIN; + } + + /* +* The datasheet specifies that the interval between sending screen-off +* command and wake-up should be longer than 58 ms. To avoid waking up +* sooner, delay 58ms here. +*/ + msleep(58); + return 0; +} + +static int __maybe_unused goodix_resume(struct device *dev) +{ + struct i2c_client *client = to_i2c_client(dev); + struct goodix_ts_data *ts = i2c_get_clientdata(client); + int error; + + if (!ts->gpiod_int || !ts->gpiod_rst) + return 0; + + /* +* Exit sleep mode by outputting HIGH level to INT pin +* for 2ms~5ms. +
[PATCH v5 1/9] Input: goodix - sort includes alphabetically
Signed-off-by: Irina Tirdea --- drivers/input/touchscreen/goodix.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index e36162b..6ae28c5 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -14,18 +14,18 @@ * Software Foundation; version 2 of the License. */ -#include +#include +#include #include #include #include #include -#include -#include -#include #include -#include -#include +#include +#include +#include #include +#include #include struct goodix_ts_data { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 3/9] Input: goodix - reset device at init
After power on, it is recommended that the driver resets the device. The reset procedure timing is described in the datasheet and is used at device init (before writing device configuration) and for power management. It is a sequence of setting the interrupt and reset pins high/low at specific timing intervals. This procedure also includes setting the slave address to the one specified in the ACPI/device tree. This is based on Goodix datasheets for GT911 and GT9271 and on Goodix driver gt9xx.c for Android (publicly available in Android kernel trees for various devices). For reset the driver needs to control the interrupt and reset gpio pins (configured through ACPI/device tree). For devices that do not have the gpio pins declared, the functionality depending on these pins will not be available, but the device can still be used with basic functionality. Signed-off-by: Octavian Purdila Signed-off-by: Irina Tirdea --- .../bindings/input/touchscreen/goodix.txt | 5 + drivers/input/touchscreen/goodix.c | 136 + 2 files changed, 141 insertions(+) diff --git a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt index 8ba98ee..c0715f8 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt +++ b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt @@ -12,6 +12,8 @@ Required properties: - reg : I2C address of the chip. Should be 0x5d or 0x14 - interrupt-parent: Interrupt controller to which the chip is connected - interrupts : Interrupt to which the chip is connected + - gpios : GPIOS the chip is connected to: first one is the + interrupt gpio and second one the reset gpio. Example: @@ -23,6 +25,9 @@ Example: reg = <0x5d>; interrupt-parent = <&gpio>; interrupts = <0 0>; + + gpios = <&gpio1 0 0>, /* INT */ + <&gpio1 1 0>; /* RST */ }; /* ... */ diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 7be6eab..8edfc06 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -37,6 +38,8 @@ struct goodix_ts_data { unsigned int int_trigger_type; bool rotated_screen; int cfg_len; + struct gpio_desc *gpiod_int; + struct gpio_desc *gpiod_rst; }; #define GOODIX_MAX_HEIGHT 4096 @@ -89,6 +92,30 @@ static const struct dmi_system_id rotated_screen[] = { {} }; +/* + * ACPI table specifies gpio pins in this order: first rst pin and + * then interrupt pin. + */ +static const struct dmi_system_id goodix_rst_pin_first[] = { +#if defined(CONFIG_DMI) && defined(CONFIG_X86) + { + .ident = "WinBook TW100", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "WinBook"), + DMI_MATCH(DMI_PRODUCT_NAME, "TW100") + } + }, + { + .ident = "WinBook TW700", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "WinBook"), + DMI_MATCH(DMI_PRODUCT_NAME, "TW700") + }, + }, +#endif + {} +}; + /** * goodix_i2c_read - read data from a register of the i2c slave device. * @@ -237,6 +264,102 @@ static irqreturn_t goodix_ts_irq_handler(int irq, void *dev_id) return IRQ_HANDLED; } +static int goodix_int_sync(struct goodix_ts_data *ts) +{ + int error; + + error = gpiod_direction_output(ts->gpiod_int, 0); + if (error) + return error; + msleep(50); /* T5: 50ms */ + + return gpiod_direction_input(ts->gpiod_int); +} + +/** + * goodix_reset - Reset device during power on + * + * @ts: goodix_ts_data pointer + */ +static int goodix_reset(struct goodix_ts_data *ts) +{ + int error; + + /* begin select I2C slave addr */ + error = gpiod_direction_output(ts->gpiod_rst, 0); + if (error) + return error; + msleep(20); /* T2: > 10ms */ + /* HIGH: 0x28/0x29, LOW: 0xBA/0xBB */ + error = gpiod_direction_output(ts->gpiod_int, ts->client->addr == 0x14); + if (error) + return error; + usleep_range(100, 2000);/* T3: > 100us */ + error = gpiod_direction_output(ts->gpiod_rst, 1); + if (error) + return error; + usleep_range(6000, 1); /* T4: > 5ms */ + /* end select I2C slave addr */ + error = gpiod_direction_input(ts->gpiod_rst); + if (error) + return error; + return goodix_int_sync(ts); +} + +/