Re: [PATCH-v6 5/6] mfd: 88pm800: Set default interrupt clear method
On Monday 24 August 2015 07:24 PM, Lee Jones wrote: On Wed, 08 Jul 2015, Vaibhav Hiremath wrote: As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe (page 0) controls the method of clearing interrupt status of 88pm800 family of devices; 0: clear on read 1: clear on write If pdata is not coming from board file, then set the default irq clear method to irq clear on write Also, as suggested by Lee Jones renaming variable field to appropriate name and removed unnecessary field pm80x_chip.irq_mode, using platform_data.irq_clr_method. Signed-off-by: Zhao Ye zh...@marvell.com Signed-off-by: Vaibhav Hiremath vaibhav.hirem...@linaro.org Reviewed-by: Krzysztof Kozlowski k.kozlow...@samsung.com --- drivers/mfd/88pm800.c | 15 ++- include/linux/mfd/88pm80x.h | 9 +++-- 2 files changed, 17 insertions(+), 7 deletions(-) [...] +#define PM800_WAKEUP2_INT_READ_CLEAR (0 1) +#define PM800_WAKEUP2_INT_WRITE_CLEAR (1 1) Use BIT(). +/* Used by irq_clr_method */ +#define PM800_IRQ_CLR_ON_READ 0 +#define PM800_IRQ_CLR_ON_WRITE 1 - int irq_mode; /* Clear interrupt by read/write(0/1) */ + bool irq_clr_method;/* Clear interrupt by read/write(0/1) */ + irq_clr_mode = pdata-irq_clr_method == PM800_IRQ_CLR_ON_WRITE ? + PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR; + ret = regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode); This is pretty convoluted. For starters you're abusing the 'bool' type here. Bool is either 'true' or 'false', so at the very least you should rename 'irq_clr_method' to 'irq_clr_on_write'. Then you can do: irq_clr_mode = pdata-irq_clr_on_write ? PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR; We have discussed on this, and went back-n-forth. I think if I remember correctly, one of the version was using true/false then we decided to rename it to relevant macro. If I am not wrong V4 version of this series is exactly same as what you are referring to. However, what I suggest you really do is share PM800_WAKEUP2_INT_{READ,WRITE}_CLEAR with platform data and just pass the value through directly. I think we discussed about this also, and the reason I recall here is, we may need to control this from DT in the future so we decided to keep it boolean in platform_data and have simple check before writing to register. And I think that was also another reason we introduced /* Used by irq_clr_method */ #define PM800_IRQ_CLR_ON_READ 0 #define PM800_IRQ_CLR_ON_WRITE 1 (Earlier it was true/false in V4) Thanks, Vaibhav -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kernel/sysctl.c: If count including the terminating byte '\0' the write system call should retrun success.
On Mon, Aug 24, 2015 at 8:27 PM, Eric W. Biederman ebied...@xmission.com wrote: On August 24, 2015 1:56:13 AM PDT, Sean Fu fxinr...@gmail.com wrote: when the input argument count including the terminating byte \0, The write system call return EINVAL on proc file. But it return success on regular file. Nonsense. It will write the '\0' to a regular file because it is just data. Integers in proc are more than data. So I see no justification for this change. In fact, write(fd, 1\0, 2) on Integers proc file return success on 2.6 kernel. I already tested it on 2.6.6.60 kernel. So, The latest behavior of write(fd, 1\0, 2) is different from old kernel(2.6). This maybe impact the compatibility of some user space program. Eric E.g. Writting two bytes (1\0) to /proc/sys/net/ipv4/conf/eth0/rp_filter. write(fd, 1\0, 2) return EINVAL. Signed-off-by: Sean Fu fxinr...@gmail.com --- kernel/sysctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 19b62b5..c2b0594 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -2004,7 +2004,7 @@ static int do_proc_dointvec_conv(bool *negp, unsigned long *lvalp, return 0; } -static const char proc_wspace_sep[] = { ' ', '\t', '\n' }; +static const char proc_wspace_sep[] = { ' ', '\t', '\n', '\0' }; static int __do_proc_dointvec(void *tbl_data, struct ctl_table *table, int write, void __user *buffer, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 1/1] USB:option:add ZTE PIDs
This is intended to add ZTE device PIDs on kernel. Signed-off-by: Liu.Zhao lzsos...@163.com --- drivers/usb/serial/option.c | 24 1 file changed, 24 insertions(+) diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index 876423b..6b4a766 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -285,6 +285,10 @@ static void option_instat_callback(struct urb *urb); #define ZTE_PRODUCT_MC2718 0xffe8 #define ZTE_PRODUCT_AD3812 0xffeb #define ZTE_PRODUCT_MC2716 0xffed +#define ZTE_PRODUCT_ZM8620_X 0x0396 +#define ZTE_PRODUCT_ME3620_MBIM0x0426 +#define ZTE_PRODUCT_ME3620_X 0x1432 +#define ZTE_PRODUCT_ME3620_L 0x1433 #define BENQ_VENDOR_ID 0x04a5 #define BENQ_PRODUCT_H10 0x4068 @@ -544,6 +548,18 @@ static const struct option_blacklist_info zte_mc2716_z_blacklist = { .sendsetup = BIT(1) | BIT(2) | BIT(3), }; +static const struct option_blacklist_info zte_zm8620_x_blacklist = { + .reserved = BIT(3) | BIT(4) | BIT(5), +}; + +static const struct option_blacklist_info zte_me3620_xl_blacklist = { + .reserved = BIT(3) | BIT(4) | BIT(5), +}; + +static const struct option_blacklist_info zte_me3620_mbim_blacklist = { + .reserved = BIT(2) | BIT(3) | BIT(4), +}; + static const struct option_blacklist_info huawei_cdc12_blacklist = { .reserved = BIT(1) | BIT(2), }; @@ -1591,6 +1607,14 @@ static const struct usb_device_id option_ids[] = { .driver_info = (kernel_ulong_t)zte_ad3812_z_blacklist }, { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, ZTE_PRODUCT_MC2716, 0xff, 0xff, 0xff), .driver_info = (kernel_ulong_t)zte_mc2716_z_blacklist }, + { USB_DEVICE(ZTE_VENDOR_ID, ZTE_PRODUCT_ME3620_L), +.driver_info = (kernel_ulong_t)zte_me3620_xl_blacklist }, + { USB_DEVICE(ZTE_VENDOR_ID, ZTE_PRODUCT_ME3620_X), +.driver_info = (kernel_ulong_t)zte_me3620_xl_blacklist }, + { USB_DEVICE(ZTE_VENDOR_ID, ZTE_PRODUCT_ZM8620_X), +.driver_info = (kernel_ulong_t)zte_zm8620_x_blacklist }, + { USB_DEVICE(ZTE_VENDOR_ID, ZTE_PRODUCT_ME3620_MBIM), +.driver_info = (kernel_ulong_t)zte_me3620_mbim_blacklist }, { USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0xff, 0x02, 0x01) }, { USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0xff, 0x02, 0x05) }, { USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0xff, 0x86, 0x10) }, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ipmi: add of_device_id in MODULE_DEVICE_TABLE
Fix autoloading ipmi modules when using device tree. Signed-off-by: Brijesh Singh brijeshkumar.si...@amd.com --- drivers/char/ipmi/ipmi_si_intf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index 8a45e92..cddc7b0 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -2785,6 +2785,7 @@ static struct platform_driver ipmi_driver = { .probe = ipmi_probe, .remove = ipmi_remove, }; +MODULE_DEVICE_TABLE(of, ipmi_match); #ifdef CONFIG_PARISC static int ipmi_parisc_probe(struct parisc_device *dev) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 5/6] ARCv2: perf: SMP support
* split off pmu info into singleton and per-cpu bits * setup PMU on all cores Cc: Peter Zijlstra pet...@infradead.org Cc: Arnaldo Carvalho de Melo a...@kernel.org Signed-off-by: Alexey Brodkin abrod...@synopsys.com --- No changes since v2. Compared to v1: [1] Rebase on top of previos patches hence changes in patch itself [2] Cosmetics arch/arc/kernel/perf_event.c | 69 ++-- 1 file changed, 54 insertions(+), 15 deletions(-) diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c index 997ccbd..80f5a85 100644 --- a/arch/arc/kernel/perf_event.c +++ b/arch/arc/kernel/perf_event.c @@ -21,10 +21,22 @@ struct arc_pmu { struct pmu pmu; + unsigned intirq; int n_counters; - unsigned long used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)]; u64 max_period; int ev_hw_idx[PERF_COUNT_ARC_HW_MAX]; +}; + +struct arc_pmu_cpu { + /* +* A 1 bit for an index indicates that the counter is being used for +* an event. A 0 means that the counter can be used. +*/ + unsigned long used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)]; + + /* +* The events that are active on the PMU for the given index. +*/ struct perf_event *act_counter[ARC_PERF_MAX_COUNTERS]; }; @@ -67,6 +79,7 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs) } static struct arc_pmu *arc_pmu; +static DEFINE_PER_CPU(struct arc_pmu_cpu, arc_pmu_cpu); /* read counter #idx; note that counter# != event# on ARC! */ static uint64_t arc_pmu_read_counter(int idx) @@ -304,10 +317,12 @@ static void arc_pmu_stop(struct perf_event *event, int flags) static void arc_pmu_del(struct perf_event *event, int flags) { + struct arc_pmu_cpu *pmu_cpu = this_cpu_ptr(arc_pmu_cpu); + arc_pmu_stop(event, PERF_EF_UPDATE); - __clear_bit(event-hw.idx, arc_pmu-used_mask); + __clear_bit(event-hw.idx, pmu_cpu-used_mask); - arc_pmu-act_counter[event-hw.idx] = 0; + pmu_cpu-act_counter[event-hw.idx] = 0; perf_event_update_userpage(event); } @@ -315,22 +330,23 @@ static void arc_pmu_del(struct perf_event *event, int flags) /* allocate hardware counter and optionally start counting */ static int arc_pmu_add(struct perf_event *event, int flags) { + struct arc_pmu_cpu *pmu_cpu = this_cpu_ptr(arc_pmu_cpu); struct hw_perf_event *hwc = event-hw; int idx = hwc-idx; - if (__test_and_set_bit(idx, arc_pmu-used_mask)) { - idx = find_first_zero_bit(arc_pmu-used_mask, + if (__test_and_set_bit(idx, pmu_cpu-used_mask)) { + idx = find_first_zero_bit(pmu_cpu-used_mask, arc_pmu-n_counters); if (idx == arc_pmu-n_counters) return -EAGAIN; - __set_bit(idx, arc_pmu-used_mask); + __set_bit(idx, pmu_cpu-used_mask); hwc-idx = idx; } write_aux_reg(ARC_REG_PCT_INDEX, idx); - arc_pmu-act_counter[idx] = event; + pmu_cpu-act_counter[idx] = event; if (is_sampling_event(event)) { /* Mimic full counter overflow as other arches do */ @@ -357,7 +373,7 @@ static int arc_pmu_add(struct perf_event *event, int flags) static irqreturn_t arc_pmu_intr(int irq, void *dev) { struct perf_sample_data data; - struct arc_pmu *arc_pmu = (struct arc_pmu *)dev; + struct arc_pmu_cpu *pmu_cpu = this_cpu_ptr(arc_pmu_cpu); struct pt_regs *regs; int active_ints; int idx; @@ -369,7 +385,7 @@ static irqreturn_t arc_pmu_intr(int irq, void *dev) regs = get_irq_regs(); for (idx = 0; idx arc_pmu-n_counters; idx++) { - struct perf_event *event = arc_pmu-act_counter[idx]; + struct perf_event *event = pmu_cpu-act_counter[idx]; struct hw_perf_event *hwc; if (!(active_ints (1 idx))) @@ -412,6 +428,17 @@ static irqreturn_t arc_pmu_intr(int irq, void *dev) #endif /* CONFIG_ISA_ARCV2 */ +void arc_cpu_pmu_irq_init(void) +{ + struct arc_pmu_cpu *pmu_cpu = this_cpu_ptr(arc_pmu_cpu); + + arc_request_percpu_irq(arc_pmu-irq, smp_processor_id(), arc_pmu_intr, + ARC perf counters, pmu_cpu); + + /* Clear all pending interrupt flags */ + write_aux_reg(ARC_REG_PCT_INT_ACT, 0x); +} + static int arc_pmu_device_probe(struct platform_device *pdev) { struct arc_reg_pct_build pct_bcr; @@ -488,18 +515,30 @@ static int arc_pmu_device_probe(struct platform_device *pdev) if (has_interrupts) { int irq = platform_get_irq(pdev, 0); + unsigned long flags; if (irq 0) { pr_err(Cannot get IRQ number for the platform\n); return
[PATCH v3 4/6] ARCv2: perf: implement exclusion of event counting in user or kernel mode
Cc: Peter Zijlstra pet...@infradead.org Cc: Arnaldo Carvalho de Melo a...@kernel.org Signed-off-by: Alexey Brodkin abrod...@synopsys.com --- No changes since v2. No changes since v1. arch/arc/include/asm/perf_event.h | 3 +++ arch/arc/kernel/perf_event.c | 16 ++-- 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/arch/arc/include/asm/perf_event.h b/arch/arc/include/asm/perf_event.h index 9ed593e..876e216 100644 --- a/arch/arc/include/asm/perf_event.h +++ b/arch/arc/include/asm/perf_event.h @@ -34,6 +34,9 @@ #define ARC_REG_PCT_INT_CTRL 0x25E #define ARC_REG_PCT_INT_ACT0x25F +#define ARC_REG_PCT_CONFIG_USER(1 18) /* count in user mode */ +#define ARC_REG_PCT_CONFIG_KERN(1 19) /* count in kernel mode */ + #define ARC_REG_PCT_CONTROL_CC (1 16) /* clear counts */ #define ARC_REG_PCT_CONTROL_SN (1 17) /* snapshot */ diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c index ce0fa60..997ccbd 100644 --- a/arch/arc/kernel/perf_event.c +++ b/arch/arc/kernel/perf_event.c @@ -147,13 +147,25 @@ static int arc_pmu_event_init(struct perf_event *event) local64_set(hwc-period_left, hwc-sample_period); } + hwc-config = 0; + + if (is_isa_arcv2()) { + /* exclude user means count only kernel */ + if (event-attr.exclude_user) + hwc-config |= ARC_REG_PCT_CONFIG_KERN; + + /* exclude kernel means count only user */ + if (event-attr.exclude_kernel) + hwc-config |= ARC_REG_PCT_CONFIG_USER; + } + switch (event-attr.type) { case PERF_TYPE_HARDWARE: if (event-attr.config = PERF_COUNT_HW_MAX) return -ENOENT; if (arc_pmu-ev_hw_idx[event-attr.config] 0) return -ENOENT; - hwc-config = arc_pmu-ev_hw_idx[event-attr.config]; + hwc-config |= arc_pmu-ev_hw_idx[event-attr.config]; pr_debug(init event %d with h/w %d \'%s\'\n, (int) event-attr.config, (int) hwc-config, arc_pmu_ev_hw_map[event-attr.config]); @@ -163,7 +175,7 @@ static int arc_pmu_event_init(struct perf_event *event) ret = arc_pmu_cache_event(event-attr.config); if (ret 0) return ret; - hwc-config = arc_pmu-ev_hw_idx[ret]; + hwc-config |= arc_pmu-ev_hw_idx[ret]; return 0; default: return -ENOENT; -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 1/6] ARC: perf: cap the number of counters to hardware max of 32
From: Vineet Gupta vgu...@synopsys.com The number of counters in PCT can never be more than 32 (while countable conditions could be 100+) for both ARCompact and ARCv2 And while at it update copyright dates. Cc: Peter Zijlstra pet...@infradead.org Cc: Arnaldo Carvalho de Melo a...@kernel.org Signed-off-by: Vineet Gupta vgu...@synopsys.com Signed-off-by: Alexey Brodkin abrod...@synopsys.com --- Compared to v2: [1] Updated copyright date in arch/arc/kernel/perf_event.c No changes since v1. arch/arc/include/asm/perf_event.h | 5 +++-- arch/arc/kernel/perf_event.c | 6 +++--- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/arc/include/asm/perf_event.h b/arch/arc/include/asm/perf_event.h index 2b8880e..e7b16c2 100644 --- a/arch/arc/include/asm/perf_event.h +++ b/arch/arc/include/asm/perf_event.h @@ -1,6 +1,7 @@ /* * Linux performance counter support for ARC * + * Copyright (C) 2014-2015 Synopsys, Inc. (www.synopsys.com) * Copyright (C) 2011-2013 Synopsys, Inc. (www.synopsys.com) * * This program is free software; you can redistribute it and/or modify @@ -12,8 +13,8 @@ #ifndef __ASM_PERF_EVENT_H #define __ASM_PERF_EVENT_H -/* real maximum varies per CPU, this is the maximum supported by the driver */ -#define ARC_PMU_MAX_HWEVENTS 64 +/* Max number of counters that PCT block may ever have */ +#define ARC_PERF_MAX_COUNTERS 32 #define ARC_REG_CC_BUILD 0xF6 #define ARC_REG_CC_INDEX 0x240 diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c index 1287388..d7ee5b2 100644 --- a/arch/arc/kernel/perf_event.c +++ b/arch/arc/kernel/perf_event.c @@ -1,7 +1,7 @@ /* * Linux performance counter support for ARC700 series * - * Copyright (C) 2013 Synopsys, Inc. (www.synopsys.com) + * Copyright (C) 2013-2015 Synopsys, Inc. (www.synopsys.com) * * This code is inspired by the perf support of various other architectures. * @@ -22,7 +22,7 @@ struct arc_pmu { struct pmu pmu; int counter_size; /* in bits */ int n_counters; - unsigned long used_mask[BITS_TO_LONGS(ARC_PMU_MAX_HWEVENTS)]; + unsigned long used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)]; int ev_hw_idx[PERF_COUNT_ARC_HW_MAX]; }; @@ -284,7 +284,7 @@ static int arc_pmu_device_probe(struct platform_device *pdev) pr_err(This core does not have performance counters!\n); return -ENODEV; } - BUG_ON(pct_bcr.c ARC_PMU_MAX_HWEVENTS); + BUG_ON(pct_bcr.c ARC_PERF_MAX_COUNTERS); READ_BCR(ARC_REG_CC_BUILD, cc_bcr); BUG_ON(!cc_bcr.v); /* Counters exist but No countable conditions ? */ -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents/drivers/mtk: Fix spurious interrupt leading to crash
On Mon, 2015-08-24 at 15:30 +0200, Daniel Lezcano wrote: After analysis done by Yingjoe Chen, the timer appears to have a pending interrupt when it is enabled. Fix this by acknowledging the pending interrupt when enabling the timer interrupt. Signed-off-by: Daniel Lezcano daniel.lezc...@linaro.org Hi Daniel, Thanks for your patch, this can fix the boot issue. Tested-by: Yingjoe Chen yingjoe.c...@mediatek.com --- drivers/clocksource/mtk_timer.c | 13 +++-- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/drivers/clocksource/mtk_timer.c b/drivers/clocksource/mtk_timer.c index 4cd16fb..13543a8 100644 --- a/drivers/clocksource/mtk_timer.c +++ b/drivers/clocksource/mtk_timer.c @@ -156,14 +156,6 @@ static irqreturn_t mtk_timer_interrupt(int irq, void *dev_id) return IRQ_HANDLED; } -static void mtk_timer_global_reset(struct mtk_clock_event_device *evt) -{ - /* Disable all interrupts */ - writel(0x0, evt-gpt_base + GPT_IRQ_EN_REG); - /* Acknowledge all interrupts */ - writel(0x3f, evt-gpt_base + GPT_IRQ_ACK_REG); -} - static void mtk_timer_setup(struct mtk_clock_event_device *evt, u8 timer, u8 option) { @@ -183,6 +175,9 @@ static void mtk_timer_enable_irq(struct mtk_clock_event_device *evt, u8 timer) { u32 val; +/* Acknowledge all spurious pending interrupts */ +writel(0x3f, evt-gpt_base + GPT_IRQ_ACK_REG); This should use tab to indent. + val = readl(evt-gpt_base + GPT_IRQ_EN_REG); writel(val | GPT_IRQ_ENABLE(timer), evt-gpt_base + GPT_IRQ_EN_REG); @@ -232,8 +227,6 @@ static void __init mtk_timer_init(struct device_node *node) } rate = clk_get_rate(clk); - mtk_timer_global_reset(evt); - I think we should keep this one, or at least disable irq first in mtk_timer_enable_irq. MT8173 firmware didn't use this GPT, but I think it is a good ideat to do it just in case firmware in some other platform use it. Joe.C -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/6] perf: Annotate some of the error codes with perf_err()
This patch annotates a few semi-random error paths in perf core to illustrate the extended error reporting facility. Most of them can be triggered from perf tools. Signed-off-by: Alexander Shishkin alexander.shish...@linux.intel.com --- kernel/events/core.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 3ff28fc8bd..7beab37ea6 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -3382,10 +3382,10 @@ find_lively_task_by_vpid(pid_t vpid) rcu_read_unlock(); if (!task) - return ERR_PTR(-ESRCH); + return PERF_ERR_PTR(-ESRCH, task not found); /* Reuse ptrace permission checks for now. */ - err = -EACCES; + err = perf_err(-EACCES, insufficient permissions for tracing this task); if (!ptrace_may_access(task, PTRACE_MODE_READ)) goto errout; @@ -3413,7 +3413,8 @@ find_get_context(struct pmu *pmu, struct task_struct *task, if (!task) { /* Must be root to operate on a CPU event: */ if (perf_paranoid_cpu() !capable(CAP_SYS_ADMIN)) - return ERR_PTR(-EACCES); + return PERF_ERR_PTR(-EACCES, + must be root to operate on a CPU event); /* * We could be clever and allow to attach a event to an @@ -3421,7 +3422,7 @@ find_get_context(struct pmu *pmu, struct task_struct *task, * that's for later. */ if (!cpu_online(cpu)) - return ERR_PTR(-ENODEV); + return PERF_ERR_PTR(-ENODEV, cpu is offline); cpuctx = per_cpu_ptr(pmu-pmu_cpu_context, cpu); ctx = cpuctx-ctx; @@ -8134,15 +8135,16 @@ SYSCALL_DEFINE5(perf_event_open, if (!attr.exclude_kernel) { if (perf_paranoid_kernel() !capable(CAP_SYS_ADMIN)) - return -EACCES; + return perf_err_sync(attr, -EACCES, +kernel tracing forbidden for the unprivileged); } if (attr.freq) { if (attr.sample_freq sysctl_perf_event_sample_rate) - return -EINVAL; + return perf_err_sync(attr, -EINVAL, sample_freq too high); } else { if (attr.sample_period (1ULL 63)) - return -EINVAL; + return perf_err_sync(attr, -EINVAL, sample_period too high); } /* @@ -8152,14 +8154,14 @@ SYSCALL_DEFINE5(perf_event_open, * cgroup. */ if ((flags PERF_FLAG_PID_CGROUP) (pid == -1 || cpu == -1)) - return -EINVAL; + return perf_err_sync(attr, -EINVAL, pid and cpu need to be set in cgroup mode); if (flags PERF_FLAG_FD_CLOEXEC) f_flags |= O_CLOEXEC; event_fd = get_unused_fd_flags(f_flags); if (event_fd 0) - return event_fd; + return perf_err_sync(attr, event_fd, can't obtain a file descriptor); if (group_fd != -1) { err = perf_fget_light(group_fd, group); -- 2.5.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/6] perf/x86: Annotate some of the error codes with perf_err()
This patch annotates a few x86-specific error paths with perf's extended error reporting facility. Signed-off-by: Alexander Shishkin alexander.shish...@linux.intel.com --- arch/x86/kernel/cpu/perf_event.c | 8 ++-- arch/x86/kernel/cpu/perf_event_intel_lbr.c | 2 +- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c index f56cf074d0..b3b531beee 100644 --- a/arch/x86/kernel/cpu/perf_event.c +++ b/arch/x86/kernel/cpu/perf_event.c @@ -12,6 +12,8 @@ * For licencing details see kernel-base/COPYING */ +#define PERF_MODNAME perf/x86 + #include linux/perf_event.h #include linux/capability.h #include linux/notifier.h @@ -426,11 +428,13 @@ int x86_setup_perfctr(struct perf_event *event) /* BTS is currently only allowed for user-mode. */ if (!attr-exclude_kernel) - return -EOPNOTSUPP; + return perf_err(-EOPNOTSUPP, + BTS sampling not allowed for kernel space); /* disallow bts if conflicting events are present */ if (x86_add_exclusive(x86_lbr_exclusive_lbr)) - return -EBUSY; + return perf_err(-EBUSY, + LBR conflicts with active events); event-destroy = hw_perf_lbr_event_destroy; } diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c index b2c9475b7f..222b259c5e 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c @@ -607,7 +607,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event) * no LBR on this PMU */ if (!x86_pmu.lbr_nr) - return -EOPNOTSUPP; + return perf_err(-EOPNOTSUPP, LBR is not supported by this cpu); /* * setup SW LBR filter -- 2.5.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT
On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote: On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz wrote: On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote: I am in the middle of implementing lock on fault this way, but I cannot see how we will hanlde mremap of a lock on fault region. Say we have the following: addr = mmap(len, MAP_ANONYMOUS, ...); mlock(addr, len, MLOCK_ONFAULT); ... mremap(addr, len, 2 * len, ...) There is no way for mremap to know that the area being remapped was lock on fault so it will be locked and prefaulted by remap. How can we avoid this without tracking per vma if it was locked with lock or lock on fault? remap can count filled ptes and prefault only completely populated areas. Does (and should) mremap really prefault non-present pages? Shouldn't it just prepare the page tables and that's it? As I see mremap prefaults pages when it extends mlocked area. Also quote from manpage : If the memory segment specified by old_address and old_size is locked : (using mlock(2) or similar), then this lock is maintained when the segment is : resized and/or relocated. As a consequence, the amount of memory locked : by the process may change. Oh, right... Well that looks like a convincing argument for having a sticky VM_LOCKONFAULT after all. Having mremap guess by scanning existing pte's would slow it down, and be unreliable (was the area completely populated because MLOCK_ONFAULT was not used or because the process aulted it already? Was it not populated because MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to populate it all?). The only sane alternative is to populate always for mremap() of VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information as a limitation of mlock2(MLOCK_ONFAULT). Which might or might not be enough for Eric's usecase, but it's somewhat ugly. There might be a problem after failed populate: remap will handle them as lock on fault. In this case we can fill ptes with swap-like non-present entries to remember that fact and count them as should-be-locked pages. I don't think we should strive to have mremap try to fix the inherent unreliability of mmap (MAP_POPULATE)? I don't think so. MAP_POPULATE works only when mmap happens. Flag MREMAP_POPULATE might be a good idea. Just for symmetry. Maybe, but please do it as a separate series. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 02/20] x86/asm: Add C versions of frame pointer macros
Add C versions of the frame pointer macros which can be used to create a stack frame in inline assembly. Signed-off-by: Josh Poimboeuf jpoim...@redhat.com --- arch/x86/include/asm/frame.h | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/frame.h b/arch/x86/include/asm/frame.h index 8a6cd26..9a30ec7 100644 --- a/arch/x86/include/asm/frame.h +++ b/arch/x86/include/asm/frame.h @@ -1,10 +1,10 @@ #ifndef _ASM_X86_FRAME_H #define _ASM_X86_FRAME_H -#ifdef __ASSEMBLY__ - #include asm/asm.h +#ifdef __ASSEMBLY__ + /* * These are stack frame creation macros. They should be used by every * callable non-leaf asm function to make kernel stack traces more reliable. @@ -22,5 +22,21 @@ #endif .endm +#else /* !__ASSEMBLY__ */ + +#ifdef CONFIG_FRAME_POINTER + +#define FRAME_BEGIN\ + push % _ASM_BP \n \ + _ASM_MOV % _ASM_SP , % _ASM_BP \n + +#define FRAME_END pop % _ASM_BP \n + +#else /* !CONFIG_FRAME_POINTER */ + +#define FRAME_BEGIN +#define FRAME_END + +#endif /* CONFIG_FRAME_POINTER */ #endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_FRAME_H */ -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 00/20] Compile-time stack validation
This is v11 of the compile-time stack validation patch set, along with proposed fixes for many of the warnings it found. It's based on the tip/master branch. The only real change since v10 is some improvements in patch 3 to the documentation and changelog which attempt to better describe why stack validation is needed. v10 can be found here: https://lkml.kernel.org/r/cover.1439521412.git.jpoim...@redhat.com For more information about the motivation behind this patch set, and more details about what it does, please see the changelog in patch 3. Patch 3 also has Documentation/stack-validation.txt which has further details. Patches 1-5 are the stackvalidate tool and some related macros. Patches 6-20 are some proposed fixes for several of the warnings reported by stackvalidate. They've been compile-tested and boot tested in a VM, but I haven't attempted any meaningful testing for many of them. v11: - attempt to answer the why question better in the documentation and commit message - s/FP_SAVE/FRAME_BEGIN/ in documentation v10: - add scripts/mod to directory ignores - remove circular dependencies for ignored objects which are built before stackvalidate - fix CONFIG_MODVERSIONS incompatibility v9: - rename FRAME/ENDFRAME - FRAME_BEGIN/FRAME_END - fix jump table issue for when the original instruction is a jump - drop paravirt thunk alignment patch - add maintainers to CC for proposed warning fixes v8: - add proposed fixes for warnings - fix all memory leaks - process ignores earlier and add more ignore checks - always assume POPCNT alternative is enabled - drop hweight inline asm fix - drop __schedule() ignore patch - change .Ltemp_\@ to .Lstackvalidate_ignore_\@ in asm macro - fix CONFIG_* checks in asm macros - add C versions of ignore macros and frame macros - change ; to \n in C macros - add ifdef CONFIG_STACK_VALIDATION checks in C ignore macros - use numbered label in C ignore macro - add missing break in switch case statement in arch-x86.c v7: - sibling call support - document proposed solution for inline asm() frame pointer issues - say kernel entry/exit instead of context switch - clarify the checking of switch statement jump tables - discard __stackvalidate_ignore_* sections in linker script - use .Ltemp_\@ to get a unique label instead of static 3-digit number - change STACKVALIDATE_IGNORE_FUNC variable to a static - move STACKVALIDATE_IGNORE_INSN to arch-specific .h file v6: - rename asmvalidate - stackvalidate (again) - gcc-generated object file support - recursive branch state analysis - external jump support - fixup/exception table support - jump label support - switch statement jump table support - added documentation - detection of noreturn dead end functions - added a Kbuild mechanism for skipping files and dirs - moved frame pointer macros to arch/x86/include/asm/frame.h - moved ignore macros to include/linux/stackvalidate.h v5: - stackvalidate - asmvalidate - frame pointers only required for non-leaf functions - check for the use of the FP_SAVE/RESTORE macros instead of manually analyzing code to detect frame pointer usage - additional checks to ensure each function doesn't leave its boundaries - make the macros simpler and more flexible - support for analyzing ALTERNATIVE macros - simplified the arch interfaces in scripts/asmvalidate/arch.h - fixed some asmvalidate warnings - rebased onto latest tip asm cleanups - many more small changes v4: - Changed the default to CONFIG_STACK_VALIDATION=n, until all the asm code can get cleaned up. - Fixed a stackvalidate error path exit code issue found by Michal Marek. v3: - Added a patch to make the push/pop CFI macros arch-independent, as suggested by H. Peter Anvin v2: - Fixed memory leaks reported by Petr Mladek Cc: linux-kernel@vger.kernel.org Cc: live-patch...@vger.kernel.org Cc: Michal Marek mma...@suse.cz Cc: Peter Zijlstra pet...@infradead.org Cc: Andy Lutomirski l...@kernel.org Cc: Borislav Petkov b...@alien8.de Cc: Linus Torvalds torva...@linux-foundation.org Cc: Andi Kleen a...@firstfloor.org Cc: Pedro Alves pal...@redhat.com Cc: Namhyung Kim namhy...@gmail.com Cc: Bernd Petrovitsch be...@petrovitsch.priv.at Cc: Chris J Arges chris.j.ar...@canonical.com Cc: Andrew Morton a...@linux-foundation.org Josh Poimboeuf (20): x86/asm: Frame pointer macro cleanup x86/asm: Add C versions of frame pointer macros x86/stackvalidate: Compile-time stack validation x86/stackvalidate: Add file and directory ignores x86/stackvalidate: Add ignore macros x86/xen: Add stack frame dependency to hypercall inline asm calls x86/paravirt: Add stack frame dependency to PVOP inline asm calls x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK x86/amd: Set ELF function type for vide() x86/reboot: Add ljmp instructions to stackvalidate whitelist x86/xen: Add xen_cpuid() and xen_setup_gdt() to stackvalidate whitelists x86/asm/crypto: Create stack frames in aesni-intel_asm.S x86/asm/crypto: Move
[PATCH v11 05/20] x86/stackvalidate: Add ignore macros
Add new stackvalidate ignore macros: STACKVALIDATE_IGNORE_INSN and STACKVALIDATE_IGNORE_FUNC. These can be used to tell stackvalidate to skip validation of an instruction or a function, respectively. Signed-off-by: Josh Poimboeuf jpoim...@redhat.com --- arch/x86/include/asm/stackvalidate.h | 45 arch/x86/kernel/vmlinux.lds.S| 5 +++- include/linux/stackvalidate.h| 28 ++ 3 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 arch/x86/include/asm/stackvalidate.h create mode 100644 include/linux/stackvalidate.h diff --git a/arch/x86/include/asm/stackvalidate.h b/arch/x86/include/asm/stackvalidate.h new file mode 100644 index 000..95db052 --- /dev/null +++ b/arch/x86/include/asm/stackvalidate.h @@ -0,0 +1,45 @@ +#ifndef _ASM_X86_STACKVALIDATE_H +#define _ASM_X86_STACKVALIDATE_H + +#include asm/asm.h + +#ifdef __ASSEMBLY__ + +/* + * This asm macro tells the stack validation script to ignore the instruction + * immediately after the macro. It should only be used in special cases where + * you're 100% sure it won't affect the reliability of frame pointers and + * kernel stack traces. + * + * For more information, see Documentation/stack-validation.txt. + */ +.macro STACKVALIDATE_IGNORE_INSN +#ifdef CONFIG_STACK_VALIDATION + .Lstackvalidate_ignore_\@: + .pushsection __stackvalidate_ignore_insn, a + _ASM_ALIGN + .long .Lstackvalidate_ignore_\@ - . + .popsection +#endif +.endm + +#else /* !__ASSEMBLY__ */ + +#ifdef CONFIG_STACK_VALIDATION + +#define STACKVALIDATE_IGNORE_INSN \ + 1:\n \ + .pushsection __stackvalidate_ignore_insn, \a\\n \ + _ASM_ALIGN \n \ + .long 1b - .\n\ + .popsection\n + +#else /* !CONFIG_STACK_VALIDATION */ + +#define STACKVALIDATE_IGNORE_INSN + +#endif /* CONFIG_STACK_VALIDATION */ + +#endif /* __ASSEMBLY__ */ + +#endif /* _ASM_X86_STACKVALIDATE_H */ diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 00bf300..f2f8d7a 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -332,7 +332,10 @@ SECTIONS /* Sections to be discarded */ DISCARDS - /DISCARD/ : { *(.eh_frame) } + /DISCARD/ : { + *(.eh_frame) + *(__stackvalidate_ignore_*) + } } diff --git a/include/linux/stackvalidate.h b/include/linux/stackvalidate.h new file mode 100644 index 000..4ae242c --- /dev/null +++ b/include/linux/stackvalidate.h @@ -0,0 +1,28 @@ +#ifndef _LINUX_STACKVALIDATE_H +#define _LINUX_STACKVALIDATE_H + +#include asm/stackvalidate.h + +#ifndef __ASSEMBLY__ + +#ifdef CONFIG_STACK_VALIDATION +/* + * This C macro tells the stack validation script to ignore the function. It + * should only be used in special cases where you're 100% sure it won't affect + * the reliability of frame pointers and kernel stack traces. + * + * For more information, see Documentation/stack-validation.txt. + */ +#define STACKVALIDATE_IGNORE_FUNC(_func) \ + static void __used __section(__stackvalidate_ignore_func) \ + *__stackvalidate_ignore_func_##_func = _func + +#else /* !CONFIG_STACK_VALIDATION */ + +#define STACKVALIDATE_IGNORE_FUNC(_func) + +#endif /* CONFIG_STACK_VALIDATION */ + +#endif /* !__ASSEMBLY__ */ + +#endif /* _LINUX_STACKVALIDATE_H */ -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 18/20] x86/asm: Create stack frames in rwsem functions
rwsem.S has several callable non-leaf functions which don't honor CONFIG_FRAME_POINTER, which can result in bad stack traces. Create stack frames for them when CONFIG_FRAME_POINTER is enabled. Signed-off-by: Josh Poimboeuf jpoim...@redhat.com --- arch/x86/lib/rwsem.S | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/lib/rwsem.S b/arch/x86/lib/rwsem.S index 40027db..be110ef 100644 --- a/arch/x86/lib/rwsem.S +++ b/arch/x86/lib/rwsem.S @@ -15,6 +15,7 @@ #include linux/linkage.h #include asm/alternative-asm.h +#include asm/frame.h #define __ASM_HALF_REG(reg)__ASM_SEL(reg, e##reg) #define __ASM_HALF_SIZE(inst) __ASM_SEL(inst##w, inst##l) @@ -84,24 +85,29 @@ /* Fix up special calling conventions */ ENTRY(call_rwsem_down_read_failed) + FRAME_BEGIN save_common_regs __ASM_SIZE(push,) %__ASM_REG(dx) movq %rax,%rdi call rwsem_down_read_failed __ASM_SIZE(pop,) %__ASM_REG(dx) restore_common_regs + FRAME_END ret ENDPROC(call_rwsem_down_read_failed) ENTRY(call_rwsem_down_write_failed) + FRAME_BEGIN save_common_regs movq %rax,%rdi call rwsem_down_write_failed restore_common_regs + FRAME_END ret ENDPROC(call_rwsem_down_write_failed) ENTRY(call_rwsem_wake) + FRAME_BEGIN /* do nothing if still outstanding active readers */ __ASM_HALF_SIZE(dec) %__ASM_HALF_REG(dx) jnz 1f @@ -109,15 +115,18 @@ ENTRY(call_rwsem_wake) movq %rax,%rdi call rwsem_wake restore_common_regs -1: ret +1: FRAME_END + ret ENDPROC(call_rwsem_wake) ENTRY(call_rwsem_downgrade_wake) + FRAME_BEGIN save_common_regs __ASM_SIZE(push,) %__ASM_REG(dx) movq %rax,%rdi call rwsem_downgrade_wake __ASM_SIZE(pop,) %__ASM_REG(dx) restore_common_regs + FRAME_END ret ENDPROC(call_rwsem_downgrade_wake) -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH-v3 1/2] mfd: devicetree: bindings: 88pm800: Add DT property for dual phase enable
On Monday 24 August 2015 06:32 PM, Lee Jones wrote: On Mon, 24 Aug 2015, Vaibhav Hiremath wrote: 88PM860 family of device supports dual phase mode on BUCK1 supply providing total 6A capacity. Note that by default they operate independently with 3A capacity. This patch updates the devicetree binding with DT property to enable dual-phase mode on BUCK1. Signed-off-by: Vaibhav Hiremath vaibhav.hirem...@linaro.org --- Documentation/devicetree/bindings/mfd/88pm800.txt | 6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/devicetree/bindings/mfd/88pm800.txt b/Documentation/devicetree/bindings/mfd/88pm800.txt index dec842f..2c82fcb 100644 --- a/Documentation/devicetree/bindings/mfd/88pm800.txt +++ b/Documentation/devicetree/bindings/mfd/88pm800.txt @@ -9,6 +9,12 @@ Required parent device properties: - #interrupt-cells: should be 1. The cell is the 88pm80x local IRQ number +Optional properties : +- marvell,88pm860-buck1-dualphase-en : If set, enable dual phase on BUCK1, + providing 6A capacity. + Without this both BUCK1A and BUCK1B operates independently with 3A capacity. + (This property is only applicable to 88PM860) This will require a Regulator Ack. My suggestion would be to remove the 'buck' number, as the same property could be used on any Buck, and remove the '-en' part, as this is implied. Ok, Will do it in next version. Mark, Any comments here before I spin V4. Thanks, Vaibhav -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] scripts/checkkconfigsymbols.py: support default statements
Hi Michal, On Mon, Aug 24, 2015 at 4:49 PM, Michal Marek mma...@suse.cz wrote: On 2015-07-27 12:33, Valentin Rothberg wrote: Until now, checkkonfigsymbols.py did not check default statements for references on missing Kconfig symbols (i.e., undefined Kconfig options). Hence, add support to parse and check the Kconfig default statement. Signed-off-by: Valentin Rothberg valentinrothb...@gmail.com --- Changelog: v2 (thanks to Stefan Hengelein): - update NUMERIC regex (Kconfig accepts 'X' and 'A-F') - remove mistakenly added blank line from v1 scripts/checkkconfigsymbols.py | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) Applied to kbuild.git#kconfig. Michal The patch above already went through Greg's tree to linux-next (see commit 0bd38ae35522). Kind regards, Valentin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] Documentation: add 'crashkernel=auto' entry into kernel-parameters.txt
There is no 'crashkernel=auto' entry in kernel-parameters.txt, borrow it from kexec-kdump-howto.txt file in the kexec-tools-2.0.0 package. Signed-off-by: Yaowei Bai bywxiao...@163.com --- Documentation/kernel-parameters.txt | 9 + 1 file changed, 9 insertions(+) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 1d6f045..9e5913e 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -797,6 +797,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted. It will be ignored when crashkernel=X,high is not used or memory reserved is below 4G. + crashkernel=auto + This specification allows the kernel to decide how much + memory to reserve for the purposes of kdump. It will make + this determination based on the amount of memory you have + in your system, and scale the allocation accordingly. + Note that if you have less than 4Gb of memory in your system, + this specification will opt to not allocate any memory for + the purposes of kdump. + cs89x0_dma= [HW,NET] Format: dma -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mmc/sdhci-acpi: enable sdhci-acpi device to suspend/resume asynchronously
On 2015/8/17 14:51, Adrian Hunter wrote: On 17/08/15 06:38, Fu, Zhonghui wrote: Hi, Any comments are welcome. Same comments as here: http://marc.info/?l=linux-kernelm=143979428424353w=2 Now, PM core support asynchronous device suspend/resume mode. If one device has been set to support asynchronous PM mode, it's suspend/resume operation can be performed in a separate kernel thread and take advantage of multicore feature to improve overall system suspend/resume speed. The worst case is that all device suspend/resume threads will be scheduled to the same CPU, it hardly occur. PM core ensure all the suspend/resume dependency related to one device. Actually, async suspend/resume mode is one feature of PM core, every device subsystem may use it or not use it. Once one device subsystem choose to use this feature, its safety is up to PM core as long as device subsystem has initialized fully this device. Thanks, Zhonghui Thanks, Zhonghui On 2015/8/3 21:10, Fu, Zhonghui wrote: Enable sdhci-acpi device to suspend/resume asynchronously. This can improve system suspend/resume speed. Signed-off-by: Zhonghui Fu zhonghui...@linux.intel.com --- drivers/mmc/host/sdhci-acpi.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/mmc/host/sdhci-acpi.c b/drivers/mmc/host/sdhci-acpi.c index 22d929f..67e6263 100644 --- a/drivers/mmc/host/sdhci-acpi.c +++ b/drivers/mmc/host/sdhci-acpi.c @@ -379,6 +379,8 @@ static int sdhci_acpi_probe(struct platform_device *pdev) pm_runtime_enable(dev); } + device_enable_async_suspend(dev); + return 0; err_free: -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i2c: enable i2c adapter to suspend/resume asynchronously
Hi, Any comments are welcome. Thanks, Zhonghui On 2015/8/18 0:17, Fu, Zhonghui wrote: Enable i2c adapter to suspend/resume asynchronously. This can improve system suspend/resume speed. Signed-off-by: Zhonghui Fu zhonghui...@linux.intel.com --- drivers/i2c/i2c-core.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c index c83e4d1..90251be 100644 --- a/drivers/i2c/i2c-core.c +++ b/drivers/i2c/i2c-core.c @@ -1439,6 +1439,8 @@ static int i2c_register_adapter(struct i2c_adapter *adap) pm_runtime_no_callbacks(adap-dev); + device_enable_async_suspend(adap-dev); + #ifdef CONFIG_I2C_COMPAT res = class_compat_create_link(i2c_adapter_compat_class, adap-dev, adap-dev.parent); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
On Fri, Aug 21, 2015 at 08:25:24PM +, Liang, Kan wrote: SNIP we need global topology information in perf.data and use the mapping from there, we can't use current server info we currently store core_siblings_list and thread_siblings_list, in topology FEATURE, which is probably not enough core_siblings_list includes the cpu list in the same socket. thread_siblings_list includes the cpu list in the same core. numa_nodes includes the cpu list for each node. It looks we have enough data from topology FEATURE. hum, haven't hecked deeply.. how will you get core id for cpu? from thread_siblings_list. I just noticed that svg_build_topology_map did the similar thing to get topology map for timechart from perf header. What do you think about the function as below? It gets the socket id from env. some sort of caching would be nice, I guess we could store those cpumap objects within perf_session_env Yes it will be stored in perf_session_env. Thanks, Kan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 3/6] ARCv2: perf: Support sampling events using overflow interrupts
In times of ARC 700 performance counters didn't have support of interrupt an so for ARC we only had support of non-sampling events. Put simply only perf stat was functional. Now with ARC HS we have support of interrupts in performance counters which this change introduces support of. ARC performance counters act in the following way in regard of interrupts generation. [1] A counter counts starting from value set in PCT_COUNT register pair [2] Once counter reaches value set in PCT_INT_CNT interrupt is raised Basic setup look like this: [1] PCT_COUNT = 0; [2] PCT_INT_CNT = __limit_value__; [3] Enable interrupts for that counter and let it run [4] Let counter reach its limit [5] Handle interrupt when it happens Note that PCT HW block is build in CPU core and so ints interrupt line (which is basically OR of all counters IRQs) is wired directly to top-level IRQC. That means do de-assert PCT interrupt it's required to reset IRQs from all counters that have reached their limit values. Cc: Peter Zijlstra pet...@infradead.org Cc: Arnaldo Carvalho de Melo a...@kernel.org Cc: Vineet Gupta vgu...@synopsys.com Signed-off-by: Alexey Brodkin abrod...@synopsys.com --- Compared to v2: [1] Moved interrupts enabling from arc_pmu_add() to arc_pmu_start() Compared to v1: [1] Added commit message [2] Removed check for is_sampling_event() because we already set PERF_PMU_CAP_NO_INTERRUPT in probe() [3] Minor cosmetics arch/arc/include/asm/perf_event.h | 8 ++- arch/arc/kernel/perf_event.c | 128 +++--- 2 files changed, 126 insertions(+), 10 deletions(-) diff --git a/arch/arc/include/asm/perf_event.h b/arch/arc/include/asm/perf_event.h index e7b16c2..9ed593e 100644 --- a/arch/arc/include/asm/perf_event.h +++ b/arch/arc/include/asm/perf_event.h @@ -29,15 +29,19 @@ #define ARC_REG_PCT_CONFIG 0x254 #define ARC_REG_PCT_CONTROL0x255 #define ARC_REG_PCT_INDEX 0x256 +#define ARC_REG_PCT_INT_CNTL 0x25C +#define ARC_REG_PCT_INT_CNTH 0x25D +#define ARC_REG_PCT_INT_CTRL 0x25E +#define ARC_REG_PCT_INT_ACT0x25F #define ARC_REG_PCT_CONTROL_CC (1 16) /* clear counts */ #define ARC_REG_PCT_CONTROL_SN (1 17) /* snapshot */ struct arc_reg_pct_build { #ifdef CONFIG_CPU_BIG_ENDIAN - unsigned int m:8, c:8, r:6, s:2, v:8; + unsigned int m:8, c:8, r:5, i:1, s:2, v:8; #else - unsigned int v:8, s:2, r:6, c:8, m:8; + unsigned int v:8, s:2, i:1, r:5, c:8, m:8; #endif }; diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c index db53af7..ce0fa60 100644 --- a/arch/arc/kernel/perf_event.c +++ b/arch/arc/kernel/perf_event.c @@ -11,6 +11,7 @@ * */ #include linux/errno.h +#include linux/interrupt.h #include linux/module.h #include linux/of.h #include linux/perf_event.h @@ -24,6 +25,7 @@ struct arc_pmu { unsigned long used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)]; u64 max_period; int ev_hw_idx[PERF_COUNT_ARC_HW_MAX]; + struct perf_event *act_counter[ARC_PERF_MAX_COUNTERS]; }; struct arc_callchain_trace { @@ -139,9 +141,11 @@ static int arc_pmu_event_init(struct perf_event *event) struct hw_perf_event *hwc = event-hw; int ret; - hwc-sample_period = arc_pmu-max_period; - hwc-last_period = hwc-sample_period; - local64_set(hwc-period_left, hwc-sample_period); + if (!is_sampling_event(event)) { + hwc-sample_period = arc_pmu-max_period; + hwc-last_period = hwc-sample_period; + local64_set(hwc-period_left, hwc-sample_period); + } switch (event-attr.type) { case PERF_TYPE_HARDWARE: @@ -243,6 +247,11 @@ static void arc_pmu_start(struct perf_event *event, int flags) arc_pmu_event_set_period(event); + /* Enable interrupt for this counter */ + if (is_sampling_event(event)) + write_aux_reg(ARC_REG_PCT_INT_CTRL, + read_aux_reg(ARC_REG_PCT_INT_CTRL) | (1 idx)); + /* enable ARC pmu here */ write_aux_reg(ARC_REG_PCT_INDEX, idx); write_aux_reg(ARC_REG_PCT_CONFIG, hwc-config); @@ -253,6 +262,17 @@ static void arc_pmu_stop(struct perf_event *event, int flags) struct hw_perf_event *hwc = event-hw; int idx = hwc-idx; + /* Disable interrupt for this counter */ + if (is_sampling_event(event)) { + /* +* Reset interrupt flag by writing of 1. This is required +* to make sure pending interrupt was not left. +*/ + write_aux_reg(ARC_REG_PCT_INT_ACT, 1 idx); + write_aux_reg(ARC_REG_PCT_INT_CTRL, + read_aux_reg(ARC_REG_PCT_INT_CTRL) ~(1 idx)); + } + if (!(event-hw.state PERF_HES_STOPPED)) { /* stop ARC pmu here */ write_aux_reg(ARC_REG_PCT_INDEX, idx); @@ -275,6 +295,8 @@
[PATCH v3 2/6] ARCv2: perf: implement event_set_period
This generalization prepares for support of overflow interrupts. Hardware event counters on ARC work that way: Each counter counts from programmed start value (set in ARC_REG_PCT_COUNT) to a limit value (set in ARC_REG_PCT_INT_CNT) and once limit value is reached this timer generates an interrupt. Even though this hardware implementation allows for more flexibility, in Linux kernel we decided to mimic behavior of other architectures this way: [1] Set limit value as half of counter's max value (to allow counter to run after reaching it limit, see below for more explanation): --8--- arc_pmu-max_period = (1ULL counter_size) / 2 - 1ULL; --8--- [2] Set start value as arc_pmu-max_period - sample_period and then count up to the limit Our event counters don't stop on reaching max value (the one we set in ARC_REG_PCT_INT_CNT) but continue to count until kernel explicitly stops each of them. And setting a limit as half of counter capacity is done to allow capturing of additional events in between moment when interrupt was triggered until we're actually processing PMU interrupts. That way we're trying to be more precise. For example if we count CPU cycles we keep track of cycles while running through generic IRQ handling code: [1] We set counter period as say 100_000 events of type crun [2] Counter reaches that limit and raises its interrupt [3] Once we get in PMU IRQ handler we read current counter value from ARC_REG_PCT_SNAP ans see there something like 105_000. If counters stop on reaching a limit value then we would miss additional 5000 cycles. Cc: Peter Zijlstra pet...@infradead.org Cc: Arnaldo Carvalho de Melo a...@kernel.org Signed-off-by: Vineet Gupta vgu...@synopsys.com Signed-off-by: Alexey Brodkin abrod...@synopsys.com --- Compared to v2: [1] ARCv2: perf: set usable max period as a half of real max period was merged in this one so we may have complete and valid commit message that covers basics of ARC PCTs. [2] Fixed arc_pmu_event_set_period() in regard of incorrect hwc-period_left setup. Compared to v1: [1] Added verbose commit message with explanation of how PCT HW works on ARC [2] Simplified arc_perf_event_update() [3] Removed check for is_sampling_event() because we already set PERF_PMU_CAP_NO_INTERRUPT in probe() [4] Minor cosmetics arch/arc/kernel/perf_event.c | 79 +++- 1 file changed, 63 insertions(+), 16 deletions(-) diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c index d7ee5b2..db53af7 100644 --- a/arch/arc/kernel/perf_event.c +++ b/arch/arc/kernel/perf_event.c @@ -20,9 +20,9 @@ struct arc_pmu { struct pmu pmu; - int counter_size; /* in bits */ int n_counters; unsigned long used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)]; + u64 max_period; int ev_hw_idx[PERF_COUNT_ARC_HW_MAX]; }; @@ -88,18 +88,15 @@ static uint64_t arc_pmu_read_counter(int idx) static void arc_perf_event_update(struct perf_event *event, struct hw_perf_event *hwc, int idx) { - uint64_t prev_raw_count, new_raw_count; - int64_t delta; - - do { - prev_raw_count = local64_read(hwc-prev_count); - new_raw_count = arc_pmu_read_counter(idx); - } while (local64_cmpxchg(hwc-prev_count, prev_raw_count, -new_raw_count) != prev_raw_count); - - delta = (new_raw_count - prev_raw_count) - ((1ULL arc_pmu-counter_size) - 1ULL); + uint64_t prev_raw_count = local64_read(hwc-prev_count); + uint64_t new_raw_count = arc_pmu_read_counter(idx); + int64_t delta = new_raw_count - prev_raw_count; + /* +* We don't afaraid of hwc-prev_count changing beneath our feet +* because there's no way for us to re-enter this function anytime. +*/ + local64_set(hwc-prev_count, new_raw_count); local64_add(delta, event-count); local64_sub(delta, hwc-period_left); } @@ -142,6 +139,10 @@ static int arc_pmu_event_init(struct perf_event *event) struct hw_perf_event *hwc = event-hw; int ret; + hwc-sample_period = arc_pmu-max_period; + hwc-last_period = hwc-sample_period; + local64_set(hwc-period_left, hwc-sample_period); + switch (event-attr.type) { case PERF_TYPE_HARDWARE: if (event-attr.config = PERF_COUNT_HW_MAX) @@ -153,6 +154,7 @@ static int arc_pmu_event_init(struct perf_event *event) (int) event-attr.config, (int) hwc-config, arc_pmu_ev_hw_map[event-attr.config]); return 0; + case PERF_TYPE_HW_CACHE: ret = arc_pmu_cache_event(event-attr.config); if (ret 0) @@ -180,6 +182,47 @@ static void arc_pmu_disable(struct pmu *pmu)
[PATCH v3 0/6] ARCv2 port to Linux - (C) perf
Hi Peter, This mini-series adds perf support for ARCv2 based cores, which brings in overflow interupts and SMP. Additionally now raw events are supported as well. Please review ! Compared to v2 this series has: [1] Removed patch with raw-events support. It needs some rework and let's then discuss it separately. Still I plan to send it shortly. [2] Merged set usable max period as a half of real max period into implement event_set_period. [3] Fixed arc_pmu_event_set_period() in regard of incorrect hwc-period_left setup. [4] Moved interrupts enabling from arc_pmu_add() to arc_pmu_start() Compared to v1 this series has: [1] Addressed review comments [2] More verbose commit messages and comments in sources [3] Minor cosmetics Thanks, Alexey Alexey Brodkin (4): ARCv2: perf: implement event_set_period ARCv2: perf: Support sampling events using overflow interrupts ARCv2: perf: implement exclusion of event counting in user or kernel mode ARCv2: perf: SMP support Vineet Gupta (2): ARC: perf: cap the number of counters to hardware max of 32 ARCv2: perf: Finally introduce HS perf unit .../devicetree/bindings/arc/archs-pct.txt | 17 ++ MAINTAINERS| 2 +- arch/arc/include/asm/perf_event.h | 21 +- arch/arc/kernel/perf_event.c | 271 ++--- 4 files changed, 275 insertions(+), 36 deletions(-) create mode 100644 Documentation/devicetree/bindings/arc/archs-pct.txt -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scripts/kernel-doc: Improve Markdown results
On Fri, 2015-08-21 at 16:39 -0300, Danilo Cesar Lemes de Paula wrote: Using pandoc as the Markdown engine cause some minor side effects as pandoc includes main para tags for almost everything. Original Markdown support approach removes those main tags, but it caused some inconsistencies when that tag is not the main one, like: something../something para.../para As kernel-doc was already including a para tag, it causes the presence of double para tags (parapara), which is not supported by DocBook spec. Html target gets away with it, so it causes no harm, although other targets might not be so lucky (pdf as example). We're now delegating the inclusion of the main para tag to pandoc only, as it knows when it's necessary or not. That behavior causes a corner case, the only situation where we're certainly that para is not needed, which is the refpurpose content. For those cases, we're using a $output_markdown_nopara = 1 control var. Signed-off-by: Danilo Cesar Lemes de Paula Feel free to add my: Tested-by: Graham Whaley graham.wha...@linux.intel.com Graham danilo.ce...@collabora.co.uk Cc: Randy Dunlap rdun...@infradead.org Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com Cc: Jonathan Corbet cor...@lwn.net Cc: Herbert Xu herb...@gondor.apana.org.au Cc: Stephan Mueller smuel...@chronox.de Cc: Michal Marek mma...@suse.cz Cc: linux-kernel@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: intel-gfx intel-...@lists.freedesktop.org Cc: dri-devel dri-de...@lists.freedesktop.org Cc: Graham Whaley graham.wha...@linux.intel.com --- Thanks to Graham Whaley who helped me to debug this. scripts/kernel-doc | 48 ++-- 1 file changed, 34 insertions(+), 14 deletions(-) diff --git a/scripts/kernel-doc b/scripts/kernel-doc index 3850c1e..12a106c 100755 --- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -288,6 +288,7 @@ my $use_markdown = 0; my $verbose = 0; my $output_mode = man; my $output_preformatted = 0; +my $output_markdown_nopara = 0; my $no_doc_sections = 0; my @highlights = @highlights_man; my $blankline = $blankline_man; @@ -529,8 +530,11 @@ sub markdown_to_docbook { close(CHLD_OUT); close(CHLD_ERR); - # pandoc insists in adding Main para/para, we should remove them. - $content =~ s:\A\s*para\s*\n(.*)\n/para\Z$:$1:egsm; + if ($output_markdown_nopara) { + # pandoc insists in adding Main para/para, sometimes we + # want to remove them. + $content =~ s:\A\s*para\s*\n(.*)\n/para\Z$:$1:egsm; + } return $content; } @@ -605,7 +609,7 @@ sub output_highlight { $line =~ s/^\s*//; } if ($line eq ){ - if (! $output_preformatted) { + if (! $output_preformatted ! $use_markdown) { print $lineprefix, local_unescape($blankline); } } else { @@ -1026,7 +1030,7 @@ sub output_section_xml(%) { # programlisting is already included by pandoc print programlisting\n unless $use_markdown; $output_preformatted = 1; - } else { + } elsif (! $use_markdown) { print para\n; } output_highlight($args{'sections'}{$section}); @@ -1034,7 +1038,7 @@ sub output_section_xml(%) { if ($section =~ m/EXAMPLE/i) { print /programlisting\n unless $use_markdown; print /informalexample\n; - } else { + } elsif (! $use_markdown) { print /para\n; } print /refsect1\n; @@ -1066,7 +1070,9 @@ sub output_function_xml(%) { print refname . $args{'function'} . /refname\n; print refpurpose\n; print ; +$output_markdown_nopara = 1; output_highlight ($args{'purpose'}); +$output_markdown_nopara = 0; print /refpurpose\n; print /refnamediv\n; @@ -1104,10 +1110,12 @@ sub output_function_xml(%) { $parameter_name =~ s/\[.*//; print varlistentry\n termparameter$parameter/parameter/term\n; - printlistitem\npara\n; + printlistitem\n; + print para\n unless $use_markdown; $lineprefix= ; output_highlight($args{'parameterdescs'}{$parameter_name}); - print /para\n /listitem\n /varlistentry\n; + print /para\n unless $use_markdown; + print/listitem\n /varlistentry\n; } print /variablelist\n; } else { @@ -1143,7 +1151,9 @@ sub output_struct_xml(%) { print refname . $args{'type'} . . $args{'struct'} . /refname\n; print refpurpose\n; print ; +$output_markdown_nopara = 1; output_highlight ($args{'purpose'}); +$output_markdown_nopara = 0; print /refpurpose\n; print /refnamediv\n; @@ -1196,9 +1206,11 @@ sub output_struct_xml(%) {
Re: [PATCH 0/2] kbuild: Minor cleanups of fixdep
On 2015-07-24 07:18, Masahiro Yamada wrote: Masahiro Yamada (2): kbuild: fixdep: optimize code slightly kbuild: fixdep: drop meaningless hash table initialization scripts/basic/fixdep.c | 26 -- 1 file changed, 4 insertions(+), 22 deletions(-) Applied to kbuild.git#kbuild. Michal -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/10] irqchip: irq-mips-gic: export gic_send_ipi
On Mon, 24 Aug 2015, Qais Yousef wrote: On 08/24/2015 02:32 PM, Marc Zyngier wrote: I'd rather see something more architected than this blind export, or at least some level of filtering (the idea random drivers can access such a low-level function doesn't make me feel very good). I don't know how to architect this better or how to perform the filtering, but I'm happy to hear suggestions and try them out. Keep in mind that detecting GIC and writing your own gic_send_ipi() is very simple. I have done this when the driver was out of tree. So restricting it by not exporting it will not prevent someone from really accessing the functionality, it's just they have to do it their own way. Keep in mind that we are not talking about out of tree hackery. We talk about a kernel code submission and I doubt, that you will get away with a GIC detection/fiddling burried in your driver code. Keep in mind that just slapping an export to some random function is not much better than doing a GIC hack in the driver. Marcs concerns about blindly exposing IPI functionality to drivers is well justified and that kind of coprocessor stuff is not unique to your particular SoC. We're going to see such things more frequently in the not so distant future, so we better think now about proper solutions to that problem. There are a couple of issues to solve: 1) How is the IPI which is received by the coprocessor reserved in the system? 2) How is it associated to a particular driver? 3) How do we ensure that a driver cannot issue random IPIs and can only send the associated ones? None of these issues are handled by your export. So we need a core infrastructure which allows us to do that. The requirements are pretty clear from the above and Marc might have some further restrictions in mind. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT
On Mon, Aug 24, 2015 at 6:09 PM, Eric B Munson emun...@akamai.com wrote: On Mon, 24 Aug 2015, Vlastimil Babka wrote: On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote: On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz wrote: On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote: I am in the middle of implementing lock on fault this way, but I cannot see how we will hanlde mremap of a lock on fault region. Say we have the following: addr = mmap(len, MAP_ANONYMOUS, ...); mlock(addr, len, MLOCK_ONFAULT); ... mremap(addr, len, 2 * len, ...) There is no way for mremap to know that the area being remapped was lock on fault so it will be locked and prefaulted by remap. How can we avoid this without tracking per vma if it was locked with lock or lock on fault? remap can count filled ptes and prefault only completely populated areas. Does (and should) mremap really prefault non-present pages? Shouldn't it just prepare the page tables and that's it? As I see mremap prefaults pages when it extends mlocked area. Also quote from manpage : If the memory segment specified by old_address and old_size is locked : (using mlock(2) or similar), then this lock is maintained when the segment is : resized and/or relocated. As a consequence, the amount of memory locked : by the process may change. Oh, right... Well that looks like a convincing argument for having a sticky VM_LOCKONFAULT after all. Having mremap guess by scanning existing pte's would slow it down, and be unreliable (was the area completely populated because MLOCK_ONFAULT was not used or because the process aulted it already? Was it not populated because MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to populate it all?). Given this, I am going to stop working in v8 and leave the vma flag in place. The only sane alternative is to populate always for mremap() of VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information as a limitation of mlock2(MLOCK_ONFAULT). Which might or might not be enough for Eric's usecase, but it's somewhat ugly. I don't think that this is the right solution, I would be really surprised as a user if an area I locked with MLOCK_ONFAULT was then fully locked and prepopulated after mremap(). If mremap is the only problem then we can add opposite flag for it: MREMAP_NOPOPULATE - do not populate new segment of locked areas - do not copy normal areas if possible (anonymous/special must be copied) addr = mmap(len, MAP_ANONYMOUS, ...); mlock(addr, len, MLOCK_ONFAULT); ... addr2 = mremap(addr, len, 2 * len, MREMAP_NOPOPULATE); ... There might be a problem after failed populate: remap will handle them as lock on fault. In this case we can fill ptes with swap-like non-present entries to remember that fact and count them as should-be-locked pages. I don't think we should strive to have mremap try to fix the inherent unreliability of mmap (MAP_POPULATE)? I don't think so. MAP_POPULATE works only when mmap happens. Flag MREMAP_POPULATE might be a good idea. Just for symmetry. Maybe, but please do it as a separate series. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv3 4/5] mm: make compound_head() robust
On 08/21/2015 09:34 PM, Andrew Morton wrote: On Fri, 21 Aug 2015 22:31:09 +0300 Kirill A. Shutemov kir...@shutemov.name wrote: On Fri, Aug 21, 2015 at 11:11:27AM -0500, Christoph Lameter wrote: On Fri, 21 Aug 2015, Kirill A. Shutemov wrote: Is this really true? For example if it's a slab page, will that page ever be inspected by code which is looking for the PageTail bit? +Christoph. What we know for sure is that space is not used in tail pages, otherwise it would collide with current compound_dtor. Sl*b allocators only do a virt_to_head_page on tail pages. The question was whether it's safe to assume that the bit 0 is always zero in the word as this bit will encode PageTail(). That wasn't my question actually... What I'm wondering is: if this page is being used for slab, will any code path ever run PageTail() against it? If not, we don't need to be concerned about that bit. Pfn scanners such as compaction might inspect such pages and run compound_head() (and thus PageTail) on them. I think no kind of page within a zone (slab or otherwise) is protected from this, which is why it needs to be robust. And slab was just the example I chose. The same question petains to all other uses of that union. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3 v4] mm/vmalloc: Cache the vmalloc memory info
George John Stoffel j...@stoffel.org wrote: vmap_info_gen should be initialized to 1 to force an initial cache update. Blech, it should be initialized with a proper #define VMAP_CACHE_NEEDS_UPDATE 1, instead of more magic numbers. George Er... this is a joke, right? Not really. The comment made before was that by setting this variable to zero, it wasn't properly initialized. Which implies that either the API is wrong... or we should be documenting it better. I just went in the direction of the #define instead of a comment. George First, this number is used exactly once, and it's not part of George a collection of similar numbers. And the definition would be George adjacent to the use. George We have easier ways of accomplishing that, called comments. Sure, that would be the better solution in this case. George Second, your proposed name is misleading. needs update is defined George as vmap_info_gen != vmap_info_cache_gen. There is no particular value George of either that has this meaning. George For example, initializing vmap_info_cache_gen to -1 would do just as well. George (I actually considered that before deciding that +1 was simpler than -1.) See, I just threw out a dumb suggestion without reading the patch properly. My fault. George (John, my apologies if I went over the top and am contributing to LKML's George reputation for flaming. I *did* actually laugh, and *do* think it's a George dumb idea, but my annoyance is really directed at unpleasant memories of George mindless application of coding style guidelines. In this case, I suspect George you just posted before reading carefully enough to see the subtle logic.) Nope, I'm in the wrong here. And your comment here is wonderful, I really do appreciate how you handled my ham fisted attempt to contribute. But I've got thick skin and I'll keep trying in my free time to comment on patches when I can. John -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] sdhci: fix DMA leaks [was: [SHDCI] Heavy (thousands) DMA leaks]
On 08/06/2015 02:17 AM, Chen Bough wrote: I will format a patch based on your diff file firstly. I will test this on my side, If any issue, like dma issue or performance issue, I will add some modification. Then I will send the patch for review, and you can test the patch on your platform. Best Regards Haibo Chen Did I miss the follow up patch or is this still pending? If it's still pending, would you mind Ccing me when it's available for testing? Thanks, Laura -Original Message- From: Jiri Slaby [mailto:jsl...@suse.cz] Sent: Thursday, August 06, 2015 5:07 PM To: Chen Haibo-B51421; Ulf Hansson Cc: linux-...@vger.kernel.org; Linux kernel mailing list Subject: Re: [RFC] sdhci: fix DMA leaks [was: [SHDCI] Heavy (thousands) DMA leaks] On 08/06/2015, 09:42 AM, Chen Bough wrote: I read your attached log and patch, yes, dma memory leak will happen when more than one pre_request execute. The method of ++next-cookie is not good, your patch seems good, but I still need some time to test the patch, because you unmap the dma in sdhci_finish_data rather than the sdhci_post_req. Hi, yes, this is not correct. We can perhaps differentiate according to the COOKIE value. Should I fix it or are you going to prepare a patch based on my RFC? thanks, -- js suse labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH linux-next v4 3/5] mtd: spi-nor: allow to tune the number of dummy cycles
Hi Marek, Le 24/08/2015 12:48, Marek Vasut a écrit : On Monday, August 24, 2015 at 12:13:58 PM, Cyrille Pitchen wrote: The number of dummy cycles used during Fast Read commands can be reduced to improve transfer performances. Each manufacturer has a dedicated set of registers to provide the memory with the exact number of dummy cycles it should expect. Both the memory and the (Q)SPI controller must agree on this number of dummy cycles. The number of dummy cycles can be found into the memory datasheet and mostly depends on the SPI clock frequency, the Fast Read op code and the Single/Dual Data Rate mode. Probing JEDEC Serial Flash Discoverable Parameters (SFDP) tables would only provide the driver with a high enough number of dummy cycles for each Fast Read command to be used for all clock frequencies: this solution would not be optimized. Signed-off-by: Cyrille Pitchen cyrille.pitc...@atmel.com Hi! drivers/mtd/spi-nor/spi-nor.c | 97 ++- include/linux/mtd/spi-nor.h | 2 + 2 files changed, 80 insertions(+), 19 deletions(-) diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c index e2a6029dc056..869e098a6841 100644 --- a/drivers/mtd/spi-nor/spi-nor.c +++ b/drivers/mtd/spi-nor/spi-nor.c @@ -119,24 +119,6 @@ static int read_cr(struct spi_nor *nor) } /* - * Dummy Cycle calculation for different type of read. - * It can be used to support more commands with - * different dummy cycle requirements. - */ -static inline int spi_nor_read_dummy_cycles(struct spi_nor *nor) -{ -switch (nor-flash_read) { -case SPI_NOR_FAST: -case SPI_NOR_DUAL: -case SPI_NOR_QUAD: -return 8; -case SPI_NOR_NORMAL: -return 0; -} -return 0; -} You can probably just soup up this function so that it sets the nor-read_dummy, no ? Actually, this is what the patch does: spi_nor_read_dummy_cycles() was reused and enhanced few lines below where you've pointed out the switch (nor-flash_read) block should be move after the else block. I think when I wrote the code I've chosen to move the definition of this function instead of adding forward declarations of functions such as read_cr() or write_sr_cr(), which are now called by micron_set_dummy_cycles(). -/* * Write status register 1 byte * Returns negative if error occurred. */ @@ -1012,6 +994,81 @@ static int set_quad_mode(struct spi_nor *nor, struct flash_info *info) } } +static int micron_set_dummy_cycles(struct spi_nor *nor) +{ +int ret; +u8 val, mask; + +/* read the Volatile Configuration Register (VCR) */ NIT: If this is a sentence, start it with capital letter and end it with fullstop :) done for the next version +ret = nor-read_reg(nor, SPINOR_OP_RD_VCR, val, 1); +if (ret 0) { +dev_err(nor-dev, error %d reading VCR\n, ret); +return ret; +} + +write_enable(nor); + +/* update the number of dummy into the VCR */ DTTO done for the next version +mask = GENMASK(7, 4); +val = ~mask; +val |= (nor-read_dummy 4) mask; +ret = nor-write_reg(nor, SPINOR_OP_WR_VCR, val, 1, 0); +if (ret 0) { +dev_err(nor-dev, error while writing VCR register\n); +return ret; +} + +ret = spi_nor_wait_till_ready(nor); +if (ret) +return ret; + +return 0; +} + +/* + * Dummy Cycle calculation for different type of read. + * It can be used to support more commands with + * different dummy cycle requirements. + */ +static int spi_nor_read_dummy_cycles(struct spi_nor *nor, + const struct flash_info *info) +{ +struct device_node *np = nor-dev-of_node; +u32 num_dummy_cycles; + +if (np !of_property_read_u32(np, m25p,num-dummy-cycles, +num_dummy_cycles)) { +nor-read_dummy = num_dummy_cycles; + +/* + * This switch block might be moved after the if...then...else + * statement but it was not tested with all Spansion or Micron + * memories. + * Now the m25p,num-dummy-cycles property needs to be + * explicitly set in the device tree so the switch statement is + * executed. This should avoid unwanted side effects and keep + * backward compatibility. + */ +switch (JEDEC_MFR(info)) { +case CFI_MFR_ST: +return micron_set_dummy_cycles(nor); +default: If you do have m25p,num-dummy-cycles set for non-micron flash, you have a problem here I believe. +break; +} +} else { The solution would be to drop this else {} bit here, so that if you fail in the DT-based configuration, you fall back to this old behavior. What do you think please ? :) Good idea! I
Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy
On 2015-08-22 14:29, Tejun Heo wrote: Hello, Paul. On Fri, Aug 21, 2015 at 12:26:30PM -0700, Paul Turner wrote: ... A very concrete example of the above is a virtual machine in which you want to guarantee scheduling for the vCPU threads which must schedule beside many hypervisor support threads. A hierarchy is the only way to fix the ratio at which these compete. Just to learn more, what sort of hypervisor support threads are we talking about? They would have to consume considerable amount of cpu cycles for problems like this to be relevant and be dynamic in numbers in a way which letting them competing against vcpus makes sense. Do IO helpers meet these criteria? Depending on the configuration, yes they can. VirtualBox has some rather CPU intensive threads that aren't vCPU threads (their emulated APIC thread immediately comes to mind), and so does QEMU depending on the emulated hardware configuration (it gets more noticeable when the disk images are stored on a SAN and served through iSCSI, NBD, FCoE, or ATAoE, which is pretty typical usage for large virtualization deployments). I've seen cases first hand where the vCPU's can make no reasonable progress because they are constantly getting crowded out by other threads. The use of the term 'hypervisor support threads' for this is probably not the best way of describing the contention, as it's almost always a full system virtualization issue, and the contending threads are usually storage back-end access threads. I would argue that there are better ways to deal properly with this (Isolate the non vCPU threads on separate physical CPU's from the hardware emulation threads), but such methods require large systems to be practical at any scale, and many people don't have the budget for such large systems, and this way of doing things is much more flexible for small scale use cases (for example, someone running one or two VM's on a laptop under QEMU or VirtualBox). smime.p7s Description: S/MIME Cryptographic Signature
Re: [PATCH v2 5/5] arm64: add KASan support
2015-08-24 19:16 GMT+03:00 Vladimir Murzin vladimir.mur...@arm.com: On 24/08/15 17:00, Andrey Ryabinin wrote: 2015-08-24 18:44 GMT+03:00 Vladimir Murzin vladimir.mur...@arm.com: Another option would be having sparse shadow memory based on page extension. I did play with that some time ago based on ideas from original v1 KASan support for x86/arm - it is how 614be38 irqchip: gic-v3: Fix out of bounds access to cpu_logical_map was caught. It doesn't require any VA reservations, only some contiguous memory for the page_ext itself, which serves as indirection level for the 0-order shadow pages. We won't be able to use inline instrumentation (I could live with that), and most importantly, we won't be able to use stack instrumentation. GCC needs to know shadow address for inline and/or stack instrumentation to generate correct code. It's definitely a trade-off ;) Just for my understanding does that stack instrumentation is controlled via -asan-stack? Yup. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 4/6] ARCv2: perf: implement exclusion of event counting in user or kernel mode
On Monday 24 August 2015 08:00 PM, Vineet Gupta wrote: On Monday 24 August 2015 07:50 PM, Alexey Brodkin wrote: Cc: Peter Zijlstra pet...@infradead.org Cc: Arnaldo Carvalho de Melo a...@kernel.org Signed-off-by: Alexey Brodkin abrod...@synopsys.com --- No changes since v2. No changes since v1. } + hwc-config = 0; + + if (is_isa_arcv2()) { + /* exclude user means count only kernel */ + if (event-attr.exclude_user) + hwc-config |= ARC_REG_PCT_CONFIG_KERN; + + /* exclude kernel means count only user */ + if (event-attr.exclude_kernel) + hwc-config |= ARC_REG_PCT_CONFIG_USER; + } + switch (event-attr.type) { case PERF_TYPE_HARDWARE: if (event-attr.config = PERF_COUNT_HW_MAX) return -ENOENT; if (arc_pmu-ev_hw_idx[event-attr.config] 0) return -ENOENT; - hwc-config = arc_pmu-ev_hw_idx[event-attr.config]; + hwc-config |= arc_pmu-ev_hw_idx[event-attr.config]; With raw events patch dropped - this hunk need not be present. Please ignore this stupid comment - this was written when I was presumably smoking pot ! pr_debug(init event %d with h/w %d \'%s\'\n, (int) event-attr.config, (int) hwc-config, arc_pmu_ev_hw_map[event-attr.config]); @@ -163,7 +175,7 @@ static int arc_pmu_event_init(struct perf_event *event) ret = arc_pmu_cache_event(event-attr.config); if (ret 0) return ret; - hwc-config = arc_pmu-ev_hw_idx[ret]; + hwc-config |= arc_pmu-ev_hw_idx[ret]; return 0; default: return -ENOENT; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH-v6 5/6] mfd: 88pm800: Set default interrupt clear method
On Mon, 24 Aug 2015, Vaibhav Hiremath wrote: On Monday 24 August 2015 07:24 PM, Lee Jones wrote: On Wed, 08 Jul 2015, Vaibhav Hiremath wrote: As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe (page 0) controls the method of clearing interrupt status of 88pm800 family of devices; 0: clear on read 1: clear on write If pdata is not coming from board file, then set the default irq clear method to irq clear on write Also, as suggested by Lee Jones renaming variable field to appropriate name and removed unnecessary field pm80x_chip.irq_mode, using platform_data.irq_clr_method. Signed-off-by: Zhao Ye zh...@marvell.com Signed-off-by: Vaibhav Hiremath vaibhav.hirem...@linaro.org Reviewed-by: Krzysztof Kozlowski k.kozlow...@samsung.com --- drivers/mfd/88pm800.c | 15 ++- include/linux/mfd/88pm80x.h | 9 +++-- 2 files changed, 17 insertions(+), 7 deletions(-) [...] +#define PM800_WAKEUP2_INT_READ_CLEAR (0 1) +#define PM800_WAKEUP2_INT_WRITE_CLEAR (1 1) Use BIT(). +/* Used by irq_clr_method */ +#define PM800_IRQ_CLR_ON_READ 0 +#define PM800_IRQ_CLR_ON_WRITE 1 - int irq_mode; /* Clear interrupt by read/write(0/1) */ + bool irq_clr_method;/* Clear interrupt by read/write(0/1) */ + irq_clr_mode = pdata-irq_clr_method == PM800_IRQ_CLR_ON_WRITE ? + PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR; + ret = regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode); This is pretty convoluted. For starters you're abusing the 'bool' type here. Bool is either 'true' or 'false', so at the very least you should rename 'irq_clr_method' to 'irq_clr_on_write'. Then you can do: irq_clr_mode = pdata-irq_clr_on_write ? PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR; We have discussed on this, and went back-n-forth. I think if I remember correctly, one of the version was using true/false then we decided to rename it to relevant macro. If I am not wrong V4 version of this series is exactly same as what you are referring to. Right. I made a few suggestions which vary in usefulness depending on how you plan to implement all of this. Unfortunately this is a bit of a bastardised version where some of it make sense and other parts could do with some improvement. However, what I suggest you really do is share PM800_WAKEUP2_INT_{READ,WRITE}_CLEAR with platform data and just pass the value through directly. I think we discussed about this also, and the reason I recall here is, we may need to control this from DT in the future so we decided to keep it boolean in platform_data and have simple check before writing to register. And I think that was also another reason we introduced /* Used by irq_clr_method */ #define PM800_IRQ_CLR_ON_READ 0 #define PM800_IRQ_CLR_ON_WRITE 1 I think these are still required. So it would look like this: == Platform data == struct pdata { bool clear_irq_on_write; }; pdata-clear_irq_on_write = PM800_IRQ_CLR_ON_{READ,WRITE}; == Driver == irq_clr_mode = pdata-clear_irq_on_write ? PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR; regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode); -- Lee Jones Linaro STMicroelectronics Landing Team Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT
On Mon, 24 Aug 2015, Konstantin Khlebnikov wrote: On Mon, Aug 24, 2015 at 6:09 PM, Eric B Munson emun...@akamai.com wrote: On Mon, 24 Aug 2015, Vlastimil Babka wrote: On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote: On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz wrote: On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote: I am in the middle of implementing lock on fault this way, but I cannot see how we will hanlde mremap of a lock on fault region. Say we have the following: addr = mmap(len, MAP_ANONYMOUS, ...); mlock(addr, len, MLOCK_ONFAULT); ... mremap(addr, len, 2 * len, ...) There is no way for mremap to know that the area being remapped was lock on fault so it will be locked and prefaulted by remap. How can we avoid this without tracking per vma if it was locked with lock or lock on fault? remap can count filled ptes and prefault only completely populated areas. Does (and should) mremap really prefault non-present pages? Shouldn't it just prepare the page tables and that's it? As I see mremap prefaults pages when it extends mlocked area. Also quote from manpage : If the memory segment specified by old_address and old_size is locked : (using mlock(2) or similar), then this lock is maintained when the segment is : resized and/or relocated. As a consequence, the amount of memory locked : by the process may change. Oh, right... Well that looks like a convincing argument for having a sticky VM_LOCKONFAULT after all. Having mremap guess by scanning existing pte's would slow it down, and be unreliable (was the area completely populated because MLOCK_ONFAULT was not used or because the process aulted it already? Was it not populated because MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to populate it all?). Given this, I am going to stop working in v8 and leave the vma flag in place. The only sane alternative is to populate always for mremap() of VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information as a limitation of mlock2(MLOCK_ONFAULT). Which might or might not be enough for Eric's usecase, but it's somewhat ugly. I don't think that this is the right solution, I would be really surprised as a user if an area I locked with MLOCK_ONFAULT was then fully locked and prepopulated after mremap(). If mremap is the only problem then we can add opposite flag for it: MREMAP_NOPOPULATE - do not populate new segment of locked areas - do not copy normal areas if possible (anonymous/special must be copied) addr = mmap(len, MAP_ANONYMOUS, ...); mlock(addr, len, MLOCK_ONFAULT); ... addr2 = mremap(addr, len, 2 * len, MREMAP_NOPOPULATE); ... But with this, the user must remember what areas are locked with MLOCK_LOCKONFAULT and which are locked the with prepopulate so the correct mremap flags can be used. signature.asc Description: Digital signature
Re: [PATCH 7/7] ipmi/kcs: Don't run the KCS state machine when it is KCS_IDLE
On 08/23/2015 08:52 PM, 河合英宏 / KAWAI,HIDEHIRO wrote: From: Corey Minyard [mailto:tcminy...@gmail.com] On Behalf Of Corey Minyard On 08/17/2015 09:54 PM, 河合英宏 / KAWAI,HIDEHIRO wrote: From: Corey Minyard [mailto:tcminy...@gmail.com] On Behalf Of Corey Minyard This patch will break ATN handling on the interfaces. So we can't do this. I understand. So how about doing like this: /* All states wait for ibf, so just do it here. */ - if (!check_ibf(kcs, status, time)) + if (kcs-state != KCS_IDLE !check_ibf(kcs, status, time)) return SI_SM_CALL_WITH_DELAY; I think it is not necessary to wait IBF when the state is IDLE. In this way, we can also handle the ATN case. I think it would be more reliable to go up a level and add a timeout. It may be so, but we should address this issue separately (at least I think above solution reasonably solves the issue). This issue happens after all queued messages are processed or dropped by timeout. There is no current message. So what should we set a timeout against? We can add a timeout into my new flush_messages(), but that is meaningful only in panic context. That doesn't help in normal context; we would perform a busy loop of smi_event_handler() and schedule() in ipmi_thread(). I'm a little confused here. Is the problem that the ATN bit is stuck high? If so, it's going to be really hard to work around this without breaking ATN handling. -corey Regards, Hidehiro Kawai One should be there, anyway. I thought they were all covered, but I may have missed something. -corey Regards, Hidehiro Kawai Hitachi, Ltd. Research Development Group It's going to be extremely hard to recover if the BMC is not working correctly when a panic happens. I'm not sure what can be done, but if you can fix it another way it would be good. -corey On 07/27/2015 12:55 AM, Hidehiro Kawai wrote: If a BMC is unresponsive for some reason, it ends up completing the requested message as an error, then kcs_event() is called once to advance the state machine. However, since the BMC is unresponsive now, the status of the KCS interface may not be idle. As the result, the state machine can continue to run and comsume CPU time indefinitely even if there is no more request message. Moreover, if this happens in run-to-completion mode (i.e. context of panic_event()), the kernel hangs up. To fix this problem, this patch ignores kcs_event() call if there is no request message to be processed. Signed-off-by: Hidehiro Kawai hidehiro.kawai...@hitachi.com --- drivers/char/ipmi/ipmi_kcs_sm.c |4 1 file changed, 4 insertions(+) diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c b/drivers/char/ipmi/ipmi_kcs_sm.c index 8c25f59..0e187fb 100644 --- a/drivers/char/ipmi/ipmi_kcs_sm.c +++ b/drivers/char/ipmi/ipmi_kcs_sm.c @@ -353,6 +353,10 @@ static enum si_sm_result kcs_event(struct si_sm_data *kcs, long time) if (kcs_debug KCS_DEBUG_STATES) printk(KERN_DEBUG KCS: State = %d, %x\n, kcs-state, status); + /* We don't want to run the state machine when the state is IDLE */ + if (kcs-state == KCS_IDLE) + return SI_SM_IDLE; + /* All states wait for ibf, so just do it here. */ if (!check_ibf(kcs, status, time)) return SI_SM_CALL_WITH_DELAY; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 5/5] arm64: add KASan support
2015-08-24 18:44 GMT+03:00 Vladimir Murzin vladimir.mur...@arm.com: Another option would be having sparse shadow memory based on page extension. I did play with that some time ago based on ideas from original v1 KASan support for x86/arm - it is how 614be38 irqchip: gic-v3: Fix out of bounds access to cpu_logical_map was caught. It doesn't require any VA reservations, only some contiguous memory for the page_ext itself, which serves as indirection level for the 0-order shadow pages. We won't be able to use inline instrumentation (I could live with that), and most importantly, we won't be able to use stack instrumentation. GCC needs to know shadow address for inline and/or stack instrumentation to generate correct code. In theory such design can be reused by others 32-bit arches and, I think, nommu too. Additionally, the shadow pages might be movable with help of driver-page migration patch series [1]. The cost is obvious - performance drop, although I didn't bother measuring it. [1] https://lwn.net/Articles/650917/ Cheers Vladimir -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3.12 00/82] 3.12.47-stable review
On 08/24/2015 02:09 AM, Jiri Slaby wrote: This is the start of the stable review cycle for the 3.12.47 release. There are 82 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Wed Aug 26 11:08:59 CEST 2015. Anything received after that time might be too late. Build results: total: 124 pass: 124 fail: 0 Qemu test results: total: 70 pass: 70 fail: 0 Details are available at http://server.roeck-us.net:8010/builders. Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v8 1/2] irqchip: imx-gpcv2: IMX GPCv2 driver for wakeup sources
-Original Message- From: Thomas Gleixner [mailto:t...@linutronix.de] Sent: 2015年8月23日 5:58 To: Wang Shenwei-B38339 Cc: shawn@linaro.org; ja...@lakedaemon.net; linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org; Huang Yongcai-B20788 Subject: Re: [PATCH v8 1/2] irqchip: imx-gpcv2: IMX GPCv2 driver for wakeup sources On Fri, 31 Jul 2015, Shenwei Wang wrote: +struct gpcv2_irqchip_data { + struct raw_spinlock rlock; + void __iomem *gpc_base; + u32 wakeup_sources[IMR_NUM]; + u32 enabled_irqs[IMR_NUM]; + u32 cpu2wakeup; Can you please format that in a readable way? struct raw_spinlockrlock; void __iomem *gpc_base; I did try to be careful about the format, but did not notice this one. Will change it in the new version.:) +}; + +static struct gpcv2_irqchip_data *imx_gpcv2_instance; + +u32 imx_gpcv2_get_wakeup_source(u32 **sources) { + if (!imx_gpcv2_instance) + return 0; + + if (sources) + *sources = imx_gpcv2_instance-wakeup_sources; + + return IMR_NUM; +} + +static int gpcv2_wakeup_source_save(void) { + struct gpcv2_irqchip_data *cd; + void __iomem *reg; + int i; + + cd = imx_gpcv2_instance; + if (!cd) + return 0; + + for (i = 0; i IMR_NUM; i++) { + reg = cd-gpc_base + cd-cpu2wakeup + i * 4; + cd-enabled_irqs[i] = readl_relaxed(reg); You read the full state of the register and restore the full state. So why enabled_irqs? There are two user scenarios: In CPU Idle state, the system need to be woke up by any enabled irqs, not just the ones that marked as wakeup sources. In Suspend State, they system will only be woke up by the one that marked as a wakeup source. Enabled_irqs are used to save the values before suspend, and restore them after resume. + writel_relaxed(cd-wakeup_sources[i], reg); + } + + return 0; +} + +static void gpcv2_wakeup_source_restore(void) { + struct gpcv2_irqchip_data *cd; + void __iomem *reg; + int i; + + cd = imx_gpcv2_instance; + if (!cd) + return; + + for (i = 0; i IMR_NUM; i++) { + reg = cd-gpc_base + cd-cpu2wakeup + i * 4; + writel_relaxed(cd-enabled_irqs[i], reg); + cd-wakeup_sources[i] = ~0; Why are you clearing that info on resume? Drivers will clear that via set_wake() or leave it when they want to have resume functionality? Each time system goes into the suspend state, it will call set_wake (ON) again to configure the wakeup sources. Clearing wakeup_sources here can make sure the system work as expected no matter that a driver calls set_wake (OFF) during resume stage. +static int __init imx_gpcv2_irqchip_init(struct device_node *node, + struct device_node *parent) { + struct irq_domain *parent_domain, *domain; + struct gpcv2_irqchip_data *cd; + int i; + + if (!parent) { + pr_err(%s: no parent, giving up\n, node-full_name); + return -ENODEV; + } + + parent_domain = irq_find_host(parent); + if (!parent_domain) { + pr_err(%s: unable to get parent domain\n, node-full_name); + return -ENXIO; + } + + cd = kzalloc(sizeof(struct gpcv2_irqchip_data), GFP_KERNEL); + BUG_ON(!cd); You return an error code for all other failures. Why BUG here? Good point. To be consistent, I will change it to return an error code. Thanks, Shenwei Otherwise this looks very clean now. Can you please resend ASAP with these minor points addressed? Thanks, tglx
Re: [RESEND][PATCH 4/4] ARM: dts: keystone: Add ti,keystone-spi for SPI
On 8/24/2015 6:36 AM, Franklin S Cooper Jr. wrote: Hi Santosh, All the patches except this one are in linux-next. Yes I noticed it. I will queue this up for next merge window. Thanks for reminder. Regards, Santosh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 5/5] arm64: add KASan support
On 24/08/15 17:00, Andrey Ryabinin wrote: 2015-08-24 18:44 GMT+03:00 Vladimir Murzin vladimir.mur...@arm.com: Another option would be having sparse shadow memory based on page extension. I did play with that some time ago based on ideas from original v1 KASan support for x86/arm - it is how 614be38 irqchip: gic-v3: Fix out of bounds access to cpu_logical_map was caught. It doesn't require any VA reservations, only some contiguous memory for the page_ext itself, which serves as indirection level for the 0-order shadow pages. We won't be able to use inline instrumentation (I could live with that), and most importantly, we won't be able to use stack instrumentation. GCC needs to know shadow address for inline and/or stack instrumentation to generate correct code. It's definitely a trade-off ;) Just for my understanding does that stack instrumentation is controlled via -asan-stack? Thanks Vladimir In theory such design can be reused by others 32-bit arches and, I think, nommu too. Additionally, the shadow pages might be movable with help of driver-page migration patch series [1]. The cost is obvious - performance drop, although I didn't bother measuring it. [1] https://lwn.net/Articles/650917/ Cheers Vladimir -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 06/14] Documentation: drm/bridge: add document for analogix_dp
Am Montag, 24. August 2015, 09:48:27 schrieb Rob Herring: On Mon, Aug 24, 2015 at 7:57 AM, Russell King - ARM Linux When we adopted the graph bindings for iMX DRM, I thought exactly at that time it would be nice if this could become the standard for binding DRM components together but I don't have the authority from either the DT perspective or the DRM perspective to mandate that. Neither does anyone else. That's the _real_ problem here. I've seen several DRM bindings go by which don't use the of-graph stuff, which means that they'll never be compatible with generic components which do use the of-graph stuff. It goes beyond bindings IMO. The use of the component framework or not has been at the whim of driver writers as well. It is either used or private APIs are created. I'm using components and my need for it boils down to passing the struct drm_device pointer to the encoder. Other components like panels and bridges have different ways to attach to the DRM driver. but that is then simply implementation specific. Panels and bridges can very well be part of and created from an of_graph description without needing to be a (linux-)component - see patch 7. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] ubifs: Allow O_DIRECT
On Mon, Aug 24, 2015 at 10:13:25AM +0300, Artem Bityutskiy wrote: Now, some user-space fails when direct I/O is not supported. I think the whole argument rested on what it means when some user space fails; apparently that user space is just a test suite (which can/should be fixed). We can chose to fake direct I/O or fix user-space. The latter seems to be the preferred course of actions, and you are correctly pointing the man page. However, if 1. we are the only FS erroring out on O_DIRECT 2. other file-systems not supporting direct IO just fake it we may just follow the crowd and fake it too. I am kind of trusting Richard here - I assume he did the research and the above is the case, this is why I am fine with his patch. Does this logic seem acceptable to you? Other folk's opinion would be great to hear. Could work for me, though that doesn't seem ideal. Anyway, it now seems Christopher and Richard agree with me. Regards, Brian -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9] dmaengine: Add Xilinx AXI Direct Memory Access Engine driver support
This is the driver for the AXI Direct Memory Access (AXI DMA) core, which is a soft Xilinx IP core that provides high- bandwidth direct memory access between memory and AXI4-Stream type target peripherals. Signed-off-by: Kedareswara rao Appana appa...@xilinx.com --- The deivce tree doc got applied in the slave-dmaengine.git. Changes in v9: - Used the readl_poll_timeout instead of do while loops in the driver as suggested by Moritz Fischer. - Intialize the residue variable to get rid of compilation warining. Changes in v8: - Updated the SG handling as suggested by Nicolae Rosia. - Removed the unnecessary xilinx_dma_channel_set_config API the properties in this API is not being used by the driver. Changes in v7: - Updated license in the driver as suggested by Paul. - Corrected return value in is_idle funtion. Changes in v6: - Fixed Odd indention in the Kconfig. - used GFP_NOWAIT instead of GFP_KERNEL during the desc allocation - Calculated residue in the tx_status instead of complete_descriptor. - Update copy right to 2015. - Modified spin_lock handling moved the spin_lock to the appropriate functions (instead of xilinx_dma_start_transfer doing it xilinx_dma_issue_pending api). - device_control and declare slave caps updated as per newer APi's. Changes in v5: - Modified the xilinx_dma.h header file location to the include/linux/dma/xilinx_dma.h Changes in v4: - Add direction field to DMA descriptor structure and removed from channel structure to avoid duplication. - Check for DMA idle condition before changing the configuration. - Residue is being calculated in complete_descriptor() and is reported to slave driver. Changes in v3: - Rebased on 3.16-rc7 Changes in v2: - Simplified the logic to set SOP and APP words in prep_slave_sg(). - Corrected function description comments to match the return type. - Fixed some minor comments as suggested by Andy. --- drivers/dma/Kconfig | 13 + drivers/dma/xilinx/Makefile |1 + drivers/dma/xilinx/xilinx_dma.c | 1178 +++ 3 files changed, 1192 insertions(+) create mode 100644 drivers/dma/xilinx/xilinx_dma.c diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index 88d474b..5e95f07 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -507,4 +507,17 @@ config QCOM_BAM_DMA Enable support for the QCOM BAM DMA controller. This controller provides DMA capabilities for a variety of on-chip devices. +config XILINX_DMA +tristate Xilinx AXI DMA Engine +depends on (ARCH_ZYNQ || MICROBLAZE) +select DMA_ENGINE +help + Enable support for Xilinx AXI DMA Soft IP. + + This engine provides high-bandwidth direct memory access + between memory and AXI4-Stream type target peripherals. + It has two stream interfaces/channels, Memory Mapped to + Stream (MM2S) and Stream to Memory Mapped (S2MM) for the + data transfers. + endif diff --git a/drivers/dma/xilinx/Makefile b/drivers/dma/xilinx/Makefile index 3c4e9f2..6224a49 100644 --- a/drivers/dma/xilinx/Makefile +++ b/drivers/dma/xilinx/Makefile @@ -1 +1,2 @@ obj-$(CONFIG_XILINX_VDMA) += xilinx_vdma.o +obj-$(CONFIG_XILINX_DMA) += xilinx_dma.o diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c new file mode 100644 index 000..d19009e --- /dev/null +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -0,0 +1,1178 @@ +/* + * DMA driver for Xilinx DMA Engine + * + * Copyright (C) 2010 - 2015 Xilinx, Inc. All rights reserved. + * + * Based on the Freescale DMA driver. + * + * Description: + * The AXI DMA, is a soft IP, which provides high-bandwidth Direct Memory + * Access between memory and AXI4-Stream-type target peripherals. It can be + * configured to have one channel or two channels and if configured as two + * channels, one is to transmit data from memory to a device and another is + * to receive from a device. + * + * This is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include linux/bitops.h +#include linux/dma/xilinx_dma.h +#include linux/init.h +#include linux/interrupt.h +#include linux/io.h +#include linux/iopoll.h +#include linux/module.h +#include linux/of_address.h +#include linux/of_dma.h +#include linux/of_irq.h +#include linux/of_platform.h +#include linux/slab.h + +#include ../dmaengine.h + +/* Register Offsets */ +#define XILINX_DMA_REG_CONTROL 0x00 +#define XILINX_DMA_REG_STATUS 0x04 +#define XILINX_DMA_REG_CURDESC 0x08 +#define XILINX_DMA_REG_TAILDESC0x10 +#define XILINX_DMA_REG_SRCADDR 0x18 +#define XILINX_DMA_REG_DSTADDR 0x20 +#define XILINX_DMA_REG_BTT 0x28 + +/* Channel/Descriptor Offsets */ +#define XILINX_DMA_MM2S_CTRL_OFFSET0x00 +#define
Re: [PATCH v2 5/5] arm64: add KASan support
On 24/08/15 15:15, Andrey Ryabinin wrote: 2015-08-24 16:45 GMT+03:00 Linus Walleij linus.wall...@linaro.org: On Mon, Aug 24, 2015 at 3:15 PM, Russell King - ARM Linux li...@arm.linux.org.uk wrote: On Tue, Jul 21, 2015 at 11:27:56PM +0200, Linus Walleij wrote: On Tue, Jul 21, 2015 at 4:27 PM, Andrey Ryabinin a.ryabi...@samsung.com wrote: I used vexpress. Anyway, it doesn't matter now, since I have an update with a lot of stuff fixed, and it works on hardware. I still need to do some work on it and tomorrow, probably, I will share. Ah awesome. I have a stash of ARM boards so I can test it on a range of hardware once you feel it's ready. Sorry for pulling stuff out of your hands, people are excited about KASan ARM32 as it turns out. People may be excited about it because it's a new feature, but we really need to consider whether gobbling up 512MB of userspace for it is a good idea or not. There are programs around which like to map large amounts of memory into their process space, and the more we steal from them, the more likely these programs are to fail. I looked at some different approaches over the last weeks for this when playing around with KASan. It seems since KASan was developed on 64bit systems, this was not much of an issue for them as they could take their shadow memory from the vmalloc space. I think it is possible to actually just steal as much memory as is needed to cover the kernel, and not 1/8 of the entire addressable 32bit space. So instead of covering all from 0x0-0x at least just MODULES_VADDR thru 0x should be enough. So if that is 0xbf00-0x in most cases, 0x4100 bytes, then 1/8 of that, 0x820, 130MB should be enough. (Andrey need to say if this is possible.) Yes, ~130Mb (3G/1G split) should work. 512Mb shadow is optional. The only advantage of 512Mb shadow is better handling of user memory accesses bugs (access to user memory without copy_from_user/copy_to_user/strlen_user etc API). In case of 512Mb shadow we could to not map anything in shadow for user addresses, so such bug will guarantee to crash the kernel. In case of 130Mb, the behavior will depend on memory layout of the current process. So, I think it's fine to keep shadow only for kernel addresses. Another option would be having sparse shadow memory based on page extension. I did play with that some time ago based on ideas from original v1 KASan support for x86/arm - it is how 614be38 irqchip: gic-v3: Fix out of bounds access to cpu_logical_map was caught. It doesn't require any VA reservations, only some contiguous memory for the page_ext itself, which serves as indirection level for the 0-order shadow pages. In theory such design can be reused by others 32-bit arches and, I think, nommu too. Additionally, the shadow pages might be movable with help of driver-page migration patch series [1]. The cost is obvious - performance drop, although I didn't bother measuring it. [1] https://lwn.net/Articles/650917/ Cheers Vladimir That will probably miss some usecases I'm not familiar with, where the kernel is actually executing something below 0xbf00... I looked at taking memory from vmalloc instead, but ran into problems since this is subject to the highmem split and KASan need to have it's address offset at compile time. On Ux500 I managed to remove all the static maps and steal memory from the top of the vmalloc area instead of the beginning, but that is probably not generally feasible. I suspect you have better ideas than what I can come up with though. Yours, Linus Walleij -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, Aug 24, 2015 at 04:04:37PM +0200, Frederic Weisbecker wrote: On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote: On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: * Paul E. McKenney paul...@linux.vnet.ibm.com wrote: here it's fully set - triggering the bug I'm worried about. So what am I missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is called from tick_init() which is called from start_kernel() shortly after rcu_init(): cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { pr_warning(NO_HZ: Clearing %d from nohz_full range for timekeeping\n, cpu); cpumask_clear_cpu(cpu, tick_nohz_full_mask); } This happens after the call to tick_nohz_init_all() that does the cpumask_setall() that you called out above. Ah, indeed - I somehow missed that. This brings up two other questions: 1) the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); which can be pretty expensive, and which gets executed in two hotpaths: kernel/time/hrtimer.c: return per_cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(tvec_bases, get_nohz_timer_target()); ... why not just use a single housekeeping_cpu which would be way faster to pass down to the timer code? The housekeeping_cpu came later, but that does seem like a good optimization. Well nohz full is likely to be used for HPC and that can involve big machines. Having the housekeeping duty spread per node is a likely future evolution there, if it isn't already used that way. So we need to keep it a cpumask. Fair point! Thanx, Paul 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. The tick_nohz_cpu_down_callback() function does this, though in a less than obvious way. The tick_do_timer_cpu variable is the housekeeping CPU that is currently handling timing, and it is not permitted to go offline. Indeed, more specifically tick-common.c makes sure to set the timekeeping duty to a housekeeper and that housekeeper is always the boot CPU due to early device initialization. But I should find a way to simplify that code and make it obvious it's always set to the boot CPU. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Internal PATCH] ipmi: add of_device_id in MODULE_DEVICE_TABLE
Fix autoloading ipmi modules when using device tree. Signed-off-by: Brijesh Singh brijeshkumar.si...@amd.com --- drivers/char/ipmi/ipmi_si_intf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index 8a45e92..cddc7b0 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -2785,6 +2785,7 @@ static struct platform_driver ipmi_driver = { .probe = ipmi_probe, .remove = ipmi_remove, }; +MODULE_DEVICE_TABLE(of, ipmi_match); #ifdef CONFIG_PARISC static int ipmi_parisc_probe(struct parisc_device *dev) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT
On Mon, Aug 24, 2015 at 6:55 PM, Eric B Munson emun...@akamai.com wrote: On Mon, 24 Aug 2015, Konstantin Khlebnikov wrote: On Mon, Aug 24, 2015 at 6:09 PM, Eric B Munson emun...@akamai.com wrote: On Mon, 24 Aug 2015, Vlastimil Babka wrote: On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote: On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz wrote: On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote: I am in the middle of implementing lock on fault this way, but I cannot see how we will hanlde mremap of a lock on fault region. Say we have the following: addr = mmap(len, MAP_ANONYMOUS, ...); mlock(addr, len, MLOCK_ONFAULT); ... mremap(addr, len, 2 * len, ...) There is no way for mremap to know that the area being remapped was lock on fault so it will be locked and prefaulted by remap. How can we avoid this without tracking per vma if it was locked with lock or lock on fault? remap can count filled ptes and prefault only completely populated areas. Does (and should) mremap really prefault non-present pages? Shouldn't it just prepare the page tables and that's it? As I see mremap prefaults pages when it extends mlocked area. Also quote from manpage : If the memory segment specified by old_address and old_size is locked : (using mlock(2) or similar), then this lock is maintained when the segment is : resized and/or relocated. As a consequence, the amount of memory locked : by the process may change. Oh, right... Well that looks like a convincing argument for having a sticky VM_LOCKONFAULT after all. Having mremap guess by scanning existing pte's would slow it down, and be unreliable (was the area completely populated because MLOCK_ONFAULT was not used or because the process aulted it already? Was it not populated because MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to populate it all?). Given this, I am going to stop working in v8 and leave the vma flag in place. The only sane alternative is to populate always for mremap() of VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information as a limitation of mlock2(MLOCK_ONFAULT). Which might or might not be enough for Eric's usecase, but it's somewhat ugly. I don't think that this is the right solution, I would be really surprised as a user if an area I locked with MLOCK_ONFAULT was then fully locked and prepopulated after mremap(). If mremap is the only problem then we can add opposite flag for it: MREMAP_NOPOPULATE - do not populate new segment of locked areas - do not copy normal areas if possible (anonymous/special must be copied) addr = mmap(len, MAP_ANONYMOUS, ...); mlock(addr, len, MLOCK_ONFAULT); ... addr2 = mremap(addr, len, 2 * len, MREMAP_NOPOPULATE); ... But with this, the user must remember what areas are locked with MLOCK_LOCKONFAULT and which are locked the with prepopulate so the correct mremap flags can be used. Yep. Shouldn't be hard. You anyway have to do some changes in user-space. Much simpler for users-pace solution is a mm-wide flag which turns all further mlocks and MAP_LOCKED into lock-on-fault. Something like mlockall(MCL_NOPOPULATE_LOCKED). -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] pinctrl: mediatek: Implement wake handler and suspend resume
On 14/08/15 09:38, maoguang.m...@mediatek.com wrote: From: Maoguang Meng maoguang.m...@mediatek.com This patch implement irq_set_wake to get who is wakeup source and setup on suspend resume. Signed-off-by: Maoguang Meng maoguang.m...@mediatek.com --- changes since v3: -add a comment in mtk_eint_chip_read_mask. -delete ALIGN when allocate eint_offsets.ports. -fix unrelated change. changes since v2: -modify irq_wake to handle irq wakeup source. -allocate two buffers separately. -fix some codestyle. Changes since v1: -implement irq_wake handler. --- drivers/pinctrl/mediatek/pinctrl-mt8173.c | 1 + drivers/pinctrl/mediatek/pinctrl-mtk-common.c | 91 ++- drivers/pinctrl/mediatek/pinctrl-mtk-common.h | 4 ++ 3 files changed, 95 insertions(+), 1 deletion(-) diff --git a/drivers/pinctrl/mediatek/pinctrl-mt8173.c b/drivers/pinctrl/mediatek/pinctrl-mt8173.c index d0c811d..ad27184 100644 --- a/drivers/pinctrl/mediatek/pinctrl-mt8173.c +++ b/drivers/pinctrl/mediatek/pinctrl-mt8173.c @@ -385,6 +385,7 @@ static struct platform_driver mtk_pinctrl_driver = { .driver = { .name = mediatek-mt8173-pinctrl, .of_match_table = mt8173_pctrl_match, + .pm = mtk_eint_pm_ops, }, }; diff --git a/drivers/pinctrl/mediatek/pinctrl-mtk-common.c b/drivers/pinctrl/mediatek/pinctrl-mtk-common.c index ad1ea16..fe34ce9 100644 --- a/drivers/pinctrl/mediatek/pinctrl-mtk-common.c +++ b/drivers/pinctrl/mediatek/pinctrl-mtk-common.c @@ -33,6 +33,7 @@ #include linux/mfd/syscon.h #include linux/delay.h #include linux/interrupt.h +#include linux/pm.h #include dt-bindings/pinctrl/mt65xx.h #include ../core.h @@ -1062,6 +1063,77 @@ static int mtk_eint_set_type(struct irq_data *d, return 0; } +static int mtk_eint_irq_set_wake(struct irq_data *d, unsigned int on) +{ + struct mtk_pinctrl *pctl = irq_data_get_irq_chip_data(d); + int shift = d-hwirq 0x1f; + int reg = d-hwirq 5; + + if (on) + pctl-wake_mask[reg] |= BIT(shift); + else + pctl-wake_mask[reg] = ~BIT(shift); + + return 0; +} Does this pinmux controller: 1. Support wake-up configuration ? If not, you need to use IRQCHIP_SKIP_SET_WAKE. I don't see any value in writing the mask_{set,clear} if the same registers are used for {en,dis}able 2. Is in always on domain ? If not, you need save/restore only to resume back the functionality. Generally we can set IRQCHIP_MASK_ON_SUSPEND to ensure non-wake-up interrupts are disabled during suspend and re-enabled in resume path. You just save/restore raw values without tracking the wake-up source. Also I see that no care is taken to set the port irq as wake enable source. It may work with current mainline, but won't with -next. Please ensure the port irq to the parent interrupt controller remains enabled(i.e set as wake). Regards, Sudeep -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] irqchip, gicv3-its, numa: Workaround for Cavium ThunderX erratum 23144
Hi Marc, thanks for the suggestions. On Mon, Aug 24, 2015 at 7:17 PM, Marc Zyngier marc.zyng...@arm.com wrote: On 24/08/15 14:27, Ganapatrao Kulkarni wrote: On Mon, Aug 24, 2015 at 6:15 PM, Marc Zyngier marc.zyng...@arm.com wrote: static void its_enable_cavium_thunderx(void *data) { - struct its_node *its = data; + struct its_node __maybe_unused *its = data; - its-flags |= ITS_FLAGS_CAVIUM_THUNDERX; +#ifdef CONFIG_CAVIUM_ERRATUM_22375 + its-flags |= ITS_WORKAROUND_CAVIUM_22375; + pr_info(ITS: Enabling workaround for 22375, 24313\n); +#endif + +#ifdef CONFIG_CAVIUM_ERRATUM_23144 + if (num_possible_nodes() 1) { + its-numa_node = its_get_node_thunderx(its); I'd rather see numa_node being always initialized to something useful. If you're adding numa support, why can't this be initialized via standard topology bindings? IIUC, topology defines only cpu topology. Well, welcome to a much more complex system where both your CPUs and your IOs have some degree of affinity. This needs to be described properly, and not hacked on the side. ok, will add description for the function. I sense that you misunderstood what I meant. What I'd like to see is some topology information coming from DT, showing the relationship between a device (your ITS) and a given node (your socket). This can then be used from two purposes: sure will post next version with changes as per you comments. - find the optimal affinity for a MSI so that it doesn't default to a foreign node (a reasonable performance expectation), this can be done by adding dt associativity property to its node. i can send in next version of patch. - work around implementation bugs where an LPI cannot be routed to a redistributor that is on a foreign node. I really don't feel like adding a hack just for the second point, and I'd rather get the big picture right so that your workaround is just a special case of the generic one. Thanks, M. -- Jazz is not dead. It just smells funny... thanks Ganapat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/10] irqchip: irq-mips-gic: export gic_send_ipi
On 08/24/2015 04:07 PM, Thomas Gleixner wrote: On Mon, 24 Aug 2015, Qais Yousef wrote: On 08/24/2015 02:32 PM, Marc Zyngier wrote: I'd rather see something more architected than this blind export, or at least some level of filtering (the idea random drivers can access such a low-level function doesn't make me feel very good). I don't know how to architect this better or how to perform the filtering, but I'm happy to hear suggestions and try them out. Keep in mind that detecting GIC and writing your own gic_send_ipi() is very simple. I have done this when the driver was out of tree. So restricting it by not exporting it will not prevent someone from really accessing the functionality, it's just they have to do it their own way. Keep in mind that we are not talking about out of tree hackery. We talk about a kernel code submission and I doubt, that you will get away with a GIC detection/fiddling burried in your driver code. Keep in mind that just slapping an export to some random function is not much better than doing a GIC hack in the driver. Marcs concerns about blindly exposing IPI functionality to drivers is well justified and that kind of coprocessor stuff is not unique to your particular SoC. We're going to see such things more frequently in the not so distant future, so we better think now about proper solutions to that problem. Sure I'm not trying to argue against that. There are a couple of issues to solve: 1) How is the IPI which is received by the coprocessor reserved in the system? 2) How is it associated to a particular driver? Shouldn't 'interrupts' property in DT take care of these 2 questions? Maybe we can give it an alias name to make it more readable that this interrupt is requested for external IPI. 3) How do we ensure that a driver cannot issue random IPIs and can only send the associated ones? If we get the irq number from DT then I'm not sure how feasible it is to implement a generic_send_ipi() function that takes this number to generate an IPI. Do you think this approach would work? None of these issues are handled by your export. So we need a core infrastructure which allows us to do that. The requirements are pretty clear from the above and Marc might have some further restrictions in mind. Another issue I'm having which is related is that I need to communicate these GIC irq numbers to AXD core when it starts up. So the logic is that these IPIs are not hardwired and it's up to the system designer to allocate 2 free GIC irqs to be used for that purpose. At the moment I have my own DT property to take these numbers. Hopefully this link would explain the issue. See the question about gic-irq property. https://lkml.org/lkml/2015/8/24/459 From what I know there's no generic way for the driver to get the hw irq number from linux irq number unless I missed something. Is it possible to add something to support this? Or maybe there's something but I failed to find? Thanks, Qais Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH-v6 5/6] mfd: 88pm800: Set default interrupt clear method
On Monday 24 August 2015 09:21 PM, Lee Jones wrote: On Mon, 24 Aug 2015, Vaibhav Hiremath wrote: On Monday 24 August 2015 07:24 PM, Lee Jones wrote: On Wed, 08 Jul 2015, Vaibhav Hiremath wrote: As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe (page 0) controls the method of clearing interrupt status of 88pm800 family of devices; 0: clear on read 1: clear on write If pdata is not coming from board file, then set the default irq clear method to irq clear on write Also, as suggested by Lee Jones renaming variable field to appropriate name and removed unnecessary field pm80x_chip.irq_mode, using platform_data.irq_clr_method. Signed-off-by: Zhao Ye zh...@marvell.com Signed-off-by: Vaibhav Hiremath vaibhav.hirem...@linaro.org Reviewed-by: Krzysztof Kozlowski k.kozlow...@samsung.com --- drivers/mfd/88pm800.c | 15 ++- include/linux/mfd/88pm80x.h | 9 +++-- 2 files changed, 17 insertions(+), 7 deletions(-) [...] +#define PM800_WAKEUP2_INT_READ_CLEAR (0 1) +#define PM800_WAKEUP2_INT_WRITE_CLEAR (1 1) Use BIT(). +/* Used by irq_clr_method */ +#define PM800_IRQ_CLR_ON_READ 0 +#define PM800_IRQ_CLR_ON_WRITE 1 - int irq_mode; /* Clear interrupt by read/write(0/1) */ + bool irq_clr_method;/* Clear interrupt by read/write(0/1) */ + irq_clr_mode = pdata-irq_clr_method == PM800_IRQ_CLR_ON_WRITE ? + PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR; + ret = regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode); This is pretty convoluted. For starters you're abusing the 'bool' type here. Bool is either 'true' or 'false', so at the very least you should rename 'irq_clr_method' to 'irq_clr_on_write'. Then you can do: irq_clr_mode = pdata-irq_clr_on_write ? PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR; We have discussed on this, and went back-n-forth. I think if I remember correctly, one of the version was using true/false then we decided to rename it to relevant macro. If I am not wrong V4 version of this series is exactly same as what you are referring to. Right. I made a few suggestions which vary in usefulness depending on how you plan to implement all of this. Unfortunately this is a bit of a bastardised version where some of it make sense and other parts could do with some improvement. This so called basterdised version could have been avoided :) V2 version itself was clean and ready. It just got dragged into multiple iterations. However, what I suggest you really do is share PM800_WAKEUP2_INT_{READ,WRITE}_CLEAR with platform data and just pass the value through directly. I think we discussed about this also, and the reason I recall here is, we may need to control this from DT in the future so we decided to keep it boolean in platform_data and have simple check before writing to register. And I think that was also another reason we introduced /* Used by irq_clr_method */ #define PM800_IRQ_CLR_ON_READ 0 #define PM800_IRQ_CLR_ON_WRITE 1 I think these are still required. So it would look like this: NO. I think you are confused here, We have two different macros playing around here, +/* Used by irq_clr_method */ +#define PM800_IRQ_CLR_ON_READ 0 +#define PM800_IRQ_CLR_ON_WRITE 1 /* Used to write to register */ +#define PM800_WAKEUP2_INT_READ_CLEAR (0 1) +#define PM800_WAKEUP2_INT_WRITE_CLEAR (1 1) == Platform data == struct pdata { bool clear_irq_on_write; }; pdata-clear_irq_on_write = PM800_IRQ_CLR_ON_{READ,WRITE}; == Driver == irq_clr_mode = pdata-clear_irq_on_write ? PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR; regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode); Please check V2, which is exactly same as above. https://patchwork.kernel.org/patch/6627781/ If you are OK with it, I will spin another version and submit it. Thanks, Vaibhav -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
On Mon, Aug 24, 2015 at 02:22:08PM +, Liang, Kan wrote: On Fri, Aug 21, 2015 at 08:25:24PM +, Liang, Kan wrote: SNIP we need global topology information in perf.data and use the mapping from there, we can't use current server info we currently store core_siblings_list and thread_siblings_list, in topology FEATURE, which is probably not enough core_siblings_list includes the cpu list in the same socket. thread_siblings_list includes the cpu list in the same core. numa_nodes includes the cpu list for each node. It looks we have enough data from topology FEATURE. hum, haven't hecked deeply.. how will you get core id for cpu? from thread_siblings_list. I just noticed that svg_build_topology_map did the similar thing to get topology map for timechart from perf header. could you please provide both functions then cpu - core, cpu - socket Do you mean something like this? Store cpu-socket and cpu-core in perf_session_env. diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index 179b2bd..a01c603 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -1590,10 +1596,17 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused u32 nr, i; char *str; struct strbuf sb; + int cpu_nr = ph-env.nr_cpus_online; + struct cpu_map *map; + int j; + + ph-env.cpu = calloc(cpu_nr, sizeof(*ph-env.cpu)); + if (!ph-env.cpu) + return -1; ret = readn(fd, nr, sizeof(nr)); if (ret != sizeof(nr)) - return -1; + goto free_cpu; if (ph-needs_swap) nr = bswap_32(nr); @@ -1608,6 +1621,14 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused /* include a NULL character at the end */ strbuf_add(sb, str, strlen(str) + 1); + + map = cpu_map__new(str); + if (!map) + goto error; + for (j = 0; j map-nr; j++) { +ph-env.cpu[map-map[j]].socket_id = i; + } + cpu_map__put(map); free(str); } ph-env.sibling_cores = strbuf_detach(sb, NULL); @@ -1628,6 +1649,14 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused /* include a NULL character at the end */ strbuf_add(sb, str, strlen(str) + 1); + + map = cpu_map__new(str); + if (!map) + goto error; + for (j = 0; j map-nr; j++) { + ph-env.cpu[map-map[j]].core_id = i; + } + cpu_map__put(map); free(str); } ph-env.sibling_threads = strbuf_detach(sb, NULL); @@ -1635,6 +1664,8 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused error: strbuf_release(sb); +free_cpu: + free(ph-env.cpu); return -1; } diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h index 9b53b65..8b8c4fc 100644 --- a/tools/perf/util/header.h +++ b/tools/perf/util/header.h @@ -66,6 +66,11 @@ struct perf_header; int perf_file_header__read(struct perf_file_header *header, struct perf_header *ph, int fd); +struct cpu_topology_map { + int socket_id; + int core_id; +}; + struct perf_session_env { char*hostname; char*os_release; @@ -89,6 +94,7 @@ struct perf_session_env { char*sibling_threads; char*numa_nodes; char*pmu_mappings; + struct cpu_topology_map *cpu; }; struct perf_header { diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 18722e7..51b4d5a 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -185,6 +185,7 @@ static void perf_session_env__exit(struct perf_session_env *env) zfree(env-sibling_threads); zfree(env-numa_nodes); zfree(env-pmu_mappings); + zfree(env-cpu); } void perf_session__delete(struct perf_session *session) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 3/3] qe_common: add qe_muram_ functions to manage muram
On Mon, 2015-08-24 at 17:31 +0800, Zhao Qiang wrote: muram is used for qe, add qe_muram_ functions to manage muram. Signed-off-by: Zhao Qiang qiang.z...@freescale.com --- Changes for v2: - no changes Changes for v3: - no changes Changes for v4: - no changes Changes for v5: - no changes Changes for v5: - using genalloc instead rheap to manage QE MURAM - remove qe_reset from platform file, using - subsys_initcall to call qe_init function. This patch should come before the one that moves the code. diff --git a/drivers/soc/fsl/qe/qe_common.c b/drivers/soc/fsl/qe/qe_common.c new file mode 100644 index 000..7f1762c --- /dev/null +++ b/drivers/soc/fsl/qe/qe_common.c @@ -0,0 +1,193 @@ +/* + * common qe code + * + * author: scott wood scottw...@freescale.com + * + * copyright 2007-2008,2010 freescale Semiconductor, Inc. + * + * some parts derived from commproc.c/qe2_common.c, which is: + * copyright (c) 1997 dan error_act (dma...@jlc.net) + * copyright (c) 1999-2001 dan Malek d...@embeddedalley.com + * copyright (c) 2000 montavista Software, Inc (sou...@mvista.com) + * 2006 (c) montavista software, Inc. + * vitaly bordug vbor...@ru.mvista.com Why did you lowercase everyone's names? Why is this copying code rather than moving it? diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h index 55da07e..aaf3dc2 100644 --- a/include/linux/genalloc.h +++ b/include/linux/genalloc.h @@ -30,6 +30,7 @@ #ifndef __GENALLOC_H__ #define __GENALLOC_H__ +#include linux/types.h #include linux/spinlock_types.h struct device; This does not belong in this patch. @@ -187,12 +190,41 @@ static inline int qe_alive_during_sleep(void) } /* we actually use cpm_muram implementation, define this for convenience */ -#define qe_muram_init cpm_muram_init -#define qe_muram_alloc cpm_muram_alloc -#define qe_muram_alloc_fixed cpm_muram_alloc_fixed -#define qe_muram_free cpm_muram_free -#define qe_muram_addr cpm_muram_addr -#define qe_muram_offset cpm_muram_offset +int qe_muram_init(void); + +#if defined(CONFIG_QUICC_ENGINE) +unsigned long qe_muram_alloc(unsigned long size, unsigned long align); +int qe_muram_free(unsigned long offset); +void __iomem *qe_muram_addr(unsigned long offset); +unsigned long qe_muram_offset(void __iomem *addr); +dma_addr_t qe_muram_dma(void __iomem *addr); +#else +static inline unsigned long qe_muram_alloc(unsigned long size, + unsigned long align) +{ + return -ENOSYS; +} What code calls these functions without CONFIG_QUICC_ENGINE? Are you converting qe without cpm? Why? -Scott -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] f2fs: fix to release inode correctly
On Mon, Aug 24, 2015 at 05:40:45PM +0800, Chao Yu wrote: In following call stack, if unfortunately we lose all chances to truncate inode page in remove_inode_page, eventually we will add the nid allocated previously into free nid cache, this nid is with NID_NEW status and with NEW_ADDR in its blkaddr pointer: - f2fs_create - f2fs_add_link - __f2fs_add_link - init_inode_metadata - new_inode_page - new_node_page - set_node_addr(, NEW_ADDR) - f2fs_init_acl failed - remove_inode_page failed - handle_failed_inode - remove_inode_page failed - iput - f2fs_evict_inode - remove_inode_page failed - alloc_nid_failed cache a nid with valid blkaddr: NEW_ADDR This may not only cause resource leak of previous inode, but also may cause incorrect use of the previous blkaddr which is located in NO.nid node entry when this nid is reused by others. This patch tries to add this inode to orphan list if we fail to truncate inode, so that we can obtain a second chance to release it in orphan recovery flow. Signed-off-by: Chao Yu chao2...@samsung.com --- fs/f2fs/f2fs.h | 2 +- fs/f2fs/inode.c | 53 ++--- fs/f2fs/node.c | 14 +- 3 files changed, 56 insertions(+), 13 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 806439f..69827ee 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1687,7 +1687,7 @@ int get_dnode_of_data(struct dnode_of_data *, pgoff_t, int); int truncate_inode_blocks(struct inode *, pgoff_t); int truncate_xattr_node(struct inode *, struct page *); int wait_on_node_pages_writeback(struct f2fs_sb_info *, nid_t); -void remove_inode_page(struct inode *); +int remove_inode_page(struct inode *); struct page *new_inode_page(struct inode *); struct page *new_node_page(struct dnode_of_data *, unsigned int, struct page *); void ra_node_page(struct f2fs_sb_info *, nid_t); diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index d1b03d0..35aae65 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -317,6 +317,7 @@ void f2fs_evict_inode(struct inode *inode) struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct f2fs_inode_info *fi = F2FS_I(inode); nid_t xnid = fi-i_xattr_nid; + int err = 0; /* some remained atomic pages should discarded */ if (f2fs_is_atomic_file(inode)) @@ -342,11 +343,13 @@ void f2fs_evict_inode(struct inode *inode) i_size_write(inode, 0); if (F2FS_HAS_BLOCKS(inode)) - f2fs_truncate(inode, true); + err = f2fs_truncate(inode, true); - f2fs_lock_op(sbi); - remove_inode_page(inode); - f2fs_unlock_op(sbi); + if (!err) { + f2fs_lock_op(sbi); + err = remove_inode_page(inode); + f2fs_unlock_op(sbi); + } sb_end_intwrite(inode-i_sb); no_delete: @@ -362,9 +365,26 @@ no_delete: if (is_inode_flag_set(fi, FI_UPDATE_WRITE)) add_dirty_inode(sbi, inode-i_ino, UPDATE_INO); if (is_inode_flag_set(fi, FI_FREE_NID)) { - alloc_nid_failed(sbi, inode-i_ino); + if (err err != -ENOENT) + alloc_nid_done(sbi, inode-i_ino); + else + alloc_nid_failed(sbi, inode-i_ino); clear_inode_flag(fi, FI_FREE_NID); } + + if (err err != -ENOENT) { + if (!exist_written_data(sbi, inode-i_ino, ORPHAN_INO)) { + /* + * get here because we failed to release resource + * of inode previously, reminder our user to run fsck + * for fixing. + */ + set_sbi_flag(sbi, SBI_NEED_FSCK); + f2fs_msg(sbi-sb, KERN_WARNING, + inode (ino:%lu) resource leak, run fsck + to fix this issue!, inode-i_ino); + } + } out_clear: #ifdef CONFIG_F2FS_FS_ENCRYPTION if (fi-i_crypt_info) @@ -377,6 +397,7 @@ out_clear: void handle_failed_inode(struct inode *inode) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + int err = 0; clear_nlink(inode); make_bad_inode(inode); @@ -384,9 +405,27 @@ void handle_failed_inode(struct inode *inode) i_size_write(inode, 0); if (F2FS_HAS_BLOCKS(inode)) - f2fs_truncate(inode, false); + err = f2fs_truncate(inode, false); + + if (!err) + err = remove_inode_page(inode); - remove_inode_page(inode); + /* + * if we skip truncate_node in remove_inode_page bacause we failed + * before, it's better to find another way to release resource of + * this inode (e.g. valid block count, node block or nid). Here we + * choose to add this inode to orphan list, so that we can call iput +
Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT
On Mon, 24 Aug 2015, Konstantin Khlebnikov wrote: On Mon, Aug 24, 2015 at 6:55 PM, Eric B Munson emun...@akamai.com wrote: On Mon, 24 Aug 2015, Konstantin Khlebnikov wrote: On Mon, Aug 24, 2015 at 6:09 PM, Eric B Munson emun...@akamai.com wrote: On Mon, 24 Aug 2015, Vlastimil Babka wrote: On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote: On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz wrote: On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote: I am in the middle of implementing lock on fault this way, but I cannot see how we will hanlde mremap of a lock on fault region. Say we have the following: addr = mmap(len, MAP_ANONYMOUS, ...); mlock(addr, len, MLOCK_ONFAULT); ... mremap(addr, len, 2 * len, ...) There is no way for mremap to know that the area being remapped was lock on fault so it will be locked and prefaulted by remap. How can we avoid this without tracking per vma if it was locked with lock or lock on fault? remap can count filled ptes and prefault only completely populated areas. Does (and should) mremap really prefault non-present pages? Shouldn't it just prepare the page tables and that's it? As I see mremap prefaults pages when it extends mlocked area. Also quote from manpage : If the memory segment specified by old_address and old_size is locked : (using mlock(2) or similar), then this lock is maintained when the segment is : resized and/or relocated. As a consequence, the amount of memory locked : by the process may change. Oh, right... Well that looks like a convincing argument for having a sticky VM_LOCKONFAULT after all. Having mremap guess by scanning existing pte's would slow it down, and be unreliable (was the area completely populated because MLOCK_ONFAULT was not used or because the process aulted it already? Was it not populated because MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to populate it all?). Given this, I am going to stop working in v8 and leave the vma flag in place. The only sane alternative is to populate always for mremap() of VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information as a limitation of mlock2(MLOCK_ONFAULT). Which might or might not be enough for Eric's usecase, but it's somewhat ugly. I don't think that this is the right solution, I would be really surprised as a user if an area I locked with MLOCK_ONFAULT was then fully locked and prepopulated after mremap(). If mremap is the only problem then we can add opposite flag for it: MREMAP_NOPOPULATE - do not populate new segment of locked areas - do not copy normal areas if possible (anonymous/special must be copied) addr = mmap(len, MAP_ANONYMOUS, ...); mlock(addr, len, MLOCK_ONFAULT); ... addr2 = mremap(addr, len, 2 * len, MREMAP_NOPOPULATE); ... But with this, the user must remember what areas are locked with MLOCK_LOCKONFAULT and which are locked the with prepopulate so the correct mremap flags can be used. Yep. Shouldn't be hard. You anyway have to do some changes in user-space. Sorry if I wasn't clear enough in my last reply, I think forcing userspace to track this is the wrong choice. The VM system is responsible for tracking these attributes and should continue to be. Much simpler for users-pace solution is a mm-wide flag which turns all further mlocks and MAP_LOCKED into lock-on-fault. Something like mlockall(MCL_NOPOPULATE_LOCKED). This set certainly adds the foundation for such a change if you think it would be useful. That particular behavior was not part of my inital use case though. signature.asc Description: Digital signature
[PATCH v7 5/8] Watchdog: introduce ARM SBSA watchdog driver
From: Fu Wei fu@linaro.org This driver bases on linux kernel watchdog framework, and use pretimeout in the framework. It supports getting timeout and pretimeout from parameter and FDT at the driver init stage. In first timeout, the interrupt routine run panic to save system context. Signed-off-by: Fu Wei fu@linaro.org --- drivers/watchdog/Kconfig | 14 ++ drivers/watchdog/Makefile| 1 + drivers/watchdog/sbsa_gwdt.c | 459 +++ 3 files changed, 474 insertions(+) diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index 241fafd..b2734f0 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -173,6 +173,20 @@ config ARM_SP805_WATCHDOG ARM Primecell SP805 Watchdog timer. This will reboot your system when the timeout is reached. +config ARM_SBSA_WATCHDOG + tristate ARM SBSA Generic Watchdog + depends on ARM64 + depends on ARM_ARCH_TIMER + select WATCHDOG_CORE + help + ARM SBSA Generic Watchdog. This watchdog has two Watchdog timeouts. + The first timeout will trigger a panic; the second timeout will + trigger a system reset. + More details: ARM DEN0029B - Server Base System Architecture (SBSA) + + To compile this driver as module, choose M here: The module + will be called sbsa_gwdt. + config AT91RM9200_WATCHDOG tristate AT91RM9200 watchdog depends on SOC_AT91RM9200 MFD_SYSCON diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile index 59ea9a1..be8e7c5 100644 --- a/drivers/watchdog/Makefile +++ b/drivers/watchdog/Makefile @@ -30,6 +30,7 @@ obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o # ARM Architecture obj-$(CONFIG_ARM_SP805_WATCHDOG) += sp805_wdt.o +obj-$(CONFIG_ARM_SBSA_WATCHDOG) += sbsa_gwdt.o obj-$(CONFIG_AT91RM9200_WATCHDOG) += at91rm9200_wdt.o obj-$(CONFIG_AT91SAM9X_WATCHDOG) += at91sam9_wdt.o obj-$(CONFIG_CADENCE_WATCHDOG) += cadence_wdt.o diff --git a/drivers/watchdog/sbsa_gwdt.c b/drivers/watchdog/sbsa_gwdt.c new file mode 100644 index 000..7ae45cc --- /dev/null +++ b/drivers/watchdog/sbsa_gwdt.c @@ -0,0 +1,459 @@ +/* + * SBSA(Server Base System Architecture) Generic Watchdog driver + * + * Copyright (c) 2015, Linaro Ltd. + * Author: Fu Wei fu@linaro.org + * Suravee Suthikulpanit suravee.suthikulpa...@amd.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * The SBSA Generic watchdog driver is compatible with the pretimeout + * concept of Linux kernel. + * The timeout and pretimeout are determined by WCV or WOR. + * The first watch period is set by writing WCV directly, that can + * support more than 10s timeout at the maximum system counter + * frequency (400MHz). + * When WS0 is triggered, the second watch period (pretimeout) is + * determined by one of these registers: + * (1)WOR: 32bit register, this gives a maximum watch period of + * around 10s at the maximum system counter frequency. It's loaded + * automatically by hardware. + * (2)WCV: If the pretimeout value is greater then max_wor_timeout, + * it will be loaded in WS0 interrupt routine. If system is in + * ws0_mode (reboot by kexec/kdump in panic with watchdog enabled + * and WS0 == true), the ping operation will only reload WCV. + * More details about the hardware specification of this device: + * ARM DEN0029B - Server Base System Architecture (SBSA) + * + * Kernel/API: P--| pretimeout + * |T timeout + * SBSA GWDT: P---WOR (or WCV)---WS1 pretimeout + * |---WCV--WS0~~~(ws0_mode)T timeout + */ + +#include linux/io.h +#include linux/interrupt.h +#include linux/module.h +#include linux/moduleparam.h +#include linux/of.h +#include linux/of_device.h +#include linux/platform_device.h +#include linux/uaccess.h +#include linux/watchdog.h +#include asm/arch_timer.h + +/* SBSA Generic Watchdog register definitions */ +/* refresh frame */ +#define SBSA_GWDT_WRR 0x000 + +/* control frame */ +#define SBSA_GWDT_WCS 0x000 +#define SBSA_GWDT_WOR 0x008 +#define SBSA_GWDT_WCV_LO 0x010 +#define SBSA_GWDT_WCV_HI 0x014 + +/* refresh/control frame */ +#define SBSA_GWDT_W_IIDR 0xfcc +#define SBSA_GWDT_IDR 0xfd0 + +/* Watchdog Control and Status Register */ +#define SBSA_GWDT_WCS_EN BIT(0) +#define
[PATCH v7 2/8] ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts
From: Fu Wei fu@linaro.org This can be a example of adding SBSA Generic Watchdog device node into some dts files for the Soc which contains SBSA Generic Watchdog. Acked-by: Arnd Bergmann a...@arndb.de Signed-off-by: Fu Wei fu@linaro.org --- arch/arm64/boot/dts/arm/foundation-v8.dts | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/boot/dts/arm/foundation-v8.dts b/arch/arm64/boot/dts/arm/foundation-v8.dts index 4eac8dc..824431f 100644 --- a/arch/arm64/boot/dts/arm/foundation-v8.dts +++ b/arch/arm64/boot/dts/arm/foundation-v8.dts @@ -237,4 +237,11 @@ }; }; }; + watchdog@2a44 { + compatible = arm,sbsa-gwdt; + reg = 0x0 0x2a44 0 0x1000, + 0x0 0x2a45 0 0x1000; + interrupts = 0 27 4; + timeout-sec = 10 5; + }; }; -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [lkp] [auxdisplay] 4edd70c133f: BUG: unable to handle kernel
On Thu, Aug 20, 2015 at 01:36:17PM +0800, kernel test robot wrote: FYI, we noticed the below changes on git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master commit 4edd70c133f3921c594883d8f9da31a7261f8b4f (auxdisplay: ks0108: use new parport device model) Sorry for the delay in replying. It has already been fixed by: 92f26189b181 (auxdisplay: ks0108: initialize local parport variable) regards sudip -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
scsi: convert host_busy to atomic_t series causes regressions for some hardware configurations
Thanks Christoph for the answer! Apparently I missed a piece of the thread where the test patch was originally proposed . Now, I have gone through it and I see how the patch was not meant to be a final correction. My (possibly naive) understanding is that: - Even if this might be due to hardware that not fully conforms to the standard (but we do not know right now), commit 74665016086615bbaa3fa6f83af410a0a4e029ee ( scsi: convert host_busy to atomic_t ) certainly breaks the kernel for some hardware configurations causing a regression. - If the regression was immediately spotted, the patch would probably have been revised right after proposal. Unfortunately, another bug - that got fixed only much later with 045065d8a300a37218c - hid the original issue for a long time. - Now that a lot of time has passed with the scsi: convert host_busy to atomic_t series in the kernel, going back to look into it is much more difficult. Libata people might not be very interested as they moved to other topics and might need a lot of time to go through it (it has been known since November 2014 - 9 months ago), possibly due to the race like nature of the issue and the fact that the bug might not be reproducible on their hardware... Is this correct? Aren't commits that cause regressions confirmed by multiple users expected (at least in principle) to be reverted? If reverting is too costy, wouldn't your papering over or making the scsi delay configurable be an acceptable solution? Even better: can in some way the libata-people be helped find the real culprit, given that there are at least two hardware setups that are known to trigger the regression (mine and Barto's)? I have tried the linux-ide mailing list, but got silence. Best, Sergio On 20/08/2015 10:08, Christoph Hellwig wrote: Hi Sergio, On Tue, Aug 18, 2015 at 09:44:28AM +0200, Sergio Callegari wrote: Hi, I have bisected the issue down to [045065d8a300a37218c548e9aa7becd581c6a0e8] [SCSI] fix qemu boot hang problem Bisecting has been a painful job due to the fact that the bug may show only many hours after the system boot. The commit above in fact is not the culprit, but a fix to an issue that was hiding the real bug on my system. See http://marc.info/?l=linux-kernelm=143973820612978w=2 The real issue is with sata host lock and seems to be biting a few other people as well https://bbs.archlinux.org/viewtopic.php?id=189324 A patch fixing the issue was sent to the LKML back in Nov 2014 by Christoph Hellwig (who is reading in CC) https://lkml.org/lkml/2014/11/20/581 I have tested the patch and it works for me. What is expected to happen now? As mentioned in that thread we need more input from the libata people on what kind of race this is papering over. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] ubifs: Allow O_DIRECT
Brian Norris computersforpe...@gmail.com writes: On Mon, Aug 24, 2015 at 10:13:25AM +0300, Artem Bityutskiy wrote: Now, some user-space fails when direct I/O is not supported. I think the whole argument rested on what it means when some user space fails; apparently that user space is just a test suite (which can/should be fixed). Even if it wasn't a test suite it should still fail. Either the fs supports O_DIRECT or it doesn't. Right now, the only way an application can figure this out is to try an open and see if it fails. Don't break that. Cheers, Jeff -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dom0 panic with Upstream Linux 4.1 tree
On 08/17/2015 09:32 AM, Zhenzhong Duan wrote: Hi Maintainers I found below panic when bootup OVM3.3.3 on HP PROLIANT DL980 G7 with dom0_mem=max:128G, not reproduce with dom0_mem=max:127G. Dom0 kernel is uek4 4.1.5-5.el6uek which is based on Upstream Linux 4.1 tree. This looks like an upstream issue. Appereciate any patch/fix. Thanks I don't think there is an easy patch in 4.1 to fix that. Your system has half of the physical memory above the 512GB boundary making it impossible for dom0 to use. Dom0 tries to use the memory layout of the physical host, so it can only use memory below 512GB. As you try to allocate 128GB for Dom0 some of the memory will end above the magic boundary (there is only a little bit less than 128GB below the boundary available). For 4.3 I have posted a patch series which will eventually make it into the kernel allowing Dom0 (and other pv-domains as well) to use memory above the 512GB boundary. Juergen -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Documentation/x86: Rename IRQSTACKSIZE to IRQ_STACK_SIZE
On Fri, 21 Aug 2015 15:19:06 +0600 Alexander Kuleshov kuleshovm...@gmail.com wrote: The IRQSTACKSIZE was renamed to the IRQ_STACK_SIZE in the (26f80bd6a9 x86-64: Convert irqstacks to per-cpu) commit, but it still named IRQSTACKSIZE in the documentation. This patch fixes this. Applied to the docs tree, thanks. jon -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] input: gpio-keys: report error when invalid key number
On Mon, Aug 24, 2015 at 08:07:44PM +0800, Peng Fan wrote: When the input key number is not valid one of '/sys/devices/soc0/gpio-keys/keys', need to report an error, but not continue. See the following example: root@yocto:/sys/devices/soc0/gpio-keys# cat keys 114-116 root@yocto:/sys/devices/soc0/gpio-keys# echo 77 keys root@yocto:/sys/devices/soc0/gpio-keys# we want 'echo 77 keys' to report an error, but not silence to give us an fake illusion that all is 'ok'. Signed-off-by: Peng Fan van.free...@gmail.com Cc: Dmitry Torokhov dmitry.torok...@gmail.com Cc: Linus Walleij linus.wall...@linaro.org Cc: Alexander Stein alexander.st...@systec-electronic.com Cc: Tejun Heo t...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Wolfram Sang w...@the-dreams.de Cc: Fabio Estevam fabio.este...@freescale.com Applied, thank you. --- drivers/input/keyboard/gpio_keys.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/input/keyboard/gpio_keys.c b/drivers/input/keyboard/gpio_keys.c index ddf4045..b98f3b4 100644 --- a/drivers/input/keyboard/gpio_keys.c +++ b/drivers/input/keyboard/gpio_keys.c @@ -239,6 +239,11 @@ static ssize_t gpio_keys_attr_store_helper(struct gpio_keys_drvdata *ddata, } } + if (i == ddata-pdata-nbuttons) { + error = -EINVAL; + goto out; + } + mutex_lock(ddata-disable_lock); for (i = 0; i ddata-pdata-nbuttons; i++) { -- 1.8.4.5 -- Dmitry -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] media: don't try to empty links list in media_entity_cleanup()
The media_entity_cleanup() function only cleans up the entity links list but this operation is already made in media_device_unregister_entity(). In most cases this should be harmless (besides having duplicated code) since the links list would be empty so the iteration would not happen but the links list is initialized in media_device_register_entity() so if a driver fails to register an entity with a media device and clean up the entity in the error path, a NULL deference pointer error will happen. So don't try to empty the links list in media_entity_cleanup() since is either done already or haven't been initialized yet. Signed-off-by: Javier Martinez Canillas jav...@osg.samsung.com --- drivers/media/media-entity.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/media/media-entity.c b/drivers/media/media-entity.c index fc6bb48027ab..acb65f734508 100644 --- a/drivers/media/media-entity.c +++ b/drivers/media/media-entity.c @@ -252,13 +252,6 @@ EXPORT_SYMBOL_GPL(media_entity_init); void media_entity_cleanup(struct media_entity *entity) { - struct media_link *link, *tmp; - - list_for_each_entry_safe(link, tmp, entity-links, list) { - media_gobj_remove(link-graph_obj); - list_del(link-list); - kfree(link); - } } EXPORT_SYMBOL_GPL(media_entity_cleanup); -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kernel/sysctl.c: If count including the terminating byte '\0' the write system call should retrun success.
On Mon, 24 Aug 2015 16:56:13 +0800 Sean Fu fxinr...@gmail.com wrote: when the input argument count including the terminating byte \0, The write system call return EINVAL on proc file. But it return success on regular file. E.g. Writting two bytes (1\0) to /proc/sys/net/ipv4/conf/eth0/rp_filter. write(fd, 1\0, 2) return EINVAL. And what would do that? What tool broke because of this? echo 1 /proc/sys/net/ipv4/conf/eth0/rp_filter works just fine. strlen(string) would not include the nul character. The only thing I could think of would be a sizeof(str), but then that would include someone hardcoding an integer in a string, like: char val[] = 1 write(fd, val, sizeof(val)); Again, what tool does that? If there is a tool out in the wild that use to work on 2.6 (and was running on 2.6 then, and not something that was created after that change), then we can consider this fix. -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v7 1/8] Documentation: add sbsa-gwdt.txt documentation
From: Fu Wei fu@linaro.org The sbsa-gwdt.txt documentation in devicetree/bindings/watchdog is for introducing SBSA(Server Base System Architecture) Generic Watchdog device node info into FDT. Acked-by: Arnd Bergmann a...@arndb.de Signed-off-by: Fu Wei fu@linaro.org --- .../devicetree/bindings/watchdog/sbsa-gwdt.txt | 32 ++ 1 file changed, 32 insertions(+) diff --git a/Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt b/Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt new file mode 100644 index 000..8b43640 --- /dev/null +++ b/Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt @@ -0,0 +1,32 @@ +* SBSA(Server Base System Architecture) Generic Watchdog + +The SBSA Generic Watchdog Timer is used for resetting the system after +two stages of timeout. +More details: ARM-DEN-0029 - Server Base System Architecture (SBSA) + +Required properties: +- compatible : Should at least contain arm,sbsa-gwdt. + +- reg : Specifies base physical address of the two register frames + and length of memory mapped region, order: + 1: Watchdog control frame + 2: Refresh frame. + +- interrupts : Should at least contain WS0 interrupt, + the WS1 interrupt is optional, order: + 1: WS0 interrupt + 2: WS1 interrupt + +Optional properties +- timeout-sec : Watchdog pre-timeout and timeout values (in seconds). + The first is timeout values, then pre-timeout. + +Example for FVP Foundation Model v8: + +watchdog@2a44 { + compatible = arm,sbsa-gwdt; + reg = 0x0 0x2a44 0 0x1000, + 0x0 0x2a45 0 0x1000; + interrupts = 0 27 4; + timeout-sec = 10 5; +}; -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v7 0/8] Watchdog: introduce ARM SBSA watchdog driver
From: Fu Wei fu@linaro.org This patchset: (1)Introduce Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt for FDT info of SBSA Generic Watchdog, and give two examples of adding SBSA Generic Watchdog device node into the dts files: foundation-v8.dts and amd-seattle-soc.dtsi. (2)Introduce pretimeout into the watchdog framework, and update Documentation/watchdog/watchdog-kernel-api.txt to introduce: (1)the new elements in the watchdog_device and watchdog_ops struct; (2)the new API watchdog_init_timeouts. (3)Introduce ARM SBSA watchdog driver: a.Use linux kernel watchdog framework; b.Work with FDT on ARM64; c.Use pretimeout in watchdog framework; d.Support getting timeout and pretimeout from parameter and FDT at the driver init stage. e.In the first timeout, do panic to save system context; f.In the second stage, user can still feed the dog without cleaning WS0. By this feature, we can avoid the panic infinite loops, while backing up a large system context in a server. g.In the second stage, can trigger WS1 by setting pretimeout = 0 if necessary. (4)Introduce ACPI GTDT parser: drivers/acpi/gtdt.c Parse SBSA Generic Watchdog Structure in GTDT table of ACPI, and create a platform device with that information. This platform device can be used by This Watchdog driver. drivers/clocksource/arm_arch_timer.c is simplified by this GTDT support. This patchset has been tested with watchdog daemon (ACPI/FDT, module/build-in) on the following platforms: (1)ARM Foundation v8 model Changelog: v7: Rebase to latest kernel version(4.2-rc7). Improve FDT support: geting resource by order, instead of name. According to the FDT support, Update the example dts file, gtdt.c and sbsa_gwdt.c. Pass the sparse test, and fix the warning. Fix the max_pretimeout and max_timeout value overflow bug. Delete the WCV output value. v6: Improve the dtb example files: reduce the register frame size to 4K. Improve pretimeout support: (1) improve watchdog_init_timeouts function (2) rename watchdog_check_min_max_timeouts back to the original name (1) improve watchdog_timeout_invalid/watchdog_pretimeout_invalid Add the new features in the sbsa_gwdt driver: (1) In the second stage, user can feed the dog without cleaning WS0. (2) In the second stage, user can trigger WS1 by setting pretimeout = 0. (3) expand the max value of pretimeout, in case 10 second is not enough for a kdump kernel reboot in panic. v5: Improve pretimeout support: (1)fix typo in documentation and comments. (2)fix the timeout limits validation bug. Simplify sbsa_gwdt driver: (1)integrate all the registers access functions into caller. v4: Refactor GTDT support code: remove it from arch/arm64/kernel/acpi.c, put it into drivers/acpi/gtdt.c file. Integrate the GTDT code of drivers/clocksource/arm_arch_timer.c into drivers/acpi/gtdt.c. Improve pretimeout support, fix pretimeout == 0 problem. Simplify sbsa_gwdt driver: (1)timeout/pretimeout limits setup; (2)keepalive function; (3)delete clk == 0 check; (4)delete WS0 status bit check in interrupt routine; (5)sbsa_gwdt_set_wcv function. v3: Delete export arch_timer_get_rate patch. Driver back to use arch_timer_get_cntfrq. Improve watchdog_init_timeouts function and update relevant documentation. Improve watchdog_timeout_invalid and watchdog_pretimeout_invalid. Improve foundation-v8.dts: delete the unnecessary tag of device node. Remove ARM64 || COMPILE_TEST from Kconfig. Add comments in arch/arm64/kernel/acpi.c Fix typoes and incorrect comments. v2: Improve watchdog-kernel-api.txt documentation for pretimeout support. Export arch_timer_get_rate in arm_arch_timer.c. Add watchdog_init_timeouts API for pretimeout support in framework. Improve suspend and resume foundation in driver Improve timeout/pretimeout values init code in driver. Delete unnecessary items of the sbsa_gwdt struct and #define. Delete all unnecessary debug info in driver. Fix 64bit division bug. Use the arch_timer interface to get watchdog clock rate. Add MODULE_DEVICE_TABLE for platform device id. Fix typoes. v1: The first version upstream patchset to linux mailing list. Fu Wei (8): Documentation: add sbsa-gwdt.txt documentation ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts ARM64: add SBSA Generic Watchdog device node in amd-seattle-soc.dtsi Watchdog: introdouce pretimeout into framework Watchdog: introduce ARM SBSA watchdog driver ACPI: add GTDT table parse driver into ACPI driver Watchdog: enable ACPI GTDT support for ARM SBSA watchdog driver clocksource: simplify ACPI code in arm_arch_timer.c
Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy
Hello, Austin. On Mon, Aug 24, 2015 at 11:47:02AM -0400, Austin S Hemmelgarn wrote: Just to learn more, what sort of hypervisor support threads are we talking about? They would have to consume considerable amount of cpu cycles for problems like this to be relevant and be dynamic in numbers in a way which letting them competing against vcpus makes sense. Do IO helpers meet these criteria? Depending on the configuration, yes they can. VirtualBox has some rather CPU intensive threads that aren't vCPU threads (their emulated APIC thread immediately comes to mind), and so does QEMU depending on the emulated And the number of those threads fluctuate widely and dynamically? hardware configuration (it gets more noticeable when the disk images are stored on a SAN and served through iSCSI, NBD, FCoE, or ATAoE, which is pretty typical usage for large virtualization deployments). I've seen cases first hand where the vCPU's can make no reasonable progress because they are constantly getting crowded out by other threads. That alone doesn't require hierarchical resource distribution tho. Setting nice levels reasonably is likely to alleviate most of the problem. The use of the term 'hypervisor support threads' for this is probably not the best way of describing the contention, as it's almost always a full system virtualization issue, and the contending threads are usually storage back-end access threads. I would argue that there are better ways to deal properly with this (Isolate the non vCPU threads on separate physical CPU's from the hardware emulation threads), but such methods require large systems to be practical at any scale, and many people don't have the budget for such large systems, and this way of doing things is much more flexible for small scale use cases (for example, someone running one or two VM's on a laptop under QEMU or VirtualBox). I don't know. Someone running one or two VM's on a laptop under QEMU doesn't really sound like the use case which absolutely requires hierarchical cpu cycle distribution. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] nfit, nd_blk: BLK status register is only 32 bits
Ross Zwisler ross.zwis...@linux.intel.com writes: Only read 32 bits for the BLK status register in read_blk_stat(). The format and size of this register is defined in the NVDIMM Driver Writer's guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf Signed-off-by: Ross Zwisler ross.zwis...@linux.intel.com Reported-by: Nicholas Moulin nicholas.w.mou...@linux.intel.com Looks fine, Reviewed-by: Jeff Moyer jmo...@redhat.com However, now that you've drawn attention to that code, I'll note that there is no checking of the pending or retry bits. In fact, ACPI_NFIT_CONTROL_BUFFERED isn't even checked upon loading the tables. Is this on a todo list somewhere? Cheers, Jeff --- drivers/acpi/nfit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c index 7c2638f..8689ee1 100644 --- a/drivers/acpi/nfit.c +++ b/drivers/acpi/nfit.c @@ -1009,7 +1009,7 @@ static void wmb_blk(struct nfit_blk *nfit_blk) wmb_pmem(); } -static u64 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw) +static u32 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw) { struct nfit_blk_mmio *mmio = nfit_blk-mmio[DCR]; u64 offset = nfit_blk-stat_offset + mmio-size * bw; @@ -1017,7 +1017,7 @@ static u64 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw) if (mmio-num_lines) offset = to_interleave_offset(offset, mmio); - return readq(mmio-base + offset); + return readl(mmio-base + offset); } static void write_blk_ctl(struct nfit_blk *nfit_blk, unsigned int bw, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH linux-next v4 5/5] mtd: atmel-quadspi: add driver for Atmel QSPI controller
Hi Marek, Le 24/08/2015 13:03, Marek Vasut a écrit : On Monday, August 24, 2015 at 12:14:00 PM, Cyrille Pitchen wrote: This driver add support to the new Atmel QSPI controller embedded into sama5d2x SoCs. It expects a NOR memory to be connected to the QSPI controller. Signed-off-by: Cyrille Pitchen cyrille.pitc...@atmel.com Acked-by: Nicolas Ferre nicolas.fe...@atmel.com Hi, [...] +/* Register access macros */ These are functions, not macros :) btw is there any reason for these ? I'd say, just put the read*() and write*() functions directly into the code and be done with it, it is much less confusing. Also, why do you use the _relaxed() versions of the functions ? +static inline u32 qspi_readl(struct atmel_qspi *aq, u32 reg) +{ +return readl_relaxed(aq-regs + reg); +} + +static inline void qspi_writel(struct atmel_qspi *aq, u32 reg, u32 value) +{ +writel_relaxed(value, aq-regs + reg); +} + +static inline u16 qspi_readw(struct atmel_qspi *aq, u32 reg) +{ +return readw_relaxed(aq-regs + reg); +} + +static inline void qspi_writew(struct atmel_qspi *aq, u32 reg, u16 value) +{ +writew_relaxed(value, aq-regs + reg); +} + +static inline u8 qspi_readb(struct atmel_qspi *aq, u32 reg) +{ +return readb_relaxed(aq-regs + reg); +} + +static inline void qspi_writeb(struct atmel_qspi *aq, u32 reg, u8 value) +{ +writeb_relaxed(value, aq-regs + reg); +} [...] +static int atmel_qspi_run_command(struct atmel_qspi *aq, + const struct atmel_qspi_command *cmd) +{ +u32 iar, icr, ifr, sr; +int err = 0; + +iar = 0; +icr = 0; +ifr = aq-ifr_width | cmd-ifr_tfrtyp; + +/* Compute instruction parameters */ +if (cmd-enable.bits.instruction) { +icr |= QSPI_ICR_INST(cmd-instruction); +ifr |= QSPI_IFR_INSTEN; +} + +/* Compute address parameters */ +switch (cmd-enable.bits.address) { +case 4: +ifr |= QSPI_IFR_ADDRL; +/*break;*/ /* fallback to the 24bit address case */ What's this commented out bit of code for ? :-) I just wanted to stress out there was no missing break;. I've reworded the comment to: /* No break on purpose: fallback to the 24bit address case. */ +case 3: +iar = (cmd-enable.bits.data) ? 0 : cmd-address; +ifr |= QSPI_IFR_ADDREN; +break; +case 0: +break; +default: +return -EINVAL; +} [...] +no_data: +/* Poll INSTRuction End status */ +sr = qspi_readl(aq, QSPI_SR); +if (sr QSPI_SR_INSTRE) +return err; + +/* Wait for INSTRuction End interrupt */ +init_completion(aq-completion); You should use reinit_completion() in the code. init_completion() should be used only in the probe() function and nowhere else. Alright. In the next version I'll rename the completion member of struct atmel_qspi into cmd_completion. Also I'll add another dma_completion member in this very same structure to replace the local struct completion completion in atmel_qspi_run_dma_transfer(). Then I'll call init_completion() on both cmd_completion and dma_completion only from atmel_qspi_probe() and reinit_completion() elsewhere. +aq-pending = 0; +qspi_writel(aq, QSPI_IER, QSPI_SR_INSTRE); +if (!wait_for_completion_timeout(aq-completion, + msecs_to_jiffies(1000))) +err = -ETIMEDOUT; +qspi_writel(aq, QSPI_IDR, QSPI_SR_INSTRE); + +return err; +} [...] Hope this helps :) Indeed, it does! I still work on the next version of this series to take all your comments into account. Best regards, Cyrille -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mtd: nand: pass page number to ecc-write_xxx() methods
The -read_xxx() methods are all passed the page number the NAND controller is supposed to read, but -write_xxx() do not have such a parameter. This is a problem if we want to properly implement data scrambling/randomization in order to mitigate MLC sensibility to repeated pattern: to prevent bitflips in adjacent pages in the same block we need to avoid repeating the same pattern at the same offset in those pages, hence the randomizer/scrambler engine need to be passed the page value in order to adapt its seed accordingly. Moreover, adding the page parameter the -write_xxx() methods add some consistency to the current model. Signed-off-by: Boris Brezillon boris.brezil...@free-electrons.com CC: Josh Wu josh...@atmel.com CC: Ezequiel Garcia ezequiel.gar...@free-electrons.com CC: Maxime Ripard maxime.rip...@free-electrons.com CC: Greg Kroah-Hartman gre...@linuxfoundation.org CC: Huang Shijie shijie.hu...@intel.com CC: Bryan Wu bryan...@analog.com CC: de...@driverdev.osuosl.org CC: linux-arm-ker...@lists.infradead.org CC: linux-kernel@vger.kernel.org --- drivers/mtd/nand/atmel_nand.c | 6 -- drivers/mtd/nand/bf5xx_nand.c | 3 ++- drivers/mtd/nand/brcmnand/brcmnand.c | 4 ++-- drivers/mtd/nand/cafe_nand.c | 3 ++- drivers/mtd/nand/denali.c | 5 +++-- drivers/mtd/nand/docg4.c | 4 ++-- drivers/mtd/nand/fsl_elbc_nand.c | 4 ++-- drivers/mtd/nand/fsl_ifc_nand.c | 2 +- drivers/mtd/nand/gpmi-nand/gpmi-nand.c| 6 +++--- drivers/mtd/nand/hisi504_nand.c | 3 ++- drivers/mtd/nand/lpc32xx_mlc.c| 3 ++- drivers/mtd/nand/lpc32xx_slc.c| 5 +++-- drivers/mtd/nand/nand_base.c | 31 ++- drivers/mtd/nand/omap2.c | 3 ++- drivers/mtd/nand/pxa3xx_nand.c| 3 ++- drivers/mtd/nand/sh_flctl.c | 3 ++- drivers/mtd/nand/sunxi_nand.c | 5 +++-- drivers/staging/mt29f_spinand/mt29f_spinand.c | 3 ++- include/linux/mtd/nand.h | 6 +++--- 19 files changed, 63 insertions(+), 39 deletions(-) diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c index 46010bd..d0f50c9 100644 --- a/drivers/mtd/nand/atmel_nand.c +++ b/drivers/mtd/nand/atmel_nand.c @@ -954,7 +954,8 @@ static int atmel_nand_pmecc_read_page(struct mtd_info *mtd, } static int atmel_nand_pmecc_write_page(struct mtd_info *mtd, - struct nand_chip *chip, const uint8_t *buf, int oob_required) + struct nand_chip *chip, const uint8_t *buf, int oob_required, + int page) { struct atmel_nand_host *host = chip-priv; uint32_t *eccpos = chip-ecc.layout-eccpos; @@ -2005,7 +2006,8 @@ static int nfc_sram_write_page(struct mtd_info *mtd, struct nand_chip *chip, if (likely(!raw)) /* Need to write ecc into oob */ - status = chip-ecc.write_page(mtd, chip, buf, oob_required); + status = chip-ecc.write_page(mtd, chip, buf, oob_required, + page); if (status 0) return status; diff --git a/drivers/mtd/nand/bf5xx_nand.c b/drivers/mtd/nand/bf5xx_nand.c index 4d8d4ba..17b3727 100644 --- a/drivers/mtd/nand/bf5xx_nand.c +++ b/drivers/mtd/nand/bf5xx_nand.c @@ -566,7 +566,8 @@ static int bf5xx_nand_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip } static int bf5xx_nand_write_page_raw(struct mtd_info *mtd, - struct nand_chip *chip, const uint8_t *buf, int oob_required) + struct nand_chip *chip, const uint8_t *buf, int oob_required, + int page) { bf5xx_nand_write_buf(mtd, buf, mtd-writesize); bf5xx_nand_write_buf(mtd, chip-oob_poi, mtd-oobsize); diff --git a/drivers/mtd/nand/brcmnand/brcmnand.c b/drivers/mtd/nand/brcmnand/brcmnand.c index fddb795..9a4e345 100644 --- a/drivers/mtd/nand/brcmnand/brcmnand.c +++ b/drivers/mtd/nand/brcmnand/brcmnand.c @@ -1606,7 +1606,7 @@ out: } static int brcmnand_write_page(struct mtd_info *mtd, struct nand_chip *chip, - const uint8_t *buf, int oob_required) + const uint8_t *buf, int oob_required, int page) { struct brcmnand_host *host = chip-priv; void *oob = oob_required ? chip-oob_poi : NULL; @@ -1617,7 +1617,7 @@ static int brcmnand_write_page(struct mtd_info *mtd, struct nand_chip *chip, static int brcmnand_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip, const uint8_t *buf, - int oob_required) + int oob_required, int page) { struct brcmnand_host *host = chip-priv; void *oob = oob_required ? chip-oob_poi : NULL; diff --git a/drivers/mtd/nand/cafe_nand.c
Re: [PATCH] usb: phy: msm: Unregister driver interest for VBUS and ID events
On 08/18/2015 12:56 AM, Ivan T. Ivanov wrote: Right now even if driver failed to probe extcon framework will still deliver its VBUS and ID events, which will lead to random exception codes. Fix this by removing driver interest for VBUS and ID events when probe fail. Fixes: 591fc116f330 (usb: phy: msm: Use extcon framework for VBUS and ID detection) Reported-by: Tim Bird tim.b...@sonymobile.com Signed-off-by: Ivan T. Ivanov ivan.iva...@linaro.org --- drivers/usb/phy/phy-msm-usb.c | 26 +- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/drivers/usb/phy/phy-msm-usb.c b/drivers/usb/phy/phy-msm-usb.c index 00c49bb1bd29..a9082567f114 100644 --- a/drivers/usb/phy/phy-msm-usb.c +++ b/drivers/usb/phy/phy-msm-usb.c @@ -1581,6 +1581,8 @@ static int msm_otg_read_dt(struct platform_device *pdev, struct msm_otg *motg) ret = extcon_register_interest(motg-id.conn, ext_id-name, USB-HOST, motg-id.nb); if (ret 0) { + if (!IS_ERR(ext_vbus)) + extcon_unregister_interest(motg-vbus.conn); dev_err(pdev-dev, register ID notifier failed\n); return ret; } ... This patch is obsoleted by commit 83b7b67c7, which changes the extcon API a bit (from register_interest to register_notifier, among other things). But, in general, I would expect this approach to work. Do you want me to re-spin this with the new API? -- Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Documentation: add 'crashkernel=auto' entry into kernel-parameters.txt
On Mon, 24 Aug 2015 23:04:29 +0800 Yaowei Bai bywxiao...@163.com wrote: There is no 'crashkernel=auto' entry in kernel-parameters.txt, borrow it from kexec-kdump-howto.txt file in the kexec-tools-2.0.0 package. OK, so I did some digging here. As far as I can tell, there is no crashkernel=auto entry because the auto-reserve patch has never been merged into the mainline kernel. RHEL kernels appear to have it, but mainline doesn't. Thus, merging this patch would make the documentation incorrect, something I'd rather not do. I appreciate efforts to improve the kernel's documentation, but it is important to be sure that your proposed changes make the docs closer to reality, rather than further away. Thanks, jon -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] docs: update HOWTO for 3.x - 4.x versioning
On Mon, 24 Aug 2015 09:33:09 -0500 Mario Carrillo mario.alfredo.c.arev...@intel.com wrote: The HOWTO document needed updating for the new kernel versioning. As with various others, this document would benefit from changes that would get it away from specific major version numbers. In the absence of that, though, we might as well at least make it current; patch applied to the docs tree. Thanks, jon -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pci: acpi: Generic function for setting up PCI device DMA coherency
On Mon, Aug 24, 2015 at 9:41 AM, Suravee Suthikulpanit suravee.suthikulpa...@amd.com wrote: Hi, Ping. Does anyone have any comments or suggestions? Yes, I sent you some ideas a couple weeks ago. I'll resend them. On 8/13/15 16:58, Suravee Suthikulpanit wrote: This patch refactors of_pci_dma_configure() into a more generic pci_dma_configure(), which can be reused by non-OF code. Then, it adds support for setting up PCI device DMA coherency from ACPI _CCA object that should normally be specified in the DSDT node of its PCI host bridge.. Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com CC: Bjorn Helgaas bhelg...@google.com CC: Catalin Marinas catalin.mari...@arm.com CC: Will Deacon will.dea...@arm.com CC: Rafael J. Wysocki r...@rjwysocki.net CC: Rob Herring robh...@kernel.org CC: Murali Karicheri m-kariche...@ti.com --- Note: According to the ACPI spec, the _CCA attribute is required for ARM64. Therefore, this patch is a pre-req for ACPI PCI support for ARM64 which is currently in development. Also, this should not affect other architectures since if CCA is not required, the default value is coherent. Please see include/acpi/acpi_bus.h: acpi_check_dma() and drivers/acpi/scan.c: acpi_init_coherency() for more information drivers/of/of_pci.c| 20 drivers/pci/probe.c| 35 +-- include/linux/of_pci.h | 3 --- 3 files changed, 33 insertions(+), 25 deletions(-) diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c index 5751dc5..b66ee4e 100644 --- a/drivers/of/of_pci.c +++ b/drivers/of/of_pci.c @@ -117,26 +117,6 @@ int of_get_pci_domain_nr(struct device_node *node) } EXPORT_SYMBOL_GPL(of_get_pci_domain_nr); -/** - * of_pci_dma_configure - Setup DMA configuration - * @dev: ptr to pci_dev struct of the PCI device - * - * Function to update PCI devices's DMA configuration using the same - * info from the OF node of host bridge's parent (if any). - */ -void of_pci_dma_configure(struct pci_dev *pci_dev) -{ - struct device *dev = pci_dev-dev; - struct device *bridge = pci_get_host_bridge_device(pci_dev); - - if (!bridge-parent) - return; - - of_dma_configure(dev, bridge-parent-of_node); - pci_put_host_bridge_device(bridge); -} -EXPORT_SYMBOL_GPL(of_pci_dma_configure); - #if defined(CONFIG_OF_ADDRESS) /** * of_pci_get_host_bridge_resources - Parse PCI host bridge resources from DT diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index cefd636..e2fcd3b 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -6,12 +6,14 @@ #include linux/delay.h #include linux/init.h #include linux/pci.h -#include linux/of_pci.h +#include linux/of_device.h #include linux/pci_hotplug.h #include linux/slab.h #include linux/module.h #include linux/cpumask.h #include linux/pci-aspm.h +#include linux/acpi.h +#include linux/property.h #include asm-generic/pci-bridge.h #include pci.h @@ -1544,6 +1546,35 @@ static void pci_init_capabilities(struct pci_dev *dev) pci_enable_acs(dev); } +/** + * pci_dma_configure - Setup DMA configuration + * @pci_dev: ptr to pci_dev struct of the PCI device + * + * Function to update PCI devices's DMA configuration using the same + * info from the OF node or ACPI node of host bridge's parent (if any). + */ +static void pci_dma_configure(struct pci_dev *pci_dev) +{ + struct device *dev = pci_dev-dev; + struct device *bridge = pci_get_host_bridge_device(pci_dev); + struct acpi_device *adev; + bool coherent; + + if (has_acpi_companion(bridge)) { + adev = to_acpi_node(bridge-fwnode); + if (acpi_check_dma(adev, coherent)) + arch_setup_dma_ops(dev, 0, 0, NULL, coherent); + } else { + struct device *host = bridge-parent; + if (!host) + return; + + of_dma_configure(dev, host-of_node); + } + + pci_put_host_bridge_device(bridge); +} + void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) { int ret; @@ -1557,7 +1588,7 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) dev-dev.dma_mask = dev-dma_mask; dev-dev.dma_parms = dev-dma_parms; dev-dev.coherent_dma_mask = 0xull; - of_pci_dma_configure(dev); + pci_dma_configure(dev); pci_set_dma_max_seg_size(dev, 65536); pci_set_dma_seg_boundary(dev, 0x); diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h index 29fd3fe..ce0e5ab 100644 --- a/include/linux/of_pci.h +++ b/include/linux/of_pci.h @@ -16,7 +16,6 @@ int of_pci_get_devfn(struct device_node *np); int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin); int
RE: [PATCH v8 1/2] irqchip: imx-gpcv2: IMX GPCv2 driver for wakeup sources
On Mon, 24 Aug 2015, Shenwei Wang wrote: +static int gpcv2_wakeup_source_save(void) { + struct gpcv2_irqchip_data *cd; + void __iomem *reg; + int i; + + cd = imx_gpcv2_instance; + if (!cd) + return 0; + + for (i = 0; i IMR_NUM; i++) { + reg = cd-gpc_base + cd-cpu2wakeup + i * 4; + cd-enabled_irqs[i] = readl_relaxed(reg); You read the full state of the register and restore the full state. So why enabled_irqs? There are two user scenarios: In CPU Idle state, the system need to be woke up by any enabled irqs, not just the ones that marked as wakeup sources. In Suspend State, they system will only be woke up by the one that marked as a wakeup source. Enabled_irqs are used to save the values before suspend, and restore them after resume. That's what you want achieve. Still you save the full content of the registers and restore the full content. That saves/restores the enabled and disabled interrupts. So enabled_irqs is a misnomer as you save the full state. + writel_relaxed(cd-wakeup_sources[i], reg); + } + + return 0; +} + +static void gpcv2_wakeup_source_restore(void) { + struct gpcv2_irqchip_data *cd; + void __iomem *reg; + int i; + + cd = imx_gpcv2_instance; + if (!cd) + return; + + for (i = 0; i IMR_NUM; i++) { + reg = cd-gpc_base + cd-cpu2wakeup + i * 4; + writel_relaxed(cd-enabled_irqs[i], reg); + cd-wakeup_sources[i] = ~0; Why are you clearing that info on resume? Drivers will clear that via set_wake() or leave it when they want to have resume functionality? Each time system goes into the suspend state, it will call set_wake (ON) again to configure the wakeup sources. Clearing wakeup_sources here can make sure the system work as expected no matter that a driver calls set_wake (OFF) during resume stage. We rather make sure that the drivers call set_wake(OFF) as they are supposed to, because if they do not then the set_wake(ON) logic in the core code will see the counter != 0 and not invoke the irq callback. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] crypto: KEYS: convert public key to the akcipher API
Hi Stephan, On 08/15/2015 11:08 AM, Stephan Mueller wrote: Am Mittwoch, 12. August 2015, 20:54:39 schrieb Tadeusz Struk: Hi Tadeusz, @@ -41,7 +41,7 @@ struct pkcs7_parse_context { static void pkcs7_free_signed_info(struct pkcs7_signed_info *sinfo) { if (sinfo) { -mpi_free(sinfo-sig.mpi[0]); +kfree(sinfo-sig.s); kzfree? kfree(sinfo-sig.digest); kzfree? kfree(sinfo-signing_cert_id); kfree(sinfo); kzfree (due to -msdigest)? Sorry for late response. I was on vacation. All these above are module signatures, which are not sensitive, so no need to zero the buffers on free. The only thing that is sensitive is the private key, which is only used for signing modules on make modules_install and never included in the kernel. Thanks, T -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH
From: Eugene Shatokhin eugene.shatok...@rosalab.ru Date: Wed, 19 Aug 2015 14:59:01 +0300 So the following might be possible, although unlikely: CPU0 CPU1 clear_bit: read dev-flags clear_bit: clear EVENT_RX_KILL in the read value dev-flags=0; clear_bit: write updated dev-flags As a result, dev-flags may become non-zero again. Is this really possible? Stores really are atomic in the sense that the do their update in one indivisible operation. Atomic operations like clear_bit also will behave that way. If a clear_bit is in progress, the dev-flags=0 store will not be able to grab the cache line exclusively until the clear_bit is done. So I think the above sequent of events is completely impossible. Once a clear_bit starts, a write by another foreign agent on the bus is absolutely impossible to legally occur until the clear_bit completes. I think this is a non-issue. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] arm: kgdb: patch_text() in kgdb_arch_set_breakpoint() may sleep
On Sun, Aug 23, 2015 at 7:45 PM, Doug Anderson diand...@chromium.org wrote: On Wed, Aug 5, 2015 at 8:50 AM, Aapo Vienamo avien...@nvidia.com wrote: Hi, The breakpoint setting code in arch/arm/kernel/kgdb.c calls patch_text(), which ends up trying to sleep while in interrupt context. The bug was introduced by commit: 23a4e40 arm: kgdb: Handle read-only text / modules. The resulting behavior is BUG: scheduling while atomic... when setting a breakpoint in kgdb. This was tested on an Nvidia Jetson TK1 board with 4.2.0-rc5-next-20150805 kernel. Regards, Aapo Vienamo Aapo, Including the stack trace with this would have been helpful, though it's not too hard to reproduce. Here it is: [ 416.510559] BUG: scheduling while atomic: swapper/0/0/0x00010007 [ 416.516554] Modules linked in: [ 416.519614] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc7-00133-geb63b34 #1073 [ 416.527341] Hardware name: Rockchip (Device Tree) [ 416.532042] [c0017a4c] (unwind_backtrace) from [c00133d4] (show_stack+0x20/0x24) [ 416.539772] [c00133d4] (show_stack) from [c05400e8] (dump_stack+0x84/0xb8) [ 416.546983] [c05400e8] (dump_stack) from [c004913c] (__schedule_bug+0x54/0x6c) [ 416.554540] [c004913c] (__schedule_bug) from [c054065c] (__schedule+0x80/0x668) [ 416.562183] [c054065c] (__schedule) from [c0540cfc] (schedule+0xb8/0xd4) [ 416.569219] [c0540cfc] (schedule) from [c0543a3c] (schedule_timeout+0x2c/0x234) [ 416.576861] [c0543a3c] (schedule_timeout) from [c05417c0] (wait_for_common+0xf4/0x188) [ 416.585109] [c05417c0] (wait_for_common) from [c0541874] (wait_for_completion+0x20/0x24) [ 416.593531] [c0541874] (wait_for_completion) from [c00a0104] (__stop_cpus+0x58/0x70) [ 416.601608] [c00a0104] (__stop_cpus) from [c00a0580] (stop_cpus+0x3c/0x54) [ 416.608817] [c00a0580] (stop_cpus) from [c00a06c4] (__stop_machine+0xcc/0xe8) [ 416.616286] [c00a06c4] (__stop_machine) from [c00a0714] (stop_machine+0x34/0x44) [ 416.624016] [c00a0714] (stop_machine) from [c00173e8] (patch_text+0x28/0x34) [ 416.631399] [c00173e8] (patch_text) from [c001733c] (kgdb_arch_set_breakpoint+0x40/0x4c) [ 416.639823] [c001733c] (kgdb_arch_set_breakpoint) from [c00a0d68] (kgdb_validate_break_address+0x2c/0x60) [ 416.649719] [c00a0d68] (kgdb_validate_break_address) from [c00a0e90] (dbg_set_sw_break+0x1c/0xdc) [ 416.658922] [c00a0e90] (dbg_set_sw_break) from [c00a2e88] (gdb_serial_stub+0x9c4/0xba4) [ 416.667259] [c00a2e88] (gdb_serial_stub) from [c00a11cc] (kgdb_cpu_enter+0x1f8/0x60c) [ 416.675423] [c00a11cc] (kgdb_cpu_enter) from [c00a18cc] (kgdb_handle_exception+0x19c/0x1d0) [ 416.684106] [c00a18cc] (kgdb_handle_exception) from [c0016f7c] (kgdb_compiled_brk_fn+0x30/0x3c) [ 416.693135] [c0016f7c] (kgdb_compiled_brk_fn) from [c00091a4] (do_undefinstr+0x1a4/0x20c) [ 416.701643] [c00091a4] (do_undefinstr) from [c001400c] (__und_svc_finish+0x0/0x34) [ 416.709543] Exception stack(0xc07c1ce8 to 0xc07c1d30) [ 416.714584] 1ce0: c07c6504 c086e290 c086e294 c086e294 c086e290 [ 416.722745] 1d00: c07c6504 0067 0001 c07c2100 0027 c07c1d4c c07c1d50 c07c1d30 [ 416.730905] 1d20: c00a0990 c00a08d0 6193 [ 416.735947] [c001400c] (__und_svc_finish) from [c00a08d0] (kgdb_breakpoint+0x58/0x94) [ 416.744110] [c00a08d0] (kgdb_breakpoint) from [c00a0990] (sysrq_handle_dbg+0x58/0x6c) [ 416.752273] [c00a0990] (sysrq_handle_dbg) from [c02c230c] (__handle_sysrq+0xac/0x15c) [ 416.760437] [c02c230c] (__handle_sysrq) from [c02c23ec] (handle_sysrq+0x30/0x34) Kees: I think you've dealt with a lot more of these types of issues than I have. Any quick thoughts? If not I can put it on my long-term list of things to do, but until then we could always just post a Revert... I don't think a revert is in order here. CONFIG_DEBUG_RODATA could be turned off for builds where you need kgdb while this bug gets found. I don't actually see where we've gone wrong, though. Looks like scheduling happened while waiting for CPUs to stop? Where did we enter atomic? Perhaps we need to test if we're already atomic in patch_text, and only call stop_machine if we need to? Untested (and likely mangled by gmail): diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c index 69bda1a5707e..855696bfe072 100644 --- a/arch/arm/kernel/patch.c +++ b/arch/arm/kernel/patch.c @@ -124,5 +124,8 @@ void __kprobes patch_text(void *addr, unsigned int insn) .insn = insn, }; - stop_machine(patch_text_stop_machine, patch, NULL); + if (unlikely(in_atomic_preempt_off())) + patch_text_stop_machine(patch); + else + stop_machine(patch_text_stop_machine, patch, NULL); } -Kees -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: [PATCH linux-next v4 5/5] mtd: atmel-quadspi: add driver for Atmel QSPI controller
On Monday, August 24, 2015 at 07:04:38 PM, Cyrille Pitchen wrote: Hi Marek, Hi! Le 24/08/2015 13:03, Marek Vasut a écrit : On Monday, August 24, 2015 at 12:14:00 PM, Cyrille Pitchen wrote: This driver add support to the new Atmel QSPI controller embedded into sama5d2x SoCs. It expects a NOR memory to be connected to the QSPI controller. [...] + /* Compute address parameters */ + switch (cmd-enable.bits.address) { + case 4: + ifr |= QSPI_IFR_ADDRL; + /*break;*/ /* fallback to the 24bit address case */ What's this commented out bit of code for ? :-) I just wanted to stress out there was no missing break;. I've reworded the comment to: /* No break on purpose: fallback to the 24bit address case. */ Oh, the address is in bytes . I see, yes, it makes sense to be more explicit here about the purpose of the fallback. I think this change in the comment will make it easier for everyone who comes back in a few years and reads this code. + case 3: + iar = (cmd-enable.bits.data) ? 0 : cmd-address; + ifr |= QSPI_IFR_ADDREN; + break; + case 0: + break; + default: + return -EINVAL; + } [...] +no_data: + /* Poll INSTRuction End status */ + sr = qspi_readl(aq, QSPI_SR); + if (sr QSPI_SR_INSTRE) + return err; + + /* Wait for INSTRuction End interrupt */ + init_completion(aq-completion); You should use reinit_completion() in the code. init_completion() should be used only in the probe() function and nowhere else. Alright. In the next version I'll rename the completion member of struct atmel_qspi into cmd_completion. Also I'll add another dma_completion member in this very same structure to replace the local struct completion completion in atmel_qspi_run_dma_transfer(). Then I'll call init_completion() on both cmd_completion and dma_completion only from atmel_qspi_probe() and reinit_completion() elsewhere. + aq-pending = 0; + qspi_writel(aq, QSPI_IER, QSPI_SR_INSTRE); + if (!wait_for_completion_timeout(aq-completion, + msecs_to_jiffies(1000))) + err = -ETIMEDOUT; + qspi_writel(aq, QSPI_IDR, QSPI_SR_INSTRE); + + return err; +} [...] Hope this helps :) Indeed, it does! I still work on the next version of this series to take all your comments into account. Thanks :) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 5/5] arm64: add KASan support
On Mon, Aug 24, 2015 at 05:15:22PM +0300, Andrey Ryabinin wrote: Yes, ~130Mb (3G/1G split) should work. 512Mb shadow is optional. The only advantage of 512Mb shadow is better handling of user memory accesses bugs (access to user memory without copy_from_user/copy_to_user/strlen_user etc API). No need for that to be handed by KASan. I have patches in linux-next, now acked by Will, which prevent the kernel accessing userspace with zero memory footprint. No need for remapping, we have a way to quickly turn off access to userspace mapped pages on non-LPAE 32-bit CPUs. (LPAE is not supported yet - Catalin will be working on that using the hooks I'm providing once he returns.) This isn't a debugging thing, it's a security hardening thing. Some use-after-free bugs are potentially exploitable from userspace. See the recent blackhat conference paper. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 8/8] clocksource: simplify ACPI code in arm_arch_timer.c
On Tue, 25 Aug 2015, fu@linaro.org wrote: You Cc the world and some more on your patch, but you fail to add the maintainers of the clocksource code to the Cc list. Sigh. From: Fu Wei fu@linaro.org The patch update arm_arch_timer driver to use the function provided by the new GTDT driver of ACPI. By this way, arm_arch_timer.c can be simplified, and separate all the ACPI GTDT knowledge from this timer driver. That's not a proper changelog and this patch want's to be split in two: 1) Implement the new ACPI function 2) Make use of it index 0aa135d..99505bb 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -817,68 +817,30 @@ CLOCKSOURCE_OF_DECLARE(armv7_arch_timer_mem, arm,armv7-timer-mem, arch_timer_mem_init); #ifdef CONFIG_ACPI -static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags) -{ - int trigger, polarity; - - if (!interrupt) - return 0; - - trigger = (flags ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE - : ACPI_LEVEL_SENSITIVE; - - polarity = (flags ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW - : ACPI_ACTIVE_HIGH; - - return acpi_register_gsi(NULL, interrupt, trigger, polarity); -} - /* Initialize per-processor generic timer */ -static int __init arch_timer_acpi_init(struct acpi_table_header *table) +void __init arch_timer_acpi_init(void) { And how is that supposed to work when we have next generation CPUs which implement a different timer? You break multisystem kernels that way. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Input: elan_i2c - enable ELAN0100 acpi panels
On Sat, Aug 22, 2015 at 09:37:52AM +0200, Michele Curti wrote: Enable ELAN0100 touchpad driver, found on a Asus X205TA laptop, to gai 2,3 fingers tap and 2 fingers scroll. Signed-off-by: Michele Curti michele.cu...@gmail.com Applied, thank you (Duson, I put you as 'reviewed-by'). --- drivers/hid/hid-core.c | 1 + drivers/input/mouse/elan_i2c_core.c | 4 2 files changed, 5 insertions(+) diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index 22afab9..70a11ac 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -2294,6 +2294,7 @@ static const struct hid_device_id hid_ignore_list[] = { { HID_USB_DEVICE(USB_VENDOR_ID_DREAM_CHEEKY, 0x0004) }, { HID_USB_DEVICE(USB_VENDOR_ID_DREAM_CHEEKY, 0x000a) }, { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, 0x0400) }, + { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, 0x0401) }, { HID_USB_DEVICE(USB_VENDOR_ID_ESSENTIAL_REALITY, USB_DEVICE_ID_ESSENTIAL_REALITY_P5) }, { HID_USB_DEVICE(USB_VENDOR_ID_ETT, USB_DEVICE_ID_TC5UH) }, { HID_USB_DEVICE(USB_VENDOR_ID_ETT, USB_DEVICE_ID_TC4UM) }, diff --git a/drivers/input/mouse/elan_i2c_core.c b/drivers/input/mouse/elan_i2c_core.c index 67388f4..bbdaedc 100644 --- a/drivers/input/mouse/elan_i2c_core.c +++ b/drivers/input/mouse/elan_i2c_core.c @@ -98,6 +98,9 @@ static int elan_get_fwinfo(u8 ic_type, u16 *vaildpage_count, u16 *signature_address) { switch(ic_type) { + case 0x08: + *vaildpage_count = 512; + break; case 0x09: *vaildpage_count = 768; break; @@ -1165,6 +1168,7 @@ MODULE_DEVICE_TABLE(i2c, elan_id); #ifdef CONFIG_ACPI static const struct acpi_device_id elan_acpi_id[] = { { ELAN, 0 }, + { ELAN0100, 0 }, { ELAN0600, 0 }, { } }; -- 2.5.0 -- Dmitry -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH linux-next v4 3/5] mtd: spi-nor: allow to tune the number of dummy cycles
On Monday, August 24, 2015 at 06:42:46 PM, Cyrille Pitchen wrote: Hi Marek, Hi! [...] - * Dummy Cycle calculation for different type of read. - * It can be used to support more commands with - * different dummy cycle requirements. - */ -static inline int spi_nor_read_dummy_cycles(struct spi_nor *nor) -{ - switch (nor-flash_read) { - case SPI_NOR_FAST: - case SPI_NOR_DUAL: - case SPI_NOR_QUAD: - return 8; - case SPI_NOR_NORMAL: - return 0; - } - return 0; -} You can probably just soup up this function so that it sets the nor-read_dummy, no ? Actually, this is what the patch does: spi_nor_read_dummy_cycles() was reused and enhanced few lines below where you've pointed out the switch (nor-flash_read) block should be move after the else block. You know what? I'll go get some sleep, coffee doesn't cut it anymore :) I think when I wrote the code I've chosen to move the definition of this function instead of adding forward declarations of functions such as read_cr() or write_sr_cr(), which are now called by micron_set_dummy_cycles(). Yep, that's all right, sorry for the confusion. -/* * Write status register 1 byte * Returns negative if error occurred. */ @@ -1012,6 +994,81 @@ static int set_quad_mode(struct spi_nor *nor, struct flash_info *info) } } [...] +/* + * Dummy Cycle calculation for different type of read. + * It can be used to support more commands with + * different dummy cycle requirements. + */ +static int spi_nor_read_dummy_cycles(struct spi_nor *nor, + const struct flash_info *info) +{ + struct device_node *np = nor-dev-of_node; + u32 num_dummy_cycles; + + if (np !of_property_read_u32(np, m25p,num-dummy-cycles, + num_dummy_cycles)) { + nor-read_dummy = num_dummy_cycles; + + /* + * This switch block might be moved after the if...then...else + * statement but it was not tested with all Spansion or Micron + * memories. + * Now the m25p,num-dummy-cycles property needs to be + * explicitly set in the device tree so the switch statement is + * executed. This should avoid unwanted side effects and keep + * backward compatibility. + */ + switch (JEDEC_MFR(info)) { + case CFI_MFR_ST: + return micron_set_dummy_cycles(nor); + default: If you do have m25p,num-dummy-cycles set for non-micron flash, you have a problem here I believe. + break; + } + } else { The solution would be to drop this else {} bit here, so that if you fail in the DT-based configuration, you fall back to this old behavior. What do you think please ? :) Good idea! I also add a trace for the default case of switch (JEDEC_MFR(info)): dev_warn(dev, can't set the number of dummy cycles\n); Maybe change this to setting the number of dummy cycles not supported by chip, ignoring or something, to be explicit about the fallback and that this is not supported by the chip. But this is just an idea, feel free to ignore it. So the user is notified that the driver could not use the value of m25p,num-dummy-cycles from the DT before falling back to the legacy code. Yup. + switch (nor-flash_read) { + case SPI_NOR_FAST: + case SPI_NOR_DUAL: + case SPI_NOR_QUAD: + nor-read_dummy = 8; + case SPI_NOR_NORMAL: + nor-read_dummy = 0; + } + } + + return 0; +} [...] thanks for the review! Im glad it helped ;-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Next round: revised futex(2) man page for review
On Thu, Aug 20, 2015 at 12:40:46AM +0200, Thomas Gleixner wrote: On Wed, 5 Aug 2015, Darren Hart wrote: On Mon, Jul 27, 2015 at 02:07:15PM +0200, Michael Kerrisk (man-pages) wrote: .\ FIXME XXX = Start of adapted Hart/Guniguntala text = .\ The following text is drawn from the Hart/Guniguntala paper .\ (listed in SEE ALSO), but I have reworded some pieces .\ significantly. Please check it. The PI futex operations described below differ from the other futex operations in that they impose policy on the use of the value of the futex word: * If the lock is not acquired, the futex word's value shall be 0. * If the lock is acquired, the futex word's value shall be the thread ID (TID; see gettid(2)) of the owning thread. * If the lock is owned and there are threads contending for the lock, then the FUTEX_WAITERS bit shall be set in the futex word's value; in other words, this value is: FUTEX_WAITERS | TID Note that a PI futex word never just has the value FUTEX_WAITERS, which is a permissible state for non-PI futexes. The second clause is inappropriate. I don't know if that was yours or mine, but non-PI futexes do not have a kernel defined value policy, so ==FUTEX_WAITERS cannot be a permissible state as any value is permissible for non-PI futexes, and none have a kernel defined state. Depends. If the regular futex is configured as robust, then we have a kernel defined value policy as well. Indeed, thanks for catching that. -- Darren Hart Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pci: acpi: Generic function for setting up PCI device DMA coherency
Hi, Ping. Does anyone have any comments or suggestions? Thanks, Suravee On 8/13/15 16:58, Suravee Suthikulpanit wrote: This patch refactors of_pci_dma_configure() into a more generic pci_dma_configure(), which can be reused by non-OF code. Then, it adds support for setting up PCI device DMA coherency from ACPI _CCA object that should normally be specified in the DSDT node of its PCI host bridge.. Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com CC: Bjorn Helgaas bhelg...@google.com CC: Catalin Marinas catalin.mari...@arm.com CC: Will Deacon will.dea...@arm.com CC: Rafael J. Wysocki r...@rjwysocki.net CC: Rob Herring robh...@kernel.org CC: Murali Karicheri m-kariche...@ti.com --- Note: According to the ACPI spec, the _CCA attribute is required for ARM64. Therefore, this patch is a pre-req for ACPI PCI support for ARM64 which is currently in development. Also, this should not affect other architectures since if CCA is not required, the default value is coherent. Please see include/acpi/acpi_bus.h: acpi_check_dma() and drivers/acpi/scan.c: acpi_init_coherency() for more information drivers/of/of_pci.c| 20 drivers/pci/probe.c| 35 +-- include/linux/of_pci.h | 3 --- 3 files changed, 33 insertions(+), 25 deletions(-) diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c index 5751dc5..b66ee4e 100644 --- a/drivers/of/of_pci.c +++ b/drivers/of/of_pci.c @@ -117,26 +117,6 @@ int of_get_pci_domain_nr(struct device_node *node) } EXPORT_SYMBOL_GPL(of_get_pci_domain_nr); -/** - * of_pci_dma_configure - Setup DMA configuration - * @dev: ptr to pci_dev struct of the PCI device - * - * Function to update PCI devices's DMA configuration using the same - * info from the OF node of host bridge's parent (if any). - */ -void of_pci_dma_configure(struct pci_dev *pci_dev) -{ - struct device *dev = pci_dev-dev; - struct device *bridge = pci_get_host_bridge_device(pci_dev); - - if (!bridge-parent) - return; - - of_dma_configure(dev, bridge-parent-of_node); - pci_put_host_bridge_device(bridge); -} -EXPORT_SYMBOL_GPL(of_pci_dma_configure); - #if defined(CONFIG_OF_ADDRESS) /** * of_pci_get_host_bridge_resources - Parse PCI host bridge resources from DT diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index cefd636..e2fcd3b 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -6,12 +6,14 @@ #include linux/delay.h #include linux/init.h #include linux/pci.h -#include linux/of_pci.h +#include linux/of_device.h #include linux/pci_hotplug.h #include linux/slab.h #include linux/module.h #include linux/cpumask.h #include linux/pci-aspm.h +#include linux/acpi.h +#include linux/property.h #include asm-generic/pci-bridge.h #include pci.h @@ -1544,6 +1546,35 @@ static void pci_init_capabilities(struct pci_dev *dev) pci_enable_acs(dev); } +/** + * pci_dma_configure - Setup DMA configuration + * @pci_dev: ptr to pci_dev struct of the PCI device + * + * Function to update PCI devices's DMA configuration using the same + * info from the OF node or ACPI node of host bridge's parent (if any). + */ +static void pci_dma_configure(struct pci_dev *pci_dev) +{ + struct device *dev = pci_dev-dev; + struct device *bridge = pci_get_host_bridge_device(pci_dev); + struct acpi_device *adev; + bool coherent; + + if (has_acpi_companion(bridge)) { + adev = to_acpi_node(bridge-fwnode); + if (acpi_check_dma(adev, coherent)) + arch_setup_dma_ops(dev, 0, 0, NULL, coherent); + } else { + struct device *host = bridge-parent; + if (!host) + return; + + of_dma_configure(dev, host-of_node); + } + + pci_put_host_bridge_device(bridge); +} + void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) { int ret; @@ -1557,7 +1588,7 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) dev-dev.dma_mask = dev-dma_mask; dev-dev.dma_parms = dev-dma_parms; dev-dev.coherent_dma_mask = 0xull; - of_pci_dma_configure(dev); + pci_dma_configure(dev); pci_set_dma_max_seg_size(dev, 65536); pci_set_dma_seg_boundary(dev, 0x); diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h index 29fd3fe..ce0e5ab 100644 --- a/include/linux/of_pci.h +++ b/include/linux/of_pci.h @@ -16,7 +16,6 @@ int of_pci_get_devfn(struct device_node *np); int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin); int of_pci_parse_bus_range(struct device_node *node, struct resource *res); int of_get_pci_domain_nr(struct device_node *node); -void of_pci_dma_configure(struct pci_dev *pci_dev); #else static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq) { @@
Re: [PATCH 1/3] KVM: make halt_poll_ns per-VCPU
On Mon, Aug 24, 2015 at 5:53 AM, Wanpeng Li wanpeng...@hotmail.com wrote: Change halt_poll_ns into per-VCPU variable, seeded from module parameter, to allow greater flexibility. You should also change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of the module parameter. Signed-off-by: Wanpeng Li wanpeng...@hotmail.com --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 1 + 2 files changed, 2 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 81089cf..1bef9e2 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -242,6 +242,7 @@ struct kvm_vcpu { int sigset_active; sigset_t sigset; struct kvm_vcpu_stat stat; + unsigned int halt_poll_ns; #ifdef CONFIG_HAS_IOMEM int mmio_needed; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d8db2f8f..a122b52 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -217,6 +217,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id) vcpu-kvm = kvm; vcpu-vcpu_id = id; vcpu-pid = NULL; + vcpu-halt_poll_ns = halt_poll_ns; init_waitqueue_head(vcpu-wq); kvm_async_pf_vcpu_init(vcpu); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH
24.08.2015 16:29, Bjørn Mork пишет: Eugene Shatokhin eugene.shatok...@rosalab.ru writes: 19.08.2015 15:31, Bjørn Mork пишет: Eugene Shatokhin eugene.shatok...@rosalab.ru writes: The problem is not in the reordering but rather in the fact that dev-flags = 0 is not necessarily atomic w.r.t. clear_bit(EVENT_RX_KILL, dev-flags), and vice versa. So the following might be possible, although unlikely: CPU0 CPU1 clear_bit: read dev-flags clear_bit: clear EVENT_RX_KILL in the read value dev-flags=0; clear_bit: write updated dev-flags As a result, dev-flags may become non-zero again. Ah, right. Thanks for explaining. I cannot prove yet that this is an impossible situation. If anyone can, please explain. If so, this part of the patch will not be needed. I wonder if we could simply move the dev-flags = 0 down a few lines to fix both issues? It doesn't seem to do anything useful except for resetting the flags to a sane initial state after the device is down. Stopping the tasklet rescheduling etc depends only on netif_running(), which will be false when usbnet_stop is called. There is no need to touch dev-flags for this to happen. That was one of the first ideas we discussed here. Unfortunately, it is probably not so simple. Setting dev-flags to 0 makes some delayed operations do nothing and, among other things, not to reschedule usbnet_bh(). Yes, but I believe that is merely a side effect. You should never need to clear multiple flags to get the desired behaviour. As you can see in drivers/net/usb/usbnet.c, usbnet_bh() can be called as a tasklet function and as a timer function in a number of situations (look for the usage of dev-bh and dev-delay there). netif_running() is indeed false when usbnet_stop() runs, usbnet_stop() also disables Tx. This seems to be enough for many cases where usbnet_bh() is scheduled, but I am not so sure about the remaining ones, namely: 1. A work function, usbnet_deferred_kevent(), may reschedule usbnet_bh(). Looks like the workqueue is only stopped in usbnet_disconnect(), so a work item might be processed while usbnet_stop() works. Setting dev-flags to 0 makes the work function do nothing, by the way. See also the comment in usbnet_stop() about this. A work item may be placed to this workqueue in a number of ways, by both usbnet module and the mini-drivers. It is not too easy to track all these situations. That's an understatement :) 2. rx_complete() and tx_complete() may schedule execution of usbnet_bh() as a tasklet or a timer function. These two are URB completion callbacks. It seems, new Rx and Tx URBs cannot be submitted when usbnet_stop() clears dev-flags, indeed. But it does not prevent the completion handlers for the previously submitted URBs from running concurrently with usbnet_stop(). The latter waits for them to complete (via usbnet_terminate_urbs(dev)) but only if FLAG_AVOID_UNLINK_URBS is not set in info-flags. rndis_wlan, however, sets this flag for a few hardware models. So - no guarantees here as well. FLAG_AVOID_UNLINK_URBS looks like it should be replaced by the newer ability to keep the status urb active. I believe that must have been the real reason for adding it, based on the commit message and the effect the flag will have: commit 1487cd5e76337555737cbc55d7d83f41460d198f Author: Jussi Kivilinna jussi.kivili...@mbnet.fi Date: Thu Jul 30 19:41:20 2009 +0300 usbnet: allow minidriver to prevent urb unlinking on usbnet_stop rndis_wlan devices freeze after running usbnet_stop several times. It appears that firmware freezes in state where it does not respond to any RNDIS commands and device have to be physically unplugged/replugged. This patch lets minidrivers to disable unlink_urbs on usbnet_stop through new info flag. Signed-off-by: Jussi Kivilinna jussi.kivili...@mbnet.fi Cc: David Brownell dbrown...@users.sourceforge.net Signed-off-by: John W. Linville linvi...@tuxdriver.com The rx urbs will not be resubmitted in any case, and there are of course no tx urbs being submitted. So the only effect of this flag is on the status/interrupt urb, which I can imagine some RNDIS devices wants active all the time. So FLAG_AVOID_UNLINK_URBS should probably be removed and replaced calls to usbnet_status_start() and usbnet_status_stop(). This will require testing on some of the devices with the original firmware problem however. In any case: I do not think this flag should be considered when trying to make usbnet_stop behaviour saner. It's only purpose is to deliberately break usbnet_stop by not actually stopping. If someone could list the particular bits of dev-flags that should be cleared to make sure no deferred call could reschedule usbnet_bh(), etc... Well, it would be enough to clear these first and use dev-flags = 0 later, after tasklet_kill() and del_timer_sync(). I cannot point out these particular bits
Re: [PATCH 2/3] KVM: dynamise halt_poll_ns adjustment
On Mon, Aug 24, 2015 at 5:53 AM, Wanpeng Li wanpeng...@hotmail.com wrote: There are two new kernel parameters for changing the halt_poll_ns: halt_poll_ns_grow and halt_poll_ns_shrink. halt_poll_ns_grow affects halt_poll_ns when an interrupt arrives and halt_poll_ns_shrink does it when idle VCPU is detected. halt_poll_ns_shrink/ | halt_poll_ns_grow| interrupt arrives| idle VCPU is detected -+--+--- 1 | = halt_poll_ns | = 0 halt_poll_ns | *= halt_poll_ns_grow | /= halt_poll_ns_shrink otherwise| += halt_poll_ns_grow | -= halt_poll_ns_shrink A third new parameter, halt_poll_ns_max, controls the maximal halt_poll_ns; it is internally rounded down to a closest multiple of halt_poll_ns_grow. I like the idea of growing and shrinking halt_poll_ns, but I'm not sure we grow and shrink in the right places here. For example, if vcpu-halt_poll_ns gets down to 0, I don't see how it can then grow back up. This might work better: if (poll successfully for interrupt): stay the same else if (length of kvm_vcpu_block is longer than halt_poll_ns_max): shrink else if (length of kvm_vcpu_block is less than halt_poll_ns_max): grow where halt_poll_ns_max is something reasonable, like 2 millisecond. You get diminishing returns from halt polling as the length of the halt gets longer (halt polling only reduces halt latency by 10-15 us). So there's little benefit to polling longer than a few milliseconds. Signed-off-by: Wanpeng Li wanpeng...@hotmail.com --- virt/kvm/kvm_main.c | 81 - 1 file changed, 80 insertions(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index a122b52..bcfbd35 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -66,9 +66,28 @@ MODULE_AUTHOR(Qumranet); MODULE_LICENSE(GPL); -static unsigned int halt_poll_ns; +#define KVM_HALT_POLL_NS 50 +#define KVM_HALT_POLL_NS_GROW 2 +#define KVM_HALT_POLL_NS_SHRINK 0 +#define KVM_HALT_POLL_NS_MAX \ + INT_MAX / KVM_HALT_POLL_NS_GROW + +static unsigned int halt_poll_ns = KVM_HALT_POLL_NS; module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR); +/* Default doubles per-vcpu halt_poll_ns. */ +static int halt_poll_ns_grow = KVM_HALT_POLL_NS_GROW; +module_param(halt_poll_ns_grow, int, S_IRUGO); + +/* Default resets per-vcpu halt_poll_ns . */ +int halt_poll_ns_shrink = KVM_HALT_POLL_NS_SHRINK; +module_param(halt_poll_ns_shrink, int, S_IRUGO); + +/* Default is to compute the maximum so we can never overflow. */ +unsigned int halt_poll_ns_actual_max = KVM_HALT_POLL_NS_MAX; +unsigned int halt_poll_ns_max = KVM_HALT_POLL_NS_MAX; +module_param(halt_poll_ns_max, int, S_IRUGO); + /* * Ordering of locks: * @@ -1907,6 +1926,62 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn) } EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty); +static unsigned int __grow_halt_poll_ns(unsigned int val) +{ + if (halt_poll_ns_grow 1) + return halt_poll_ns; + + val = min(val, halt_poll_ns_actual_max); + + if (val == 0) + return halt_poll_ns; + + if (halt_poll_ns_grow halt_poll_ns) + val *= halt_poll_ns_grow; + else + val += halt_poll_ns_grow; + + return val; +} + +static unsigned int __shrink_halt_poll_ns(int val, int modifier, int minimum) +{ + if (modifier 1) + return 0; + + if (modifier halt_poll_ns) + val /= modifier; + else + val -= modifier; + + return val; +} + +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) +{ + vcpu-halt_poll_ns = __grow_halt_poll_ns(vcpu-halt_poll_ns); +} + +static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu) +{ + vcpu-halt_poll_ns = __shrink_halt_poll_ns(vcpu-halt_poll_ns, + halt_poll_ns_shrink, halt_poll_ns); +} + +/* + * halt_poll_ns_actual_max is computed to be one grow_halt_poll_ns() below + * halt_poll_ns_max. (See __grow_halt_poll_ns for the reason.) + * This prevents overflows, because ple_halt_poll_ns is int. + * halt_poll_ns_max effectively rounded down to a multiple of halt_poll_ns_grow in + * this process. + */ +static void update_halt_poll_ns_actual_max(void) +{ + halt_poll_ns_actual_max = + __shrink_halt_poll_ns(max(halt_poll_ns_max, halt_poll_ns), + halt_poll_ns_grow, INT_MIN); +} + static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) { if (kvm_arch_vcpu_runnable(vcpu)) { @@ -1941,6 +2016,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) */ if (kvm_vcpu_check_block(vcpu) 0) { ++vcpu-stat.halt_successful_poll; +
[PATCH v7 6/8] ACPI: add GTDT table parse driver into ACPI driver
From: Fu Wei fu@linaro.org This driver adds support for parsing SBSA Generic Watchdog Structure in GTDT, and creating a platform device with that information. This allows the operating system to obtain device data from the resource of platform device. The platform device named sbsa-gwdt can be used by the ARM SBSA Generic Watchdog driver. Signed-off-by: Fu Wei fu@linaro.org Signed-off-by: Hanjun Guo hanjun@linaro.org --- drivers/acpi/Kconfig | 9 drivers/acpi/Makefile | 1 + drivers/acpi/gtdt.c | 135 ++ 3 files changed, 145 insertions(+) diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 114cf48..2e7e162 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -479,4 +479,13 @@ config XPOWER_PMIC_OPREGION endif +config ACPI_GTDT + bool ACPI GTDT Support + depends on ARM64 + help + GTDT (Generic Timer Description Table) provides information + for per-processor timers and Platform (memory-mapped) timers + for ARM platforms. Select this option to provide information + needed for the timers init. + endif # ACPI diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index 8321430..9a7966e 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -93,5 +93,6 @@ obj-$(CONFIG_ACPI_EXTLOG) += acpi_extlog.o obj-$(CONFIG_PMIC_OPREGION)+= pmic/intel_pmic.o obj-$(CONFIG_CRC_PMIC_OPREGION) += pmic/intel_pmic_crc.o obj-$(CONFIG_XPOWER_PMIC_OPREGION) += pmic/intel_pmic_xpower.o +obj-$(CONFIG_ACPI_GTDT)+= gtdt.o video-objs += acpi_video.o video_detect.o diff --git a/drivers/acpi/gtdt.c b/drivers/acpi/gtdt.c new file mode 100644 index 000..bbe3a2e --- /dev/null +++ b/drivers/acpi/gtdt.c @@ -0,0 +1,135 @@ +/* + * ARM Specific GTDT table Support + * + * Copyright (C) 2015, Linaro Ltd. + * Author: Fu Wei fu@linaro.org + * Hanjun Guo hanjun@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include linux/acpi.h +#include linux/device.h +#include linux/init.h +#include linux/kernel.h +#include linux/module.h +#include linux/platform_device.h + +static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags) +{ + int trigger, polarity; + + if (!interrupt) + return 0; + + trigger = (flags ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE + : ACPI_LEVEL_SENSITIVE; + + polarity = (flags ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW + : ACPI_ACTIVE_HIGH; + + return acpi_register_gsi(NULL, interrupt, trigger, polarity); +} + +/* + * Initialize a SBSA generic Watchdog platform device info from GTDT + * According to SBSA specification the size of refresh and control + * frames of SBSA Generic Watchdog is SZ_4K(Offset 0x000 – 0xFFF). + */ +static int __init gtdt_import_sbsa_gwdt(struct acpi_gtdt_watchdog *wd, + int index) +{ + struct platform_device *pdev; + int irq = map_generic_timer_interrupt(wd-timer_interrupt, + wd-timer_flags); + struct resource res[] = { + DEFINE_RES_IRQ(irq), + DEFINE_RES_MEM(wd-control_frame_address, SZ_4K), + DEFINE_RES_MEM(wd-refresh_frame_address, SZ_4K), + }; + + pr_debug(GTDT: a Watchdog GT(0x%llx/0x%llx gsi:%u flags:0x%x)\n, +wd-refresh_frame_address, wd-control_frame_address, +wd-timer_interrupt, wd-timer_flags); + + if (!(wd-refresh_frame_address + wd-control_frame_address + wd-timer_interrupt)) { + pr_err(GTDT: failed geting the device info.\n); + return -EINVAL; + } + + if (irq 0) { + pr_err(GTDT: failed to register GSI of the Watchdog GT.\n); + return -EINVAL; + } + + /* +* Add a platform device named sbsa-gwdt to match the platform driver. +* sbsa-gwdt: SBSA(Server Base System Architecture) Generic Watchdog +* The platform driver (like drivers/watchdog/sbsa_gwdt.c)can get device +* info below by matching this name. +*/ + pdev = platform_device_register_simple(sbsa-gwdt, index, res, + ARRAY_SIZE(res)); + if (IS_ERR(pdev)) { + acpi_unregister_gsi(wd-timer_interrupt); + return PTR_ERR(pdev); + } + + return 0; +} + +static int __init gtdt_platform_timer_parse(struct acpi_table_header *table) +{ + struct acpi_gtdt_header *header; + struct acpi_table_gtdt *gtdt; + void *gtdt_subtable; + int i, gwdt_index; + int ret = 0; + + if (table-revision 2) { +
[PATCH v7 4/8] Watchdog: introdouce pretimeout into framework
From: Fu Wei fu@linaro.org Also update Documentation/watchdog/watchdog-kernel-api.txt to introduce: (1)the new elements in the watchdog_device and watchdog_ops struct; (2)the new API watchdog_init_timeouts Reasons: (1)kernel already has two watchdog drivers are using pretimeout: drivers/char/ipmi/ipmi_watchdog.c drivers/watchdog/kempld_wdt.c(but the definition is different) (2)some other drivers are going to use this: ARM SBSA Generic Watchdog Signed-off-by: Fu Wei fu@linaro.org --- Documentation/watchdog/watchdog-kernel-api.txt | 47 ++-- drivers/watchdog/watchdog_core.c | 98 ++ drivers/watchdog/watchdog_dev.c| 53 ++ include/linux/watchdog.h | 39 -- 4 files changed, 200 insertions(+), 37 deletions(-) diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt index d8b0d33..1fadeb9 100644 --- a/Documentation/watchdog/watchdog-kernel-api.txt +++ b/Documentation/watchdog/watchdog-kernel-api.txt @@ -53,6 +53,9 @@ struct watchdog_device { unsigned int timeout; unsigned int min_timeout; unsigned int max_timeout; + unsigned int pretimeout; + unsigned int min_pretimeout; + unsigned int max_pretimeout; void *driver_data; struct mutex lock; unsigned long status; @@ -75,6 +78,9 @@ It contains following fields: * timeout: the watchdog timer's timeout value (in seconds). * min_timeout: the watchdog timer's minimum timeout value (in seconds). * max_timeout: the watchdog timer's maximum timeout value (in seconds). +* pretimeout: the watchdog timer's pretimeout value (in seconds). +* min_pretimeout: the watchdog timer's minimum pretimeout value (in seconds). +* max_pretimeout: the watchdog timer's maximum pretimeout value (in seconds). * bootstatus: status of the device after booting (reported with watchdog WDIOF_* status bits). * driver_data: a pointer to the drivers private data of a watchdog device. @@ -99,6 +105,7 @@ struct watchdog_ops { int (*ping)(struct watchdog_device *); unsigned int (*status)(struct watchdog_device *); int (*set_timeout)(struct watchdog_device *, unsigned int); + int (*set_pretimeout)(struct watchdog_device *, unsigned int); unsigned int (*get_timeleft)(struct watchdog_device *); void (*ref)(struct watchdog_device *); void (*unref)(struct watchdog_device *); @@ -160,9 +167,19 @@ they are supported. These optional routines/operations are: and -EIO for could not write value to the watchdog. On success this routine should set the timeout value of the watchdog_device to the achieved timeout value (which may be different from the requested one - because the watchdog does not necessarily has a 1 second resolution). + because the watchdog does not necessarily has a 1 second resolution; + If the driver supports pretimeout, then the timeout value must be greater + than that). (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the watchdog's info structure). +* set_pretimeout: this routine checks and changes the pretimeout of the + watchdog timer device. It returns 0 on success, -EINVAL for parameter out of + range and -EIO for could not write value to the watchdog. On success this + routine should set the pretimeout value of the watchdog_device to the + achieved pretimeout value (which may be different from the requested one + because the watchdog does not necessarily has a 1 second resolution). + (Note: the WDIOF_PRETIMEOUT needs to be set in the options field of the + watchdog's info structure). * get_timeleft: this routines returns the time that's left before a reset. * ref: the operation that calls kref_get on the kref of a dynamically allocated watchdog_device struct. @@ -226,8 +243,28 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd, unsigned int timeout_parm, struct device *dev); The watchdog_init_timeout function allows you to initialize the timeout field -using the module timeout parameter or by retrieving the timeout-sec property from -the device tree (if the module timeout parameter is invalid). Best practice is -to set the default timeout value as timeout value in the watchdog_device and -then use this function to set the user preferred timeout value. +using the module timeout parameter or by retrieving the first element of +the timeout-sec property from the device tree (if the module timeout parameter +is invalid). Best practice is to set the default timeout value as timeout value +in the watchdog_device and then use this function to set the user preferred +timeout value. +This routine returns zero on success and a negative errno code for failure. + +Some watchdog timers have two stage of timeouts (timeout and pretimeout), +to initialize the timeout and pretimeout fields at the
[PATCH v7 7/8] Watchdog: enable ACPI GTDT support for ARM SBSA watchdog driver
From: Fu Wei fu@linaro.org This patch enables ACPI GTDT support for ARM SBSA watchdog driver automatically, if ACPI support is enabled. Signed-off-by: Fu Wei fu@linaro.org --- drivers/watchdog/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index b2734f0..2719093 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -178,6 +178,7 @@ config ARM_SBSA_WATCHDOG depends on ARM64 depends on ARM_ARCH_TIMER select WATCHDOG_CORE + select ACPI_GTDT if ACPI help ARM SBSA Generic Watchdog. This watchdog has two Watchdog timeouts. The first timeout will trigger a panic; the second timeout will -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v7 3/8] ARM64: add SBSA Generic Watchdog device node in amd-seattle-soc.dtsi
From: Fu Wei fu@linaro.org This can be a example of adding SBSA Generic Watchdog device node into some dts files for the Soc which contains SBSA Generic Watchdog. Acked-by: Arnd Bergmann a...@arndb.de Acked-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Tested-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Signed-off-by: Fu Wei fu@linaro.org --- arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi | 8 1 file changed, 8 insertions(+) diff --git a/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi b/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi index 2874d92..259430f 100644 --- a/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi +++ b/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi @@ -84,6 +84,14 @@ clock-names = uartclk, apb_pclk; }; + watchdog0: watchdog@e0bb { + compatible = arm,sbsa-gwdt; + reg = 0x0 0xe0bc 0 0x1000, + 0x0 0xe0bb 0 0x1000; + interrupts = 0 337 4; + timeout-sec = 10 5; + }; + spi0: ssp@e102 { status = disabled; compatible = arm,pl022, arm,primecell; -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v7 8/8] clocksource: simplify ACPI code in arm_arch_timer.c
From: Fu Wei fu@linaro.org The patch update arm_arch_timer driver to use the function provided by the new GTDT driver of ACPI. By this way, arm_arch_timer.c can be simplified, and separate all the ACPI GTDT knowledge from this timer driver. Signed-off-by: Fu Wei fu@linaro.org Signed-off-by: Hanjun Guo hanjun@linaro.org --- arch/arm64/kernel/time.c | 4 +-- drivers/acpi/gtdt.c | 43 ++ drivers/clocksource/Kconfig | 1 + drivers/clocksource/arm_arch_timer.c | 60 +++- include/clocksource/arm_arch_timer.h | 8 + include/linux/acpi.h | 5 +++ include/linux/clocksource.h | 4 +-- 7 files changed, 72 insertions(+), 53 deletions(-) diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c index 42f9195..2cabea6 100644 --- a/arch/arm64/kernel/time.c +++ b/arch/arm64/kernel/time.c @@ -75,9 +75,9 @@ void __init time_init(void) /* * Since ACPI or FDT will only one be available in the system, -* we can use acpi_generic_timer_init() here safely +* we can use arch_timer_acpi_init() here safely */ - acpi_generic_timer_init(); + arch_timer_acpi_init(); arch_timer_rate = arch_timer_get_rate(); if (!arch_timer_rate) diff --git a/drivers/acpi/gtdt.c b/drivers/acpi/gtdt.c index bbe3a2e..3559babf 100644 --- a/drivers/acpi/gtdt.c +++ b/drivers/acpi/gtdt.c @@ -17,6 +17,8 @@ #include linux/module.h #include linux/platform_device.h +#include clocksource/arm_arch_timer.h + static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags) { int trigger, polarity; @@ -33,6 +35,47 @@ static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags) return acpi_register_gsi(NULL, interrupt, trigger, polarity); } +static struct arch_timer_data __initdata *arch_timer_data_p; + +static int __init arch_timer_data_init(struct acpi_table_header *table) +{ + struct acpi_table_gtdt *gtdt; + + gtdt = container_of(table, struct acpi_table_gtdt, header); + + arch_timer_data_p-phys_secure_ppi = + map_generic_timer_interrupt(gtdt-secure_el1_interrupt, + gtdt-secure_el1_flags); + + arch_timer_data_p-phys_nonsecure_ppi = + map_generic_timer_interrupt(gtdt-non_secure_el1_interrupt, + gtdt-non_secure_el1_flags); + + arch_timer_data_p-virt_ppi = + map_generic_timer_interrupt(gtdt-virtual_timer_interrupt, + gtdt-virtual_timer_flags); + + arch_timer_data_p-hyp_ppi = + map_generic_timer_interrupt(gtdt-non_secure_el2_interrupt, + gtdt-non_secure_el2_flags); + + arch_timer_data_p-c3stop = !(gtdt-non_secure_el1_flags + ACPI_GTDT_ALWAYS_ON); + + return 0; +} + +/* Initialize the arch_timer_data struct for arm_arch_timer by GTDT info */ +int __init gtdt_arch_timer_data_init(struct arch_timer_data *data) +{ + if (acpi_disabled || !data) + return -EINVAL; + + arch_timer_data_p = data; + + return acpi_table_parse(ACPI_SIG_GTDT, arch_timer_data_init); +} + /* * Initialize a SBSA generic Watchdog platform device info from GTDT * According to SBSA specification the size of refresh and control diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig index 4e57730..e111025 100644 --- a/drivers/clocksource/Kconfig +++ b/drivers/clocksource/Kconfig @@ -119,6 +119,7 @@ config CLKSRC_STM32 config ARM_ARCH_TIMER bool select CLKSRC_OF if OF + select ACPI_GTDT if ACPI config ARM_ARCH_TIMER_EVTSTREAM bool Support for ARM architected timer event stream generation diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index 0aa135d..99505bb 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -817,68 +817,30 @@ CLOCKSOURCE_OF_DECLARE(armv7_arch_timer_mem, arm,armv7-timer-mem, arch_timer_mem_init); #ifdef CONFIG_ACPI -static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags) -{ - int trigger, polarity; - - if (!interrupt) - return 0; - - trigger = (flags ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE - : ACPI_LEVEL_SENSITIVE; - - polarity = (flags ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW - : ACPI_ACTIVE_HIGH; - - return acpi_register_gsi(NULL, interrupt, trigger, polarity); -} - /* Initialize per-processor generic timer */ -static int __init arch_timer_acpi_init(struct acpi_table_header *table) +void __init arch_timer_acpi_init(void) { - struct acpi_table_gtdt *gtdt; + struct arch_timer_data data;
Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME inodes
Hello, On Mon, Aug 24, 2015 at 10:51:50AM -0400, Tejun Heo wrote: Bah, I see the problem and indeed it was introduced by commit e79729123f639 writeback: don't issue wb_writeback_work if clean. The problem is that we bail out of sync_inodes_sb() if there is no dirty IO. Which is wrong because we have to wait for any outstanding IO (i.e. call wait_sb_inodes()) regardless of dirty state! And that also explains why Tejun's patch fixes the problem because it backs out the change to the exit condition in sync_inodes_sb(). Dang, I'm an idiot sandwich. A question tho, so this means that an inode may contain dirty or writeback pages w/o the inode being on one of the dirty lists. Looking at the generic filesystem and writeback code, this doesn't seem true in general. Is this something xfs specific? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/10] irqchip: irq-mips-gic: export gic_send_ipi
[adding Mark Rutland, as this is heading straight into uncharted DT territory] On 24/08/15 17:39, Qais Yousef wrote: On 08/24/2015 04:07 PM, Thomas Gleixner wrote: On Mon, 24 Aug 2015, Qais Yousef wrote: On 08/24/2015 02:32 PM, Marc Zyngier wrote: I'd rather see something more architected than this blind export, or at least some level of filtering (the idea random drivers can access such a low-level function doesn't make me feel very good). I don't know how to architect this better or how to perform the filtering, but I'm happy to hear suggestions and try them out. Keep in mind that detecting GIC and writing your own gic_send_ipi() is very simple. I have done this when the driver was out of tree. So restricting it by not exporting it will not prevent someone from really accessing the functionality, it's just they have to do it their own way. Keep in mind that we are not talking about out of tree hackery. We talk about a kernel code submission and I doubt, that you will get away with a GIC detection/fiddling burried in your driver code. Keep in mind that just slapping an export to some random function is not much better than doing a GIC hack in the driver. Marcs concerns about blindly exposing IPI functionality to drivers is well justified and that kind of coprocessor stuff is not unique to your particular SoC. We're going to see such things more frequently in the not so distant future, so we better think now about proper solutions to that problem. Sure I'm not trying to argue against that. There are a couple of issues to solve: 1) How is the IPI which is received by the coprocessor reserved in the system? 2) How is it associated to a particular driver? Shouldn't 'interrupts' property in DT take care of these 2 questions? Maybe we can give it an alias name to make it more readable that this interrupt is requested for external IPI. The interrupts property has a rather different meaning, and isn't designed to hardcode IPIs. Also, this property describes an interrupt from a device to the CPU, not the other way around (I imagine you also have an interrupt coming from the AXD to the CPU, possibly using an IPI too). We can deal with these issues, but that's not something we can improvise. What I had in mind was something fairly generic: - interrupt-source: something generating an interrupt - interrupt-sink: something being targeted by an interrupt You could then express things like: intc: interrupt-controller@1000 { interrupt-controller; }; mydevice@f000 { interrupt-source = intc INT_SPEC 2 inttarg1 inttarg1; }; inttarg1: mydevice@f100 { interrupt-sink = intc HWAFFINITY1; }; inttarg2: cpu@1 { interrupt-sink = intc HWAFFINITY2; }; You could also imagine having CPUs being both source and sink. 3) How do we ensure that a driver cannot issue random IPIs and can only send the associated ones? If we get the irq number from DT then I'm not sure how feasible it is to implement a generic_send_ipi() function that takes this number to generate an IPI. Do you think this approach would work? If you follow the above approach, it should be pretty easy to derive a source identifier and a sink identifier from the DT, and have the core code to route one to the other and do the right thing. The source identifier could also be used to describe an IPI in a fairly safe way (the target being fixed by DT, but the actual number used dynamically allocated by the kernel). This is just a 10 minutes braindump, so feel free to throw rocks at it and to come up with a better solution! :-) Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pci: acpi: Generic function for setting up PCI device DMA coherency
Here it is again. On Thu, Aug 13, 2015 at 6:50 PM, Bjorn Helgaas bhelg...@google.com wrote: Hi Suravee, On Thu, Aug 13, 2015 at 04:58:45PM +0700, Suravee Suthikulpanit wrote: This patch refactors of_pci_dma_configure() into a more generic pci_dma_configure(), which can be reused by non-OF code. Then, it adds support for setting up PCI device DMA coherency from ACPI _CCA object that should normally be specified in the DSDT node of its PCI host bridge.. Since this does two things: 1) Rename of_pci_dma_configure() and move it to PCI 2) Add _CCA support, maybe it should be split into two patches? There are a couple more comments below. While looking at this, I thought some of the existing code could be made simpler and easier to follow. I appended a couple possible patches; you can incorporate them or ignore them, whatever seems best to you. Bjorn Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com CC: Bjorn Helgaas bhelg...@google.com CC: Catalin Marinas catalin.mari...@arm.com CC: Will Deacon will.dea...@arm.com CC: Rafael J. Wysocki r...@rjwysocki.net CC: Rob Herring robh...@kernel.org CC: Murali Karicheri m-kariche...@ti.com --- Note: According to the ACPI spec, the _CCA attribute is required for ARM64. Therefore, this patch is a pre-req for ACPI PCI support for ARM64 which is currently in development. Also, this should not affect other architectures since if CCA is not required, the default value is coherent. Please see include/acpi/acpi_bus.h: acpi_check_dma() and drivers/acpi/scan.c: acpi_init_coherency() for more information drivers/of/of_pci.c| 20 drivers/pci/probe.c| 35 +-- include/linux/of_pci.h | 3 --- 3 files changed, 33 insertions(+), 25 deletions(-) diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c index 5751dc5..b66ee4e 100644 --- a/drivers/of/of_pci.c +++ b/drivers/of/of_pci.c @@ -117,26 +117,6 @@ int of_get_pci_domain_nr(struct device_node *node) } EXPORT_SYMBOL_GPL(of_get_pci_domain_nr); -/** - * of_pci_dma_configure - Setup DMA configuration - * @dev: ptr to pci_dev struct of the PCI device - * - * Function to update PCI devices's DMA configuration using the same - * info from the OF node of host bridge's parent (if any). - */ -void of_pci_dma_configure(struct pci_dev *pci_dev) -{ - struct device *dev = pci_dev-dev; - struct device *bridge = pci_get_host_bridge_device(pci_dev); - - if (!bridge-parent) - return; - - of_dma_configure(dev, bridge-parent-of_node); - pci_put_host_bridge_device(bridge); -} -EXPORT_SYMBOL_GPL(of_pci_dma_configure); - #if defined(CONFIG_OF_ADDRESS) /** * of_pci_get_host_bridge_resources - Parse PCI host bridge resources from DT diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index cefd636..e2fcd3b 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -6,12 +6,14 @@ #include linux/delay.h #include linux/init.h #include linux/pci.h -#include linux/of_pci.h +#include linux/of_device.h #include linux/pci_hotplug.h #include linux/slab.h #include linux/module.h #include linux/cpumask.h #include linux/pci-aspm.h +#include linux/acpi.h +#include linux/property.h #include asm-generic/pci-bridge.h #include pci.h @@ -1544,6 +1546,35 @@ static void pci_init_capabilities(struct pci_dev *dev) pci_enable_acs(dev); } +/** + * pci_dma_configure - Setup DMA configuration + * @pci_dev: ptr to pci_dev struct of the PCI device + * + * Function to update PCI devices's DMA configuration using the same + * info from the OF node or ACPI node of host bridge's parent (if any). + */ +static void pci_dma_configure(struct pci_dev *pci_dev) Almost all pci_dev pointers in probe.c are named dev, so I would use that for this one, too. I probably would just drop the struct device *dev below and use dev-dev the two places you need it. That's a common idiom in PCI. +{ + struct device *dev = pci_dev-dev; + struct device *bridge = pci_get_host_bridge_device(pci_dev); + struct acpi_device *adev; + bool coherent; + + if (has_acpi_companion(bridge)) { + adev = to_acpi_node(bridge-fwnode); + if (acpi_check_dma(adev, coherent)) + arch_setup_dma_ops(dev, 0, 0, NULL, coherent); + } else { + struct device *host = bridge-parent; + if (!host) + return; + + of_dma_configure(dev, host-of_node); + } Why is this check reversed with respect to device_dma_is_coherent()? In device_dma_is_coherent(), we first look for an OF property, then look for ACPI _CCA. But here we check for _CCA, then for OF. + + pci_put_host_bridge_device(bridge); +} + void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) { int ret; @@ -1557,7 +1588,7 @@ void