Re: [PATCH-v6 5/6] mfd: 88pm800: Set default interrupt clear method

2015-08-24 Thread Vaibhav Hiremath



On Monday 24 August 2015 07:24 PM, Lee Jones wrote:

On Wed, 08 Jul 2015, Vaibhav Hiremath wrote:


As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe
(page 0) controls the method of clearing interrupt
status of 88pm800 family of devices;

   0: clear on read
   1: clear on write

If pdata is not coming from board file, then set the
default irq clear method to irq clear on write

Also, as suggested by Lee Jones renaming variable field
to appropriate name and removed unnecessary field
pm80x_chip.irq_mode, using platform_data.irq_clr_method.

Signed-off-by: Zhao Ye zh...@marvell.com
Signed-off-by: Vaibhav Hiremath vaibhav.hirem...@linaro.org
Reviewed-by: Krzysztof Kozlowski k.kozlow...@samsung.com
---
  drivers/mfd/88pm800.c   | 15 ++-
  include/linux/mfd/88pm80x.h |  9 +++--
  2 files changed, 17 insertions(+), 7 deletions(-)


[...]


+#define PM800_WAKEUP2_INT_READ_CLEAR   (0  1)
+#define PM800_WAKEUP2_INT_WRITE_CLEAR  (1  1)


Use BIT().


+/* Used by irq_clr_method */
+#define PM800_IRQ_CLR_ON_READ  0
+#define PM800_IRQ_CLR_ON_WRITE 1



-   int irq_mode;   /* Clear interrupt by read/write(0/1) */
+   bool irq_clr_method;/* Clear interrupt by read/write(0/1) */



+   irq_clr_mode = pdata-irq_clr_method == PM800_IRQ_CLR_ON_WRITE ?
+   PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR;
+   ret = regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode);


This is pretty convoluted.

For starters you're abusing the 'bool' type here.  Bool is either
'true' or 'false', so at the very least you should rename
'irq_clr_method' to 'irq_clr_on_write'.

Then you can do:

irq_clr_mode = pdata-irq_clr_on_write ?
PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR;



We have discussed on this, and went back-n-forth.
I think if I remember correctly, one of the version was using
true/false then we decided to rename it to relevant macro.

If I am not wrong V4 version of this series is exactly same as what you
are referring to.



However, what I suggest you really do is share
PM800_WAKEUP2_INT_{READ,WRITE}_CLEAR with platform data and just pass
the value through directly.



I think we discussed about this also, and the reason I recall here is,

we may need to control this from DT in the future so we decided to keep
it boolean in platform_data and have simple check before writing to
register.

And I think that was also another reason we introduced

/* Used by irq_clr_method */
#define PM800_IRQ_CLR_ON_READ   0
#define PM800_IRQ_CLR_ON_WRITE  1

(Earlier it was true/false in V4)

Thanks,
Vaibhav
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/sysctl.c: If count including the terminating byte '\0' the write system call should retrun success.

2015-08-24 Thread Sean Fu
On Mon, Aug 24, 2015 at 8:27 PM, Eric W. Biederman
ebied...@xmission.com wrote:


 On August 24, 2015 1:56:13 AM PDT, Sean Fu fxinr...@gmail.com wrote:
when the input argument count including the terminating byte \0,
The write system call return EINVAL on proc file.
But it return success on regular file.

 Nonsense.  It will write the '\0' to a regular file because it is just data.

 Integers in proc are more than data.

 So I see no justification for this change.
In fact, write(fd, 1\0, 2) on Integers proc file return success on
2.6 kernel. I already tested it on 2.6.6.60 kernel.

So, The latest behavior of write(fd, 1\0, 2) is different from old
kernel(2.6).
This maybe impact the compatibility of some user space program.


 Eric

E.g. Writting two bytes (1\0) to
/proc/sys/net/ipv4/conf/eth0/rp_filter.
write(fd, 1\0, 2) return EINVAL.

Signed-off-by: Sean Fu fxinr...@gmail.com
---
 kernel/sysctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 19b62b5..c2b0594 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2004,7 +2004,7 @@ static int do_proc_dointvec_conv(bool *negp,
unsigned long *lvalp,
return 0;
 }

-static const char proc_wspace_sep[] = { ' ', '\t', '\n' };
+static const char proc_wspace_sep[] = { ' ', '\t', '\n', '\0' };

 static int __do_proc_dointvec(void *tbl_data, struct ctl_table *table,
  int write, void __user *buffer,

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/1] USB:option:add ZTE PIDs

2015-08-24 Thread Liu.Zhao
This is intended to add ZTE device PIDs on kernel.

Signed-off-by: Liu.Zhao lzsos...@163.com
---
 drivers/usb/serial/option.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index 876423b..6b4a766 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -285,6 +285,10 @@ static void option_instat_callback(struct urb *urb);
 #define ZTE_PRODUCT_MC2718 0xffe8
 #define ZTE_PRODUCT_AD3812 0xffeb
 #define ZTE_PRODUCT_MC2716 0xffed
+#define ZTE_PRODUCT_ZM8620_X   0x0396
+#define ZTE_PRODUCT_ME3620_MBIM0x0426
+#define ZTE_PRODUCT_ME3620_X   0x1432
+#define ZTE_PRODUCT_ME3620_L   0x1433
 
 #define BENQ_VENDOR_ID 0x04a5
 #define BENQ_PRODUCT_H10   0x4068
@@ -544,6 +548,18 @@ static const struct option_blacklist_info 
zte_mc2716_z_blacklist = {
.sendsetup = BIT(1) | BIT(2) | BIT(3),
 };
 
+static const struct option_blacklist_info zte_zm8620_x_blacklist = {
+   .reserved = BIT(3) | BIT(4) | BIT(5),
+};
+
+static const struct option_blacklist_info zte_me3620_xl_blacklist = {
+   .reserved = BIT(3) | BIT(4) | BIT(5),
+};
+
+static const struct option_blacklist_info zte_me3620_mbim_blacklist = {
+   .reserved = BIT(2) | BIT(3) | BIT(4),
+};
+
 static const struct option_blacklist_info huawei_cdc12_blacklist = {
.reserved = BIT(1) | BIT(2),
 };
@@ -1591,6 +1607,14 @@ static const struct usb_device_id option_ids[] = {
 .driver_info = (kernel_ulong_t)zte_ad3812_z_blacklist },
{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, ZTE_PRODUCT_MC2716, 
0xff, 0xff, 0xff),
 .driver_info = (kernel_ulong_t)zte_mc2716_z_blacklist },
+   { USB_DEVICE(ZTE_VENDOR_ID, ZTE_PRODUCT_ME3620_L),
+.driver_info = (kernel_ulong_t)zte_me3620_xl_blacklist },
+   { USB_DEVICE(ZTE_VENDOR_ID, ZTE_PRODUCT_ME3620_X),
+.driver_info = (kernel_ulong_t)zte_me3620_xl_blacklist },
+   { USB_DEVICE(ZTE_VENDOR_ID, ZTE_PRODUCT_ZM8620_X),
+.driver_info = (kernel_ulong_t)zte_zm8620_x_blacklist },
+   { USB_DEVICE(ZTE_VENDOR_ID, ZTE_PRODUCT_ME3620_MBIM),
+.driver_info = (kernel_ulong_t)zte_me3620_mbim_blacklist },
{ USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0xff, 0x02, 0x01) },
{ USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0xff, 0x02, 0x05) },
{ USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0xff, 0x86, 0x10) },
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ipmi: add of_device_id in MODULE_DEVICE_TABLE

2015-08-24 Thread Brijesh Singh
Fix autoloading ipmi modules when using device tree.

Signed-off-by: Brijesh Singh brijeshkumar.si...@amd.com
---
 drivers/char/ipmi/ipmi_si_intf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 8a45e92..cddc7b0 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -2785,6 +2785,7 @@ static struct platform_driver ipmi_driver = {
.probe  = ipmi_probe,
.remove = ipmi_remove,
 };
+MODULE_DEVICE_TABLE(of, ipmi_match);
 
 #ifdef CONFIG_PARISC
 static int ipmi_parisc_probe(struct parisc_device *dev)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 5/6] ARCv2: perf: SMP support

2015-08-24 Thread Alexey Brodkin
* split off pmu info into singleton and per-cpu bits
* setup PMU on all cores

Cc: Peter Zijlstra pet...@infradead.org
Cc: Arnaldo Carvalho de Melo a...@kernel.org
Signed-off-by: Alexey Brodkin abrod...@synopsys.com
---

No changes since v2.

Compared to v1:
 [1] Rebase on top of previos patches hence changes in patch itself
 [2] Cosmetics

 arch/arc/kernel/perf_event.c | 69 ++--
 1 file changed, 54 insertions(+), 15 deletions(-)

diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index 997ccbd..80f5a85 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -21,10 +21,22 @@
 
 struct arc_pmu {
struct pmu  pmu;
+   unsigned intirq;
int n_counters;
-   unsigned long   used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)];
u64 max_period;
int ev_hw_idx[PERF_COUNT_ARC_HW_MAX];
+};
+
+struct arc_pmu_cpu {
+   /*
+* A 1 bit for an index indicates that the counter is being used for
+* an event. A 0 means that the counter can be used.
+*/
+   unsigned long   used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)];
+
+   /*
+* The events that are active on the PMU for the given index.
+*/
struct perf_event *act_counter[ARC_PERF_MAX_COUNTERS];
 };
 
@@ -67,6 +79,7 @@ perf_callchain_user(struct perf_callchain_entry *entry, 
struct pt_regs *regs)
 }
 
 static struct arc_pmu *arc_pmu;
+static DEFINE_PER_CPU(struct arc_pmu_cpu, arc_pmu_cpu);
 
 /* read counter #idx; note that counter# != event# on ARC! */
 static uint64_t arc_pmu_read_counter(int idx)
@@ -304,10 +317,12 @@ static void arc_pmu_stop(struct perf_event *event, int 
flags)
 
 static void arc_pmu_del(struct perf_event *event, int flags)
 {
+   struct arc_pmu_cpu *pmu_cpu = this_cpu_ptr(arc_pmu_cpu);
+
arc_pmu_stop(event, PERF_EF_UPDATE);
-   __clear_bit(event-hw.idx, arc_pmu-used_mask);
+   __clear_bit(event-hw.idx, pmu_cpu-used_mask);
 
-   arc_pmu-act_counter[event-hw.idx] = 0;
+   pmu_cpu-act_counter[event-hw.idx] = 0;
 
perf_event_update_userpage(event);
 }
@@ -315,22 +330,23 @@ static void arc_pmu_del(struct perf_event *event, int 
flags)
 /* allocate hardware counter and optionally start counting */
 static int arc_pmu_add(struct perf_event *event, int flags)
 {
+   struct arc_pmu_cpu *pmu_cpu = this_cpu_ptr(arc_pmu_cpu);
struct hw_perf_event *hwc = event-hw;
int idx = hwc-idx;
 
-   if (__test_and_set_bit(idx, arc_pmu-used_mask)) {
-   idx = find_first_zero_bit(arc_pmu-used_mask,
+   if (__test_and_set_bit(idx, pmu_cpu-used_mask)) {
+   idx = find_first_zero_bit(pmu_cpu-used_mask,
  arc_pmu-n_counters);
if (idx == arc_pmu-n_counters)
return -EAGAIN;
 
-   __set_bit(idx, arc_pmu-used_mask);
+   __set_bit(idx, pmu_cpu-used_mask);
hwc-idx = idx;
}
 
write_aux_reg(ARC_REG_PCT_INDEX, idx);
 
-   arc_pmu-act_counter[idx] = event;
+   pmu_cpu-act_counter[idx] = event;
 
if (is_sampling_event(event)) {
/* Mimic full counter overflow as other arches do */
@@ -357,7 +373,7 @@ static int arc_pmu_add(struct perf_event *event, int flags)
 static irqreturn_t arc_pmu_intr(int irq, void *dev)
 {
struct perf_sample_data data;
-   struct arc_pmu *arc_pmu = (struct arc_pmu *)dev;
+   struct arc_pmu_cpu *pmu_cpu = this_cpu_ptr(arc_pmu_cpu);
struct pt_regs *regs;
int active_ints;
int idx;
@@ -369,7 +385,7 @@ static irqreturn_t arc_pmu_intr(int irq, void *dev)
regs = get_irq_regs();
 
for (idx = 0; idx  arc_pmu-n_counters; idx++) {
-   struct perf_event *event = arc_pmu-act_counter[idx];
+   struct perf_event *event = pmu_cpu-act_counter[idx];
struct hw_perf_event *hwc;
 
if (!(active_ints  (1  idx)))
@@ -412,6 +428,17 @@ static irqreturn_t arc_pmu_intr(int irq, void *dev)
 
 #endif /* CONFIG_ISA_ARCV2 */
 
+void arc_cpu_pmu_irq_init(void)
+{
+   struct arc_pmu_cpu *pmu_cpu = this_cpu_ptr(arc_pmu_cpu);
+
+   arc_request_percpu_irq(arc_pmu-irq, smp_processor_id(), arc_pmu_intr,
+  ARC perf counters, pmu_cpu);
+
+   /* Clear all pending interrupt flags */
+   write_aux_reg(ARC_REG_PCT_INT_ACT, 0x);
+}
+
 static int arc_pmu_device_probe(struct platform_device *pdev)
 {
struct arc_reg_pct_build pct_bcr;
@@ -488,18 +515,30 @@ static int arc_pmu_device_probe(struct platform_device 
*pdev)
 
if (has_interrupts) {
int irq = platform_get_irq(pdev, 0);
+   unsigned long flags;
 
if (irq  0) {
pr_err(Cannot get IRQ number for the platform\n);
return 

[PATCH v3 4/6] ARCv2: perf: implement exclusion of event counting in user or kernel mode

2015-08-24 Thread Alexey Brodkin
Cc: Peter Zijlstra pet...@infradead.org
Cc: Arnaldo Carvalho de Melo a...@kernel.org
Signed-off-by: Alexey Brodkin abrod...@synopsys.com
---

No changes since v2.

No changes since v1.

 arch/arc/include/asm/perf_event.h |  3 +++
 arch/arc/kernel/perf_event.c  | 16 ++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/arc/include/asm/perf_event.h 
b/arch/arc/include/asm/perf_event.h
index 9ed593e..876e216 100644
--- a/arch/arc/include/asm/perf_event.h
+++ b/arch/arc/include/asm/perf_event.h
@@ -34,6 +34,9 @@
 #define ARC_REG_PCT_INT_CTRL   0x25E
 #define ARC_REG_PCT_INT_ACT0x25F
 
+#define ARC_REG_PCT_CONFIG_USER(1  18)   /* count in user mode */
+#define ARC_REG_PCT_CONFIG_KERN(1  19)   /* count in kernel mode 
*/
+
 #define ARC_REG_PCT_CONTROL_CC (1  16)   /* clear counts */
 #define ARC_REG_PCT_CONTROL_SN (1  17)   /* snapshot */
 
diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index ce0fa60..997ccbd 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -147,13 +147,25 @@ static int arc_pmu_event_init(struct perf_event *event)
local64_set(hwc-period_left, hwc-sample_period);
}
 
+   hwc-config = 0;
+
+   if (is_isa_arcv2()) {
+   /* exclude user means count only kernel */
+   if (event-attr.exclude_user)
+   hwc-config |= ARC_REG_PCT_CONFIG_KERN;
+
+   /* exclude kernel means count only user */
+   if (event-attr.exclude_kernel)
+   hwc-config |= ARC_REG_PCT_CONFIG_USER;
+   }
+
switch (event-attr.type) {
case PERF_TYPE_HARDWARE:
if (event-attr.config = PERF_COUNT_HW_MAX)
return -ENOENT;
if (arc_pmu-ev_hw_idx[event-attr.config]  0)
return -ENOENT;
-   hwc-config = arc_pmu-ev_hw_idx[event-attr.config];
+   hwc-config |= arc_pmu-ev_hw_idx[event-attr.config];
pr_debug(init event %d with h/w %d \'%s\'\n,
 (int) event-attr.config, (int) hwc-config,
 arc_pmu_ev_hw_map[event-attr.config]);
@@ -163,7 +175,7 @@ static int arc_pmu_event_init(struct perf_event *event)
ret = arc_pmu_cache_event(event-attr.config);
if (ret  0)
return ret;
-   hwc-config = arc_pmu-ev_hw_idx[ret];
+   hwc-config |= arc_pmu-ev_hw_idx[ret];
return 0;
default:
return -ENOENT;
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/6] ARC: perf: cap the number of counters to hardware max of 32

2015-08-24 Thread Alexey Brodkin
From: Vineet Gupta vgu...@synopsys.com

The number of counters in PCT can never be more than 32 (while
countable conditions could be 100+) for both ARCompact and ARCv2

And while at it update copyright dates.

Cc: Peter Zijlstra pet...@infradead.org
Cc: Arnaldo Carvalho de Melo a...@kernel.org
Signed-off-by: Vineet Gupta vgu...@synopsys.com
Signed-off-by: Alexey Brodkin abrod...@synopsys.com
---

Compared to v2:
 [1] Updated copyright date in arch/arc/kernel/perf_event.c

No changes since v1.

 arch/arc/include/asm/perf_event.h | 5 +++--
 arch/arc/kernel/perf_event.c  | 6 +++---
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/arc/include/asm/perf_event.h 
b/arch/arc/include/asm/perf_event.h
index 2b8880e..e7b16c2 100644
--- a/arch/arc/include/asm/perf_event.h
+++ b/arch/arc/include/asm/perf_event.h
@@ -1,6 +1,7 @@
 /*
  * Linux performance counter support for ARC
  *
+ * Copyright (C) 2014-2015 Synopsys, Inc. (www.synopsys.com)
  * Copyright (C) 2011-2013 Synopsys, Inc. (www.synopsys.com)
  *
  * This program is free software; you can redistribute it and/or modify
@@ -12,8 +13,8 @@
 #ifndef __ASM_PERF_EVENT_H
 #define __ASM_PERF_EVENT_H
 
-/* real maximum varies per CPU, this is the maximum supported by the driver */
-#define ARC_PMU_MAX_HWEVENTS   64
+/* Max number of counters that PCT block may ever have */
+#define ARC_PERF_MAX_COUNTERS  32
 
 #define ARC_REG_CC_BUILD   0xF6
 #define ARC_REG_CC_INDEX   0x240
diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index 1287388..d7ee5b2 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -1,7 +1,7 @@
 /*
  * Linux performance counter support for ARC700 series
  *
- * Copyright (C) 2013 Synopsys, Inc. (www.synopsys.com)
+ * Copyright (C) 2013-2015 Synopsys, Inc. (www.synopsys.com)
  *
  * This code is inspired by the perf support of various other architectures.
  *
@@ -22,7 +22,7 @@ struct arc_pmu {
struct pmu  pmu;
int counter_size;   /* in bits */
int n_counters;
-   unsigned long   used_mask[BITS_TO_LONGS(ARC_PMU_MAX_HWEVENTS)];
+   unsigned long   used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)];
int ev_hw_idx[PERF_COUNT_ARC_HW_MAX];
 };
 
@@ -284,7 +284,7 @@ static int arc_pmu_device_probe(struct platform_device 
*pdev)
pr_err(This core does not have performance counters!\n);
return -ENODEV;
}
-   BUG_ON(pct_bcr.c  ARC_PMU_MAX_HWEVENTS);
+   BUG_ON(pct_bcr.c  ARC_PERF_MAX_COUNTERS);
 
READ_BCR(ARC_REG_CC_BUILD, cc_bcr);
BUG_ON(!cc_bcr.v); /* Counters exist but No countable conditions ? */
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents/drivers/mtk: Fix spurious interrupt leading to crash

2015-08-24 Thread Yingjoe Chen
On Mon, 2015-08-24 at 15:30 +0200, Daniel Lezcano wrote:
 After analysis done by Yingjoe Chen, the timer appears to have a pending
 interrupt when it is enabled.

 Fix this by acknowledging the pending interrupt when enabling the timer
 interrupt.

 Signed-off-by: Daniel Lezcano daniel.lezc...@linaro.org

Hi Daniel,


Thanks for your patch, this can fix the boot issue.
Tested-by: Yingjoe Chen yingjoe.c...@mediatek.com

 ---
  drivers/clocksource/mtk_timer.c | 13 +++--
  1 file changed, 3 insertions(+), 10 deletions(-)
 
 diff --git a/drivers/clocksource/mtk_timer.c b/drivers/clocksource/mtk_timer.c
 index 4cd16fb..13543a8 100644
 --- a/drivers/clocksource/mtk_timer.c
 +++ b/drivers/clocksource/mtk_timer.c
 @@ -156,14 +156,6 @@ static irqreturn_t mtk_timer_interrupt(int irq, void 
 *dev_id)
   return IRQ_HANDLED;
  }
  
 -static void mtk_timer_global_reset(struct mtk_clock_event_device *evt)
 -{
 - /* Disable all interrupts */
 - writel(0x0, evt-gpt_base + GPT_IRQ_EN_REG);
 - /* Acknowledge all interrupts */
 - writel(0x3f, evt-gpt_base + GPT_IRQ_ACK_REG);
 -}
 -
  static void
  mtk_timer_setup(struct mtk_clock_event_device *evt, u8 timer, u8 option)
  {
 @@ -183,6 +175,9 @@ static void mtk_timer_enable_irq(struct 
 mtk_clock_event_device *evt, u8 timer)
  {
   u32 val;
  
 +/* Acknowledge all spurious pending interrupts */
 +writel(0x3f, evt-gpt_base + GPT_IRQ_ACK_REG);

This should use tab to indent.

 +
   val = readl(evt-gpt_base + GPT_IRQ_EN_REG);
   writel(val | GPT_IRQ_ENABLE(timer),
   evt-gpt_base + GPT_IRQ_EN_REG);
 @@ -232,8 +227,6 @@ static void __init mtk_timer_init(struct device_node 
 *node)
   }
   rate = clk_get_rate(clk);
  
 - mtk_timer_global_reset(evt);
 -

I think we should keep this one, or at least disable irq first in
mtk_timer_enable_irq. MT8173 firmware didn't use this GPT, but I think
it is a good ideat to do it just in case firmware in some other platform
use it.

Joe.C


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/6] perf: Annotate some of the error codes with perf_err()

2015-08-24 Thread Alexander Shishkin
This patch annotates a few semi-random error paths in perf core to
illustrate the extended error reporting facility. Most of them can
be triggered from perf tools.

Signed-off-by: Alexander Shishkin alexander.shish...@linux.intel.com
---
 kernel/events/core.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3ff28fc8bd..7beab37ea6 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3382,10 +3382,10 @@ find_lively_task_by_vpid(pid_t vpid)
rcu_read_unlock();
 
if (!task)
-   return ERR_PTR(-ESRCH);
+   return PERF_ERR_PTR(-ESRCH, task not found);
 
/* Reuse ptrace permission checks for now. */
-   err = -EACCES;
+   err = perf_err(-EACCES, insufficient permissions for tracing this 
task);
if (!ptrace_may_access(task, PTRACE_MODE_READ))
goto errout;
 
@@ -3413,7 +3413,8 @@ find_get_context(struct pmu *pmu, struct task_struct 
*task,
if (!task) {
/* Must be root to operate on a CPU event: */
if (perf_paranoid_cpu()  !capable(CAP_SYS_ADMIN))
-   return ERR_PTR(-EACCES);
+   return PERF_ERR_PTR(-EACCES,
+   must be root to operate on a CPU 
event);
 
/*
 * We could be clever and allow to attach a event to an
@@ -3421,7 +3422,7 @@ find_get_context(struct pmu *pmu, struct task_struct 
*task,
 * that's for later.
 */
if (!cpu_online(cpu))
-   return ERR_PTR(-ENODEV);
+   return PERF_ERR_PTR(-ENODEV, cpu is offline);
 
cpuctx = per_cpu_ptr(pmu-pmu_cpu_context, cpu);
ctx = cpuctx-ctx;
@@ -8134,15 +8135,16 @@ SYSCALL_DEFINE5(perf_event_open,
 
if (!attr.exclude_kernel) {
if (perf_paranoid_kernel()  !capable(CAP_SYS_ADMIN))
-   return -EACCES;
+   return perf_err_sync(attr, -EACCES,
+kernel tracing forbidden for the 
unprivileged);
}
 
if (attr.freq) {
if (attr.sample_freq  sysctl_perf_event_sample_rate)
-   return -EINVAL;
+   return perf_err_sync(attr, -EINVAL, sample_freq too 
high);
} else {
if (attr.sample_period  (1ULL  63))
-   return -EINVAL;
+   return perf_err_sync(attr, -EINVAL, sample_period too 
high);
}
 
/*
@@ -8152,14 +8154,14 @@ SYSCALL_DEFINE5(perf_event_open,
 * cgroup.
 */
if ((flags  PERF_FLAG_PID_CGROUP)  (pid == -1 || cpu == -1))
-   return -EINVAL;
+   return perf_err_sync(attr, -EINVAL, pid and cpu need to be 
set in cgroup mode);
 
if (flags  PERF_FLAG_FD_CLOEXEC)
f_flags |= O_CLOEXEC;
 
event_fd = get_unused_fd_flags(f_flags);
if (event_fd  0)
-   return event_fd;
+   return perf_err_sync(attr, event_fd, can't obtain a file 
descriptor);
 
if (group_fd != -1) {
err = perf_fget_light(group_fd, group);
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 4/6] perf/x86: Annotate some of the error codes with perf_err()

2015-08-24 Thread Alexander Shishkin
This patch annotates a few x86-specific error paths with perf's extended
error reporting facility.

Signed-off-by: Alexander Shishkin alexander.shish...@linux.intel.com
---
 arch/x86/kernel/cpu/perf_event.c   | 8 ++--
 arch/x86/kernel/cpu/perf_event_intel_lbr.c | 2 +-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index f56cf074d0..b3b531beee 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -12,6 +12,8 @@
  *  For licencing details see kernel-base/COPYING
  */
 
+#define PERF_MODNAME   perf/x86
+
 #include linux/perf_event.h
 #include linux/capability.h
 #include linux/notifier.h
@@ -426,11 +428,13 @@ int x86_setup_perfctr(struct perf_event *event)
 
/* BTS is currently only allowed for user-mode. */
if (!attr-exclude_kernel)
-   return -EOPNOTSUPP;
+   return perf_err(-EOPNOTSUPP,
+   BTS sampling not allowed for kernel 
space);
 
/* disallow bts if conflicting events are present */
if (x86_add_exclusive(x86_lbr_exclusive_lbr))
-   return -EBUSY;
+   return perf_err(-EBUSY,
+   LBR conflicts with active events);
 
event-destroy = hw_perf_lbr_event_destroy;
}
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c 
b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index b2c9475b7f..222b259c5e 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -607,7 +607,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
 * no LBR on this PMU
 */
if (!x86_pmu.lbr_nr)
-   return -EOPNOTSUPP;
+   return perf_err(-EOPNOTSUPP, LBR is not supported by this 
cpu);
 
/*
 * setup SW LBR filter
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT

2015-08-24 Thread Vlastimil Babka

On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote:

On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz wrote:

On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote:



I am in the middle of implementing lock on fault this way, but I cannot
see how we will hanlde mremap of a lock on fault region.  Say we have
the following:

  addr = mmap(len, MAP_ANONYMOUS, ...);
  mlock(addr, len, MLOCK_ONFAULT);
  ...
  mremap(addr, len, 2 * len, ...)

There is no way for mremap to know that the area being remapped was lock
on fault so it will be locked and prefaulted by remap.  How can we avoid
this without tracking per vma if it was locked with lock or lock on
fault?



remap can count filled ptes and prefault only completely populated areas.



Does (and should) mremap really prefault non-present pages? Shouldn't it
just prepare the page tables and that's it?


As I see mremap prefaults pages when it extends mlocked area.

Also quote from manpage
: If  the memory segment specified by old_address and old_size is locked
: (using mlock(2) or similar), then this lock is maintained when the segment is
: resized and/or relocated.  As a  consequence, the amount of memory locked
: by the process may change.


Oh, right... Well that looks like a convincing argument for having a 
sticky VM_LOCKONFAULT after all. Having mremap guess by scanning 
existing pte's would slow it down, and be unreliable (was the area 
completely populated because MLOCK_ONFAULT was not used or because the 
process aulted it already? Was it not populated because MLOCK_ONFAULT 
was used, or because mmap(MAP_LOCKED) failed to populate it all?).


The only sane alternative is to populate always for mremap() of 
VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information as 
a limitation of mlock2(MLOCK_ONFAULT). Which might or might not be 
enough for Eric's usecase, but it's somewhat ugly.





There might be a problem after failed populate: remap will handle them
as lock on fault. In this case we can fill ptes with swap-like non-present
entries to remember that fact and count them as should-be-locked pages.



I don't think we should strive to have mremap try to fix the inherent
unreliability of mmap (MAP_POPULATE)?


I don't think so. MAP_POPULATE works only when mmap happens.
Flag MREMAP_POPULATE might be a good idea. Just for symmetry.


Maybe, but please do it as a separate series.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 02/20] x86/asm: Add C versions of frame pointer macros

2015-08-24 Thread Josh Poimboeuf
Add C versions of the frame pointer macros which can be used to create a
stack frame in inline assembly.

Signed-off-by: Josh Poimboeuf jpoim...@redhat.com
---
 arch/x86/include/asm/frame.h | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/frame.h b/arch/x86/include/asm/frame.h
index 8a6cd26..9a30ec7 100644
--- a/arch/x86/include/asm/frame.h
+++ b/arch/x86/include/asm/frame.h
@@ -1,10 +1,10 @@
 #ifndef _ASM_X86_FRAME_H
 #define _ASM_X86_FRAME_H
 
-#ifdef __ASSEMBLY__
-
 #include asm/asm.h
 
+#ifdef __ASSEMBLY__
+
 /*
  * These are stack frame creation macros.  They should be used by every
  * callable non-leaf asm function to make kernel stack traces more reliable.
@@ -22,5 +22,21 @@
 #endif
 .endm
 
+#else /* !__ASSEMBLY__ */
+
+#ifdef CONFIG_FRAME_POINTER
+
+#define FRAME_BEGIN\
+   push % _ASM_BP \n   \
+   _ASM_MOV % _ASM_SP , % _ASM_BP \n
+
+#define FRAME_END pop % _ASM_BP \n
+
+#else /* !CONFIG_FRAME_POINTER */
+
+#define FRAME_BEGIN 
+#define FRAME_END 
+
+#endif /* CONFIG_FRAME_POINTER */
 #endif  /*  __ASSEMBLY__  */
 #endif /* _ASM_X86_FRAME_H */
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 00/20] Compile-time stack validation

2015-08-24 Thread Josh Poimboeuf
This is v11 of the compile-time stack validation patch set, along with
proposed fixes for many of the warnings it found.  It's based on the
tip/master branch.

The only real change since v10 is some improvements in patch 3 to the
documentation and changelog which attempt to better describe why stack
validation is needed.

v10 can be found here:

  https://lkml.kernel.org/r/cover.1439521412.git.jpoim...@redhat.com

For more information about the motivation behind this patch set, and
more details about what it does, please see the changelog in patch 3.
Patch 3 also has Documentation/stack-validation.txt which has further
details.

Patches 1-5 are the stackvalidate tool and some related macros.

Patches 6-20 are some proposed fixes for several of the warnings
reported by stackvalidate.  They've been compile-tested and boot tested
in a VM, but I haven't attempted any meaningful testing for many of
them.

v11:
- attempt to answer the why question better in the documentation and
  commit message
- s/FP_SAVE/FRAME_BEGIN/ in documentation

v10:
- add scripts/mod to directory ignores
- remove circular dependencies for ignored objects which are built
  before stackvalidate
- fix CONFIG_MODVERSIONS incompatibility

v9:
- rename FRAME/ENDFRAME - FRAME_BEGIN/FRAME_END
- fix jump table issue for when the original instruction is a jump
- drop paravirt thunk alignment patch
- add maintainers to CC for proposed warning fixes

v8:
- add proposed fixes for warnings
- fix all memory leaks
- process ignores earlier and add more ignore checks
- always assume POPCNT alternative is enabled
- drop hweight inline asm fix
- drop __schedule() ignore patch
- change .Ltemp_\@ to .Lstackvalidate_ignore_\@ in asm macro
- fix CONFIG_* checks in asm macros
- add C versions of ignore macros and frame macros
- change ; to \n in C macros
- add ifdef CONFIG_STACK_VALIDATION checks in C ignore macros
- use numbered label in C ignore macro
- add missing break in switch case statement in arch-x86.c

v7:
- sibling call support
- document proposed solution for inline asm() frame pointer issues
- say kernel entry/exit instead of context switch
- clarify the checking of switch statement jump tables
- discard __stackvalidate_ignore_* sections in linker script
- use .Ltemp_\@ to get a unique label instead of static 3-digit number
- change STACKVALIDATE_IGNORE_FUNC variable to a static
- move STACKVALIDATE_IGNORE_INSN to arch-specific .h file

v6:
- rename asmvalidate - stackvalidate (again)
- gcc-generated object file support
- recursive branch state analysis
- external jump support
- fixup/exception table support
- jump label support
- switch statement jump table support
- added documentation
- detection of noreturn dead end functions
- added a Kbuild mechanism for skipping files and dirs
- moved frame pointer macros to arch/x86/include/asm/frame.h
- moved ignore macros to include/linux/stackvalidate.h

v5:
- stackvalidate - asmvalidate
- frame pointers only required for non-leaf functions
- check for the use of the FP_SAVE/RESTORE macros instead of manually
  analyzing code to detect frame pointer usage
- additional checks to ensure each function doesn't leave its boundaries
- make the macros simpler and more flexible
- support for analyzing ALTERNATIVE macros
- simplified the arch interfaces in scripts/asmvalidate/arch.h
- fixed some asmvalidate warnings
- rebased onto latest tip asm cleanups
- many more small changes

v4:
- Changed the default to CONFIG_STACK_VALIDATION=n, until all the asm
  code can get cleaned up.
- Fixed a stackvalidate error path exit code issue found by Michal
  Marek.

v3:
- Added a patch to make the push/pop CFI macros arch-independent, as
  suggested by H. Peter Anvin

v2:
- Fixed memory leaks reported by Petr Mladek

Cc: linux-kernel@vger.kernel.org
Cc: live-patch...@vger.kernel.org
Cc: Michal Marek mma...@suse.cz
Cc: Peter Zijlstra pet...@infradead.org
Cc: Andy Lutomirski l...@kernel.org
Cc: Borislav Petkov b...@alien8.de
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: Andi Kleen a...@firstfloor.org
Cc: Pedro Alves pal...@redhat.com
Cc: Namhyung Kim namhy...@gmail.com
Cc: Bernd Petrovitsch be...@petrovitsch.priv.at
Cc: Chris J Arges chris.j.ar...@canonical.com
Cc: Andrew Morton a...@linux-foundation.org

Josh Poimboeuf (20):
  x86/asm: Frame pointer macro cleanup
  x86/asm: Add C versions of frame pointer macros
  x86/stackvalidate: Compile-time stack validation
  x86/stackvalidate: Add file and directory ignores
  x86/stackvalidate: Add ignore macros
  x86/xen: Add stack frame dependency to hypercall inline asm calls
  x86/paravirt: Add stack frame dependency to PVOP inline asm calls
  x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK
  x86/amd: Set ELF function type for vide()
  x86/reboot: Add ljmp instructions to stackvalidate whitelist
  x86/xen: Add xen_cpuid() and xen_setup_gdt() to stackvalidate
whitelists
  x86/asm/crypto: Create stack frames in aesni-intel_asm.S
  x86/asm/crypto: Move 

[PATCH v11 05/20] x86/stackvalidate: Add ignore macros

2015-08-24 Thread Josh Poimboeuf
Add new stackvalidate ignore macros: STACKVALIDATE_IGNORE_INSN and
STACKVALIDATE_IGNORE_FUNC.  These can be used to tell stackvalidate to
skip validation of an instruction or a function, respectively.

Signed-off-by: Josh Poimboeuf jpoim...@redhat.com
---
 arch/x86/include/asm/stackvalidate.h | 45 
 arch/x86/kernel/vmlinux.lds.S|  5 +++-
 include/linux/stackvalidate.h| 28 ++
 3 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/include/asm/stackvalidate.h
 create mode 100644 include/linux/stackvalidate.h

diff --git a/arch/x86/include/asm/stackvalidate.h 
b/arch/x86/include/asm/stackvalidate.h
new file mode 100644
index 000..95db052
--- /dev/null
+++ b/arch/x86/include/asm/stackvalidate.h
@@ -0,0 +1,45 @@
+#ifndef _ASM_X86_STACKVALIDATE_H
+#define _ASM_X86_STACKVALIDATE_H
+
+#include asm/asm.h
+
+#ifdef __ASSEMBLY__
+
+/*
+ * This asm macro tells the stack validation script to ignore the instruction
+ * immediately after the macro.  It should only be used in special cases where
+ * you're 100% sure it won't affect the reliability of frame pointers and
+ * kernel stack traces.
+ *
+ * For more information, see Documentation/stack-validation.txt.
+ */
+.macro STACKVALIDATE_IGNORE_INSN
+#ifdef CONFIG_STACK_VALIDATION
+   .Lstackvalidate_ignore_\@:
+   .pushsection __stackvalidate_ignore_insn, a
+   _ASM_ALIGN
+   .long .Lstackvalidate_ignore_\@ - .
+   .popsection
+#endif
+.endm
+
+#else /* !__ASSEMBLY__ */
+
+#ifdef CONFIG_STACK_VALIDATION
+
+#define STACKVALIDATE_IGNORE_INSN  \
+   1:\n  \
+   .pushsection __stackvalidate_ignore_insn, \a\\n \
+   _ASM_ALIGN \n \
+   .long 1b - .\n\
+   .popsection\n
+
+#else /* !CONFIG_STACK_VALIDATION */
+
+#define STACKVALIDATE_IGNORE_INSN 
+
+#endif /* CONFIG_STACK_VALIDATION */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_X86_STACKVALIDATE_H */
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 00bf300..f2f8d7a 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -332,7 +332,10 @@ SECTIONS
 
/* Sections to be discarded */
DISCARDS
-   /DISCARD/ : { *(.eh_frame) }
+   /DISCARD/ : {
+   *(.eh_frame)
+   *(__stackvalidate_ignore_*)
+   }
 }
 
 
diff --git a/include/linux/stackvalidate.h b/include/linux/stackvalidate.h
new file mode 100644
index 000..4ae242c
--- /dev/null
+++ b/include/linux/stackvalidate.h
@@ -0,0 +1,28 @@
+#ifndef _LINUX_STACKVALIDATE_H
+#define _LINUX_STACKVALIDATE_H
+
+#include asm/stackvalidate.h
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_STACK_VALIDATION
+/*
+ * This C macro tells the stack validation script to ignore the function.  It
+ * should only be used in special cases where you're 100% sure it won't affect
+ * the reliability of frame pointers and kernel stack traces.
+ *
+ * For more information, see Documentation/stack-validation.txt.
+ */
+#define STACKVALIDATE_IGNORE_FUNC(_func) \
+   static void __used __section(__stackvalidate_ignore_func) \
+   *__stackvalidate_ignore_func_##_func = _func
+
+#else /* !CONFIG_STACK_VALIDATION */
+
+#define STACKVALIDATE_IGNORE_FUNC(_func)
+
+#endif /* CONFIG_STACK_VALIDATION */
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* _LINUX_STACKVALIDATE_H */
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 18/20] x86/asm: Create stack frames in rwsem functions

2015-08-24 Thread Josh Poimboeuf
rwsem.S has several callable non-leaf functions which don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf jpoim...@redhat.com
---
 arch/x86/lib/rwsem.S | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/rwsem.S b/arch/x86/lib/rwsem.S
index 40027db..be110ef 100644
--- a/arch/x86/lib/rwsem.S
+++ b/arch/x86/lib/rwsem.S
@@ -15,6 +15,7 @@
 
 #include linux/linkage.h
 #include asm/alternative-asm.h
+#include asm/frame.h
 
 #define __ASM_HALF_REG(reg)__ASM_SEL(reg, e##reg)
 #define __ASM_HALF_SIZE(inst)  __ASM_SEL(inst##w, inst##l)
@@ -84,24 +85,29 @@
 
 /* Fix up special calling conventions */
 ENTRY(call_rwsem_down_read_failed)
+   FRAME_BEGIN
save_common_regs
__ASM_SIZE(push,) %__ASM_REG(dx)
movq %rax,%rdi
call rwsem_down_read_failed
__ASM_SIZE(pop,) %__ASM_REG(dx)
restore_common_regs
+   FRAME_END
ret
 ENDPROC(call_rwsem_down_read_failed)
 
 ENTRY(call_rwsem_down_write_failed)
+   FRAME_BEGIN
save_common_regs
movq %rax,%rdi
call rwsem_down_write_failed
restore_common_regs
+   FRAME_END
ret
 ENDPROC(call_rwsem_down_write_failed)
 
 ENTRY(call_rwsem_wake)
+   FRAME_BEGIN
/* do nothing if still outstanding active readers */
__ASM_HALF_SIZE(dec) %__ASM_HALF_REG(dx)
jnz 1f
@@ -109,15 +115,18 @@ ENTRY(call_rwsem_wake)
movq %rax,%rdi
call rwsem_wake
restore_common_regs
-1: ret
+1: FRAME_END
+   ret
 ENDPROC(call_rwsem_wake)
 
 ENTRY(call_rwsem_downgrade_wake)
+   FRAME_BEGIN
save_common_regs
__ASM_SIZE(push,) %__ASM_REG(dx)
movq %rax,%rdi
call rwsem_downgrade_wake
__ASM_SIZE(pop,) %__ASM_REG(dx)
restore_common_regs
+   FRAME_END
ret
 ENDPROC(call_rwsem_downgrade_wake)
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v3 1/2] mfd: devicetree: bindings: 88pm800: Add DT property for dual phase enable

2015-08-24 Thread Vaibhav Hiremath



On Monday 24 August 2015 06:32 PM, Lee Jones wrote:

On Mon, 24 Aug 2015, Vaibhav Hiremath wrote:


88PM860 family of device supports dual phase mode on BUCK1 supply
providing total 6A capacity.
Note that by default they operate independently with 3A capacity.

This patch updates the devicetree binding with DT property
to enable dual-phase mode on BUCK1.

Signed-off-by: Vaibhav Hiremath vaibhav.hirem...@linaro.org
---
  Documentation/devicetree/bindings/mfd/88pm800.txt | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/mfd/88pm800.txt 
b/Documentation/devicetree/bindings/mfd/88pm800.txt
index dec842f..2c82fcb 100644
--- a/Documentation/devicetree/bindings/mfd/88pm800.txt
+++ b/Documentation/devicetree/bindings/mfd/88pm800.txt
@@ -9,6 +9,12 @@ Required parent device properties:
  - #interrupt-cells: should be 1.
  The cell is the 88pm80x local IRQ number

+Optional properties :
+- marvell,88pm860-buck1-dualphase-en  : If set, enable dual phase on BUCK1,
+  providing 6A capacity.
+  Without this both BUCK1A and BUCK1B operates independently with 3A capacity.
+  (This property is only applicable to 88PM860)


This will require a Regulator Ack.

My suggestion would be to remove the 'buck' number, as the same
property could be used on any Buck, and remove the '-en' part, as
this is implied.



Ok, Will do it in next version.

Mark,

Any comments here before I spin V4.

Thanks,
Vaibhav
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] scripts/checkkconfigsymbols.py: support default statements

2015-08-24 Thread Valentin Rothberg
Hi Michal,

On Mon, Aug 24, 2015 at 4:49 PM, Michal Marek mma...@suse.cz wrote:
 On 2015-07-27 12:33, Valentin Rothberg wrote:
 Until now, checkkonfigsymbols.py did not check default statements for
 references on missing Kconfig symbols (i.e., undefined Kconfig options).
 Hence, add support to parse and check the Kconfig default statement.

 Signed-off-by: Valentin Rothberg valentinrothb...@gmail.com
 ---
 Changelog:
 v2 (thanks to Stefan Hengelein):
 - update NUMERIC regex (Kconfig accepts 'X' and 'A-F')
 - remove mistakenly added blank line from v1

  scripts/checkkconfigsymbols.py | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

 Applied to kbuild.git#kconfig.

 Michal


The patch above already went through Greg's tree to linux-next (see
commit 0bd38ae35522).

Kind regards,
 Valentin
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] Documentation: add 'crashkernel=auto' entry into kernel-parameters.txt

2015-08-24 Thread Yaowei Bai
There is no 'crashkernel=auto' entry in kernel-parameters.txt, borrow it
from kexec-kdump-howto.txt file in the kexec-tools-2.0.0 package.

Signed-off-by: Yaowei Bai bywxiao...@163.com
---
 Documentation/kernel-parameters.txt | 9 +
 1 file changed, 9 insertions(+)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 1d6f045..9e5913e 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -797,6 +797,15 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
It will be ignored when crashkernel=X,high is not used
or memory reserved is below 4G.
 
+   crashkernel=auto
+   This specification allows the kernel to decide how much
+   memory to reserve for the purposes of kdump. It will 
make
+   this determination based on the amount of memory you 
have
+   in your system, and scale the allocation accordingly.
+   Note that if you have less than 4Gb of memory in your 
system,
+   this specification will opt to not allocate any memory 
for
+   the purposes of kdump.
+
cs89x0_dma= [HW,NET]
Format: dma
 
-- 
1.9.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mmc/sdhci-acpi: enable sdhci-acpi device to suspend/resume asynchronously

2015-08-24 Thread Fu, Zhonghui


On 2015/8/17 14:51, Adrian Hunter wrote:
 On 17/08/15 06:38, Fu, Zhonghui wrote:
 Hi,

 Any comments are welcome.
 Same comments as here:

   http://marc.info/?l=linux-kernelm=143979428424353w=2

Now, PM core support asynchronous device suspend/resume mode. If one device has 
been set to support asynchronous PM mode, it's suspend/resume operation can be 
performed in a separate kernel thread and take advantage of multicore feature 
to improve overall system suspend/resume speed. The worst case is that all 
device suspend/resume threads will be scheduled to the same CPU, it hardly 
occur.

PM core ensure all the suspend/resume dependency related to one device. 
Actually, async suspend/resume mode is one feature of PM core, every device 
subsystem may use it or not use it. Once one device subsystem choose to use 
this feature, its safety is up to PM core as long as device subsystem has 
initialized fully this device.


Thanks,
Zhonghui




 Thanks,
 Zhonghui

 On 2015/8/3 21:10, Fu, Zhonghui wrote:
 Enable sdhci-acpi device to suspend/resume asynchronously.
 This can improve system suspend/resume speed.

 Signed-off-by: Zhonghui Fu zhonghui...@linux.intel.com
 ---
  drivers/mmc/host/sdhci-acpi.c |2 ++
  1 files changed, 2 insertions(+), 0 deletions(-)

 diff --git a/drivers/mmc/host/sdhci-acpi.c b/drivers/mmc/host/sdhci-acpi.c
 index 22d929f..67e6263 100644
 --- a/drivers/mmc/host/sdhci-acpi.c
 +++ b/drivers/mmc/host/sdhci-acpi.c
 @@ -379,6 +379,8 @@ static int sdhci_acpi_probe(struct platform_device 
 *pdev)
 pm_runtime_enable(dev);
 }
  
 +   device_enable_async_suspend(dev);
 +
 return 0;
  
  err_free:
 -- 1.7.1



 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i2c: enable i2c adapter to suspend/resume asynchronously

2015-08-24 Thread Fu, Zhonghui
Hi,

Any comments are welcome.


Thanks,
Zhonghui



On 2015/8/18 0:17, Fu, Zhonghui wrote:
 Enable i2c adapter to suspend/resume asynchronously. This can improve
 system suspend/resume speed.

 Signed-off-by: Zhonghui Fu zhonghui...@linux.intel.com
 ---
  drivers/i2c/i2c-core.c |2 ++
  1 files changed, 2 insertions(+), 0 deletions(-)

 diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c
 index c83e4d1..90251be 100644
 --- a/drivers/i2c/i2c-core.c
 +++ b/drivers/i2c/i2c-core.c
 @@ -1439,6 +1439,8 @@ static int i2c_register_adapter(struct i2c_adapter 
 *adap)
  
   pm_runtime_no_callbacks(adap-dev);
  
 + device_enable_async_suspend(adap-dev);
 +
  #ifdef CONFIG_I2C_COMPAT
   res = class_compat_create_link(i2c_adapter_compat_class, adap-dev,
  adap-dev.parent);
 -- 1.7.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH RFC 02/10] perf,tools: Support new sort type --socket

2015-08-24 Thread Liang, Kan
 
 On Fri, Aug 21, 2015 at 08:25:24PM +, Liang, Kan wrote:
 
 SNIP
 
  
   we need global topology information in perf.data and use the mapping
   from there, we can't use current server info
  
   we currently store core_siblings_list and thread_siblings_list, in
   topology FEATURE, which is probably not enough
  
 
  core_siblings_list  includes the cpu list in the same socket.
  thread_siblings_list includes the cpu list in the same core.
  numa_nodes includes the cpu list for each node.
 
  It looks we have enough data from topology FEATURE.
 
 hum, haven't hecked deeply.. how will you get core id for cpu?


from thread_siblings_list.
 I just noticed that svg_build_topology_map did the similar thing to
get topology map for timechart from perf header.

 
  What do you think about the function as below?
  It gets the socket id from env.
 
 some sort of caching would be nice, I guess we could store those cpumap
 objects within perf_session_env

Yes it will be stored in perf_session_env.

Thanks,
Kan
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 3/6] ARCv2: perf: Support sampling events using overflow interrupts

2015-08-24 Thread Alexey Brodkin
In times of ARC 700 performance counters didn't have support of
interrupt an so for ARC we only had support of non-sampling events.

Put simply only perf stat was functional.

Now with ARC HS we have support of interrupts in performance counters
which this change introduces support of.

ARC performance counters act in the following way in regard of
interrupts generation.
 [1] A counter counts starting from value set in PCT_COUNT register pair
 [2] Once counter reaches value set in PCT_INT_CNT interrupt is raised

Basic setup look like this:
 [1] PCT_COUNT = 0;
 [2] PCT_INT_CNT = __limit_value__;
 [3] Enable interrupts for that counter and let it run
 [4] Let counter reach its limit
 [5] Handle interrupt when it happens

Note that PCT HW block is build in CPU core and so ints interrupt
line (which is basically OR of all counters IRQs) is wired directly to
top-level IRQC. That means do de-assert PCT interrupt it's required to
reset IRQs from all counters that have reached their limit values.

Cc: Peter Zijlstra pet...@infradead.org
Cc: Arnaldo Carvalho de Melo a...@kernel.org
Cc: Vineet Gupta vgu...@synopsys.com
Signed-off-by: Alexey Brodkin abrod...@synopsys.com
---

Compared to v2:
 [1] Moved interrupts enabling from arc_pmu_add() to arc_pmu_start()

Compared to v1:
 [1] Added commit message
 [2] Removed check for is_sampling_event() because we already set
 PERF_PMU_CAP_NO_INTERRUPT in probe()
 [3] Minor cosmetics

 arch/arc/include/asm/perf_event.h |   8 ++-
 arch/arc/kernel/perf_event.c  | 128 +++---
 2 files changed, 126 insertions(+), 10 deletions(-)

diff --git a/arch/arc/include/asm/perf_event.h 
b/arch/arc/include/asm/perf_event.h
index e7b16c2..9ed593e 100644
--- a/arch/arc/include/asm/perf_event.h
+++ b/arch/arc/include/asm/perf_event.h
@@ -29,15 +29,19 @@
 #define ARC_REG_PCT_CONFIG 0x254
 #define ARC_REG_PCT_CONTROL0x255
 #define ARC_REG_PCT_INDEX  0x256
+#define ARC_REG_PCT_INT_CNTL   0x25C
+#define ARC_REG_PCT_INT_CNTH   0x25D
+#define ARC_REG_PCT_INT_CTRL   0x25E
+#define ARC_REG_PCT_INT_ACT0x25F
 
 #define ARC_REG_PCT_CONTROL_CC (1  16)   /* clear counts */
 #define ARC_REG_PCT_CONTROL_SN (1  17)   /* snapshot */
 
 struct arc_reg_pct_build {
 #ifdef CONFIG_CPU_BIG_ENDIAN
-   unsigned int m:8, c:8, r:6, s:2, v:8;
+   unsigned int m:8, c:8, r:5, i:1, s:2, v:8;
 #else
-   unsigned int v:8, s:2, r:6, c:8, m:8;
+   unsigned int v:8, s:2, i:1, r:5, c:8, m:8;
 #endif
 };
 
diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index db53af7..ce0fa60 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -11,6 +11,7 @@
  *
  */
 #include linux/errno.h
+#include linux/interrupt.h
 #include linux/module.h
 #include linux/of.h
 #include linux/perf_event.h
@@ -24,6 +25,7 @@ struct arc_pmu {
unsigned long   used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)];
u64 max_period;
int ev_hw_idx[PERF_COUNT_ARC_HW_MAX];
+   struct perf_event *act_counter[ARC_PERF_MAX_COUNTERS];
 };
 
 struct arc_callchain_trace {
@@ -139,9 +141,11 @@ static int arc_pmu_event_init(struct perf_event *event)
struct hw_perf_event *hwc = event-hw;
int ret;
 
-   hwc-sample_period  = arc_pmu-max_period;
-   hwc-last_period = hwc-sample_period;
-   local64_set(hwc-period_left, hwc-sample_period);
+   if (!is_sampling_event(event)) {
+   hwc-sample_period  = arc_pmu-max_period;
+   hwc-last_period = hwc-sample_period;
+   local64_set(hwc-period_left, hwc-sample_period);
+   }
 
switch (event-attr.type) {
case PERF_TYPE_HARDWARE:
@@ -243,6 +247,11 @@ static void arc_pmu_start(struct perf_event *event, int 
flags)
 
arc_pmu_event_set_period(event);
 
+   /* Enable interrupt for this counter */
+   if (is_sampling_event(event))
+   write_aux_reg(ARC_REG_PCT_INT_CTRL,
+ read_aux_reg(ARC_REG_PCT_INT_CTRL) | (1  idx));
+
/* enable ARC pmu here */
write_aux_reg(ARC_REG_PCT_INDEX, idx);
write_aux_reg(ARC_REG_PCT_CONFIG, hwc-config);
@@ -253,6 +262,17 @@ static void arc_pmu_stop(struct perf_event *event, int 
flags)
struct hw_perf_event *hwc = event-hw;
int idx = hwc-idx;
 
+   /* Disable interrupt for this counter */
+   if (is_sampling_event(event)) {
+   /*
+* Reset interrupt flag by writing of 1. This is required
+* to make sure pending interrupt was not left.
+*/
+   write_aux_reg(ARC_REG_PCT_INT_ACT, 1  idx);
+   write_aux_reg(ARC_REG_PCT_INT_CTRL,
+ read_aux_reg(ARC_REG_PCT_INT_CTRL)  ~(1  idx));
+   }
+
if (!(event-hw.state  PERF_HES_STOPPED)) {
/* stop ARC pmu here */
write_aux_reg(ARC_REG_PCT_INDEX, idx);
@@ -275,6 +295,8 @@ 

[PATCH v3 2/6] ARCv2: perf: implement event_set_period

2015-08-24 Thread Alexey Brodkin
This generalization prepares for support of overflow interrupts.

Hardware event counters on ARC work that way:
Each counter counts from programmed start value (set in
ARC_REG_PCT_COUNT) to a limit value (set in ARC_REG_PCT_INT_CNT) and
once limit value is reached this timer generates an interrupt.

Even though this hardware implementation allows for more flexibility,
in Linux kernel we decided to mimic behavior of other architectures
this way:

 [1] Set limit value as half of counter's max value (to allow counter to
 run after reaching it limit, see below for more explanation):
 --8---
 arc_pmu-max_period = (1ULL  counter_size) / 2 - 1ULL;
 --8---

 [2] Set start value as arc_pmu-max_period - sample_period and then
count up to the limit

Our event counters don't stop on reaching max value (the one we set in
ARC_REG_PCT_INT_CNT) but continue to count until kernel explicitly
stops each of them.

And setting a limit as half of counter capacity is done to allow
capturing of additional events in between moment when interrupt was
triggered until we're actually processing PMU interrupts. That way
we're trying to be more precise.

For example if we count CPU cycles we keep track of cycles while
running through generic IRQ handling code:

 [1] We set counter period as say 100_000 events of type crun
 [2] Counter reaches that limit and raises its interrupt
 [3] Once we get in PMU IRQ handler we read current counter value from
ARC_REG_PCT_SNAP ans see there something like 105_000.

If counters stop on reaching a limit value then we would miss
additional 5000 cycles.

Cc: Peter Zijlstra pet...@infradead.org
Cc: Arnaldo Carvalho de Melo a...@kernel.org
Signed-off-by: Vineet Gupta vgu...@synopsys.com
Signed-off-by: Alexey Brodkin abrod...@synopsys.com
---

Compared to v2:
 [1] ARCv2: perf: set usable max period as a half of real max period
 was merged in this one so we may have complete and valid commit message
 that covers basics of ARC PCTs.
 [2] Fixed arc_pmu_event_set_period() in regard of incorrect
 hwc-period_left setup.

Compared to v1:
 [1] Added verbose commit message with explanation of how PCT HW works on ARC
 [2] Simplified arc_perf_event_update()
 [3] Removed check for is_sampling_event() because we already set
 PERF_PMU_CAP_NO_INTERRUPT in probe()
 [4] Minor cosmetics

 arch/arc/kernel/perf_event.c | 79 +++-
 1 file changed, 63 insertions(+), 16 deletions(-)

diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index d7ee5b2..db53af7 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -20,9 +20,9 @@
 
 struct arc_pmu {
struct pmu  pmu;
-   int counter_size;   /* in bits */
int n_counters;
unsigned long   used_mask[BITS_TO_LONGS(ARC_PERF_MAX_COUNTERS)];
+   u64 max_period;
int ev_hw_idx[PERF_COUNT_ARC_HW_MAX];
 };
 
@@ -88,18 +88,15 @@ static uint64_t arc_pmu_read_counter(int idx)
 static void arc_perf_event_update(struct perf_event *event,
  struct hw_perf_event *hwc, int idx)
 {
-   uint64_t prev_raw_count, new_raw_count;
-   int64_t delta;
-
-   do {
-   prev_raw_count = local64_read(hwc-prev_count);
-   new_raw_count = arc_pmu_read_counter(idx);
-   } while (local64_cmpxchg(hwc-prev_count, prev_raw_count,
-new_raw_count) != prev_raw_count);
-
-   delta = (new_raw_count - prev_raw_count) 
-   ((1ULL  arc_pmu-counter_size) - 1ULL);
+   uint64_t prev_raw_count = local64_read(hwc-prev_count);
+   uint64_t new_raw_count = arc_pmu_read_counter(idx);
+   int64_t delta = new_raw_count - prev_raw_count;
 
+   /*
+* We don't afaraid of hwc-prev_count changing beneath our feet
+* because there's no way for us to re-enter this function anytime.
+*/
+   local64_set(hwc-prev_count, new_raw_count);
local64_add(delta, event-count);
local64_sub(delta, hwc-period_left);
 }
@@ -142,6 +139,10 @@ static int arc_pmu_event_init(struct perf_event *event)
struct hw_perf_event *hwc = event-hw;
int ret;
 
+   hwc-sample_period  = arc_pmu-max_period;
+   hwc-last_period = hwc-sample_period;
+   local64_set(hwc-period_left, hwc-sample_period);
+
switch (event-attr.type) {
case PERF_TYPE_HARDWARE:
if (event-attr.config = PERF_COUNT_HW_MAX)
@@ -153,6 +154,7 @@ static int arc_pmu_event_init(struct perf_event *event)
 (int) event-attr.config, (int) hwc-config,
 arc_pmu_ev_hw_map[event-attr.config]);
return 0;
+
case PERF_TYPE_HW_CACHE:
ret = arc_pmu_cache_event(event-attr.config);
if (ret  0)
@@ -180,6 +182,47 @@ static void arc_pmu_disable(struct pmu *pmu)

[PATCH v3 0/6] ARCv2 port to Linux - (C) perf

2015-08-24 Thread Alexey Brodkin
Hi Peter,

This mini-series adds perf support for ARCv2 based cores, which brings in
overflow interupts and SMP. Additionally now raw events are supported as well.

Please review !

Compared to v2 this series has:
 [1] Removed patch with raw-events support.
 It needs some rework and let's then discuss it separately.
 Still I plan to send it shortly.
 [2] Merged set usable max period as a half of real max period into
 implement event_set_period.
 [3] Fixed arc_pmu_event_set_period() in regard of incorrect
 hwc-period_left setup.
 [4] Moved interrupts enabling from arc_pmu_add() to arc_pmu_start()

Compared to v1 this series has:
 [1] Addressed review comments
 [2] More verbose commit messages and comments in sources
 [3] Minor cosmetics

Thanks,
Alexey

Alexey Brodkin (4):
  ARCv2: perf: implement event_set_period
  ARCv2: perf: Support sampling events using overflow interrupts
  ARCv2: perf: implement exclusion of event counting in user or kernel
mode
  ARCv2: perf: SMP support

Vineet Gupta (2):
  ARC: perf: cap the number of counters to hardware max of 32
  ARCv2: perf: Finally introduce HS perf unit

 .../devicetree/bindings/arc/archs-pct.txt  |  17 ++
 MAINTAINERS|   2 +-
 arch/arc/include/asm/perf_event.h  |  21 +-
 arch/arc/kernel/perf_event.c   | 271 ++---
 4 files changed, 275 insertions(+), 36 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arc/archs-pct.txt

-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] scripts/kernel-doc: Improve Markdown results

2015-08-24 Thread Graham Whaley
On Fri, 2015-08-21 at 16:39 -0300, Danilo Cesar Lemes de Paula wrote:
 Using pandoc as the Markdown engine cause some minor side effects as
 pandoc includes  main para tags for almost everything.
 Original Markdown support approach removes those main tags, but it
 caused
 some inconsistencies when that tag is not the main one, like:
 something../something
 para.../para
 
 As kernel-doc was already including a para tag, it causes the
 presence
 of double para tags (parapara), which is not supported by
 DocBook
 spec.
 
 Html target gets away with it, so it causes no harm, although other
 targets might not be so lucky (pdf as example).
 
 We're now delegating the inclusion of the main para tag to pandoc
 only, as it knows when it's necessary or not.
 
 That behavior causes a corner case, the only situation where we're
 certainly that para is not needed, which is the refpurpose
 content.
 For those cases, we're using a $output_markdown_nopara = 1 control
 var.
 
 Signed-off-by: Danilo Cesar Lemes de Paula 


Feel free to add my:
Tested-by: Graham Whaley graham.wha...@linux.intel.com

 Graham
 danilo.ce...@collabora.co.uk
 Cc: Randy Dunlap rdun...@infradead.org
 Cc: Daniel Vetter daniel.vet...@ffwll.ch
 Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com
 Cc: Jonathan Corbet cor...@lwn.net
 Cc: Herbert Xu herb...@gondor.apana.org.au
 Cc: Stephan Mueller smuel...@chronox.de
 Cc: Michal Marek mma...@suse.cz
 Cc: linux-kernel@vger.kernel.org
 Cc: linux-...@vger.kernel.org
 Cc: intel-gfx intel-...@lists.freedesktop.org
 Cc: dri-devel dri-de...@lists.freedesktop.org
 Cc: Graham Whaley graham.wha...@linux.intel.com
 ---
  Thanks to Graham Whaley who helped me to debug this.
 
  scripts/kernel-doc | 48 ++--
 
  1 file changed, 34 insertions(+), 14 deletions(-)
 
 diff --git a/scripts/kernel-doc b/scripts/kernel-doc
 index 3850c1e..12a106c 100755
 --- a/scripts/kernel-doc
 +++ b/scripts/kernel-doc
 @@ -288,6 +288,7 @@ my $use_markdown = 0;
  my $verbose = 0;
  my $output_mode = man;
  my $output_preformatted = 0;
 +my $output_markdown_nopara = 0;
  my $no_doc_sections = 0;
  my @highlights = @highlights_man;
  my $blankline = $blankline_man;
 @@ -529,8 +530,11 @@ sub markdown_to_docbook {
   close(CHLD_OUT);
   close(CHLD_ERR);
  
 - # pandoc insists in adding Main para/para, we should
 remove them.
 - $content =~ s:\A\s*para\s*\n(.*)\n/para\Z$:$1:egsm;
 + if ($output_markdown_nopara) {
 + # pandoc insists in adding Main para/para,
 sometimes we
 + # want to remove them.
 + $content =~
 s:\A\s*para\s*\n(.*)\n/para\Z$:$1:egsm;
 + }
  
   return $content;
  }
 @@ -605,7 +609,7 @@ sub output_highlight {
   $line =~ s/^\s*//;
   }
   if ($line eq ){
 - if (! $output_preformatted) {
 + if (! $output_preformatted  ! $use_markdown) {
   print $lineprefix, local_unescape($blankline);
   }
   } else {
 @@ -1026,7 +1030,7 @@ sub output_section_xml(%) {
   # programlisting is already included by pandoc
   print programlisting\n unless $use_markdown;
   $output_preformatted = 1;
 - } else {
 + } elsif (! $use_markdown) {
   print para\n;
   }
   output_highlight($args{'sections'}{$section});
 @@ -1034,7 +1038,7 @@ sub output_section_xml(%) {
   if ($section =~ m/EXAMPLE/i) {
   print /programlisting\n unless $use_markdown;
   print /informalexample\n;
 - } else {
 + } elsif (! $use_markdown) {
   print /para\n;
   }
   print /refsect1\n;
 @@ -1066,7 +1070,9 @@ sub output_function_xml(%) {
  print  refname . $args{'function'} . /refname\n;
  print  refpurpose\n;
  print   ;
 +$output_markdown_nopara = 1;
  output_highlight ($args{'purpose'});
 +$output_markdown_nopara = 0;
  print  /refpurpose\n;
  print /refnamediv\n;
  
 @@ -1104,10 +1110,12 @@ sub output_function_xml(%) {
   $parameter_name =~ s/\[.*//;
  
   print   varlistentry\n  
  termparameter$parameter/parameter/term\n;
 - printlistitem\npara\n;
 + printlistitem\n;
 + print para\n unless $use_markdown;
   $lineprefix= ;
  
  output_highlight($args{'parameterdescs'}{$parameter_name});
 - print /para\n   /listitem\n 
  /varlistentry\n;
 + print /para\n unless $use_markdown;
 + print/listitem\n  /varlistentry\n;
   }
   print  /variablelist\n;
  } else {
 @@ -1143,7 +1151,9 @@ sub output_struct_xml(%) {
  print  refname . $args{'type'} .   . $args{'struct'} .
 /refname\n;
  print  refpurpose\n;
  print   ;
 +$output_markdown_nopara = 1;
  output_highlight ($args{'purpose'});
 +$output_markdown_nopara = 0;
  print  /refpurpose\n;
  print /refnamediv\n;
  
 @@ -1196,9 +1206,11 @@ sub output_struct_xml(%) {

Re: [PATCH 0/2] kbuild: Minor cleanups of fixdep

2015-08-24 Thread Michal Marek
On 2015-07-24 07:18, Masahiro Yamada wrote:
 Masahiro Yamada (2):
   kbuild: fixdep: optimize code slightly
   kbuild: fixdep: drop meaningless hash table initialization
 
  scripts/basic/fixdep.c | 26 --
  1 file changed, 4 insertions(+), 22 deletions(-)

Applied to kbuild.git#kbuild.

Michal

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/10] irqchip: irq-mips-gic: export gic_send_ipi

2015-08-24 Thread Thomas Gleixner
On Mon, 24 Aug 2015, Qais Yousef wrote:
 On 08/24/2015 02:32 PM, Marc Zyngier wrote:
  I'd rather see something more architected than this blind export, or
  at least some level of filtering (the idea random drivers can access
  such a low-level function doesn't make me feel very good).
 
 I don't know how to architect this better or how to perform  the filtering,
 but I'm happy to hear suggestions and try them out.
 Keep in mind that detecting GIC and writing your own gic_send_ipi() is very
 simple. I have done this when the driver was out of tree. So restricting it by
 not exporting it will not prevent someone from really accessing the
 functionality, it's just they have to do it their own way.

Keep in mind that we are not talking about out of tree hackery. We
talk about a kernel code submission and I doubt, that you will get
away with a GIC detection/fiddling burried in your driver code.

Keep in mind that just slapping an export to some random function is
not much better than doing a GIC hack in the driver.

Marcs concerns about blindly exposing IPI functionality to drivers is
well justified and that kind of coprocessor stuff is not unique to
your particular SoC. We're going to see such things more frequently in
the not so distant future, so we better think now about proper
solutions to that problem.

There are a couple of issues to solve:

1) How is the IPI which is received by the coprocessor reserved in the
   system?

2) How is it associated to a particular driver?

3) How do we ensure that a driver cannot issue random IPIs and can
   only send the associated ones?

None of these issues are handled by your export.

So we need a core infrastructure which allows us to do that. The
requirements are pretty clear from the above and Marc might have some
further restrictions in mind.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT

2015-08-24 Thread Konstantin Khlebnikov
On Mon, Aug 24, 2015 at 6:09 PM, Eric B Munson emun...@akamai.com wrote:
 On Mon, 24 Aug 2015, Vlastimil Babka wrote:

 On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote:
 On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz wrote:
 On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote:
 
 
 I am in the middle of implementing lock on fault this way, but I cannot
 see how we will hanlde mremap of a lock on fault region.  Say we have
 the following:
 
   addr = mmap(len, MAP_ANONYMOUS, ...);
   mlock(addr, len, MLOCK_ONFAULT);
   ...
   mremap(addr, len, 2 * len, ...)
 
 There is no way for mremap to know that the area being remapped was lock
 on fault so it will be locked and prefaulted by remap.  How can we avoid
 this without tracking per vma if it was locked with lock or lock on
 fault?
 
 
 remap can count filled ptes and prefault only completely populated areas.
 
 
 Does (and should) mremap really prefault non-present pages? Shouldn't it
 just prepare the page tables and that's it?
 
 As I see mremap prefaults pages when it extends mlocked area.
 
 Also quote from manpage
 : If  the memory segment specified by old_address and old_size is locked
 : (using mlock(2) or similar), then this lock is maintained when the 
 segment is
 : resized and/or relocated.  As a  consequence, the amount of memory locked
 : by the process may change.

 Oh, right... Well that looks like a convincing argument for having a
 sticky VM_LOCKONFAULT after all. Having mremap guess by scanning
 existing pte's would slow it down, and be unreliable (was the area
 completely populated because MLOCK_ONFAULT was not used or because
 the process aulted it already? Was it not populated because
 MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to
 populate it all?).

 Given this, I am going to stop working in v8 and leave the vma flag in
 place.


 The only sane alternative is to populate always for mremap() of
 VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information
 as a limitation of mlock2(MLOCK_ONFAULT). Which might or might not
 be enough for Eric's usecase, but it's somewhat ugly.


 I don't think that this is the right solution, I would be really
 surprised as a user if an area I locked with MLOCK_ONFAULT was then
 fully locked and prepopulated after mremap().

If mremap is the only problem then we can add opposite flag for it:

MREMAP_NOPOPULATE
- do not populate new segment of locked areas
- do not copy normal areas if possible (anonymous/special must be copied)

addr = mmap(len, MAP_ANONYMOUS, ...);
mlock(addr, len, MLOCK_ONFAULT);
...
addr2 = mremap(addr, len, 2 * len, MREMAP_NOPOPULATE);
...


 
 There might be a problem after failed populate: remap will handle them
 as lock on fault. In this case we can fill ptes with swap-like non-present
 entries to remember that fact and count them as should-be-locked pages.
 
 
 I don't think we should strive to have mremap try to fix the inherent
 unreliability of mmap (MAP_POPULATE)?
 
 I don't think so. MAP_POPULATE works only when mmap happens.
 Flag MREMAP_POPULATE might be a good idea. Just for symmetry.

 Maybe, but please do it as a separate series.

 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 4/5] mm: make compound_head() robust

2015-08-24 Thread Vlastimil Babka

On 08/21/2015 09:34 PM, Andrew Morton wrote:

On Fri, 21 Aug 2015 22:31:09 +0300 Kirill A. Shutemov kir...@shutemov.name 
wrote:


On Fri, Aug 21, 2015 at 11:11:27AM -0500, Christoph Lameter wrote:

On Fri, 21 Aug 2015, Kirill A. Shutemov wrote:


Is this really true?  For example if it's a slab page, will that page
ever be inspected by code which is looking for the PageTail bit?


+Christoph.

What we know for sure is that space is not used in tail pages, otherwise
it would collide with current compound_dtor.


Sl*b allocators only do a virt_to_head_page on tail pages.


The question was whether it's safe to assume that the bit 0 is always zero
in the word as this bit will encode PageTail().


That wasn't my question actually...

What I'm wondering is: if this page is being used for slab, will any
code path ever run PageTail() against it?  If not, we don't need to be
concerned about that bit.


Pfn scanners such as compaction might inspect such pages and run 
compound_head() (and thus PageTail) on them. I think no kind of page 
within a zone (slab or otherwise) is protected from this, which is why 
it needs to be robust.



And slab was just the example I chose.  The same question petains to
all other uses of that union.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3 v4] mm/vmalloc: Cache the vmalloc memory info

2015-08-24 Thread John Stoffel

George John Stoffel j...@stoffel.org wrote:
 vmap_info_gen should be initialized to 1 to force an initial
 cache update.

 Blech, it should be initialized with a proper #define
 VMAP_CACHE_NEEDS_UPDATE 1, instead of more magic numbers.

George Er... this is a joke, right?

Not really.  The comment made before was that by setting this variable
to zero, it wasn't properly initialized.  Which implies that either
the API is wrong... or we should be documenting it better.   I just
went in the direction of the #define instead of a comment. 

George First, this number is used exactly once, and it's not part of
George a collection of similar numbers.  And the definition would be
George adjacent to the use.

George We have easier ways of accomplishing that, called comments.

Sure, that would be the better solution in this case.  

George Second, your proposed name is misleading.  needs update is defined
George as vmap_info_gen != vmap_info_cache_gen.  There is no particular value
George of either that has this meaning.

George For example, initializing vmap_info_cache_gen to -1 would do just as 
well.
George (I actually considered that before deciding that +1 was simpler than 
-1.)

See, I just threw out a dumb suggestion without reading the patch
properly.  My fault.

George (John, my apologies if I went over the top and am contributing to LKML's
George reputation for flaming.  I *did* actually laugh, and *do* think it's a
George dumb idea, but my annoyance is really directed at unpleasant memories of
George mindless application of coding style guidelines.  In this case, I 
suspect
George you just posted before reading carefully enough to see the subtle 
logic.)

Nope, I'm in the wrong here.  And your comment here is wonderful, I
really do appreciate how you handled my ham fisted attempt to
contribute.  But I've got thick skin and I'll keep trying in my free
time to comment on patches when I can.

John
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] sdhci: fix DMA leaks [was: [SHDCI] Heavy (thousands) DMA leaks]

2015-08-24 Thread Laura Abbott

On 08/06/2015 02:17 AM, Chen Bough wrote:

I will format a patch based on your diff file firstly. I will test this on my 
side,
If any issue, like dma issue or performance issue, I will add some modification.
Then I will send the patch for review, and you can test the patch on your 
platform.

Best Regards
Haibo Chen



Did I miss the follow up patch or is this still pending? If it's still pending,
would you mind Ccing me when it's available for testing?

Thanks,
Laura
 



-Original Message-
From: Jiri Slaby [mailto:jsl...@suse.cz]
Sent: Thursday, August 06, 2015 5:07 PM
To: Chen Haibo-B51421; Ulf Hansson
Cc: linux-...@vger.kernel.org; Linux kernel mailing list
Subject: Re: [RFC] sdhci: fix DMA leaks [was: [SHDCI] Heavy (thousands)
DMA leaks]

On 08/06/2015, 09:42 AM, Chen Bough wrote:

I read your attached log and patch, yes, dma memory leak will happen
when more than one pre_request execute. The method of ++next-cookie
is not good, your patch seems good, but I still need some time to test
the patch, because you unmap the dma in sdhci_finish_data rather than

the sdhci_post_req.

Hi,

yes, this is not correct. We can perhaps differentiate according to the
COOKIE value. Should I fix it or are you going to prepare a patch based
on my RFC?

thanks,
--
js
suse labs


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH linux-next v4 3/5] mtd: spi-nor: allow to tune the number of dummy cycles

2015-08-24 Thread Cyrille Pitchen
Hi Marek,

Le 24/08/2015 12:48, Marek Vasut a écrit :
 On Monday, August 24, 2015 at 12:13:58 PM, Cyrille Pitchen wrote:
 The number of dummy cycles used during Fast Read commands can be reduced
 to improve transfer performances. Each manufacturer has a dedicated set of
 registers to provide the memory with the exact number of dummy cycles it
 should expect. Both the memory and the (Q)SPI controller must agree on
 this number of dummy cycles.

 The number of dummy cycles can be found into the memory datasheet and
 mostly depends on the SPI clock frequency, the Fast Read op code and the
 Single/Dual Data Rate mode.

 Probing JEDEC Serial Flash Discoverable Parameters (SFDP) tables would
 only provide the driver with a high enough number of dummy cycles for each
 Fast Read command to be used for all clock frequencies: this solution
 would not be optimized.

 Signed-off-by: Cyrille Pitchen cyrille.pitc...@atmel.com
 
 Hi!
 
  drivers/mtd/spi-nor/spi-nor.c | 97
 ++- include/linux/mtd/spi-nor.h  
 |  2 +
  2 files changed, 80 insertions(+), 19 deletions(-)

 diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
 index e2a6029dc056..869e098a6841 100644
 --- a/drivers/mtd/spi-nor/spi-nor.c
 +++ b/drivers/mtd/spi-nor/spi-nor.c
 @@ -119,24 +119,6 @@ static int read_cr(struct spi_nor *nor)
  }

  /*
 - * Dummy Cycle calculation for different type of read.
 - * It can be used to support more commands with
 - * different dummy cycle requirements.
 - */
 -static inline int spi_nor_read_dummy_cycles(struct spi_nor *nor)
 -{
 -switch (nor-flash_read) {
 -case SPI_NOR_FAST:
 -case SPI_NOR_DUAL:
 -case SPI_NOR_QUAD:
 -return 8;
 -case SPI_NOR_NORMAL:
 -return 0;
 -}
 -return 0;
 -}
 
 You can probably just soup up this function so that it sets the
 nor-read_dummy, no ?


Actually, this is what the patch does: spi_nor_read_dummy_cycles() was reused
and enhanced few lines below where you've pointed out the 
switch (nor-flash_read) block should be move after the else block.

I think when I wrote the code I've chosen to move the definition of this
function instead of adding forward declarations of functions such as read_cr()
or write_sr_cr(), which are now called by micron_set_dummy_cycles().

 -/*
   * Write status register 1 byte
   * Returns negative if error occurred.
   */
 @@ -1012,6 +994,81 @@ static int set_quad_mode(struct spi_nor *nor, struct
 flash_info *info) }
  }

 +static int micron_set_dummy_cycles(struct spi_nor *nor)
 +{
 +int ret;
 +u8 val, mask;
 +
 +/* read the Volatile Configuration Register (VCR) */
 
 NIT: If this is a sentence, start it with capital letter and end it with 
 fullstop :)
 

done for the next version

 +ret = nor-read_reg(nor, SPINOR_OP_RD_VCR, val, 1);
 +if (ret  0) {
 +dev_err(nor-dev, error %d reading VCR\n, ret);
 +return ret;
 +}
 +
 +write_enable(nor);
 +
 +/* update the number of dummy into the VCR */
 
 DTTO
 

done for the next version

 +mask = GENMASK(7, 4);
 +val = ~mask;
 +val |= (nor-read_dummy  4)  mask;
 +ret = nor-write_reg(nor, SPINOR_OP_WR_VCR, val, 1, 0);
 +if (ret  0) {
 +dev_err(nor-dev, error while writing VCR register\n);
 +return ret;
 +}
 +
 +ret = spi_nor_wait_till_ready(nor);
 +if (ret)
 +return ret;
 +
 +return 0;
 +}
 +
 +/*
 + * Dummy Cycle calculation for different type of read.
 + * It can be used to support more commands with
 + * different dummy cycle requirements.
 + */
 +static int spi_nor_read_dummy_cycles(struct spi_nor *nor,
 + const struct flash_info *info)
 +{
 +struct device_node *np = nor-dev-of_node;
 +u32 num_dummy_cycles;
 +
 +if (np  !of_property_read_u32(np, m25p,num-dummy-cycles,
 +num_dummy_cycles)) {
 +nor-read_dummy = num_dummy_cycles;
 +
 +/*
 + * This switch block might be moved after the if...then...else
 + * statement but it was not tested with all Spansion or Micron
 + * memories.
 + * Now the m25p,num-dummy-cycles property needs to be
 + * explicitly set in the device tree so the switch statement is
 + * executed. This should avoid unwanted side effects and keep
 + * backward compatibility.
 + */
 +switch (JEDEC_MFR(info)) {
 +case CFI_MFR_ST:
 +return micron_set_dummy_cycles(nor);
 +default:
 
 If you do have m25p,num-dummy-cycles set for non-micron flash, you have a 
 problem here I believe.
 
 +break;
 +}
 +} else {
 
 The solution would be to drop this else {} bit here, so that if you fail in
 the DT-based configuration, you fall back to this old behavior. What do you 
 think please ? :)
 

Good idea!
I 

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Austin S Hemmelgarn

On 2015-08-22 14:29, Tejun Heo wrote:

Hello, Paul.

On Fri, Aug 21, 2015 at 12:26:30PM -0700, Paul Turner wrote:
...

A very concrete example of the above is a virtual machine in which you
want to guarantee scheduling for the vCPU threads which must schedule
beside many hypervisor support threads.   A hierarchy is the only way
to fix the ratio at which these compete.


Just to learn more, what sort of hypervisor support threads are we
talking about?  They would have to consume considerable amount of cpu
cycles for problems like this to be relevant and be dynamic in numbers
in a way which letting them competing against vcpus makes sense.  Do
IO helpers meet these criteria?

Depending on the configuration, yes they can.  VirtualBox has some 
rather CPU intensive threads that aren't vCPU threads (their emulated 
APIC thread immediately comes to mind), and so does QEMU depending on 
the emulated hardware configuration (it gets more noticeable when the 
disk images are stored on a SAN and served through iSCSI, NBD, FCoE, or 
ATAoE, which is pretty typical usage for large virtualization 
deployments).  I've seen cases first hand where the vCPU's can make no 
reasonable progress because they are constantly getting crowded out by 
other threads.


The use of the term 'hypervisor support threads' for this is probably 
not the best way of describing the contention, as it's almost always a 
full system virtualization issue, and the contending threads are usually 
storage back-end access threads.


I would argue that there are better ways to deal properly with this 
(Isolate the non vCPU threads on separate physical CPU's from the 
hardware emulation threads), but such methods require large systems to 
be practical at any scale, and many people don't have the budget for 
such large systems, and this way of doing things is much more flexible 
for small scale use cases (for example, someone running one or two VM's 
on a laptop under QEMU or VirtualBox).




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH v2 5/5] arm64: add KASan support

2015-08-24 Thread Andrey Ryabinin
2015-08-24 19:16 GMT+03:00 Vladimir Murzin vladimir.mur...@arm.com:
 On 24/08/15 17:00, Andrey Ryabinin wrote:
 2015-08-24 18:44 GMT+03:00 Vladimir Murzin vladimir.mur...@arm.com:

 Another option would be having sparse shadow memory based on page
 extension. I did play with that some time ago based on ideas from
 original v1 KASan support for x86/arm - it is how 614be38 irqchip:
 gic-v3: Fix out of bounds access to cpu_logical_map was caught.
 It doesn't require any VA reservations, only some contiguous memory for
 the page_ext itself, which serves as indirection level for the 0-order
 shadow pages.

 We won't be able to use inline instrumentation (I could live with that),
 and most importantly, we won't be able to use stack instrumentation.
 GCC needs to know shadow address for inline and/or stack instrumentation
 to generate correct code.

 It's definitely a trade-off ;)

 Just for my understanding does that stack instrumentation is controlled
 via -asan-stack?


Yup.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/6] ARCv2: perf: implement exclusion of event counting in user or kernel mode

2015-08-24 Thread Vineet Gupta
On Monday 24 August 2015 08:00 PM, Vineet Gupta wrote:
 On Monday 24 August 2015 07:50 PM, Alexey Brodkin wrote:
  Cc: Peter Zijlstra pet...@infradead.org
  Cc: Arnaldo Carvalho de Melo a...@kernel.org
  Signed-off-by: Alexey Brodkin abrod...@synopsys.com
  ---
 
  No changes since v2.
 
  No changes since v1.
 
  
 }
   
  +  hwc-config = 0;
  +
  +  if (is_isa_arcv2()) {
  +  /* exclude user means count only kernel */
  +  if (event-attr.exclude_user)
  +  hwc-config |= ARC_REG_PCT_CONFIG_KERN;
  +
  +  /* exclude kernel means count only user */
  +  if (event-attr.exclude_kernel)
  +  hwc-config |= ARC_REG_PCT_CONFIG_USER;
  +  }
  +
 switch (event-attr.type) {
 case PERF_TYPE_HARDWARE:
 if (event-attr.config = PERF_COUNT_HW_MAX)
 return -ENOENT;
 if (arc_pmu-ev_hw_idx[event-attr.config]  0)
 return -ENOENT;
  -  hwc-config = arc_pmu-ev_hw_idx[event-attr.config];
  +  hwc-config |= arc_pmu-ev_hw_idx[event-attr.config];
 With raw events patch dropped - this hunk need not be present.

Please ignore this stupid comment - this was written when I was presumably 
smoking
pot !

 
 pr_debug(init event %d with h/w %d \'%s\'\n,
  (int) event-attr.config, (int) hwc-config,
  arc_pmu_ev_hw_map[event-attr.config]);
  @@ -163,7 +175,7 @@ static int arc_pmu_event_init(struct perf_event *event)
 ret = arc_pmu_cache_event(event-attr.config);
 if (ret  0)
 return ret;
  -  hwc-config = arc_pmu-ev_hw_idx[ret];
  +  hwc-config |= arc_pmu-ev_hw_idx[ret];
 return 0;
 default:
 return -ENOENT;

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v6 5/6] mfd: 88pm800: Set default interrupt clear method

2015-08-24 Thread Lee Jones
On Mon, 24 Aug 2015, Vaibhav Hiremath wrote:

 
 
 On Monday 24 August 2015 07:24 PM, Lee Jones wrote:
 On Wed, 08 Jul 2015, Vaibhav Hiremath wrote:
 
 As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe
 (page 0) controls the method of clearing interrupt
 status of 88pm800 family of devices;
 
0: clear on read
1: clear on write
 
 If pdata is not coming from board file, then set the
 default irq clear method to irq clear on write
 
 Also, as suggested by Lee Jones renaming variable field
 to appropriate name and removed unnecessary field
 pm80x_chip.irq_mode, using platform_data.irq_clr_method.
 
 Signed-off-by: Zhao Ye zh...@marvell.com
 Signed-off-by: Vaibhav Hiremath vaibhav.hirem...@linaro.org
 Reviewed-by: Krzysztof Kozlowski k.kozlow...@samsung.com
 ---
   drivers/mfd/88pm800.c   | 15 ++-
   include/linux/mfd/88pm80x.h |  9 +++--
   2 files changed, 17 insertions(+), 7 deletions(-)
 
 [...]
 
 +#define PM800_WAKEUP2_INT_READ_CLEAR   (0  1)
 +#define PM800_WAKEUP2_INT_WRITE_CLEAR  (1  1)
 
 Use BIT().
 
 +/* Used by irq_clr_method */
 +#define PM800_IRQ_CLR_ON_READ  0
 +#define PM800_IRQ_CLR_ON_WRITE 1
 
 -   int irq_mode;   /* Clear interrupt by read/write(0/1) */
 +   bool irq_clr_method;/* Clear interrupt by read/write(0/1) */
 
 +   irq_clr_mode = pdata-irq_clr_method == PM800_IRQ_CLR_ON_WRITE ?
 +   PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR;
 +   ret = regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode);
 
 This is pretty convoluted.
 
 For starters you're abusing the 'bool' type here.  Bool is either
 'true' or 'false', so at the very least you should rename
 'irq_clr_method' to 'irq_clr_on_write'.
 
 Then you can do:
 
  irq_clr_mode = pdata-irq_clr_on_write ?
  PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR;
 
 
 We have discussed on this, and went back-n-forth.
 I think if I remember correctly, one of the version was using
 true/false then we decided to rename it to relevant macro.
 
 If I am not wrong V4 version of this series is exactly same as what you
 are referring to.

Right.  I made a few suggestions which vary in usefulness depending on
how you plan to implement all of this.  Unfortunately this is a bit of
a bastardised version where some of it make sense and other parts
could do with some improvement.

 However, what I suggest you really do is share
 PM800_WAKEUP2_INT_{READ,WRITE}_CLEAR with platform data and just pass
 the value through directly.
 
 
 I think we discussed about this also, and the reason I recall here is,
 
 we may need to control this from DT in the future so we decided to keep
 it boolean in platform_data and have simple check before writing to
 register.
 
 And I think that was also another reason we introduced
 
 /* Used by irq_clr_method */
 #define PM800_IRQ_CLR_ON_READ   0
 #define PM800_IRQ_CLR_ON_WRITE  1

I think these are still required.  So it would look like this:

== Platform data ==

struct pdata {
  bool clear_irq_on_write;
};

pdata-clear_irq_on_write = PM800_IRQ_CLR_ON_{READ,WRITE};

== Driver ==

irq_clr_mode = pdata-clear_irq_on_write ?
 PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR;
regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode);

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT

2015-08-24 Thread Eric B Munson
On Mon, 24 Aug 2015, Konstantin Khlebnikov wrote:

 On Mon, Aug 24, 2015 at 6:09 PM, Eric B Munson emun...@akamai.com wrote:
  On Mon, 24 Aug 2015, Vlastimil Babka wrote:
 
  On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote:
  On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz wrote:
  On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote:
  
  
  I am in the middle of implementing lock on fault this way, but I cannot
  see how we will hanlde mremap of a lock on fault region.  Say we have
  the following:
  
addr = mmap(len, MAP_ANONYMOUS, ...);
mlock(addr, len, MLOCK_ONFAULT);
...
mremap(addr, len, 2 * len, ...)
  
  There is no way for mremap to know that the area being remapped was 
  lock
  on fault so it will be locked and prefaulted by remap.  How can we 
  avoid
  this without tracking per vma if it was locked with lock or lock on
  fault?
  
  
  remap can count filled ptes and prefault only completely populated 
  areas.
  
  
  Does (and should) mremap really prefault non-present pages? Shouldn't it
  just prepare the page tables and that's it?
  
  As I see mremap prefaults pages when it extends mlocked area.
  
  Also quote from manpage
  : If  the memory segment specified by old_address and old_size is locked
  : (using mlock(2) or similar), then this lock is maintained when the 
  segment is
  : resized and/or relocated.  As a  consequence, the amount of memory 
  locked
  : by the process may change.
 
  Oh, right... Well that looks like a convincing argument for having a
  sticky VM_LOCKONFAULT after all. Having mremap guess by scanning
  existing pte's would slow it down, and be unreliable (was the area
  completely populated because MLOCK_ONFAULT was not used or because
  the process aulted it already? Was it not populated because
  MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to
  populate it all?).
 
  Given this, I am going to stop working in v8 and leave the vma flag in
  place.
 
 
  The only sane alternative is to populate always for mremap() of
  VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information
  as a limitation of mlock2(MLOCK_ONFAULT). Which might or might not
  be enough for Eric's usecase, but it's somewhat ugly.
 
 
  I don't think that this is the right solution, I would be really
  surprised as a user if an area I locked with MLOCK_ONFAULT was then
  fully locked and prepopulated after mremap().
 
 If mremap is the only problem then we can add opposite flag for it:
 
 MREMAP_NOPOPULATE
 - do not populate new segment of locked areas
 - do not copy normal areas if possible (anonymous/special must be copied)
 
 addr = mmap(len, MAP_ANONYMOUS, ...);
 mlock(addr, len, MLOCK_ONFAULT);
 ...
 addr2 = mremap(addr, len, 2 * len, MREMAP_NOPOPULATE);
 ...
 

But with this, the user must remember what areas are locked with
MLOCK_LOCKONFAULT and which are locked the with prepopulate so the
correct mremap flags can be used.



signature.asc
Description: Digital signature


Re: [PATCH 7/7] ipmi/kcs: Don't run the KCS state machine when it is KCS_IDLE

2015-08-24 Thread Corey Minyard
On 08/23/2015 08:52 PM, 河合英宏 / KAWAI,HIDEHIRO wrote:
 From: Corey Minyard [mailto:tcminy...@gmail.com] On Behalf Of Corey Minyard

 On 08/17/2015 09:54 PM, 河合英宏 / KAWAI,HIDEHIRO wrote:
 From: Corey Minyard [mailto:tcminy...@gmail.com] On Behalf Of Corey Minyard

 This patch will break ATN handling on the interfaces.  So we can't do this.
 I understand.  So how about doing like this:

 /* All states wait for ibf, so just do it here. */
 -   if (!check_ibf(kcs, status, time))
 +   if (kcs-state != KCS_IDLE  !check_ibf(kcs, status, time))
 return SI_SM_CALL_WITH_DELAY;

 I think it is not necessary to wait IBF when the state is IDLE.
 In this way, we can also handle the ATN case.
 I think it would be more reliable to go up a level and add a timeout.
 It may be so, but we should address this issue separately (at least
 I think above solution reasonably solves the issue).

 This issue happens after all queued messages are processed or dropped
 by timeout.  There is no current message.  So what should we set
 a timeout against?  We can add a timeout into my new flush_messages(),
 but that is meaningful only in panic context.  That doesn't help
 in normal context; we would perform a busy loop of smi_event_handler()
 and schedule() in ipmi_thread().

I'm a little confused here.  Is the problem that the ATN bit is stuck
high?  If so, it's going to be really hard to work around this without
breaking ATN handling.

-corey


 Regards,

 Hidehiro Kawai

 One should
 be there, anyway.  I thought they were all covered, but I may have missed
 something.

 -corey

 Regards,

 Hidehiro Kawai
 Hitachi, Ltd. Research  Development Group

 It's going to be extremely hard to recover if the BMC is not working
 correctly when a panic happens.  I'm not sure what can be done, but if
 you can fix it another way it would be good.

 -corey

 On 07/27/2015 12:55 AM, Hidehiro Kawai wrote:
 If a BMC is unresponsive for some reason, it ends up completing
 the requested message as an error, then kcs_event() is called once
 to advance the state machine.  However, since the BMC is
 unresponsive now, the status of the KCS interface may not be
 idle.  As the result, the state machine can continue to run and
 comsume CPU time indefinitely even if there is no more request
 message.  Moreover, if this happens in run-to-completion mode
 (i.e. context of panic_event()), the kernel hangs up.

 To fix this problem, this patch ignores kcs_event() call if there
 is no request message to be processed.

 Signed-off-by: Hidehiro Kawai hidehiro.kawai...@hitachi.com
 ---
  drivers/char/ipmi/ipmi_kcs_sm.c |4 
  1 file changed, 4 insertions(+)

 diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c 
 b/drivers/char/ipmi/ipmi_kcs_sm.c
 index 8c25f59..0e187fb 100644
 --- a/drivers/char/ipmi/ipmi_kcs_sm.c
 +++ b/drivers/char/ipmi/ipmi_kcs_sm.c
 @@ -353,6 +353,10 @@ static enum si_sm_result kcs_event(struct si_sm_data 
 *kcs, long time)
   if (kcs_debug  KCS_DEBUG_STATES)
   printk(KERN_DEBUG KCS: State = %d, %x\n, kcs-state, status);

 + /* We don't want to run the state machine when the state is IDLE */
 + if (kcs-state == KCS_IDLE)
 + return SI_SM_IDLE;
 +
   /* All states wait for ibf, so just do it here. */
   if (!check_ibf(kcs, status, time))
   return SI_SM_CALL_WITH_DELAY;



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/5] arm64: add KASan support

2015-08-24 Thread Andrey Ryabinin
2015-08-24 18:44 GMT+03:00 Vladimir Murzin vladimir.mur...@arm.com:

 Another option would be having sparse shadow memory based on page
 extension. I did play with that some time ago based on ideas from
 original v1 KASan support for x86/arm - it is how 614be38 irqchip:
 gic-v3: Fix out of bounds access to cpu_logical_map was caught.
 It doesn't require any VA reservations, only some contiguous memory for
 the page_ext itself, which serves as indirection level for the 0-order
 shadow pages.

We won't be able to use inline instrumentation (I could live with that),
and most importantly, we won't be able to use stack instrumentation.
GCC needs to know shadow address for inline and/or stack instrumentation
to generate correct code.

 In theory such design can be reused by others 32-bit arches and, I
 think, nommu too. Additionally, the shadow pages might be movable with
 help of driver-page migration patch series [1].
 The cost is obvious - performance drop, although I didn't bother
 measuring it.

 [1] https://lwn.net/Articles/650917/

 Cheers
 Vladimir

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.12 00/82] 3.12.47-stable review

2015-08-24 Thread Guenter Roeck

On 08/24/2015 02:09 AM, Jiri Slaby wrote:

This is the start of the stable review cycle for the 3.12.47 release.
There are 82 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Aug 26 11:08:59 CEST 2015.
Anything received after that time might be too late.



Build results:
total: 124 pass: 124 fail: 0
Qemu test results:
total: 70 pass: 70 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v8 1/2] irqchip: imx-gpcv2: IMX GPCv2 driver for wakeup sources

2015-08-24 Thread Shenwei Wang


 -Original Message-
 From: Thomas Gleixner [mailto:t...@linutronix.de]
 Sent: 2015年8月23日 5:58
 To: Wang Shenwei-B38339
 Cc: shawn@linaro.org; ja...@lakedaemon.net;
 linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org; Huang
 Yongcai-B20788
 Subject: Re: [PATCH v8 1/2] irqchip: imx-gpcv2: IMX GPCv2 driver for wakeup
 sources
 
 On Fri, 31 Jul 2015, Shenwei Wang wrote:
  +struct gpcv2_irqchip_data {
  +   struct raw_spinlock rlock;
  +   void __iomem *gpc_base;
  +   u32 wakeup_sources[IMR_NUM];
  +   u32 enabled_irqs[IMR_NUM];
  +   u32 cpu2wakeup;
 
 Can you please format that in a readable way?
 
   struct raw_spinlockrlock;
   void __iomem *gpc_base;
   

I did try to be careful about the format, but did not notice this one. Will 
change it in the new version.:)
 
  +};
  +
  +static struct gpcv2_irqchip_data *imx_gpcv2_instance;
  +
  +u32 imx_gpcv2_get_wakeup_source(u32 **sources) {
  +   if (!imx_gpcv2_instance)
  +   return 0;
  +
  +   if (sources)
  +   *sources = imx_gpcv2_instance-wakeup_sources;
  +
  +   return IMR_NUM;
  +}
  +
  +static int gpcv2_wakeup_source_save(void) {
  +   struct gpcv2_irqchip_data *cd;
  +   void __iomem *reg;
  +   int i;
  +
  +   cd = imx_gpcv2_instance;
  +   if (!cd)
  +   return 0;
  +
  +   for (i = 0; i  IMR_NUM; i++) {
  +   reg = cd-gpc_base + cd-cpu2wakeup + i * 4;
  +   cd-enabled_irqs[i] = readl_relaxed(reg);
 
 You read the full state of the register and restore the full state. So why
 enabled_irqs?

There are two user scenarios: 
In CPU Idle state, the system need to be woke up by any enabled irqs, not just 
the ones that marked as wakeup sources.
In Suspend State, they system will only be woke up by the one that marked as a 
wakeup source. 
Enabled_irqs are used to save the values before suspend, and restore them after 
resume.

  +   writel_relaxed(cd-wakeup_sources[i], reg);
  +   }
  +
  +   return 0;
  +}
  +
  +static void gpcv2_wakeup_source_restore(void) {
  +   struct gpcv2_irqchip_data *cd;
  +   void __iomem *reg;
  +   int i;
  +
  +   cd = imx_gpcv2_instance;
  +   if (!cd)
  +   return;
  +
  +   for (i = 0; i  IMR_NUM; i++) {
  +   reg = cd-gpc_base + cd-cpu2wakeup + i * 4;
  +   writel_relaxed(cd-enabled_irqs[i], reg);
  +   cd-wakeup_sources[i] = ~0;
 
 Why are you clearing that info on resume? Drivers will clear that via
 set_wake() or leave it when they want to have resume functionality?
 
Each time system goes into the suspend state, it will call set_wake (ON) again 
to configure
the wakeup sources. Clearing wakeup_sources here can make sure the system work 
as
expected no matter that a driver calls set_wake (OFF) during resume stage.

  +static int __init imx_gpcv2_irqchip_init(struct device_node *node,
  +  struct device_node *parent) {
  +   struct irq_domain *parent_domain, *domain;
  +   struct gpcv2_irqchip_data *cd;
  +   int i;
  +
  +   if (!parent) {
  +   pr_err(%s: no parent, giving up\n, node-full_name);
  +   return -ENODEV;
  +   }
  +
  +   parent_domain = irq_find_host(parent);
  +   if (!parent_domain) {
  +   pr_err(%s: unable to get parent domain\n, node-full_name);
  +   return -ENXIO;
  +   }
  +
  +   cd = kzalloc(sizeof(struct gpcv2_irqchip_data), GFP_KERNEL);
  +   BUG_ON(!cd);
 
 You return an error code for all other failures. Why BUG here?

Good point. To be consistent, I will change it to return an error code.

Thanks,
Shenwei
 
 Otherwise this looks very clean now. Can you please resend ASAP with these
 minor points addressed?
 
 Thanks,
 
   tglx
 



Re: [RESEND][PATCH 4/4] ARM: dts: keystone: Add ti,keystone-spi for SPI

2015-08-24 Thread santosh shilimkar

On 8/24/2015 6:36 AM, Franklin S Cooper Jr. wrote:

Hi Santosh,

All the patches except this one are in linux-next.


Yes I noticed it. I will queue this up for next merge window.
Thanks for reminder.

Regards,
Santosh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/5] arm64: add KASan support

2015-08-24 Thread Vladimir Murzin
On 24/08/15 17:00, Andrey Ryabinin wrote:
 2015-08-24 18:44 GMT+03:00 Vladimir Murzin vladimir.mur...@arm.com:

 Another option would be having sparse shadow memory based on page
 extension. I did play with that some time ago based on ideas from
 original v1 KASan support for x86/arm - it is how 614be38 irqchip:
 gic-v3: Fix out of bounds access to cpu_logical_map was caught.
 It doesn't require any VA reservations, only some contiguous memory for
 the page_ext itself, which serves as indirection level for the 0-order
 shadow pages.
 
 We won't be able to use inline instrumentation (I could live with that),
 and most importantly, we won't be able to use stack instrumentation.
 GCC needs to know shadow address for inline and/or stack instrumentation
 to generate correct code.

It's definitely a trade-off ;)

Just for my understanding does that stack instrumentation is controlled
via -asan-stack?

Thanks
Vladimir

 
 In theory such design can be reused by others 32-bit arches and, I
 think, nommu too. Additionally, the shadow pages might be movable with
 help of driver-page migration patch series [1].
 The cost is obvious - performance drop, although I didn't bother
 measuring it.

 [1] https://lwn.net/Articles/650917/

 Cheers
 Vladimir

 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 06/14] Documentation: drm/bridge: add document for analogix_dp

2015-08-24 Thread Heiko Stuebner
Am Montag, 24. August 2015, 09:48:27 schrieb Rob Herring:
 On Mon, Aug 24, 2015 at 7:57 AM, Russell King - ARM Linux
  When we adopted the graph bindings for iMX DRM, I thought exactly at that
  time it would be nice if this could become the standard for binding DRM
  components together but I don't have the authority from either the DT
  perspective or the DRM perspective to mandate that.  Neither does anyone
  else.  That's the _real_ problem here.
  
  I've seen several DRM bindings go by which don't use the of-graph stuff,
  which means that they'll never be compatible with generic components
  which do use the of-graph stuff.
 
 It goes beyond bindings IMO. The use of the component framework or not
 has been at the whim of driver writers as well. It is either used or
 private APIs are created. I'm using components and my need for it
 boils down to passing the struct drm_device pointer to the encoder.
 Other components like panels and bridges have different ways to attach
 to the DRM driver.

but that is then simply implementation specific. Panels and bridges can very 
well be part of and created from an of_graph description without needing to be 
a (linux-)component - see patch 7.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] ubifs: Allow O_DIRECT

2015-08-24 Thread Brian Norris
On Mon, Aug 24, 2015 at 10:13:25AM +0300, Artem Bityutskiy wrote:
 Now, some user-space fails when direct I/O is not supported.

I think the whole argument rested on what it means when some user space
fails; apparently that user space is just a test suite (which
can/should be fixed).

 We can
 chose to fake direct I/O or fix user-space. The latter seems to be the
 preferred course of actions, and you are correctly pointing the man
 page.
 
 However, if
 
 1. we are the only FS erroring out on O_DIRECT
 2. other file-systems not supporting direct IO just fake it
 
 we may just follow the crowd and fake it too.
 
 I am kind of trusting Richard here - I assume he did the research and
 the above is the case, this is why I am fine with his patch.
 
 Does this logic seem acceptable to you? Other folk's opinion would be
 great to hear.

Could work for me, though that doesn't seem ideal. Anyway, it now seems
Christopher and Richard agree with me.

Regards,
Brian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9] dmaengine: Add Xilinx AXI Direct Memory Access Engine driver support

2015-08-24 Thread Kedareswara rao Appana
This is the driver for the AXI Direct Memory Access (AXI DMA)
core, which is a soft Xilinx IP core that provides high-
bandwidth direct memory access between memory and AXI4-Stream
type target peripherals.

Signed-off-by: Kedareswara rao Appana appa...@xilinx.com
---
The deivce tree doc got applied in the slave-dmaengine.git.

Changes in v9:
- Used the readl_poll_timeout instead of do while loops
  in the driver as suggested by Moritz Fischer.
- Intialize the residue variable to get rid of compilation warining.
Changes in v8:
- Updated the SG handling as suggested by Nicolae Rosia.
- Removed the unnecessary xilinx_dma_channel_set_config API the properties
  in this API is not being used by the driver.
Changes in v7:
- Updated license in the driver as suggested by Paul.
- Corrected return value in is_idle funtion.
Changes in v6:
- Fixed Odd indention in the Kconfig.
- used GFP_NOWAIT instead of GFP_KERNEL during the desc allocation
- Calculated residue in the tx_status instead of complete_descriptor.
- Update copy right to 2015.
- Modified spin_lock handling moved the spin_lock to the appropriate functions
(instead of xilinx_dma_start_transfer doing it xilinx_dma_issue_pending api).
- device_control and declare slave caps updated as per newer APi's.
Changes in v5:
- Modified the xilinx_dma.h header file location to the 
  include/linux/dma/xilinx_dma.h
Changes in v4:
- Add direction field to DMA descriptor structure and removed from
  channel structure to avoid duplication.
- Check for DMA idle condition before changing the configuration.
- Residue is being calculated in complete_descriptor() and is reported
  to slave driver.
Changes in v3:
- Rebased on 3.16-rc7
Changes in v2:
- Simplified the logic to set SOP and APP words in prep_slave_sg().
- Corrected function description comments to match the return type.
- Fixed some minor comments as suggested by Andy.
---
 drivers/dma/Kconfig |   13 +
 drivers/dma/xilinx/Makefile |1 +
 drivers/dma/xilinx/xilinx_dma.c | 1178 +++
 3 files changed, 1192 insertions(+)
 create mode 100644 drivers/dma/xilinx/xilinx_dma.c

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 88d474b..5e95f07 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -507,4 +507,17 @@ config QCOM_BAM_DMA
  Enable support for the QCOM BAM DMA controller.  This controller
  provides DMA capabilities for a variety of on-chip devices.
 
+config XILINX_DMA
+tristate Xilinx AXI DMA Engine
+depends on (ARCH_ZYNQ || MICROBLAZE)
+select DMA_ENGINE
+help
+  Enable support for Xilinx AXI DMA Soft IP.
+
+  This engine provides high-bandwidth direct memory access
+  between memory and AXI4-Stream type target peripherals.
+  It has two stream interfaces/channels, Memory Mapped to
+  Stream (MM2S) and Stream to Memory Mapped (S2MM) for the
+  data transfers.
+
 endif
diff --git a/drivers/dma/xilinx/Makefile b/drivers/dma/xilinx/Makefile
index 3c4e9f2..6224a49 100644
--- a/drivers/dma/xilinx/Makefile
+++ b/drivers/dma/xilinx/Makefile
@@ -1 +1,2 @@
 obj-$(CONFIG_XILINX_VDMA) += xilinx_vdma.o
+obj-$(CONFIG_XILINX_DMA) += xilinx_dma.o
diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
new file mode 100644
index 000..d19009e
--- /dev/null
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -0,0 +1,1178 @@
+/*
+ * DMA driver for Xilinx DMA Engine
+ *
+ * Copyright (C) 2010 - 2015 Xilinx, Inc. All rights reserved.
+ *
+ * Based on the Freescale DMA driver.
+ *
+ * Description:
+ *  The AXI DMA, is a soft IP, which provides high-bandwidth Direct Memory
+ *  Access between memory and AXI4-Stream-type target peripherals. It can be
+ *  configured to have one channel or two channels and if configured as two
+ *  channels, one is to transmit data from memory to a device and another is
+ *  to receive from a device.
+ *
+ * This is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include linux/bitops.h
+#include linux/dma/xilinx_dma.h
+#include linux/init.h
+#include linux/interrupt.h
+#include linux/io.h
+#include linux/iopoll.h
+#include linux/module.h
+#include linux/of_address.h
+#include linux/of_dma.h
+#include linux/of_irq.h
+#include linux/of_platform.h
+#include linux/slab.h
+
+#include ../dmaengine.h
+
+/* Register Offsets */
+#define XILINX_DMA_REG_CONTROL 0x00
+#define XILINX_DMA_REG_STATUS  0x04
+#define XILINX_DMA_REG_CURDESC 0x08
+#define XILINX_DMA_REG_TAILDESC0x10
+#define XILINX_DMA_REG_SRCADDR 0x18
+#define XILINX_DMA_REG_DSTADDR 0x20
+#define XILINX_DMA_REG_BTT 0x28
+
+/* Channel/Descriptor Offsets */
+#define XILINX_DMA_MM2S_CTRL_OFFSET0x00
+#define 

Re: [PATCH v2 5/5] arm64: add KASan support

2015-08-24 Thread Vladimir Murzin
On 24/08/15 15:15, Andrey Ryabinin wrote:
 2015-08-24 16:45 GMT+03:00 Linus Walleij linus.wall...@linaro.org:
 On Mon, Aug 24, 2015 at 3:15 PM, Russell King - ARM Linux
 li...@arm.linux.org.uk wrote:
 On Tue, Jul 21, 2015 at 11:27:56PM +0200, Linus Walleij wrote:
 On Tue, Jul 21, 2015 at 4:27 PM, Andrey Ryabinin a.ryabi...@samsung.com 
 wrote:

 I used vexpress. Anyway, it doesn't matter now, since I have an update
 with a lot of stuff fixed, and it works on hardware.
 I still need to do some work on it and tomorrow, probably, I will share.

 Ah awesome. I have a stash of ARM boards so I can test it on a
 range of hardware once you feel it's ready.

 Sorry for pulling stuff out of your hands, people are excited about
 KASan ARM32 as it turns out.

 People may be excited about it because it's a new feature, but we really
 need to consider whether gobbling up 512MB of userspace for it is a good
 idea or not.  There are programs around which like to map large amounts
 of memory into their process space, and the more we steal from them, the
 more likely these programs are to fail.

 I looked at some different approaches over the last weeks for this
 when playing around with KASan.

 It seems since KASan was developed on 64bit systems, this was
 not much of an issue for them as they could take their shadow
 memory from the vmalloc space.

 I think it is possible to actually just steal as much memory as is
 needed to cover the kernel, and not 1/8 of the entire addressable
 32bit space. So instead of covering all from 0x0-0x
 at least just MODULES_VADDR thru 0x should be enough.
 So if that is 0xbf00-0x in most cases, 0x4100
 bytes, then 1/8 of that, 0x820, 130MB should be enough.
 (Andrey need to say if this is possible.)

 
 Yes, ~130Mb (3G/1G split) should work. 512Mb shadow is optional.
 The only advantage of 512Mb shadow is better handling of user memory
 accesses bugs
 (access to user memory without copy_from_user/copy_to_user/strlen_user etc 
 API).
 In case of 512Mb shadow we could to not map anything in shadow for
 user addresses, so such bug will
 guarantee  to crash the kernel.
 In case of 130Mb, the behavior will depend on memory layout of the
 current process.
 So, I think it's fine to keep shadow only for kernel addresses.

Another option would be having sparse shadow memory based on page
extension. I did play with that some time ago based on ideas from
original v1 KASan support for x86/arm - it is how 614be38 irqchip:
gic-v3: Fix out of bounds access to cpu_logical_map was caught.
It doesn't require any VA reservations, only some contiguous memory for
the page_ext itself, which serves as indirection level for the 0-order
shadow pages.
In theory such design can be reused by others 32-bit arches and, I
think, nommu too. Additionally, the shadow pages might be movable with
help of driver-page migration patch series [1].
The cost is obvious - performance drop, although I didn't bother
measuring it.

[1] https://lwn.net/Articles/650917/

Cheers
Vladimir

 
 That will probably miss some usecases I'm not familiar with, where
 the kernel is actually executing something below 0xbf00...

 I looked at taking memory from vmalloc instead, but ran into
 problems since this is subject to the highmem split and KASan
 need to have it's address offset at compile time. On
 Ux500 I managed to remove all the static maps and steal memory
 from the top of the vmalloc area instead of the beginning, but
 that is probably not generally feasible.

 I suspect you have better ideas than what I can come up
 with though.

 Yours,
 Linus Walleij
 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Paul E. McKenney
On Mon, Aug 24, 2015 at 04:04:37PM +0200, Frederic Weisbecker wrote:
 On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote:
  On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
   
   * Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
   
 here it's fully set - triggering the bug I'm worried about. So what 
 am I 
 missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), 
which is 
called from tick_init() which is called from start_kernel() shortly 
after 
rcu_init():

cpu = smp_processor_id();

if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
pr_warning(NO_HZ: Clearing %d from nohz_full range for 
timekeeping\n, cpu);
cpumask_clear_cpu(cpu, tick_nohz_full_mask);
}

This happens after the call to tick_nohz_init_all() that does the 
cpumask_setall() that you called out above.
   
   Ah, indeed - I somehow missed that.
   
   This brings up two other questions:
   
   1)
   
   the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a 
   full mask to 
   it (housekeeping_mask - a variable mask to begin with) and recover the 
   housekeeping CPU via:
   
   +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
   
   which can be pretty expensive, and which gets executed in two hotpaths:
   
   kernel/time/hrtimer.c:  return per_cpu(hrtimer_bases, 
   get_nohz_timer_target());
   kernel/time/timer.c:return per_cpu_ptr(tvec_bases, 
   get_nohz_timer_target());
   
   ... why not just use a single housekeeping_cpu which would be way faster 
   to pass 
   down to the timer code?
  
  The housekeeping_cpu came later, but that does seem like a good 
  optimization.
 
 Well nohz full is likely to be used for HPC and that can involve big machines.
 Having the housekeeping duty spread per node is a likely future evolution 
 there,
 if it isn't already used that way.
 
 So we need to keep it a cpumask.

Fair point!

Thanx, Paul

   2)
   
   What happens if the boot CPU is offlined? (under 
   CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
   
   I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
   boot CPU 
   is offlined.
  
  The tick_nohz_cpu_down_callback() function does this, though in a less
  than obvious way.  The tick_do_timer_cpu variable is the housekeeping
  CPU that is currently handling timing, and it is not permitted to go
  offline.
 
 Indeed, more specifically tick-common.c makes sure to set the timekeeping
 duty to a housekeeper and that housekeeper is always the boot CPU due to
 early device initialization.
 
 But I should find a way to simplify that code and make it obvious it's always
 set to the boot CPU.
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Internal PATCH] ipmi: add of_device_id in MODULE_DEVICE_TABLE

2015-08-24 Thread Brijesh Singh
Fix autoloading ipmi modules when using device tree.

Signed-off-by: Brijesh Singh brijeshkumar.si...@amd.com
---
 drivers/char/ipmi/ipmi_si_intf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 8a45e92..cddc7b0 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -2785,6 +2785,7 @@ static struct platform_driver ipmi_driver = {
.probe  = ipmi_probe,
.remove = ipmi_remove,
 };
+MODULE_DEVICE_TABLE(of, ipmi_match);
 
 #ifdef CONFIG_PARISC
 static int ipmi_parisc_probe(struct parisc_device *dev)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT

2015-08-24 Thread Konstantin Khlebnikov
On Mon, Aug 24, 2015 at 6:55 PM, Eric B Munson emun...@akamai.com wrote:
 On Mon, 24 Aug 2015, Konstantin Khlebnikov wrote:

 On Mon, Aug 24, 2015 at 6:09 PM, Eric B Munson emun...@akamai.com wrote:
  On Mon, 24 Aug 2015, Vlastimil Babka wrote:
 
  On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote:
  On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz wrote:
  On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote:
  
  
  I am in the middle of implementing lock on fault this way, but I 
  cannot
  see how we will hanlde mremap of a lock on fault region.  Say we have
  the following:
  
addr = mmap(len, MAP_ANONYMOUS, ...);
mlock(addr, len, MLOCK_ONFAULT);
...
mremap(addr, len, 2 * len, ...)
  
  There is no way for mremap to know that the area being remapped was 
  lock
  on fault so it will be locked and prefaulted by remap.  How can we 
  avoid
  this without tracking per vma if it was locked with lock or lock on
  fault?
  
  
  remap can count filled ptes and prefault only completely populated 
  areas.
  
  
  Does (and should) mremap really prefault non-present pages? Shouldn't it
  just prepare the page tables and that's it?
  
  As I see mremap prefaults pages when it extends mlocked area.
  
  Also quote from manpage
  : If  the memory segment specified by old_address and old_size is locked
  : (using mlock(2) or similar), then this lock is maintained when the 
  segment is
  : resized and/or relocated.  As a  consequence, the amount of memory 
  locked
  : by the process may change.
 
  Oh, right... Well that looks like a convincing argument for having a
  sticky VM_LOCKONFAULT after all. Having mremap guess by scanning
  existing pte's would slow it down, and be unreliable (was the area
  completely populated because MLOCK_ONFAULT was not used or because
  the process aulted it already? Was it not populated because
  MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to
  populate it all?).
 
  Given this, I am going to stop working in v8 and leave the vma flag in
  place.
 
 
  The only sane alternative is to populate always for mremap() of
  VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information
  as a limitation of mlock2(MLOCK_ONFAULT). Which might or might not
  be enough for Eric's usecase, but it's somewhat ugly.
 
 
  I don't think that this is the right solution, I would be really
  surprised as a user if an area I locked with MLOCK_ONFAULT was then
  fully locked and prepopulated after mremap().

 If mremap is the only problem then we can add opposite flag for it:

 MREMAP_NOPOPULATE
 - do not populate new segment of locked areas
 - do not copy normal areas if possible (anonymous/special must be copied)

 addr = mmap(len, MAP_ANONYMOUS, ...);
 mlock(addr, len, MLOCK_ONFAULT);
 ...
 addr2 = mremap(addr, len, 2 * len, MREMAP_NOPOPULATE);
 ...


 But with this, the user must remember what areas are locked with
 MLOCK_LOCKONFAULT and which are locked the with prepopulate so the
 correct mremap flags can be used.


Yep. Shouldn't be hard. You anyway have to do some changes in user-space.


Much simpler for users-pace solution is a mm-wide flag which turns all further
mlocks and MAP_LOCKED into lock-on-fault. Something like
mlockall(MCL_NOPOPULATE_LOCKED).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4] pinctrl: mediatek: Implement wake handler and suspend resume

2015-08-24 Thread Sudeep Holla



On 14/08/15 09:38, maoguang.m...@mediatek.com wrote:

From: Maoguang Meng maoguang.m...@mediatek.com

This patch implement irq_set_wake to get who is wakeup source and
setup on suspend resume.

Signed-off-by: Maoguang Meng maoguang.m...@mediatek.com

---
changes since v3:
-add a comment in mtk_eint_chip_read_mask.
-delete ALIGN when allocate eint_offsets.ports.
-fix unrelated change.

changes since v2:
-modify irq_wake to handle irq wakeup source.
-allocate two buffers separately.
-fix some codestyle.

Changes since v1:
-implement irq_wake handler.
---

  drivers/pinctrl/mediatek/pinctrl-mt8173.c |  1 +
  drivers/pinctrl/mediatek/pinctrl-mtk-common.c | 91 ++-
  drivers/pinctrl/mediatek/pinctrl-mtk-common.h |  4 ++
  3 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/drivers/pinctrl/mediatek/pinctrl-mt8173.c 
b/drivers/pinctrl/mediatek/pinctrl-mt8173.c
index d0c811d..ad27184 100644
--- a/drivers/pinctrl/mediatek/pinctrl-mt8173.c
+++ b/drivers/pinctrl/mediatek/pinctrl-mt8173.c
@@ -385,6 +385,7 @@ static struct platform_driver mtk_pinctrl_driver = {
.driver = {
.name = mediatek-mt8173-pinctrl,
.of_match_table = mt8173_pctrl_match,
+   .pm = mtk_eint_pm_ops,
},
  };

diff --git a/drivers/pinctrl/mediatek/pinctrl-mtk-common.c 
b/drivers/pinctrl/mediatek/pinctrl-mtk-common.c
index ad1ea16..fe34ce9 100644
--- a/drivers/pinctrl/mediatek/pinctrl-mtk-common.c
+++ b/drivers/pinctrl/mediatek/pinctrl-mtk-common.c
@@ -33,6 +33,7 @@
  #include linux/mfd/syscon.h
  #include linux/delay.h
  #include linux/interrupt.h
+#include linux/pm.h
  #include dt-bindings/pinctrl/mt65xx.h

  #include ../core.h
@@ -1062,6 +1063,77 @@ static int mtk_eint_set_type(struct irq_data *d,
return 0;
  }

+static int mtk_eint_irq_set_wake(struct irq_data *d, unsigned int on)
+{
+   struct mtk_pinctrl *pctl = irq_data_get_irq_chip_data(d);
+   int shift = d-hwirq  0x1f;
+   int reg = d-hwirq  5;
+
+   if (on)
+   pctl-wake_mask[reg] |= BIT(shift);
+   else
+   pctl-wake_mask[reg] = ~BIT(shift);
+
+   return 0;
+}


Does this pinmux controller:

1. Support wake-up configuration ? If not, you need to use
   IRQCHIP_SKIP_SET_WAKE. I don't see any value in writing the
   mask_{set,clear} if the same registers are used for {en,dis}able

2. Is in always on domain ? If not, you need save/restore only to
   resume back the functionality. Generally we can set  
   IRQCHIP_MASK_ON_SUSPEND to ensure non-wake-up interrupts are
   disabled during suspend and re-enabled in resume path. You just
   save/restore raw values without tracking the wake-up source.

Also I see that no care is taken to set the port irq as wake enable
source. It may work with current mainline, but won't with -next. Please
ensure the port irq to the parent interrupt controller remains
enabled(i.e set as wake).

Regards,
Sudeep
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] irqchip, gicv3-its, numa: Workaround for Cavium ThunderX erratum 23144

2015-08-24 Thread Ganapatrao Kulkarni
Hi Marc,

thanks for the suggestions.

On Mon, Aug 24, 2015 at 7:17 PM, Marc Zyngier marc.zyng...@arm.com wrote:
 On 24/08/15 14:27, Ganapatrao Kulkarni wrote:
 On Mon, Aug 24, 2015 at 6:15 PM, Marc Zyngier marc.zyng...@arm.com wrote:

  static void its_enable_cavium_thunderx(void *data)
  {
 - struct its_node *its = data;
 + struct its_node __maybe_unused *its = data;

 - its-flags |= ITS_FLAGS_CAVIUM_THUNDERX;
 +#ifdef CONFIG_CAVIUM_ERRATUM_22375
 + its-flags |= ITS_WORKAROUND_CAVIUM_22375;
 + pr_info(ITS: Enabling workaround for 22375, 24313\n);
 +#endif
 +
 +#ifdef CONFIG_CAVIUM_ERRATUM_23144
 + if (num_possible_nodes()  1) {
 + its-numa_node = its_get_node_thunderx(its);

 I'd rather see numa_node being always initialized to something useful.
 If you're adding numa support, why can't this be initialized via
 standard topology bindings?
 IIUC, topology defines only cpu topology.

 Well, welcome to a much more complex system where both your CPUs and
 your IOs have some degree of affinity. This needs to be described
 properly, and not hacked on the side.
 ok, will add description for the function.

 I sense that you misunderstood what I meant. What I'd like to see is
 some topology information coming from DT, showing the relationship
 between a device (your ITS) and a given node (your socket). This can
 then be used from two purposes:
sure will post next version with changes as per you comments.

 - find the optimal affinity for a MSI so that it doesn't default to a
 foreign node (a reasonable performance expectation),
this can be done by adding dt associativity property to its node.
 i can send in next version of patch.
 - work around implementation bugs where an LPI cannot be routed to a
 redistributor that is on a foreign node.



 I really don't feel like adding a hack just for the second point, and
 I'd rather get the big picture right so that your workaround is just a
 special case of the generic one.

 Thanks,

 M.
 --
 Jazz is not dead. It just smells funny...

thanks
Ganapat
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/10] irqchip: irq-mips-gic: export gic_send_ipi

2015-08-24 Thread Qais Yousef

On 08/24/2015 04:07 PM, Thomas Gleixner wrote:

On Mon, 24 Aug 2015, Qais Yousef wrote:

On 08/24/2015 02:32 PM, Marc Zyngier wrote:

I'd rather see something more architected than this blind export, or
at least some level of filtering (the idea random drivers can access
such a low-level function doesn't make me feel very good).

I don't know how to architect this better or how to perform  the filtering,
but I'm happy to hear suggestions and try them out.
Keep in mind that detecting GIC and writing your own gic_send_ipi() is very
simple. I have done this when the driver was out of tree. So restricting it by
not exporting it will not prevent someone from really accessing the
functionality, it's just they have to do it their own way.

Keep in mind that we are not talking about out of tree hackery. We
talk about a kernel code submission and I doubt, that you will get
away with a GIC detection/fiddling burried in your driver code.

Keep in mind that just slapping an export to some random function is
not much better than doing a GIC hack in the driver.

Marcs concerns about blindly exposing IPI functionality to drivers is
well justified and that kind of coprocessor stuff is not unique to
your particular SoC. We're going to see such things more frequently in
the not so distant future, so we better think now about proper
solutions to that problem.


Sure I'm not trying to argue against that.



There are a couple of issues to solve:

1) How is the IPI which is received by the coprocessor reserved in the
system?

2) How is it associated to a particular driver?


Shouldn't 'interrupts' property in DT take care of these 2 questions? 
Maybe we can give it an alias name to make it more readable that this 
interrupt is requested for external IPI.




3) How do we ensure that a driver cannot issue random IPIs and can
only send the associated ones?


If we get the irq number from DT then I'm not sure how feasible it is to 
implement a generic_send_ipi() function that takes this number to 
generate an IPI.


Do you think this approach would work?



None of these issues are handled by your export.

So we need a core infrastructure which allows us to do that. The
requirements are pretty clear from the above and Marc might have some
further restrictions in mind.


Another issue I'm having which is related is that I need to communicate 
these GIC irq numbers to AXD core when it starts up. So the logic is 
that these IPIs are not hardwired and it's up to the system designer to 
allocate 2 free GIC irqs to be used for that purpose. At the moment I 
have my own DT property to take these numbers. Hopefully this link would 
explain the issue. See the question about gic-irq property.


https://lkml.org/lkml/2015/8/24/459

From what I know there's no generic way for the driver to get the hw 
irq number from linux irq number unless I missed something. Is it 
possible to add something to support this? Or maybe there's something 
but I failed to find?


Thanks,
Qais



Thanks,

tglx


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v6 5/6] mfd: 88pm800: Set default interrupt clear method

2015-08-24 Thread Vaibhav Hiremath



On Monday 24 August 2015 09:21 PM, Lee Jones wrote:

On Mon, 24 Aug 2015, Vaibhav Hiremath wrote:




On Monday 24 August 2015 07:24 PM, Lee Jones wrote:

On Wed, 08 Jul 2015, Vaibhav Hiremath wrote:


As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe
(page 0) controls the method of clearing interrupt
status of 88pm800 family of devices;

   0: clear on read
   1: clear on write

If pdata is not coming from board file, then set the
default irq clear method to irq clear on write

Also, as suggested by Lee Jones renaming variable field
to appropriate name and removed unnecessary field
pm80x_chip.irq_mode, using platform_data.irq_clr_method.

Signed-off-by: Zhao Ye zh...@marvell.com
Signed-off-by: Vaibhav Hiremath vaibhav.hirem...@linaro.org
Reviewed-by: Krzysztof Kozlowski k.kozlow...@samsung.com
---
  drivers/mfd/88pm800.c   | 15 ++-
  include/linux/mfd/88pm80x.h |  9 +++--
  2 files changed, 17 insertions(+), 7 deletions(-)


[...]


+#define PM800_WAKEUP2_INT_READ_CLEAR   (0  1)
+#define PM800_WAKEUP2_INT_WRITE_CLEAR  (1  1)


Use BIT().


+/* Used by irq_clr_method */
+#define PM800_IRQ_CLR_ON_READ  0
+#define PM800_IRQ_CLR_ON_WRITE 1



-   int irq_mode;   /* Clear interrupt by read/write(0/1) */
+   bool irq_clr_method;/* Clear interrupt by read/write(0/1) */



+   irq_clr_mode = pdata-irq_clr_method == PM800_IRQ_CLR_ON_WRITE ?
+   PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR;
+   ret = regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode);


This is pretty convoluted.

For starters you're abusing the 'bool' type here.  Bool is either
'true' or 'false', so at the very least you should rename
'irq_clr_method' to 'irq_clr_on_write'.

Then you can do:

irq_clr_mode = pdata-irq_clr_on_write ?
PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR;



We have discussed on this, and went back-n-forth.
I think if I remember correctly, one of the version was using
true/false then we decided to rename it to relevant macro.

If I am not wrong V4 version of this series is exactly same as what you
are referring to.


Right.  I made a few suggestions which vary in usefulness depending on
how you plan to implement all of this.  Unfortunately this is a bit of
a bastardised version where some of it make sense and other parts
could do with some improvement.



This so called basterdised version could have been avoided :)

V2 version itself was clean and ready. It just got dragged into
multiple iterations.


However, what I suggest you really do is share
PM800_WAKEUP2_INT_{READ,WRITE}_CLEAR with platform data and just pass
the value through directly.



I think we discussed about this also, and the reason I recall here is,

we may need to control this from DT in the future so we decided to keep
it boolean in platform_data and have simple check before writing to
register.

And I think that was also another reason we introduced

/* Used by irq_clr_method */
#define PM800_IRQ_CLR_ON_READ   0
#define PM800_IRQ_CLR_ON_WRITE  1


I think these are still required.  So it would look like this:



NO. I think you are confused here,
We have two different macros playing around here,


+/* Used by irq_clr_method */
+#define PM800_IRQ_CLR_ON_READ  0
+#define PM800_IRQ_CLR_ON_WRITE 1

/* Used to write to register */
+#define PM800_WAKEUP2_INT_READ_CLEAR   (0  1)
+#define PM800_WAKEUP2_INT_WRITE_CLEAR  (1  1)




== Platform data ==

struct pdata {
   bool clear_irq_on_write;
};

pdata-clear_irq_on_write = PM800_IRQ_CLR_ON_{READ,WRITE};

== Driver ==

irq_clr_mode = pdata-clear_irq_on_write ?
  PM800_WAKEUP2_INT_WRITE_CLEAR : PM800_WAKEUP2_INT_READ_CLEAR;
regmap_update_bits(map, PM800_WAKEUP2, mask, irq_clr_mode);



Please check V2, which is exactly same as above.

https://patchwork.kernel.org/patch/6627781/


If you are OK with it, I will spin another version and submit it.

Thanks,
Vaibhav
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH RFC 02/10] perf,tools: Support new sort type --socket

2015-08-24 Thread Liang, Kan


 On Mon, Aug 24, 2015 at 02:22:08PM +, Liang, Kan wrote:
  
   On Fri, Aug 21, 2015 at 08:25:24PM +, Liang, Kan wrote:
  
   SNIP
  

 we need global topology information in perf.data and use the
 mapping from there, we can't use current server info

 we currently store core_siblings_list and thread_siblings_list,
 in topology FEATURE, which is probably not enough

   
core_siblings_list  includes the cpu list in the same socket.
thread_siblings_list includes the cpu list in the same core.
numa_nodes includes the cpu list for each node.
   
It looks we have enough data from topology FEATURE.
  
   hum, haven't hecked deeply.. how will you get core id for cpu?
  
 
  from thread_siblings_list.
   I just noticed that svg_build_topology_map did the similar thing to
  get topology map for timechart from perf header.
 
 could you please provide both functions then cpu - core, cpu - socket
 

Do you mean something like this?
Store cpu-socket and cpu-core in perf_session_env.

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 179b2bd..a01c603 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1590,10 +1596,17 @@ static int process_cpu_topology(struct 
perf_file_section *section __maybe_unused
u32 nr, i;
char *str;
struct strbuf sb;
+   int cpu_nr = ph-env.nr_cpus_online;
+   struct cpu_map *map;
+   int j;
+
+   ph-env.cpu = calloc(cpu_nr, sizeof(*ph-env.cpu));
+   if (!ph-env.cpu)
+   return -1;
 
ret = readn(fd, nr, sizeof(nr));
if (ret != sizeof(nr))
-   return -1;
+   goto free_cpu;
 
if (ph-needs_swap)
nr = bswap_32(nr);
@@ -1608,6 +1621,14 @@ static int process_cpu_topology(struct perf_file_section 
*section __maybe_unused
 
/* include a NULL character at the end */
strbuf_add(sb, str, strlen(str) + 1);
+
+   map = cpu_map__new(str);
+   if (!map)
+   goto error;
+   for (j = 0; j  map-nr; j++) {
+ph-env.cpu[map-map[j]].socket_id = i;
+   }
+   cpu_map__put(map);
free(str);
}
ph-env.sibling_cores = strbuf_detach(sb, NULL);
@@ -1628,6 +1649,14 @@ static int process_cpu_topology(struct perf_file_section 
*section __maybe_unused
 
/* include a NULL character at the end */
strbuf_add(sb, str, strlen(str) + 1);
+
+   map = cpu_map__new(str);
+   if (!map)
+   goto error;
+   for (j = 0; j  map-nr; j++) {
+   ph-env.cpu[map-map[j]].core_id = i;
+   }
+   cpu_map__put(map);
free(str);
}
ph-env.sibling_threads = strbuf_detach(sb, NULL);
@@ -1635,6 +1664,8 @@ static int process_cpu_topology(struct perf_file_section 
*section __maybe_unused
 
 error:
strbuf_release(sb);
+free_cpu:
+   free(ph-env.cpu);
return -1;
 }
 
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 9b53b65..8b8c4fc 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -66,6 +66,11 @@ struct perf_header;
 int perf_file_header__read(struct perf_file_header *header,
   struct perf_header *ph, int fd);
 
+struct cpu_topology_map {
+   int socket_id;
+   int core_id;
+};
+
 struct perf_session_env {
char*hostname;
char*os_release;
@@ -89,6 +94,7 @@ struct perf_session_env {
char*sibling_threads;
char*numa_nodes;
char*pmu_mappings;
+   struct cpu_topology_map *cpu;
 };
 
 struct perf_header {
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 18722e7..51b4d5a 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -185,6 +185,7 @@ static void perf_session_env__exit(struct perf_session_env 
*env)
zfree(env-sibling_threads);
zfree(env-numa_nodes);
zfree(env-pmu_mappings);
+   zfree(env-cpu);
 }
 
 void perf_session__delete(struct perf_session *session)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 3/3] qe_common: add qe_muram_ functions to manage muram

2015-08-24 Thread Scott Wood
On Mon, 2015-08-24 at 17:31 +0800, Zhao Qiang wrote:
 muram is used for qe, add qe_muram_ functions to manage
 muram.
 
 Signed-off-by: Zhao Qiang qiang.z...@freescale.com
 ---
 Changes for v2:
   - no changes
 Changes for v3:
   - no changes
 Changes for v4:
   - no changes
 Changes for v5:
   - no changes
 Changes for v5:
   - using genalloc instead rheap to manage QE MURAM
   - remove qe_reset from platform file, using 
   - subsys_initcall to call qe_init function.

This patch should come before the one that moves the code.

 diff --git a/drivers/soc/fsl/qe/qe_common.c b/drivers/soc/fsl/qe/qe_common.c
 new file mode 100644
 index 000..7f1762c
 --- /dev/null
 +++ b/drivers/soc/fsl/qe/qe_common.c
 @@ -0,0 +1,193 @@
 +/*
 + * common qe code
 + *
 + * author: scott wood scottw...@freescale.com
 + *
 + * copyright 2007-2008,2010 freescale Semiconductor, Inc.
 + *
 + * some parts derived from commproc.c/qe2_common.c, which is:
 + * copyright (c) 1997 dan error_act (dma...@jlc.net)
 + * copyright (c) 1999-2001 dan Malek d...@embeddedalley.com
 + * copyright (c) 2000 montavista Software, Inc (sou...@mvista.com)
 + * 2006 (c) montavista software, Inc.
 + * vitaly bordug vbor...@ru.mvista.com

Why did you lowercase everyone's names?  Why is this copying code rather than 
moving it?


 diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h
 index 55da07e..aaf3dc2 100644
 --- a/include/linux/genalloc.h
 +++ b/include/linux/genalloc.h
 @@ -30,6 +30,7 @@
  #ifndef __GENALLOC_H__
  #define __GENALLOC_H__
  
 +#include linux/types.h
  #include linux/spinlock_types.h
  
  struct device;

This does not belong in this patch.


 @@ -187,12 +190,41 @@ static inline int qe_alive_during_sleep(void)
  }
  
  /* we actually use cpm_muram implementation, define this for convenience */
 -#define qe_muram_init cpm_muram_init
 -#define qe_muram_alloc cpm_muram_alloc
 -#define qe_muram_alloc_fixed cpm_muram_alloc_fixed
 -#define qe_muram_free cpm_muram_free
 -#define qe_muram_addr cpm_muram_addr
 -#define qe_muram_offset cpm_muram_offset
 +int qe_muram_init(void);
 +
 +#if defined(CONFIG_QUICC_ENGINE)
 +unsigned long qe_muram_alloc(unsigned long size, unsigned long align);
 +int qe_muram_free(unsigned long offset);
 +void __iomem *qe_muram_addr(unsigned long offset);
 +unsigned long qe_muram_offset(void __iomem *addr);
 +dma_addr_t qe_muram_dma(void __iomem *addr);
 +#else
 +static inline unsigned long qe_muram_alloc(unsigned long size,
 + unsigned long align)
 +{
 + return -ENOSYS;
 +}

What code calls these functions without CONFIG_QUICC_ENGINE?

Are you converting qe without cpm?  Why?

-Scott

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] f2fs: fix to release inode correctly

2015-08-24 Thread Jaegeuk Kim
On Mon, Aug 24, 2015 at 05:40:45PM +0800, Chao Yu wrote:
 In following call stack, if unfortunately we lose all chances to truncate
 inode page in remove_inode_page, eventually we will add the nid allocated
 previously into free nid cache, this nid is with NID_NEW status and with
 NEW_ADDR in its blkaddr pointer:
 
  - f2fs_create
   - f2fs_add_link
- __f2fs_add_link
 - init_inode_metadata
  - new_inode_page
   - new_node_page
- set_node_addr(, NEW_ADDR)
  - f2fs_init_acl   failed
  - remove_inode_page  failed
   - handle_failed_inode
- remove_inode_page  failed
- iput
 - f2fs_evict_inode
  - remove_inode_page  failed
  - alloc_nid_failed   cache a nid with valid blkaddr: NEW_ADDR
 
 This may not only cause resource leak of previous inode, but also may cause
 incorrect use of the previous blkaddr which is located in NO.nid node entry
 when this nid is reused by others.
 
 This patch tries to add this inode to orphan list if we fail to truncate
 inode, so that we can obtain a second chance to release it in orphan
 recovery flow.
 
 Signed-off-by: Chao Yu chao2...@samsung.com
 ---
  fs/f2fs/f2fs.h  |  2 +-
  fs/f2fs/inode.c | 53 ++---
  fs/f2fs/node.c  | 14 +-
  3 files changed, 56 insertions(+), 13 deletions(-)
 
 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 index 806439f..69827ee 100644
 --- a/fs/f2fs/f2fs.h
 +++ b/fs/f2fs/f2fs.h
 @@ -1687,7 +1687,7 @@ int get_dnode_of_data(struct dnode_of_data *, pgoff_t, 
 int);
  int truncate_inode_blocks(struct inode *, pgoff_t);
  int truncate_xattr_node(struct inode *, struct page *);
  int wait_on_node_pages_writeback(struct f2fs_sb_info *, nid_t);
 -void remove_inode_page(struct inode *);
 +int remove_inode_page(struct inode *);
  struct page *new_inode_page(struct inode *);
  struct page *new_node_page(struct dnode_of_data *, unsigned int, struct page 
 *);
  void ra_node_page(struct f2fs_sb_info *, nid_t);
 diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
 index d1b03d0..35aae65 100644
 --- a/fs/f2fs/inode.c
 +++ b/fs/f2fs/inode.c
 @@ -317,6 +317,7 @@ void f2fs_evict_inode(struct inode *inode)
   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
   struct f2fs_inode_info *fi = F2FS_I(inode);
   nid_t xnid = fi-i_xattr_nid;
 + int err = 0;
  
   /* some remained atomic pages should discarded */
   if (f2fs_is_atomic_file(inode))
 @@ -342,11 +343,13 @@ void f2fs_evict_inode(struct inode *inode)
   i_size_write(inode, 0);
  
   if (F2FS_HAS_BLOCKS(inode))
 - f2fs_truncate(inode, true);
 + err = f2fs_truncate(inode, true);
  
 - f2fs_lock_op(sbi);
 - remove_inode_page(inode);
 - f2fs_unlock_op(sbi);
 + if (!err) {
 + f2fs_lock_op(sbi);
 + err = remove_inode_page(inode);
 + f2fs_unlock_op(sbi);
 + }
  
   sb_end_intwrite(inode-i_sb);
  no_delete:
 @@ -362,9 +365,26 @@ no_delete:
   if (is_inode_flag_set(fi, FI_UPDATE_WRITE))
   add_dirty_inode(sbi, inode-i_ino, UPDATE_INO);
   if (is_inode_flag_set(fi, FI_FREE_NID)) {
 - alloc_nid_failed(sbi, inode-i_ino);
 + if (err  err != -ENOENT)
 + alloc_nid_done(sbi, inode-i_ino);
 + else
 + alloc_nid_failed(sbi, inode-i_ino);
   clear_inode_flag(fi, FI_FREE_NID);
   }
 +
 + if (err  err != -ENOENT) {
 + if (!exist_written_data(sbi, inode-i_ino, ORPHAN_INO)) {
 + /*
 +  * get here because we failed to release resource
 +  * of inode previously, reminder our user to run fsck
 +  * for fixing.
 +  */
 + set_sbi_flag(sbi, SBI_NEED_FSCK);
 + f2fs_msg(sbi-sb, KERN_WARNING,
 + inode (ino:%lu) resource leak, run fsck 
 + to fix this issue!, inode-i_ino);
 + }
 + }
  out_clear:
  #ifdef CONFIG_F2FS_FS_ENCRYPTION
   if (fi-i_crypt_info)
 @@ -377,6 +397,7 @@ out_clear:
  void handle_failed_inode(struct inode *inode)
  {
   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 + int err = 0;
  
   clear_nlink(inode);
   make_bad_inode(inode);
 @@ -384,9 +405,27 @@ void handle_failed_inode(struct inode *inode)
  
   i_size_write(inode, 0);
   if (F2FS_HAS_BLOCKS(inode))
 - f2fs_truncate(inode, false);
 + err = f2fs_truncate(inode, false);
 +
 + if (!err)
 + err = remove_inode_page(inode);
  
 - remove_inode_page(inode);
 + /*
 +  * if we skip truncate_node in remove_inode_page bacause we failed
 +  * before, it's better to find another way to release resource of
 +  * this inode (e.g. valid block count, node block or nid). Here we
 +  * choose to add this inode to orphan list, so that we can call iput
 +   

Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT

2015-08-24 Thread Eric B Munson
On Mon, 24 Aug 2015, Konstantin Khlebnikov wrote:

 On Mon, Aug 24, 2015 at 6:55 PM, Eric B Munson emun...@akamai.com wrote:
  On Mon, 24 Aug 2015, Konstantin Khlebnikov wrote:
 
  On Mon, Aug 24, 2015 at 6:09 PM, Eric B Munson emun...@akamai.com wrote:
   On Mon, 24 Aug 2015, Vlastimil Babka wrote:
  
   On 08/24/2015 03:50 PM, Konstantin Khlebnikov wrote:
   On Mon, Aug 24, 2015 at 4:30 PM, Vlastimil Babka vba...@suse.cz 
   wrote:
   On 08/24/2015 12:17 PM, Konstantin Khlebnikov wrote:
   
   
   I am in the middle of implementing lock on fault this way, but I 
   cannot
   see how we will hanlde mremap of a lock on fault region.  Say we 
   have
   the following:
   
 addr = mmap(len, MAP_ANONYMOUS, ...);
 mlock(addr, len, MLOCK_ONFAULT);
 ...
 mremap(addr, len, 2 * len, ...)
   
   There is no way for mremap to know that the area being remapped was 
   lock
   on fault so it will be locked and prefaulted by remap.  How can we 
   avoid
   this without tracking per vma if it was locked with lock or lock on
   fault?
   
   
   remap can count filled ptes and prefault only completely populated 
   areas.
   
   
   Does (and should) mremap really prefault non-present pages? Shouldn't 
   it
   just prepare the page tables and that's it?
   
   As I see mremap prefaults pages when it extends mlocked area.
   
   Also quote from manpage
   : If  the memory segment specified by old_address and old_size is 
   locked
   : (using mlock(2) or similar), then this lock is maintained when the 
   segment is
   : resized and/or relocated.  As a  consequence, the amount of memory 
   locked
   : by the process may change.
  
   Oh, right... Well that looks like a convincing argument for having a
   sticky VM_LOCKONFAULT after all. Having mremap guess by scanning
   existing pte's would slow it down, and be unreliable (was the area
   completely populated because MLOCK_ONFAULT was not used or because
   the process aulted it already? Was it not populated because
   MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to
   populate it all?).
  
   Given this, I am going to stop working in v8 and leave the vma flag in
   place.
  
  
   The only sane alternative is to populate always for mremap() of
   VM_LOCKED areas, and document this loss of MLOCK_ONFAULT information
   as a limitation of mlock2(MLOCK_ONFAULT). Which might or might not
   be enough for Eric's usecase, but it's somewhat ugly.
  
  
   I don't think that this is the right solution, I would be really
   surprised as a user if an area I locked with MLOCK_ONFAULT was then
   fully locked and prepopulated after mremap().
 
  If mremap is the only problem then we can add opposite flag for it:
 
  MREMAP_NOPOPULATE
  - do not populate new segment of locked areas
  - do not copy normal areas if possible (anonymous/special must be copied)
 
  addr = mmap(len, MAP_ANONYMOUS, ...);
  mlock(addr, len, MLOCK_ONFAULT);
  ...
  addr2 = mremap(addr, len, 2 * len, MREMAP_NOPOPULATE);
  ...
 
 
  But with this, the user must remember what areas are locked with
  MLOCK_LOCKONFAULT and which are locked the with prepopulate so the
  correct mremap flags can be used.
 
 
 Yep. Shouldn't be hard. You anyway have to do some changes in user-space.
 

Sorry if I wasn't clear enough in my last reply, I think forcing
userspace to track this is the wrong choice.  The VM system is
responsible for tracking these attributes and should continue to be.

 
 Much simpler for users-pace solution is a mm-wide flag which turns all further
 mlocks and MAP_LOCKED into lock-on-fault. Something like
 mlockall(MCL_NOPOPULATE_LOCKED).

This set certainly adds the foundation for such a change if you think it
would be useful.  That particular behavior was not part of my inital use
case though.



signature.asc
Description: Digital signature


[PATCH v7 5/8] Watchdog: introduce ARM SBSA watchdog driver

2015-08-24 Thread fu . wei
From: Fu Wei fu@linaro.org

This driver bases on linux kernel watchdog framework, and
use pretimeout in the framework. It supports getting timeout and
pretimeout from parameter and FDT at the driver init stage.
In first timeout, the interrupt routine run panic to save
system context.

Signed-off-by: Fu Wei fu@linaro.org
---
 drivers/watchdog/Kconfig |  14 ++
 drivers/watchdog/Makefile|   1 +
 drivers/watchdog/sbsa_gwdt.c | 459 +++
 3 files changed, 474 insertions(+)

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 241fafd..b2734f0 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -173,6 +173,20 @@ config ARM_SP805_WATCHDOG
  ARM Primecell SP805 Watchdog timer. This will reboot your system when
  the timeout is reached.
 
+config ARM_SBSA_WATCHDOG
+   tristate ARM SBSA Generic Watchdog
+   depends on ARM64
+   depends on ARM_ARCH_TIMER
+   select WATCHDOG_CORE
+   help
+ ARM SBSA Generic Watchdog. This watchdog has two Watchdog timeouts.
+ The first timeout will trigger a panic; the second timeout will
+ trigger a system reset.
+ More details: ARM DEN0029B - Server Base System Architecture (SBSA)
+
+ To compile this driver as module, choose M here: The module
+ will be called sbsa_gwdt.
+
 config AT91RM9200_WATCHDOG
tristate AT91RM9200 watchdog
depends on SOC_AT91RM9200  MFD_SYSCON
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index 59ea9a1..be8e7c5 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -30,6 +30,7 @@ obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
 
 # ARM Architecture
 obj-$(CONFIG_ARM_SP805_WATCHDOG) += sp805_wdt.o
+obj-$(CONFIG_ARM_SBSA_WATCHDOG) += sbsa_gwdt.o
 obj-$(CONFIG_AT91RM9200_WATCHDOG) += at91rm9200_wdt.o
 obj-$(CONFIG_AT91SAM9X_WATCHDOG) += at91sam9_wdt.o
 obj-$(CONFIG_CADENCE_WATCHDOG) += cadence_wdt.o
diff --git a/drivers/watchdog/sbsa_gwdt.c b/drivers/watchdog/sbsa_gwdt.c
new file mode 100644
index 000..7ae45cc
--- /dev/null
+++ b/drivers/watchdog/sbsa_gwdt.c
@@ -0,0 +1,459 @@
+/*
+ * SBSA(Server Base System Architecture) Generic Watchdog driver
+ *
+ * Copyright (c) 2015, Linaro Ltd.
+ * Author: Fu Wei fu@linaro.org
+ * Suravee Suthikulpanit suravee.suthikulpa...@amd.com
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * The SBSA Generic watchdog driver is compatible with the pretimeout
+ * concept of Linux kernel.
+ * The timeout and pretimeout are determined by WCV or WOR.
+ * The first watch period is set by writing WCV directly, that can
+ * support more than 10s timeout at the maximum system counter
+ * frequency (400MHz).
+ * When WS0 is triggered, the second watch period (pretimeout) is
+ * determined by one of these registers:
+ * (1)WOR: 32bit register, this gives a maximum watch period of
+ * around 10s at the maximum system counter frequency. It's loaded
+ * automatically by hardware.
+ * (2)WCV: If the pretimeout value is greater then max_wor_timeout,
+ * it will be loaded in WS0 interrupt routine. If system is in
+ * ws0_mode (reboot by kexec/kdump in panic with watchdog enabled
+ * and WS0 == true), the ping operation will only reload WCV.
+ * More details about the hardware specification of this device:
+ * ARM DEN0029B - Server Base System Architecture (SBSA)
+ *
+ * Kernel/API: P--| pretimeout
+ *   |T timeout
+ * SBSA GWDT:  P---WOR (or WCV)---WS1 pretimeout
+ *   |---WCV--WS0~~~(ws0_mode)T timeout
+ */
+
+#include linux/io.h
+#include linux/interrupt.h
+#include linux/module.h
+#include linux/moduleparam.h
+#include linux/of.h
+#include linux/of_device.h
+#include linux/platform_device.h
+#include linux/uaccess.h
+#include linux/watchdog.h
+#include asm/arch_timer.h
+
+/* SBSA Generic Watchdog register definitions */
+/* refresh frame */
+#define SBSA_GWDT_WRR  0x000
+
+/* control frame */
+#define SBSA_GWDT_WCS  0x000
+#define SBSA_GWDT_WOR  0x008
+#define SBSA_GWDT_WCV_LO   0x010
+#define SBSA_GWDT_WCV_HI   0x014
+
+/* refresh/control frame */
+#define SBSA_GWDT_W_IIDR   0xfcc
+#define SBSA_GWDT_IDR  0xfd0
+
+/* Watchdog Control and Status Register */
+#define SBSA_GWDT_WCS_EN   BIT(0)
+#define 

[PATCH v7 2/8] ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts

2015-08-24 Thread fu . wei
From: Fu Wei fu@linaro.org

This can be a example of adding SBSA Generic Watchdog device node
into some dts files for the Soc which contains SBSA Generic Watchdog.

Acked-by: Arnd Bergmann a...@arndb.de
Signed-off-by: Fu Wei fu@linaro.org
---
 arch/arm64/boot/dts/arm/foundation-v8.dts | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/boot/dts/arm/foundation-v8.dts 
b/arch/arm64/boot/dts/arm/foundation-v8.dts
index 4eac8dc..824431f 100644
--- a/arch/arm64/boot/dts/arm/foundation-v8.dts
+++ b/arch/arm64/boot/dts/arm/foundation-v8.dts
@@ -237,4 +237,11 @@
};
};
};
+   watchdog@2a44 {
+   compatible = arm,sbsa-gwdt;
+   reg = 0x0 0x2a44 0 0x1000,
+   0x0 0x2a45 0 0x1000;
+   interrupts = 0 27 4;
+   timeout-sec = 10 5;
+   };
 };
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [lkp] [auxdisplay] 4edd70c133f: BUG: unable to handle kernel

2015-08-24 Thread Sudip Mukherjee
On Thu, Aug 20, 2015 at 01:36:17PM +0800, kernel test robot wrote:
 FYI, we noticed the below changes on
 
 git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
 commit 4edd70c133f3921c594883d8f9da31a7261f8b4f (auxdisplay: ks0108: use new 
 parport device model)
Sorry for the delay in replying. It has already been fixed by:
92f26189b181 (auxdisplay: ks0108: initialize local parport variable)

regards
sudip
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


scsi: convert host_busy to atomic_t series causes regressions for some hardware configurations

2015-08-24 Thread Sergio Callegari

Thanks Christoph for the answer!

Apparently I missed a piece of the thread where the test patch was originally 
proposed . Now, I have gone through it and I see how the patch was not meant to 
be a final correction.


My (possibly naive) understanding is that:

- Even if this might be due to hardware that not fully conforms to the standard 
(but we do not know right now), commit 74665016086615bbaa3fa6f83af410a0a4e029ee 
( scsi: convert host_busy to atomic_t ) certainly breaks the kernel for some 
hardware configurations causing a regression.


- If the regression was immediately spotted, the patch would probably have been 
revised right after proposal. Unfortunately, another bug - that got fixed only 
much later with 045065d8a300a37218c - hid the original issue for a long time.


- Now that a lot of time has passed with the scsi: convert host_busy to 
atomic_t series in the kernel, going back to look into it is much more 
difficult. Libata people might not be very interested as they moved to other 
topics and might need a lot of time to go through it (it has been known since 
November 2014 - 9 months ago), possibly due to the race like nature of the issue 
and the fact that the bug might not be reproducible on their hardware...


Is this correct?

Aren't commits that cause regressions confirmed by multiple users expected (at 
least in principle) to be reverted?


If reverting is too costy, wouldn't your papering over or making the scsi 
delay configurable be an acceptable solution?


Even better: can in some way the libata-people be helped find the real culprit, 
given that there are at least two hardware setups that are known to trigger the 
regression (mine and Barto's)?


I have tried the linux-ide mailing list, but got silence.

Best,

Sergio



On 20/08/2015 10:08, Christoph Hellwig wrote:

Hi Sergio,

On Tue, Aug 18, 2015 at 09:44:28AM +0200, Sergio Callegari wrote:

Hi,

I have bisected the issue down to

[045065d8a300a37218c548e9aa7becd581c6a0e8] [SCSI] fix qemu boot hang problem

Bisecting has been a painful job due to the fact that the bug may show only
many hours after the system boot.

The commit above in fact is not the culprit, but a fix to an issue that was
hiding the real bug on my system.  See

http://marc.info/?l=linux-kernelm=143973820612978w=2

The real issue is with sata host lock and seems to be biting a few other
people as well

https://bbs.archlinux.org/viewtopic.php?id=189324

A patch fixing the issue was sent to the LKML back in Nov 2014 by Christoph
Hellwig (who is reading in CC)

https://lkml.org/lkml/2014/11/20/581

I have tested the patch and it works for me.

What is expected to happen now?

As mentioned in that thread we need more input from the libata people
on what kind of race this is papering over.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] ubifs: Allow O_DIRECT

2015-08-24 Thread Jeff Moyer
Brian Norris computersforpe...@gmail.com writes:

 On Mon, Aug 24, 2015 at 10:13:25AM +0300, Artem Bityutskiy wrote:
 Now, some user-space fails when direct I/O is not supported.

 I think the whole argument rested on what it means when some user space
 fails; apparently that user space is just a test suite (which
 can/should be fixed).

Even if it wasn't a test suite it should still fail.  Either the fs
supports O_DIRECT or it doesn't.  Right now, the only way an application
can figure this out is to try an open and see if it fails.  Don't break
that.

Cheers,
Jeff
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: dom0 panic with Upstream Linux 4.1 tree

2015-08-24 Thread Juergen Groß

On 08/17/2015 09:32 AM, Zhenzhong Duan wrote:

Hi Maintainers

I found below panic when bootup OVM3.3.3 on HP PROLIANT DL980 G7 with
dom0_mem=max:128G, not reproduce with dom0_mem=max:127G.
Dom0 kernel is uek4 4.1.5-5.el6uek which is based on Upstream Linux 4.1
tree. This looks like an upstream issue.
Appereciate any patch/fix. Thanks


I don't think there is an easy patch in 4.1 to fix that. Your system has
half of the physical memory above the 512GB boundary making it
impossible for dom0 to use. Dom0 tries to use the memory layout of the
physical host, so it can only use memory below 512GB.

As you try to allocate 128GB for Dom0 some of the memory will end above
the magic boundary (there is only a little bit less than 128GB below
the boundary available).

For 4.3 I have posted a patch series which will eventually make it into
the kernel allowing Dom0 (and other pv-domains as well) to use memory
above the 512GB boundary.


Juergen
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Documentation/x86: Rename IRQSTACKSIZE to IRQ_STACK_SIZE

2015-08-24 Thread Jonathan Corbet
On Fri, 21 Aug 2015 15:19:06 +0600
Alexander Kuleshov kuleshovm...@gmail.com wrote:

 The IRQSTACKSIZE was renamed to the IRQ_STACK_SIZE in the
 (26f80bd6a9 x86-64: Convert irqstacks to per-cpu) commit,
 but it still named IRQSTACKSIZE in the documentation. This
 patch fixes this.

Applied to the docs tree, thanks.

jon
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] input: gpio-keys: report error when invalid key number

2015-08-24 Thread Dmitry Torokhov
On Mon, Aug 24, 2015 at 08:07:44PM +0800, Peng Fan wrote:
 When the input key number is not valid one of
 '/sys/devices/soc0/gpio-keys/keys', need to report
 an error, but not continue.
 
 See the following example:
 root@yocto:/sys/devices/soc0/gpio-keys# cat keys
 114-116
 root@yocto:/sys/devices/soc0/gpio-keys# echo 77  keys
 root@yocto:/sys/devices/soc0/gpio-keys#
 
 we want 'echo 77  keys' to report an error, but not
 silence to give us an fake illusion that all is 'ok'.
 
 Signed-off-by: Peng Fan van.free...@gmail.com
 Cc: Dmitry Torokhov dmitry.torok...@gmail.com
 Cc: Linus Walleij linus.wall...@linaro.org
 Cc: Alexander Stein alexander.st...@systec-electronic.com
 Cc: Tejun Heo t...@kernel.org
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Wolfram Sang w...@the-dreams.de
 Cc: Fabio Estevam fabio.este...@freescale.com

Applied, thank you.

 ---
  drivers/input/keyboard/gpio_keys.c | 5 +
  1 file changed, 5 insertions(+)
 
 diff --git a/drivers/input/keyboard/gpio_keys.c 
 b/drivers/input/keyboard/gpio_keys.c
 index ddf4045..b98f3b4 100644
 --- a/drivers/input/keyboard/gpio_keys.c
 +++ b/drivers/input/keyboard/gpio_keys.c
 @@ -239,6 +239,11 @@ static ssize_t gpio_keys_attr_store_helper(struct 
 gpio_keys_drvdata *ddata,
   }
   }
  
 + if (i == ddata-pdata-nbuttons) {
 + error = -EINVAL;
 + goto out;
 + }
 +
   mutex_lock(ddata-disable_lock);
  
   for (i = 0; i  ddata-pdata-nbuttons; i++) {
 -- 
 1.8.4.5
 

-- 
Dmitry
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] media: don't try to empty links list in media_entity_cleanup()

2015-08-24 Thread Javier Martinez Canillas
The media_entity_cleanup() function only cleans up the entity links list
but this operation is already made in media_device_unregister_entity().

In most cases this should be harmless (besides having duplicated code)
since the links list would be empty so the iteration would not happen
but the links list is initialized in media_device_register_entity() so
if a driver fails to register an entity with a media device and clean up
the entity in the error path, a NULL deference pointer error will happen.

So don't try to empty the links list in media_entity_cleanup() since
is either done already or haven't been initialized yet.

Signed-off-by: Javier Martinez Canillas jav...@osg.samsung.com

---

 drivers/media/media-entity.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/media/media-entity.c b/drivers/media/media-entity.c
index fc6bb48027ab..acb65f734508 100644
--- a/drivers/media/media-entity.c
+++ b/drivers/media/media-entity.c
@@ -252,13 +252,6 @@ EXPORT_SYMBOL_GPL(media_entity_init);
 void
 media_entity_cleanup(struct media_entity *entity)
 {
-   struct media_link *link, *tmp;
-
-   list_for_each_entry_safe(link, tmp, entity-links, list) {
-   media_gobj_remove(link-graph_obj);
-   list_del(link-list);
-   kfree(link);
-   }
 }
 EXPORT_SYMBOL_GPL(media_entity_cleanup);
 
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/sysctl.c: If count including the terminating byte '\0' the write system call should retrun success.

2015-08-24 Thread Steven Rostedt
On Mon, 24 Aug 2015 16:56:13 +0800
Sean Fu fxinr...@gmail.com wrote:

 when the input argument count including the terminating byte \0,
 The write system call return EINVAL on proc file.
 But it return success on regular file.
 
 E.g. Writting two bytes (1\0) to /proc/sys/net/ipv4/conf/eth0/rp_filter.
 write(fd, 1\0, 2) return EINVAL.

And what would do that? What tool broke because of this?

 echo 1  /proc/sys/net/ipv4/conf/eth0/rp_filter

works just fine. strlen(string) would not include the nul character.
The only thing I could think of would be a sizeof(str), but then that
would include someone hardcoding an integer in a string, like:

char val[] = 1

write(fd, val, sizeof(val));

Again, what tool does that?

If there is a tool out in the wild that use to work on 2.6 (and was
running on 2.6 then, and not something that was created after that
change), then we can consider this fix.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 1/8] Documentation: add sbsa-gwdt.txt documentation

2015-08-24 Thread fu . wei
From: Fu Wei fu@linaro.org

The sbsa-gwdt.txt documentation in devicetree/bindings/watchdog is for
introducing SBSA(Server Base System Architecture) Generic Watchdog
device node info into FDT.

Acked-by: Arnd Bergmann a...@arndb.de
Signed-off-by: Fu Wei fu@linaro.org
---
 .../devicetree/bindings/watchdog/sbsa-gwdt.txt | 32 ++
 1 file changed, 32 insertions(+)

diff --git a/Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt 
b/Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt
new file mode 100644
index 000..8b43640
--- /dev/null
+++ b/Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt
@@ -0,0 +1,32 @@
+* SBSA(Server Base System Architecture) Generic Watchdog
+
+The SBSA Generic Watchdog Timer is used for resetting the system after
+two stages of timeout.
+More details: ARM-DEN-0029 - Server Base System Architecture (SBSA)
+
+Required properties:
+- compatible : Should at least contain arm,sbsa-gwdt.
+
+- reg : Specifies base physical address of the two register frames
+  and length of memory mapped region, order:
+  1: Watchdog control frame
+  2: Refresh frame.
+
+- interrupts : Should at least contain WS0 interrupt,
+  the WS1 interrupt is optional, order:
+  1: WS0 interrupt
+  2: WS1 interrupt
+
+Optional properties
+- timeout-sec : Watchdog pre-timeout and timeout values (in seconds).
+  The first is timeout values, then pre-timeout.
+
+Example for FVP Foundation Model v8:
+
+watchdog@2a44 {
+   compatible = arm,sbsa-gwdt;
+   reg = 0x0 0x2a44 0 0x1000,
+ 0x0 0x2a45 0 0x1000;
+   interrupts = 0 27 4;
+   timeout-sec = 10 5;
+};
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 0/8] Watchdog: introduce ARM SBSA watchdog driver

2015-08-24 Thread fu . wei
From: Fu Wei fu@linaro.org

This patchset:
(1)Introduce Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt
for FDT info of SBSA Generic Watchdog, and give two examples of
adding SBSA Generic Watchdog device node into the dts files:
foundation-v8.dts and amd-seattle-soc.dtsi.

(2)Introduce pretimeout into the watchdog framework, and update
Documentation/watchdog/watchdog-kernel-api.txt to introduce:
(1)the new elements in the watchdog_device and watchdog_ops struct;
(2)the new API watchdog_init_timeouts.

(3)Introduce ARM SBSA watchdog driver:
a.Use linux kernel watchdog framework;
b.Work with FDT on ARM64;
c.Use pretimeout in watchdog framework;
d.Support getting timeout and pretimeout from parameter and FDT
  at the driver init stage.
e.In the first timeout, do panic to save system context;
f.In the second stage, user can still feed the dog without
  cleaning WS0. By this feature, we can avoid the panic infinite
  loops, while backing up a large system context in a server.
g.In the second stage, can trigger WS1 by setting pretimeout = 0
  if necessary.

(4)Introduce ACPI GTDT parser: drivers/acpi/gtdt.c
Parse SBSA Generic Watchdog Structure in GTDT table of ACPI,
and create a platform device with that information.
This platform device can be used by This Watchdog driver.
drivers/clocksource/arm_arch_timer.c is simplified by this GTDT support.

This patchset has been tested with watchdog daemon
(ACPI/FDT, module/build-in) on the following platforms:
(1)ARM Foundation v8 model

Changelog:
v7: Rebase to latest kernel version(4.2-rc7).
Improve FDT support: geting resource by order, instead of name.
According to the FDT support, Update the example dts file, gtdt.c
and sbsa_gwdt.c.
Pass the sparse test, and fix the warning.
Fix the max_pretimeout and max_timeout value overflow bug.
Delete the WCV output value.


v6: Improve the dtb example files: reduce the register frame size to 4K.
Improve pretimeout support:
(1) improve watchdog_init_timeouts function
(2) rename watchdog_check_min_max_timeouts back to the original name
(1) improve watchdog_timeout_invalid/watchdog_pretimeout_invalid
Add the new features in the sbsa_gwdt driver:
(1) In the second stage, user can feed the dog without cleaning WS0.
(2) In the second stage, user can trigger WS1 by setting pretimeout = 0.
(3) expand the max value of pretimeout, in case 10 second is not enough
for a kdump kernel reboot in panic.

v5: Improve pretimeout support:
(1)fix typo in documentation and comments.
(2)fix the timeout limits validation bug.
Simplify sbsa_gwdt driver:
(1)integrate all the registers access functions into caller.

v4: Refactor GTDT support code: remove it from arch/arm64/kernel/acpi.c,
put it into drivers/acpi/gtdt.c file.
Integrate the GTDT code of drivers/clocksource/arm_arch_timer.c into
drivers/acpi/gtdt.c.
Improve pretimeout support, fix pretimeout == 0 problem.
Simplify sbsa_gwdt driver:
(1)timeout/pretimeout limits setup;
(2)keepalive function;
(3)delete clk == 0 check;
(4)delete WS0 status bit check in interrupt routine;
(5)sbsa_gwdt_set_wcv function.

v3: Delete export arch_timer_get_rate patch.
Driver back to use arch_timer_get_cntfrq.
Improve watchdog_init_timeouts function and update relevant documentation.
Improve watchdog_timeout_invalid and watchdog_pretimeout_invalid.
Improve foundation-v8.dts: delete the unnecessary tag of device node.
Remove ARM64 || COMPILE_TEST from Kconfig.
Add comments in arch/arm64/kernel/acpi.c
Fix typoes and incorrect comments.

v2: Improve watchdog-kernel-api.txt documentation for pretimeout support.
Export arch_timer_get_rate in arm_arch_timer.c.
Add watchdog_init_timeouts API for pretimeout support in framework.
Improve suspend and resume foundation in driver
Improve timeout/pretimeout values init code in driver.
Delete unnecessary items of the sbsa_gwdt struct and #define.
Delete all unnecessary debug info in driver.
Fix 64bit division bug.
Use the arch_timer interface to get watchdog clock rate.
Add MODULE_DEVICE_TABLE for platform device id.
Fix typoes.

v1: The first version upstream patchset to linux mailing list.

Fu Wei (8):
  Documentation: add sbsa-gwdt.txt documentation
  ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts
  ARM64: add SBSA Generic Watchdog device node in amd-seattle-soc.dtsi
  Watchdog: introdouce pretimeout into framework
  Watchdog: introduce ARM SBSA watchdog driver
  ACPI: add GTDT table parse driver into ACPI driver
  Watchdog: enable ACPI GTDT support for ARM SBSA watchdog driver
  clocksource: simplify ACPI code in arm_arch_timer.c

 

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Tejun Heo
Hello, Austin.

On Mon, Aug 24, 2015 at 11:47:02AM -0400, Austin S Hemmelgarn wrote:
 Just to learn more, what sort of hypervisor support threads are we
 talking about?  They would have to consume considerable amount of cpu
 cycles for problems like this to be relevant and be dynamic in numbers
 in a way which letting them competing against vcpus makes sense.  Do
 IO helpers meet these criteria?
 
 Depending on the configuration, yes they can.  VirtualBox has some rather
 CPU intensive threads that aren't vCPU threads (their emulated APIC thread
 immediately comes to mind), and so does QEMU depending on the emulated

And the number of those threads fluctuate widely and dynamically?

 hardware configuration (it gets more noticeable when the disk images are
 stored on a SAN and served through iSCSI, NBD, FCoE, or ATAoE, which is
 pretty typical usage for large virtualization deployments).  I've seen cases
 first hand where the vCPU's can make no reasonable progress because they are
 constantly getting crowded out by other threads.

That alone doesn't require hierarchical resource distribution tho.
Setting nice levels reasonably is likely to alleviate most of the
problem.

 The use of the term 'hypervisor support threads' for this is probably not
 the best way of describing the contention, as it's almost always a full
 system virtualization issue, and the contending threads are usually storage
 back-end access threads.
 
 I would argue that there are better ways to deal properly with this (Isolate
 the non vCPU threads on separate physical CPU's from the hardware emulation
 threads), but such methods require large systems to be practical at any
 scale, and many people don't have the budget for such large systems, and
 this way of doing things is much more flexible for small scale use cases
 (for example, someone running one or two VM's on a laptop under QEMU or
 VirtualBox).

I don't know.  Someone running one or two VM's on a laptop under
QEMU doesn't really sound like the use case which absolutely requires
hierarchical cpu cycle distribution.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] nfit, nd_blk: BLK status register is only 32 bits

2015-08-24 Thread Jeff Moyer
Ross Zwisler ross.zwis...@linux.intel.com writes:

 Only read 32 bits for the BLK status register in read_blk_stat().

 The format and size of this register is defined in the
 NVDIMM Driver Writer's guide:

 http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf

 Signed-off-by: Ross Zwisler ross.zwis...@linux.intel.com
 Reported-by: Nicholas Moulin nicholas.w.mou...@linux.intel.com

Looks fine,

Reviewed-by: Jeff Moyer jmo...@redhat.com

However, now that you've drawn attention to that code, I'll note that
there is no checking of the pending or retry bits.  In fact,
ACPI_NFIT_CONTROL_BUFFERED isn't even checked upon loading the tables.
Is this on a todo list somewhere?

Cheers,
Jeff

 ---
  drivers/acpi/nfit.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
 index 7c2638f..8689ee1 100644
 --- a/drivers/acpi/nfit.c
 +++ b/drivers/acpi/nfit.c
 @@ -1009,7 +1009,7 @@ static void wmb_blk(struct nfit_blk *nfit_blk)
   wmb_pmem();
  }
  
 -static u64 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw)
 +static u32 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw)
  {
   struct nfit_blk_mmio *mmio = nfit_blk-mmio[DCR];
   u64 offset = nfit_blk-stat_offset + mmio-size * bw;
 @@ -1017,7 +1017,7 @@ static u64 read_blk_stat(struct nfit_blk *nfit_blk, 
 unsigned int bw)
   if (mmio-num_lines)
   offset = to_interleave_offset(offset, mmio);
  
 - return readq(mmio-base + offset);
 + return readl(mmio-base + offset);
  }
  
  static void write_blk_ctl(struct nfit_blk *nfit_blk, unsigned int bw,
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH linux-next v4 5/5] mtd: atmel-quadspi: add driver for Atmel QSPI controller

2015-08-24 Thread Cyrille Pitchen
Hi Marek,

Le 24/08/2015 13:03, Marek Vasut a écrit :
 On Monday, August 24, 2015 at 12:14:00 PM, Cyrille Pitchen wrote:
 This driver add support to the new Atmel QSPI controller embedded into
 sama5d2x SoCs. It expects a NOR memory to be connected to the QSPI
 controller.

 Signed-off-by: Cyrille Pitchen cyrille.pitc...@atmel.com
 Acked-by: Nicolas Ferre nicolas.fe...@atmel.com
 
 Hi,
 
 [...]
 
 +/* Register access macros */
 
 These are functions, not macros :)
 
 btw is there any reason for these ? I'd say, just put the read*() and
 write*() functions directly into the code and be done with it, it is
 much less confusing.
 
 Also, why do you use the _relaxed() versions of the functions ?
 
 +static inline u32 qspi_readl(struct atmel_qspi *aq, u32 reg)
 +{
 +return readl_relaxed(aq-regs + reg);
 +}
 +
 +static inline void qspi_writel(struct atmel_qspi *aq, u32 reg, u32 value)
 +{
 +writel_relaxed(value, aq-regs + reg);
 +}
 +
 +static inline u16 qspi_readw(struct atmel_qspi *aq, u32 reg)
 +{
 +return readw_relaxed(aq-regs + reg);
 +}
 +
 +static inline void qspi_writew(struct atmel_qspi *aq, u32 reg, u16 value)
 +{
 +writew_relaxed(value, aq-regs + reg);
 +}
 +
 +static inline u8 qspi_readb(struct atmel_qspi *aq, u32 reg)
 +{
 +return readb_relaxed(aq-regs + reg);
 +}
 +
 +static inline void qspi_writeb(struct atmel_qspi *aq, u32 reg, u8 value)
 +{
 +writeb_relaxed(value, aq-regs + reg);
 +}
 
 [...]
 
 +static int atmel_qspi_run_command(struct atmel_qspi *aq,
 +  const struct atmel_qspi_command *cmd)
 +{
 +u32 iar, icr, ifr, sr;
 +int err = 0;
 +
 +iar = 0;
 +icr = 0;
 +ifr = aq-ifr_width | cmd-ifr_tfrtyp;
 +
 +/* Compute instruction parameters */
 +if (cmd-enable.bits.instruction) {
 +icr |= QSPI_ICR_INST(cmd-instruction);
 +ifr |= QSPI_IFR_INSTEN;
 +}
 +
 +/* Compute address parameters */
 +switch (cmd-enable.bits.address) {
 +case 4:
 +ifr |= QSPI_IFR_ADDRL;
 +/*break;*/ /* fallback to the 24bit address case */
 
 What's this commented out bit of code for ? :-)

I just wanted to stress out there was no missing break;.
I've reworded the comment to:
/* No break on purpose: fallback to the 24bit address case. */

 
 +case 3:
 +iar = (cmd-enable.bits.data) ? 0 : cmd-address;
 +ifr |= QSPI_IFR_ADDREN;
 +break;
 +case 0:
 +break;
 +default:
 +return -EINVAL;
 +}
 
 [...]
 
 +no_data:
 +/* Poll INSTRuction End status */
 +sr = qspi_readl(aq, QSPI_SR);
 +if (sr  QSPI_SR_INSTRE)
 +return err;
 +
 +/* Wait for INSTRuction End interrupt */
 +init_completion(aq-completion);
 
 You should use reinit_completion() in the code. init_completion()
 should be used only in the probe() function and nowhere else.

Alright. In the next version I'll rename the completion member of
struct atmel_qspi into cmd_completion. Also I'll add another dma_completion
member in this very same structure to replace the local
struct completion completion in atmel_qspi_run_dma_transfer().

Then I'll call init_completion() on both cmd_completion and dma_completion only
from atmel_qspi_probe() and reinit_completion() elsewhere.

 
 +aq-pending = 0;
 +qspi_writel(aq, QSPI_IER, QSPI_SR_INSTRE);
 +if (!wait_for_completion_timeout(aq-completion,
 + msecs_to_jiffies(1000)))
 +err = -ETIMEDOUT;
 +qspi_writel(aq, QSPI_IDR, QSPI_SR_INSTRE);
 +
 +return err;
 +}
 
 [...]
 
 Hope this helps :)
 

Indeed, it does! I still work on the next version of this series to take all 
your
comments into account.

Best regards,

Cyrille
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mtd: nand: pass page number to ecc-write_xxx() methods

2015-08-24 Thread Boris Brezillon
The -read_xxx() methods are all passed the page number the NAND controller
is supposed to read, but -write_xxx() do not have such a parameter.

This is a problem if we want to properly implement data
scrambling/randomization in order to mitigate MLC sensibility to repeated
pattern: to prevent bitflips in adjacent pages in the same block we need
to avoid repeating the same pattern at the same offset in those pages,
hence the randomizer/scrambler engine need to be passed the page value
in order to adapt its seed accordingly.

Moreover, adding the page parameter the -write_xxx() methods add some
consistency to the current model.

Signed-off-by: Boris Brezillon boris.brezil...@free-electrons.com
CC: Josh Wu josh...@atmel.com
CC: Ezequiel Garcia ezequiel.gar...@free-electrons.com
CC: Maxime Ripard maxime.rip...@free-electrons.com
CC: Greg Kroah-Hartman gre...@linuxfoundation.org
CC: Huang Shijie shijie.hu...@intel.com
CC: Bryan Wu bryan...@analog.com
CC: de...@driverdev.osuosl.org
CC: linux-arm-ker...@lists.infradead.org
CC: linux-kernel@vger.kernel.org

---
 drivers/mtd/nand/atmel_nand.c |  6 --
 drivers/mtd/nand/bf5xx_nand.c |  3 ++-
 drivers/mtd/nand/brcmnand/brcmnand.c  |  4 ++--
 drivers/mtd/nand/cafe_nand.c  |  3 ++-
 drivers/mtd/nand/denali.c |  5 +++--
 drivers/mtd/nand/docg4.c  |  4 ++--
 drivers/mtd/nand/fsl_elbc_nand.c  |  4 ++--
 drivers/mtd/nand/fsl_ifc_nand.c   |  2 +-
 drivers/mtd/nand/gpmi-nand/gpmi-nand.c|  6 +++---
 drivers/mtd/nand/hisi504_nand.c   |  3 ++-
 drivers/mtd/nand/lpc32xx_mlc.c|  3 ++-
 drivers/mtd/nand/lpc32xx_slc.c|  5 +++--
 drivers/mtd/nand/nand_base.c  | 31 ++-
 drivers/mtd/nand/omap2.c  |  3 ++-
 drivers/mtd/nand/pxa3xx_nand.c|  3 ++-
 drivers/mtd/nand/sh_flctl.c   |  3 ++-
 drivers/mtd/nand/sunxi_nand.c |  5 +++--
 drivers/staging/mt29f_spinand/mt29f_spinand.c |  3 ++-
 include/linux/mtd/nand.h  |  6 +++---
 19 files changed, 63 insertions(+), 39 deletions(-)

diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c
index 46010bd..d0f50c9 100644
--- a/drivers/mtd/nand/atmel_nand.c
+++ b/drivers/mtd/nand/atmel_nand.c
@@ -954,7 +954,8 @@ static int atmel_nand_pmecc_read_page(struct mtd_info *mtd,
 }
 
 static int atmel_nand_pmecc_write_page(struct mtd_info *mtd,
-   struct nand_chip *chip, const uint8_t *buf, int oob_required)
+   struct nand_chip *chip, const uint8_t *buf, int oob_required,
+   int page)
 {
struct atmel_nand_host *host = chip-priv;
uint32_t *eccpos = chip-ecc.layout-eccpos;
@@ -2005,7 +2006,8 @@ static int nfc_sram_write_page(struct mtd_info *mtd, 
struct nand_chip *chip,
 
if (likely(!raw))
/* Need to write ecc into oob */
-   status = chip-ecc.write_page(mtd, chip, buf, oob_required);
+   status = chip-ecc.write_page(mtd, chip, buf, oob_required,
+ page);
 
if (status  0)
return status;
diff --git a/drivers/mtd/nand/bf5xx_nand.c b/drivers/mtd/nand/bf5xx_nand.c
index 4d8d4ba..17b3727 100644
--- a/drivers/mtd/nand/bf5xx_nand.c
+++ b/drivers/mtd/nand/bf5xx_nand.c
@@ -566,7 +566,8 @@ static int bf5xx_nand_read_page_raw(struct mtd_info *mtd, 
struct nand_chip *chip
 }
 
 static int bf5xx_nand_write_page_raw(struct mtd_info *mtd,
-   struct nand_chip *chip, const uint8_t *buf, int oob_required)
+   struct nand_chip *chip, const uint8_t *buf, int oob_required,
+   int page)
 {
bf5xx_nand_write_buf(mtd, buf, mtd-writesize);
bf5xx_nand_write_buf(mtd, chip-oob_poi, mtd-oobsize);
diff --git a/drivers/mtd/nand/brcmnand/brcmnand.c 
b/drivers/mtd/nand/brcmnand/brcmnand.c
index fddb795..9a4e345 100644
--- a/drivers/mtd/nand/brcmnand/brcmnand.c
+++ b/drivers/mtd/nand/brcmnand/brcmnand.c
@@ -1606,7 +1606,7 @@ out:
 }
 
 static int brcmnand_write_page(struct mtd_info *mtd, struct nand_chip *chip,
-  const uint8_t *buf, int oob_required)
+  const uint8_t *buf, int oob_required, int page)
 {
struct brcmnand_host *host = chip-priv;
void *oob = oob_required ? chip-oob_poi : NULL;
@@ -1617,7 +1617,7 @@ static int brcmnand_write_page(struct mtd_info *mtd, 
struct nand_chip *chip,
 
 static int brcmnand_write_page_raw(struct mtd_info *mtd,
   struct nand_chip *chip, const uint8_t *buf,
-  int oob_required)
+  int oob_required, int page)
 {
struct brcmnand_host *host = chip-priv;
void *oob = oob_required ? chip-oob_poi : NULL;
diff --git a/drivers/mtd/nand/cafe_nand.c 

Re: [PATCH] usb: phy: msm: Unregister driver interest for VBUS and ID events

2015-08-24 Thread Tim Bird
On 08/18/2015 12:56 AM, Ivan T. Ivanov wrote:
 Right now even if driver failed to probe extcon framework will
 still deliver its VBUS and ID events, which will lead to random
 exception codes.
 
 Fix this by removing driver interest for VBUS and ID events when
 probe fail.
 
 Fixes: 591fc116f330 (usb: phy: msm: Use extcon framework for VBUS and ID 
 detection)
 
 Reported-by: Tim Bird tim.b...@sonymobile.com
 Signed-off-by: Ivan T. Ivanov ivan.iva...@linaro.org
 ---
  drivers/usb/phy/phy-msm-usb.c | 26 +-
  1 file changed, 17 insertions(+), 9 deletions(-)
 
 diff --git a/drivers/usb/phy/phy-msm-usb.c b/drivers/usb/phy/phy-msm-usb.c
 index 00c49bb1bd29..a9082567f114 100644
 --- a/drivers/usb/phy/phy-msm-usb.c
 +++ b/drivers/usb/phy/phy-msm-usb.c
 @@ -1581,6 +1581,8 @@ static int msm_otg_read_dt(struct platform_device 
 *pdev, struct msm_otg *motg)
   ret = extcon_register_interest(motg-id.conn, ext_id-name,
  USB-HOST, motg-id.nb);
   if (ret  0) {
 + if (!IS_ERR(ext_vbus))
 + extcon_unregister_interest(motg-vbus.conn);
   dev_err(pdev-dev, register ID notifier failed\n);
   return ret;
   }
...

This patch is obsoleted by commit 83b7b67c7, which changes the extcon API
a bit (from register_interest to register_notifier, among other things).

But, in general, I would expect this approach to work.

Do you want me to re-spin this with the new API?
 -- Tim


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Documentation: add 'crashkernel=auto' entry into kernel-parameters.txt

2015-08-24 Thread Jonathan Corbet
On Mon, 24 Aug 2015 23:04:29 +0800
Yaowei Bai bywxiao...@163.com wrote:

 There is no 'crashkernel=auto' entry in kernel-parameters.txt, borrow it
 from kexec-kdump-howto.txt file in the kexec-tools-2.0.0 package.

OK, so I did some digging here.  As far as I can tell, there is no
crashkernel=auto entry because the auto-reserve patch has never been
merged into the mainline kernel.  RHEL kernels appear to have it, but
mainline doesn't.

Thus, merging this patch would make the documentation incorrect,
something I'd rather not do.  I appreciate efforts to improve the
kernel's documentation, but it is important to be sure that your proposed
changes make the docs closer to reality, rather than further away.

Thanks,

jon
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] docs: update HOWTO for 3.x - 4.x versioning

2015-08-24 Thread Jonathan Corbet
On Mon, 24 Aug 2015 09:33:09 -0500
Mario Carrillo mario.alfredo.c.arev...@intel.com wrote:

 The HOWTO document needed updating for the new kernel versioning.

As with various others, this document would benefit from changes that
would get it away from specific major version numbers.  In the absence of
that, though, we might as well at least make it current; patch applied to
the docs tree.

Thanks,

jon
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pci: acpi: Generic function for setting up PCI device DMA coherency

2015-08-24 Thread Bjorn Helgaas
On Mon, Aug 24, 2015 at 9:41 AM, Suravee Suthikulpanit
suravee.suthikulpa...@amd.com wrote:
 Hi,

 Ping. Does anyone have any comments or suggestions?

Yes, I sent you some ideas a couple weeks ago.  I'll resend them.

 On 8/13/15 16:58, Suravee Suthikulpanit wrote:

 This patch refactors of_pci_dma_configure() into a more generic
 pci_dma_configure(), which can be reused by non-OF code.
 Then, it adds support for setting up PCI device DMA coherency from
 ACPI _CCA object that should normally be specified in the DSDT node
 of its PCI host bridge..

 Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
 CC: Bjorn Helgaas bhelg...@google.com
 CC: Catalin Marinas catalin.mari...@arm.com
 CC: Will Deacon will.dea...@arm.com
 CC: Rafael J. Wysocki r...@rjwysocki.net
 CC: Rob Herring robh...@kernel.org
 CC: Murali Karicheri m-kariche...@ti.com
 ---
 Note: According to the ACPI spec, the _CCA attribute is required
for ARM64. Therefore, this patch is a pre-req for ACPI PCI
support for ARM64 which is currently in development.

Also, this should not affect other architectures since
if CCA is not required, the default value is coherent.
Please see include/acpi/acpi_bus.h: acpi_check_dma() and
drivers/acpi/scan.c: acpi_init_coherency() for more information

   drivers/of/of_pci.c| 20 
   drivers/pci/probe.c| 35 +--
   include/linux/of_pci.h |  3 ---
   3 files changed, 33 insertions(+), 25 deletions(-)

 diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
 index 5751dc5..b66ee4e 100644
 --- a/drivers/of/of_pci.c
 +++ b/drivers/of/of_pci.c
 @@ -117,26 +117,6 @@ int of_get_pci_domain_nr(struct device_node *node)
   }
   EXPORT_SYMBOL_GPL(of_get_pci_domain_nr);

 -/**
 - * of_pci_dma_configure - Setup DMA configuration
 - * @dev: ptr to pci_dev struct of the PCI device
 - *
 - * Function to update PCI devices's DMA configuration using the same
 - * info from the OF node of host bridge's parent (if any).
 - */
 -void of_pci_dma_configure(struct pci_dev *pci_dev)
 -{
 -   struct device *dev = pci_dev-dev;
 -   struct device *bridge = pci_get_host_bridge_device(pci_dev);
 -
 -   if (!bridge-parent)
 -   return;
 -
 -   of_dma_configure(dev, bridge-parent-of_node);
 -   pci_put_host_bridge_device(bridge);
 -}
 -EXPORT_SYMBOL_GPL(of_pci_dma_configure);
 -
   #if defined(CONFIG_OF_ADDRESS)
   /**
* of_pci_get_host_bridge_resources - Parse PCI host bridge resources
 from DT
 diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
 index cefd636..e2fcd3b 100644
 --- a/drivers/pci/probe.c
 +++ b/drivers/pci/probe.c
 @@ -6,12 +6,14 @@
   #include linux/delay.h
   #include linux/init.h
   #include linux/pci.h
 -#include linux/of_pci.h
 +#include linux/of_device.h
   #include linux/pci_hotplug.h
   #include linux/slab.h
   #include linux/module.h
   #include linux/cpumask.h
   #include linux/pci-aspm.h
 +#include linux/acpi.h
 +#include linux/property.h
   #include asm-generic/pci-bridge.h
   #include pci.h

 @@ -1544,6 +1546,35 @@ static void pci_init_capabilities(struct pci_dev
 *dev)
 pci_enable_acs(dev);
   }

 +/**
 + * pci_dma_configure - Setup DMA configuration
 + * @pci_dev: ptr to pci_dev struct of the PCI device
 + *
 + * Function to update PCI devices's DMA configuration using the same
 + * info from the OF node or ACPI node of host bridge's parent (if any).
 + */
 +static void pci_dma_configure(struct pci_dev *pci_dev)
 +{
 +   struct device *dev = pci_dev-dev;
 +   struct device *bridge = pci_get_host_bridge_device(pci_dev);
 +   struct acpi_device *adev;
 +   bool coherent;
 +
 +   if (has_acpi_companion(bridge)) {
 +   adev = to_acpi_node(bridge-fwnode);
 +   if (acpi_check_dma(adev, coherent))
 +   arch_setup_dma_ops(dev, 0, 0, NULL, coherent);
 +   } else {
 +   struct device *host = bridge-parent;
 +   if (!host)
 +   return;
 +
 +   of_dma_configure(dev, host-of_node);
 +   }
 +
 +   pci_put_host_bridge_device(bridge);
 +}
 +
   void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
   {
 int ret;
 @@ -1557,7 +1588,7 @@ void pci_device_add(struct pci_dev *dev, struct
 pci_bus *bus)
 dev-dev.dma_mask = dev-dma_mask;
 dev-dev.dma_parms = dev-dma_parms;
 dev-dev.coherent_dma_mask = 0xull;
 -   of_pci_dma_configure(dev);
 +   pci_dma_configure(dev);

 pci_set_dma_max_seg_size(dev, 65536);
 pci_set_dma_seg_boundary(dev, 0x);
 diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
 index 29fd3fe..ce0e5ab 100644
 --- a/include/linux/of_pci.h
 +++ b/include/linux/of_pci.h
 @@ -16,7 +16,6 @@ int of_pci_get_devfn(struct device_node *np);
   int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8
 pin);
   int 

RE: [PATCH v8 1/2] irqchip: imx-gpcv2: IMX GPCv2 driver for wakeup sources

2015-08-24 Thread Thomas Gleixner
On Mon, 24 Aug 2015, Shenwei Wang wrote:
   +static int gpcv2_wakeup_source_save(void) {
   + struct gpcv2_irqchip_data *cd;
   + void __iomem *reg;
   + int i;
   +
   + cd = imx_gpcv2_instance;
   + if (!cd)
   + return 0;
   +
   + for (i = 0; i  IMR_NUM; i++) {
   + reg = cd-gpc_base + cd-cpu2wakeup + i * 4;
   + cd-enabled_irqs[i] = readl_relaxed(reg);
  
  You read the full state of the register and restore the full state. So why
  enabled_irqs?
 
 There are two user scenarios: 
 In CPU Idle state, the system need to be woke up by any enabled
 irqs, not just the ones that marked as wakeup sources.
 In Suspend State, they system will only be woke up by the one that
 marked as a wakeup source.  Enabled_irqs are used to save the values
 before suspend, and restore them after resume.

That's what you want achieve. Still you save the full content of the
registers and restore the full content. That saves/restores the
enabled and disabled interrupts. So enabled_irqs is a misnomer as you
save the full state.

   + writel_relaxed(cd-wakeup_sources[i], reg);
   + }
   +
   + return 0;
   +}
   +
   +static void gpcv2_wakeup_source_restore(void) {
   + struct gpcv2_irqchip_data *cd;
   + void __iomem *reg;
   + int i;
   +
   + cd = imx_gpcv2_instance;
   + if (!cd)
   + return;
   +
   + for (i = 0; i  IMR_NUM; i++) {
   + reg = cd-gpc_base + cd-cpu2wakeup + i * 4;
   + writel_relaxed(cd-enabled_irqs[i], reg);
   + cd-wakeup_sources[i] = ~0;
  
  Why are you clearing that info on resume? Drivers will clear that via
  set_wake() or leave it when they want to have resume functionality?
  
 Each time system goes into the suspend state, it will call set_wake
 (ON) again to configure the wakeup sources. Clearing wakeup_sources
 here can make sure the system work as expected no matter that a
 driver calls set_wake (OFF) during resume stage.

We rather make sure that the drivers call set_wake(OFF) as they are
supposed to, because if they do not then the set_wake(ON) logic in the
core code will see the counter != 0 and not invoke the irq callback.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] crypto: KEYS: convert public key to the akcipher API

2015-08-24 Thread Tadeusz Struk
Hi Stephan,

On 08/15/2015 11:08 AM, Stephan Mueller wrote:
 Am Mittwoch, 12. August 2015, 20:54:39 schrieb Tadeusz Struk:
 
 Hi Tadeusz,
 
 @@ -41,7 +41,7 @@ struct pkcs7_parse_context {
 static void pkcs7_free_signed_info(struct pkcs7_signed_info *sinfo)
 {
  if (sinfo) {
 -mpi_free(sinfo-sig.mpi[0]);
 +kfree(sinfo-sig.s);
 
 kzfree?
 
  kfree(sinfo-sig.digest);
 
 kzfree?
 
  kfree(sinfo-signing_cert_id);
  kfree(sinfo);
 
 kzfree (due to -msdigest)?
 

Sorry for late response. I was on vacation.
All these above are module signatures, which are not sensitive,
so no need to zero the buffers on free.
The only thing that is sensitive is the private key,
which is only used for signing modules on make modules_install
and never included in the kernel.
Thanks,
T
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH

2015-08-24 Thread David Miller
From: Eugene Shatokhin eugene.shatok...@rosalab.ru
Date: Wed, 19 Aug 2015 14:59:01 +0300

 So the following might be possible, although unlikely:
 
 CPU0 CPU1
  clear_bit: read dev-flags
  clear_bit: clear EVENT_RX_KILL in the read value
 
 dev-flags=0;
 
  clear_bit: write updated dev-flags
 
 As a result, dev-flags may become non-zero again.

Is this really possible?

Stores really are atomic in the sense that the do their update
in one indivisible operation.

Atomic operations like clear_bit also will behave that way.

If a clear_bit is in progress, the dev-flags=0 store will not be
able to grab the cache line exclusively until the clear_bit is done.

So I think the above sequent of events is completely impossible.  Once
a clear_bit starts, a write by another foreign agent on the bus is
absolutely impossible to legally occur until the clear_bit completes.

I think this is a non-issue.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] arm: kgdb: patch_text() in kgdb_arch_set_breakpoint() may sleep

2015-08-24 Thread Kees Cook
On Sun, Aug 23, 2015 at 7:45 PM, Doug Anderson diand...@chromium.org wrote:
 On Wed, Aug 5, 2015 at 8:50 AM, Aapo Vienamo avien...@nvidia.com wrote:
 Hi,

 The breakpoint setting code in arch/arm/kernel/kgdb.c calls
 patch_text(), which ends up trying to sleep while in interrupt context.
 The bug was introduced by commit: 23a4e40 arm: kgdb: Handle read-only
 text / modules. The resulting behavior is BUG: scheduling while
 atomic... when setting a breakpoint in kgdb. This was tested on an
 Nvidia Jetson TK1 board with 4.2.0-rc5-next-20150805 kernel.

 Regards,
 Aapo Vienamo

 Aapo,

 Including the stack trace with this would have been helpful, though
 it's not too hard to reproduce.  Here it is:

 [  416.510559] BUG: scheduling while atomic: swapper/0/0/0x00010007
 [  416.516554] Modules linked in:
 [  416.519614] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
 4.2.0-rc7-00133-geb63b34 #1073
 [  416.527341] Hardware name: Rockchip (Device Tree)
 [  416.532042] [c0017a4c] (unwind_backtrace) from [c00133d4]
 (show_stack+0x20/0x24)
 [  416.539772] [c00133d4] (show_stack) from [c05400e8]
 (dump_stack+0x84/0xb8)
 [  416.546983] [c05400e8] (dump_stack) from [c004913c]
 (__schedule_bug+0x54/0x6c)
 [  416.554540] [c004913c] (__schedule_bug) from [c054065c]
 (__schedule+0x80/0x668)
 [  416.562183] [c054065c] (__schedule) from [c0540cfc] 
 (schedule+0xb8/0xd4)
 [  416.569219] [c0540cfc] (schedule) from [c0543a3c]
 (schedule_timeout+0x2c/0x234)
 [  416.576861] [c0543a3c] (schedule_timeout) from [c05417c0]
 (wait_for_common+0xf4/0x188)
 [  416.585109] [c05417c0] (wait_for_common) from [c0541874]
 (wait_for_completion+0x20/0x24)
 [  416.593531] [c0541874] (wait_for_completion) from [c00a0104]
 (__stop_cpus+0x58/0x70)
 [  416.601608] [c00a0104] (__stop_cpus) from [c00a0580]
 (stop_cpus+0x3c/0x54)
 [  416.608817] [c00a0580] (stop_cpus) from [c00a06c4]
 (__stop_machine+0xcc/0xe8)
 [  416.616286] [c00a06c4] (__stop_machine) from [c00a0714]
 (stop_machine+0x34/0x44)
 [  416.624016] [c00a0714] (stop_machine) from [c00173e8]
 (patch_text+0x28/0x34)
 [  416.631399] [c00173e8] (patch_text) from [c001733c]
 (kgdb_arch_set_breakpoint+0x40/0x4c)
 [  416.639823] [c001733c] (kgdb_arch_set_breakpoint) from
 [c00a0d68] (kgdb_validate_break_address+0x2c/0x60)
 [  416.649719] [c00a0d68] (kgdb_validate_break_address) from
 [c00a0e90] (dbg_set_sw_break+0x1c/0xdc)
 [  416.658922] [c00a0e90] (dbg_set_sw_break) from [c00a2e88]
 (gdb_serial_stub+0x9c4/0xba4)
 [  416.667259] [c00a2e88] (gdb_serial_stub) from [c00a11cc]
 (kgdb_cpu_enter+0x1f8/0x60c)
 [  416.675423] [c00a11cc] (kgdb_cpu_enter) from [c00a18cc]
 (kgdb_handle_exception+0x19c/0x1d0)
 [  416.684106] [c00a18cc] (kgdb_handle_exception) from [c0016f7c]
 (kgdb_compiled_brk_fn+0x30/0x3c)
 [  416.693135] [c0016f7c] (kgdb_compiled_brk_fn) from [c00091a4]
 (do_undefinstr+0x1a4/0x20c)
 [  416.701643] [c00091a4] (do_undefinstr) from [c001400c]
 (__und_svc_finish+0x0/0x34)
 [  416.709543] Exception stack(0xc07c1ce8 to 0xc07c1d30)
 [  416.714584] 1ce0:    c07c6504 c086e290
 c086e294 c086e294 c086e290
 [  416.722745] 1d00: c07c6504 0067 0001 c07c2100 0027
 c07c1d4c c07c1d50 c07c1d30
 [  416.730905] 1d20: c00a0990 c00a08d0 6193 
 [  416.735947] [c001400c] (__und_svc_finish) from [c00a08d0]
 (kgdb_breakpoint+0x58/0x94)
 [  416.744110] [c00a08d0] (kgdb_breakpoint) from [c00a0990]
 (sysrq_handle_dbg+0x58/0x6c)
 [  416.752273] [c00a0990] (sysrq_handle_dbg) from [c02c230c]
 (__handle_sysrq+0xac/0x15c)
 [  416.760437] [c02c230c] (__handle_sysrq) from [c02c23ec]
 (handle_sysrq+0x30/0x34)


 Kees: I think you've dealt with a lot more of these types of issues
 than I have.  Any quick thoughts?  If not I can put it on my long-term
 list of things to do, but until then we could always just post a
 Revert...

I don't think a revert is in order here. CONFIG_DEBUG_RODATA could be
turned off for builds where you need kgdb while this bug gets found. I
don't actually see where we've gone wrong, though. Looks like
scheduling happened while waiting for CPUs to stop? Where did we enter
atomic?

Perhaps we need to test if we're already atomic in patch_text, and
only call stop_machine if we need to?

Untested (and likely mangled by gmail):

diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
index 69bda1a5707e..855696bfe072 100644
--- a/arch/arm/kernel/patch.c
+++ b/arch/arm/kernel/patch.c
@@ -124,5 +124,8 @@ void __kprobes patch_text(void *addr, unsigned int insn)
.insn = insn,
};

-   stop_machine(patch_text_stop_machine, patch, NULL);
+   if (unlikely(in_atomic_preempt_off()))
+   patch_text_stop_machine(patch);
+   else
+   stop_machine(patch_text_stop_machine, patch, NULL);
 }


-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: [PATCH linux-next v4 5/5] mtd: atmel-quadspi: add driver for Atmel QSPI controller

2015-08-24 Thread Marek Vasut
On Monday, August 24, 2015 at 07:04:38 PM, Cyrille Pitchen wrote:
 Hi Marek,

Hi!

 Le 24/08/2015 13:03, Marek Vasut a écrit :
  On Monday, August 24, 2015 at 12:14:00 PM, Cyrille Pitchen wrote:
  This driver add support to the new Atmel QSPI controller embedded into
  sama5d2x SoCs. It expects a NOR memory to be connected to the QSPI
  controller.

[...]

  +  /* Compute address parameters */
  +  switch (cmd-enable.bits.address) {
  +  case 4:
  +  ifr |= QSPI_IFR_ADDRL;
  +  /*break;*/ /* fallback to the 24bit address case */
  
  What's this commented out bit of code for ? :-)
 
 I just wanted to stress out there was no missing break;.
 I've reworded the comment to:
 /* No break on purpose: fallback to the 24bit address case. */

Oh, the address is in bytes . I see, yes, it makes sense to be more
explicit here about the purpose of the fallback. I think this change
in the comment will make it easier for everyone who comes back in a
few years and reads this code.

  +  case 3:
  +  iar = (cmd-enable.bits.data) ? 0 : cmd-address;
  +  ifr |= QSPI_IFR_ADDREN;
  +  break;
  +  case 0:
  +  break;
  +  default:
  +  return -EINVAL;
  +  }
  
  [...]
  
  +no_data:
  +  /* Poll INSTRuction End status */
  +  sr = qspi_readl(aq, QSPI_SR);
  +  if (sr  QSPI_SR_INSTRE)
  +  return err;
  +
  +  /* Wait for INSTRuction End interrupt */
  +  init_completion(aq-completion);
  
  You should use reinit_completion() in the code. init_completion()
  should be used only in the probe() function and nowhere else.
 
 Alright. In the next version I'll rename the completion member of
 struct atmel_qspi into cmd_completion. Also I'll add another
 dma_completion member in this very same structure to replace the local
 struct completion completion in atmel_qspi_run_dma_transfer().
 
 Then I'll call init_completion() on both cmd_completion and dma_completion
 only from atmel_qspi_probe() and reinit_completion() elsewhere.
 
  +  aq-pending = 0;
  +  qspi_writel(aq, QSPI_IER, QSPI_SR_INSTRE);
  +  if (!wait_for_completion_timeout(aq-completion,
  +   msecs_to_jiffies(1000)))
  +  err = -ETIMEDOUT;
  +  qspi_writel(aq, QSPI_IDR, QSPI_SR_INSTRE);
  +
  +  return err;
  +}
  
  [...]
  
  Hope this helps :)
 
 Indeed, it does! I still work on the next version of this series to take
 all your comments into account.

Thanks :)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/5] arm64: add KASan support

2015-08-24 Thread Russell King - ARM Linux
On Mon, Aug 24, 2015 at 05:15:22PM +0300, Andrey Ryabinin wrote:
 Yes, ~130Mb (3G/1G split) should work. 512Mb shadow is optional.
 The only advantage of 512Mb shadow is better handling of user memory
 accesses bugs
 (access to user memory without copy_from_user/copy_to_user/strlen_user etc 
 API).

No need for that to be handed by KASan.  I have patches in linux-next,
now acked by Will, which prevent the kernel accessing userspace with
zero memory footprint.  No need for remapping, we have a way to quickly
turn off access to userspace mapped pages on non-LPAE 32-bit CPUs.
(LPAE is not supported yet - Catalin will be working on that using the
hooks I'm providing once he returns.)

This isn't a debugging thing, it's a security hardening thing.  Some
use-after-free bugs are potentially exploitable from userspace.  See
the recent blackhat conference paper.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 8/8] clocksource: simplify ACPI code in arm_arch_timer.c

2015-08-24 Thread Thomas Gleixner
On Tue, 25 Aug 2015, fu@linaro.org wrote:

You Cc the world and some more on your patch, but you fail to add the
maintainers of the clocksource code to the Cc list. Sigh.

 From: Fu Wei fu@linaro.org
 
 The patch update arm_arch_timer driver to use the function
 provided by the new GTDT driver of ACPI.
 By this way, arm_arch_timer.c can be simplified, and separate
 all the ACPI GTDT knowledge from this timer driver.

That's not a proper changelog and this patch want's to be split in two:

1) Implement the new ACPI function

2) Make use of it
 
 index 0aa135d..99505bb 100644
 --- a/drivers/clocksource/arm_arch_timer.c
 +++ b/drivers/clocksource/arm_arch_timer.c
 @@ -817,68 +817,30 @@ CLOCKSOURCE_OF_DECLARE(armv7_arch_timer_mem, 
 arm,armv7-timer-mem,
  arch_timer_mem_init);
  
  #ifdef CONFIG_ACPI
 -static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags)
 -{
 - int trigger, polarity;
 -
 - if (!interrupt)
 - return 0;
 -
 - trigger = (flags  ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE
 - : ACPI_LEVEL_SENSITIVE;
 -
 - polarity = (flags  ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW
 - : ACPI_ACTIVE_HIGH;
 -
 - return acpi_register_gsi(NULL, interrupt, trigger, polarity);
 -}
 -
  /* Initialize per-processor generic timer */
 -static int __init arch_timer_acpi_init(struct acpi_table_header *table)
 +void __init arch_timer_acpi_init(void)
  {

And how is that supposed to work when we have next generation CPUs
which implement a different timer? You break multisystem kernels that
way.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Input: elan_i2c - enable ELAN0100 acpi panels

2015-08-24 Thread Dmitry Torokhov
On Sat, Aug 22, 2015 at 09:37:52AM +0200, Michele Curti wrote:
 Enable ELAN0100 touchpad driver, found on a Asus X205TA laptop, to
 gai 2,3 fingers tap and 2 fingers scroll.
 
 Signed-off-by: Michele Curti michele.cu...@gmail.com

Applied, thank you (Duson, I put you as 'reviewed-by').

 ---
  drivers/hid/hid-core.c  | 1 +
  drivers/input/mouse/elan_i2c_core.c | 4 
  2 files changed, 5 insertions(+)
 
 diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
 index 22afab9..70a11ac 100644
 --- a/drivers/hid/hid-core.c
 +++ b/drivers/hid/hid-core.c
 @@ -2294,6 +2294,7 @@ static const struct hid_device_id hid_ignore_list[] = {
   { HID_USB_DEVICE(USB_VENDOR_ID_DREAM_CHEEKY, 0x0004) },
   { HID_USB_DEVICE(USB_VENDOR_ID_DREAM_CHEEKY, 0x000a) },
   { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, 0x0400) },
 + { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, 0x0401) },
   { HID_USB_DEVICE(USB_VENDOR_ID_ESSENTIAL_REALITY, 
 USB_DEVICE_ID_ESSENTIAL_REALITY_P5) },
   { HID_USB_DEVICE(USB_VENDOR_ID_ETT, USB_DEVICE_ID_TC5UH) },
   { HID_USB_DEVICE(USB_VENDOR_ID_ETT, USB_DEVICE_ID_TC4UM) },
 diff --git a/drivers/input/mouse/elan_i2c_core.c 
 b/drivers/input/mouse/elan_i2c_core.c
 index 67388f4..bbdaedc 100644
 --- a/drivers/input/mouse/elan_i2c_core.c
 +++ b/drivers/input/mouse/elan_i2c_core.c
 @@ -98,6 +98,9 @@ static int elan_get_fwinfo(u8 ic_type, u16 *vaildpage_count,
  u16 *signature_address)
  {
   switch(ic_type) {
 + case 0x08:
 + *vaildpage_count = 512;
 + break;
   case 0x09:
   *vaildpage_count = 768;
   break;
 @@ -1165,6 +1168,7 @@ MODULE_DEVICE_TABLE(i2c, elan_id);
  #ifdef CONFIG_ACPI
  static const struct acpi_device_id elan_acpi_id[] = {
   { ELAN, 0 },
 + { ELAN0100, 0 },
   { ELAN0600, 0 },
   { }
  };
 -- 
 2.5.0
 

-- 
Dmitry
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH linux-next v4 3/5] mtd: spi-nor: allow to tune the number of dummy cycles

2015-08-24 Thread Marek Vasut
On Monday, August 24, 2015 at 06:42:46 PM, Cyrille Pitchen wrote:
 Hi Marek,

Hi!

[...]

  - * Dummy Cycle calculation for different type of read.
  - * It can be used to support more commands with
  - * different dummy cycle requirements.
  - */
  -static inline int spi_nor_read_dummy_cycles(struct spi_nor *nor)
  -{
  -  switch (nor-flash_read) {
  -  case SPI_NOR_FAST:
  -  case SPI_NOR_DUAL:
  -  case SPI_NOR_QUAD:
  -  return 8;
  -  case SPI_NOR_NORMAL:
  -  return 0;
  -  }
  -  return 0;
  -}
  
  You can probably just soup up this function so that it sets the
  nor-read_dummy, no ?
 
 Actually, this is what the patch does: spi_nor_read_dummy_cycles() was
 reused and enhanced few lines below where you've pointed out the
 switch (nor-flash_read) block should be move after the else block.

You know what? I'll go get some sleep, coffee doesn't cut it anymore :)

 I think when I wrote the code I've chosen to move the definition of this
 function instead of adding forward declarations of functions such as
 read_cr() or write_sr_cr(), which are now called by
 micron_set_dummy_cycles().

Yep, that's all right, sorry for the confusion.

  -/*
  
* Write status register 1 byte
* Returns negative if error occurred.
*/
  
  @@ -1012,6 +994,81 @@ static int set_quad_mode(struct spi_nor *nor,
  struct flash_info *info) }
  
   }

[...]

  +/*
  + * Dummy Cycle calculation for different type of read.
  + * It can be used to support more commands with
  + * different dummy cycle requirements.
  + */
  +static int spi_nor_read_dummy_cycles(struct spi_nor *nor,
  +   const struct flash_info *info)
  +{
  +  struct device_node *np = nor-dev-of_node;
  +  u32 num_dummy_cycles;
  +
  +  if (np  !of_property_read_u32(np, m25p,num-dummy-cycles,
  +  num_dummy_cycles)) {
  +  nor-read_dummy = num_dummy_cycles;
  +
  +  /*
  +   * This switch block might be moved after the if...then...else
  +   * statement but it was not tested with all Spansion or Micron
  +   * memories.
  +   * Now the m25p,num-dummy-cycles property needs to be
  +   * explicitly set in the device tree so the switch statement is
  +   * executed. This should avoid unwanted side effects and keep
  +   * backward compatibility.
  +   */
  +  switch (JEDEC_MFR(info)) {
  +  case CFI_MFR_ST:
  +  return micron_set_dummy_cycles(nor);
  
  +  default:
  If you do have m25p,num-dummy-cycles set for non-micron flash, you have a
  problem here I believe.
  
  +  break;
  +  }
  +  } else {
  
  The solution would be to drop this else {} bit here, so that if you fail
  in the DT-based configuration, you fall back to this old behavior. What
  do you think please ? :)
 
 Good idea!
 I also add a trace for the default case of switch (JEDEC_MFR(info)):
 
 dev_warn(dev, can't set the number of dummy cycles\n);

Maybe change this to setting the number of dummy cycles not supported by chip, 
ignoring or something, to be explicit about the fallback and that this is not
supported by the chip. But this is just an idea, feel free to ignore it.

 So the user is notified that the driver could not use the value of
 m25p,num-dummy-cycles from the DT before falling back to the legacy
 code.

Yup.

  +  switch (nor-flash_read) {
  +  case SPI_NOR_FAST:
  +  case SPI_NOR_DUAL:
  +  case SPI_NOR_QUAD:
  +  nor-read_dummy = 8;
  +  case SPI_NOR_NORMAL:
  +  nor-read_dummy = 0;
  +  }
  +  }
  +
  +  return 0;
  +}
  
  [...]
 
 thanks for the review!

Im glad it helped ;-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Next round: revised futex(2) man page for review

2015-08-24 Thread Darren Hart
On Thu, Aug 20, 2015 at 12:40:46AM +0200, Thomas Gleixner wrote:
 On Wed, 5 Aug 2015, Darren Hart wrote:
  On Mon, Jul 27, 2015 at 02:07:15PM +0200, Michael Kerrisk (man-pages) wrote:
   .\ FIXME XXX = Start of adapted Hart/Guniguntala text =
   .\   The following text is drawn from the Hart/Guniguntala paper
   .\   (listed in SEE ALSO), but I have reworded some pieces
   .\   significantly. Please check it.
   
  The PI futex operations described below  differ  from  the  other
  futex  operations  in  that  they impose policy on the use of the
  value of the futex word:
   
  *  If the lock is not acquired, the futex word's value  shall  be
 0.
   
  *  If  the  lock is acquired, the futex word's value shall be the
 thread ID (TID; see gettid(2)) of the owning thread.
   
  *  If the lock is owned and there are threads contending for  the
 lock,  then  the  FUTEX_WAITERS  bit shall be set in the futex
 word's value; in other words, this value is:
   
 FUTEX_WAITERS | TID
   
   
  Note that a PI futex word never just has the value FUTEX_WAITERS,
  which is a permissible state for non-PI futexes.
  
  The second clause is inappropriate. I don't know if that was yours or
  mine, but non-PI futexes do not have a kernel defined value policy, so
  ==FUTEX_WAITERS cannot be a permissible state as any value is
  permissible for non-PI futexes, and none have a kernel defined state.
 
 Depends. If the regular futex is configured as robust, then we have a
 kernel defined value policy as well.

Indeed, thanks for catching that.

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pci: acpi: Generic function for setting up PCI device DMA coherency

2015-08-24 Thread Suravee Suthikulpanit

Hi,

Ping. Does anyone have any comments or suggestions?

Thanks,
Suravee

On 8/13/15 16:58, Suravee Suthikulpanit wrote:

This patch refactors of_pci_dma_configure() into a more generic
pci_dma_configure(), which can be reused by non-OF code.
Then, it adds support for setting up PCI device DMA coherency from
ACPI _CCA object that should normally be specified in the DSDT node
of its PCI host bridge..

Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
CC: Bjorn Helgaas bhelg...@google.com
CC: Catalin Marinas catalin.mari...@arm.com
CC: Will Deacon will.dea...@arm.com
CC: Rafael J. Wysocki r...@rjwysocki.net
CC: Rob Herring robh...@kernel.org
CC: Murali Karicheri m-kariche...@ti.com
---
Note: According to the ACPI spec, the _CCA attribute is required
   for ARM64. Therefore, this patch is a pre-req for ACPI PCI
   support for ARM64 which is currently in development.

   Also, this should not affect other architectures since
   if CCA is not required, the default value is coherent.
   Please see include/acpi/acpi_bus.h: acpi_check_dma() and
   drivers/acpi/scan.c: acpi_init_coherency() for more information

  drivers/of/of_pci.c| 20 
  drivers/pci/probe.c| 35 +--
  include/linux/of_pci.h |  3 ---
  3 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 5751dc5..b66ee4e 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -117,26 +117,6 @@ int of_get_pci_domain_nr(struct device_node *node)
  }
  EXPORT_SYMBOL_GPL(of_get_pci_domain_nr);

-/**
- * of_pci_dma_configure - Setup DMA configuration
- * @dev: ptr to pci_dev struct of the PCI device
- *
- * Function to update PCI devices's DMA configuration using the same
- * info from the OF node of host bridge's parent (if any).
- */
-void of_pci_dma_configure(struct pci_dev *pci_dev)
-{
-   struct device *dev = pci_dev-dev;
-   struct device *bridge = pci_get_host_bridge_device(pci_dev);
-
-   if (!bridge-parent)
-   return;
-
-   of_dma_configure(dev, bridge-parent-of_node);
-   pci_put_host_bridge_device(bridge);
-}
-EXPORT_SYMBOL_GPL(of_pci_dma_configure);
-
  #if defined(CONFIG_OF_ADDRESS)
  /**
   * of_pci_get_host_bridge_resources - Parse PCI host bridge resources from DT
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index cefd636..e2fcd3b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -6,12 +6,14 @@
  #include linux/delay.h
  #include linux/init.h
  #include linux/pci.h
-#include linux/of_pci.h
+#include linux/of_device.h
  #include linux/pci_hotplug.h
  #include linux/slab.h
  #include linux/module.h
  #include linux/cpumask.h
  #include linux/pci-aspm.h
+#include linux/acpi.h
+#include linux/property.h
  #include asm-generic/pci-bridge.h
  #include pci.h

@@ -1544,6 +1546,35 @@ static void pci_init_capabilities(struct pci_dev *dev)
pci_enable_acs(dev);
  }

+/**
+ * pci_dma_configure - Setup DMA configuration
+ * @pci_dev: ptr to pci_dev struct of the PCI device
+ *
+ * Function to update PCI devices's DMA configuration using the same
+ * info from the OF node or ACPI node of host bridge's parent (if any).
+ */
+static void pci_dma_configure(struct pci_dev *pci_dev)
+{
+   struct device *dev = pci_dev-dev;
+   struct device *bridge = pci_get_host_bridge_device(pci_dev);
+   struct acpi_device *adev;
+   bool coherent;
+
+   if (has_acpi_companion(bridge)) {
+   adev = to_acpi_node(bridge-fwnode);
+   if (acpi_check_dma(adev, coherent))
+   arch_setup_dma_ops(dev, 0, 0, NULL, coherent);
+   } else {
+   struct device *host = bridge-parent;
+   if (!host)
+   return;
+
+   of_dma_configure(dev, host-of_node);
+   }
+
+   pci_put_host_bridge_device(bridge);
+}
+
  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
  {
int ret;
@@ -1557,7 +1588,7 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus 
*bus)
dev-dev.dma_mask = dev-dma_mask;
dev-dev.dma_parms = dev-dma_parms;
dev-dev.coherent_dma_mask = 0xull;
-   of_pci_dma_configure(dev);
+   pci_dma_configure(dev);

pci_set_dma_max_seg_size(dev, 65536);
pci_set_dma_seg_boundary(dev, 0x);
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 29fd3fe..ce0e5ab 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -16,7 +16,6 @@ int of_pci_get_devfn(struct device_node *np);
  int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
  int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
  int of_get_pci_domain_nr(struct device_node *node);
-void of_pci_dma_configure(struct pci_dev *pci_dev);
  #else
  static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct 
of_phandle_args *out_irq)
  {
@@ 

Re: [PATCH 1/3] KVM: make halt_poll_ns per-VCPU

2015-08-24 Thread David Matlack
On Mon, Aug 24, 2015 at 5:53 AM, Wanpeng Li wanpeng...@hotmail.com wrote:
 Change halt_poll_ns into per-VCPU variable, seeded from module parameter,
 to allow greater flexibility.

You should also change kvm_vcpu_block to read halt_poll_ns from
the vcpu instead of the module parameter.


 Signed-off-by: Wanpeng Li wanpeng...@hotmail.com
 ---
  include/linux/kvm_host.h | 1 +
  virt/kvm/kvm_main.c  | 1 +
  2 files changed, 2 insertions(+)

 diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
 index 81089cf..1bef9e2 100644
 --- a/include/linux/kvm_host.h
 +++ b/include/linux/kvm_host.h
 @@ -242,6 +242,7 @@ struct kvm_vcpu {
 int sigset_active;
 sigset_t sigset;
 struct kvm_vcpu_stat stat;
 +   unsigned int halt_poll_ns;

  #ifdef CONFIG_HAS_IOMEM
 int mmio_needed;
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index d8db2f8f..a122b52 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -217,6 +217,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, 
 unsigned id)
 vcpu-kvm = kvm;
 vcpu-vcpu_id = id;
 vcpu-pid = NULL;
 +   vcpu-halt_poll_ns = halt_poll_ns;
 init_waitqueue_head(vcpu-wq);
 kvm_async_pf_vcpu_init(vcpu);

 --
 1.9.1

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH

2015-08-24 Thread Eugene Shatokhin

24.08.2015 16:29, Bjørn Mork пишет:

Eugene Shatokhin eugene.shatok...@rosalab.ru writes:


19.08.2015 15:31, Bjørn Mork пишет:

Eugene Shatokhin eugene.shatok...@rosalab.ru writes:


The problem is not in the reordering but rather in the fact that
dev-flags = 0 is not necessarily atomic
w.r.t. clear_bit(EVENT_RX_KILL, dev-flags), and vice versa.

So the following might be possible, although unlikely:

CPU0 CPU1
   clear_bit: read dev-flags
   clear_bit: clear EVENT_RX_KILL in the read value

dev-flags=0;

   clear_bit: write updated dev-flags

As a result, dev-flags may become non-zero again.


Ah, right.  Thanks for explaining.


I cannot prove yet that this is an impossible situation. If anyone
can, please explain. If so, this part of the patch will not be needed.


I wonder if we could simply move the dev-flags = 0 down a few lines to
fix both issues?  It doesn't seem to do anything useful except for
resetting the flags to a sane initial state after the device is down.

Stopping the tasklet rescheduling etc depends only on netif_running(),
which will be false when usbnet_stop is called.  There is no need to
touch dev-flags for this to happen.


That was one of the first ideas we discussed here. Unfortunately, it
is probably not so simple.

Setting dev-flags to 0 makes some delayed operations do nothing and,
among other things, not to reschedule usbnet_bh().


Yes, but I believe that is merely a side effect.  You should never need
to clear multiple flags to get the desired behaviour.


As you can see in drivers/net/usb/usbnet.c, usbnet_bh() can be called
as a tasklet function and as a timer function in a number of
situations (look for the usage of dev-bh and dev-delay there).

netif_running() is indeed false when usbnet_stop() runs, usbnet_stop()
also disables Tx. This seems to be enough for many cases where
usbnet_bh() is scheduled, but I am not so sure about the remaining
ones, namely:

1. A work function, usbnet_deferred_kevent(), may reschedule
usbnet_bh(). Looks like the workqueue is only stopped in
usbnet_disconnect(), so a work item might be processed while
usbnet_stop() works. Setting dev-flags to 0 makes the work function
do nothing, by the way. See also the comment in usbnet_stop() about
this.

A work item may be placed to this workqueue in a number of ways, by
both usbnet module and the mini-drivers. It is not too easy to track
all these situations.


That's an understatement :)




2. rx_complete() and tx_complete() may schedule execution of
usbnet_bh() as a tasklet or a timer function. These two are URB
completion callbacks.

It seems, new Rx and Tx URBs cannot be submitted when usbnet_stop()
clears dev-flags, indeed. But it does not prevent the completion
handlers for the previously submitted URBs from running concurrently
with usbnet_stop(). The latter waits for them to complete (via
usbnet_terminate_urbs(dev)) but only if FLAG_AVOID_UNLINK_URBS is not
set in info-flags. rndis_wlan, however, sets this flag for a few
hardware models. So - no guarantees here as well.


FLAG_AVOID_UNLINK_URBS looks like it should be replaced by the newer
ability to keep the status urb active. I believe that must have been the
real reason for adding it, based on the commit message and the effect
the flag will have:

  commit 1487cd5e76337555737cbc55d7d83f41460d198f
  Author: Jussi Kivilinna jussi.kivili...@mbnet.fi
  Date:   Thu Jul 30 19:41:20 2009 +0300

 usbnet: allow minidriver to prevent urb unlinking on usbnet_stop

 rndis_wlan devices freeze after running usbnet_stop several times. It 
appears
 that firmware freezes in state where it does not respond to any RNDIS 
commands
 and device have to be physically unplugged/replugged. This patch lets
 minidrivers to disable unlink_urbs on usbnet_stop through new info flag.

 Signed-off-by: Jussi Kivilinna jussi.kivili...@mbnet.fi
 Cc: David Brownell dbrown...@users.sourceforge.net
 Signed-off-by: John W. Linville linvi...@tuxdriver.com



The rx urbs will not be resubmitted in any case, and there are of course
no tx urbs being submitted.  So the only effect of this flag is on the
status/interrupt urb, which I can imagine some RNDIS devices wants
active all the time.

So FLAG_AVOID_UNLINK_URBS should probably be removed and replaced calls
to usbnet_status_start() and usbnet_status_stop().  This will require
testing on some of the devices with the original firmware problem
however.

In any case: I do not think this flag should be considered when trying
to make usbnet_stop behaviour saner.  It's only purpose is to
deliberately break usbnet_stop by not actually stopping.



If someone could list the particular bits of dev-flags that should be
cleared to make sure no deferred call could reschedule usbnet_bh(),
etc... Well, it would be enough to clear these first and use
dev-flags = 0 later, after tasklet_kill() and del_timer_sync(). I
cannot point out these particular bits 

Re: [PATCH 2/3] KVM: dynamise halt_poll_ns adjustment

2015-08-24 Thread David Matlack
On Mon, Aug 24, 2015 at 5:53 AM, Wanpeng Li wanpeng...@hotmail.com wrote:
 There are two new kernel parameters for changing the halt_poll_ns:
 halt_poll_ns_grow and halt_poll_ns_shrink. halt_poll_ns_grow affects
 halt_poll_ns when an interrupt arrives and halt_poll_ns_shrink
 does it when idle VCPU is detected.

   halt_poll_ns_shrink/ |
   halt_poll_ns_grow| interrupt arrives| idle VCPU is detected
   -+--+---
1  |  = halt_poll_ns  |  = 0
halt_poll_ns   | *= halt_poll_ns_grow | /= halt_poll_ns_shrink
   otherwise| += halt_poll_ns_grow | -= halt_poll_ns_shrink

 A third new parameter, halt_poll_ns_max, controls the maximal halt_poll_ns;
 it is internally rounded down to a closest multiple of halt_poll_ns_grow.

I like the idea of growing and shrinking halt_poll_ns, but I'm not sure
we grow and shrink in the right places here. For example, if vcpu-halt_poll_ns
gets down to 0, I don't see how it can then grow back up.

This might work better:
  if (poll successfully for interrupt): stay the same
  else if (length of kvm_vcpu_block is longer than halt_poll_ns_max): shrink
  else if (length of kvm_vcpu_block is less than halt_poll_ns_max): grow

where halt_poll_ns_max is something reasonable, like 2 millisecond.

You get diminishing returns from halt polling as the length of the
halt gets longer (halt polling only reduces halt latency by 10-15 us).
So there's little benefit to polling longer than a few milliseconds.


 Signed-off-by: Wanpeng Li wanpeng...@hotmail.com
 ---
  virt/kvm/kvm_main.c | 81 
 -
  1 file changed, 80 insertions(+), 1 deletion(-)

 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index a122b52..bcfbd35 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -66,9 +66,28 @@
  MODULE_AUTHOR(Qumranet);
  MODULE_LICENSE(GPL);

 -static unsigned int halt_poll_ns;
 +#define KVM_HALT_POLL_NS  50
 +#define KVM_HALT_POLL_NS_GROW   2
 +#define KVM_HALT_POLL_NS_SHRINK 0
 +#define KVM_HALT_POLL_NS_MAX \
 +   INT_MAX / KVM_HALT_POLL_NS_GROW
 +
 +static unsigned int halt_poll_ns = KVM_HALT_POLL_NS;
  module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR);

 +/* Default doubles per-vcpu halt_poll_ns. */
 +static int halt_poll_ns_grow = KVM_HALT_POLL_NS_GROW;
 +module_param(halt_poll_ns_grow, int, S_IRUGO);
 +
 +/* Default resets per-vcpu halt_poll_ns . */
 +int halt_poll_ns_shrink = KVM_HALT_POLL_NS_SHRINK;
 +module_param(halt_poll_ns_shrink, int, S_IRUGO);
 +
 +/* Default is to compute the maximum so we can never overflow. */
 +unsigned int halt_poll_ns_actual_max = KVM_HALT_POLL_NS_MAX;
 +unsigned int halt_poll_ns_max = KVM_HALT_POLL_NS_MAX;
 +module_param(halt_poll_ns_max, int, S_IRUGO);
 +
  /*
   * Ordering of locks:
   *
 @@ -1907,6 +1926,62 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, 
 gfn_t gfn)
  }
  EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty);

 +static unsigned int __grow_halt_poll_ns(unsigned int val)
 +{
 +   if (halt_poll_ns_grow  1)
 +   return halt_poll_ns;
 +
 +   val = min(val, halt_poll_ns_actual_max);
 +
 +   if (val == 0)
 +   return halt_poll_ns;
 +
 +   if (halt_poll_ns_grow  halt_poll_ns)
 +   val *= halt_poll_ns_grow;
 +   else
 +   val += halt_poll_ns_grow;
 +
 +   return val;
 +}
 +
 +static unsigned int __shrink_halt_poll_ns(int val, int modifier, int minimum)
 +{
 +   if (modifier  1)
 +   return 0;
 +
 +   if (modifier  halt_poll_ns)
 +   val /= modifier;
 +   else
 +   val -= modifier;
 +
 +   return val;
 +}
 +
 +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
 +{
 +   vcpu-halt_poll_ns = __grow_halt_poll_ns(vcpu-halt_poll_ns);
 +}
 +
 +static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu)
 +{
 +   vcpu-halt_poll_ns = __shrink_halt_poll_ns(vcpu-halt_poll_ns,
 +   halt_poll_ns_shrink, halt_poll_ns);
 +}
 +
 +/*
 + * halt_poll_ns_actual_max is computed to be one grow_halt_poll_ns() below
 + * halt_poll_ns_max. (See __grow_halt_poll_ns for the reason.)
 + * This prevents overflows, because ple_halt_poll_ns is int.
 + * halt_poll_ns_max effectively rounded down to a multiple of 
 halt_poll_ns_grow in
 + * this process.
 + */
 +static void update_halt_poll_ns_actual_max(void)
 +{
 +   halt_poll_ns_actual_max =
 +   __shrink_halt_poll_ns(max(halt_poll_ns_max, halt_poll_ns),
 +   halt_poll_ns_grow, INT_MIN);
 +}
 +
  static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
  {
 if (kvm_arch_vcpu_runnable(vcpu)) {
 @@ -1941,6 +2016,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
  */
 if (kvm_vcpu_check_block(vcpu)  0) {
 ++vcpu-stat.halt_successful_poll;
 +   

[PATCH v7 6/8] ACPI: add GTDT table parse driver into ACPI driver

2015-08-24 Thread fu . wei
From: Fu Wei fu@linaro.org

This driver adds support for parsing SBSA Generic Watchdog
Structure in GTDT, and creating a platform device with that
information. This allows the operating system to obtain device
data from the resource of platform device.

The platform device named sbsa-gwdt can be used by the
ARM SBSA Generic Watchdog driver.

Signed-off-by: Fu Wei fu@linaro.org
Signed-off-by: Hanjun Guo hanjun@linaro.org
---
 drivers/acpi/Kconfig  |   9 
 drivers/acpi/Makefile |   1 +
 drivers/acpi/gtdt.c   | 135 ++
 3 files changed, 145 insertions(+)

diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 114cf48..2e7e162 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -479,4 +479,13 @@ config XPOWER_PMIC_OPREGION
 
 endif
 
+config ACPI_GTDT
+   bool ACPI GTDT Support
+   depends on ARM64
+   help
+ GTDT (Generic Timer Description Table) provides information
+ for per-processor timers and Platform (memory-mapped) timers
+ for ARM platforms. Select this option to provide information
+ needed for the timers init.
+
 endif  # ACPI
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 8321430..9a7966e 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -93,5 +93,6 @@ obj-$(CONFIG_ACPI_EXTLOG) += acpi_extlog.o
 obj-$(CONFIG_PMIC_OPREGION)+= pmic/intel_pmic.o
 obj-$(CONFIG_CRC_PMIC_OPREGION) += pmic/intel_pmic_crc.o
 obj-$(CONFIG_XPOWER_PMIC_OPREGION) += pmic/intel_pmic_xpower.o
+obj-$(CONFIG_ACPI_GTDT)+= gtdt.o
 
 video-objs += acpi_video.o video_detect.o
diff --git a/drivers/acpi/gtdt.c b/drivers/acpi/gtdt.c
new file mode 100644
index 000..bbe3a2e
--- /dev/null
+++ b/drivers/acpi/gtdt.c
@@ -0,0 +1,135 @@
+/*
+ * ARM Specific GTDT table Support
+ *
+ * Copyright (C) 2015, Linaro Ltd.
+ * Author: Fu Wei fu@linaro.org
+ * Hanjun Guo hanjun@linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include linux/acpi.h
+#include linux/device.h
+#include linux/init.h
+#include linux/kernel.h
+#include linux/module.h
+#include linux/platform_device.h
+
+static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags)
+{
+   int trigger, polarity;
+
+   if (!interrupt)
+   return 0;
+
+   trigger = (flags  ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE
+   : ACPI_LEVEL_SENSITIVE;
+
+   polarity = (flags  ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW
+   : ACPI_ACTIVE_HIGH;
+
+   return acpi_register_gsi(NULL, interrupt, trigger, polarity);
+}
+
+/*
+ * Initialize a SBSA generic Watchdog platform device info from GTDT
+ * According to SBSA specification the size of refresh and control
+ * frames of SBSA Generic Watchdog is SZ_4K(Offset 0x000 – 0xFFF).
+ */
+static int __init gtdt_import_sbsa_gwdt(struct acpi_gtdt_watchdog *wd,
+   int index)
+{
+   struct platform_device *pdev;
+   int irq = map_generic_timer_interrupt(wd-timer_interrupt,
+ wd-timer_flags);
+   struct resource res[] = {
+   DEFINE_RES_IRQ(irq),
+   DEFINE_RES_MEM(wd-control_frame_address, SZ_4K),
+   DEFINE_RES_MEM(wd-refresh_frame_address, SZ_4K),
+   };
+
+   pr_debug(GTDT: a Watchdog GT(0x%llx/0x%llx gsi:%u flags:0x%x)\n,
+wd-refresh_frame_address, wd-control_frame_address,
+wd-timer_interrupt, wd-timer_flags);
+
+   if (!(wd-refresh_frame_address 
+ wd-control_frame_address 
+ wd-timer_interrupt)) {
+   pr_err(GTDT: failed geting the device info.\n);
+   return -EINVAL;
+   }
+
+   if (irq  0) {
+   pr_err(GTDT: failed to register GSI of the Watchdog GT.\n);
+   return -EINVAL;
+   }
+
+   /*
+* Add a platform device named sbsa-gwdt to match the platform driver.
+* sbsa-gwdt: SBSA(Server Base System Architecture) Generic Watchdog
+* The platform driver (like drivers/watchdog/sbsa_gwdt.c)can get device
+* info below by matching this name.
+*/
+   pdev = platform_device_register_simple(sbsa-gwdt, index, res,
+  ARRAY_SIZE(res));
+   if (IS_ERR(pdev)) {
+   acpi_unregister_gsi(wd-timer_interrupt);
+   return PTR_ERR(pdev);
+   }
+
+   return 0;
+}
+
+static int __init gtdt_platform_timer_parse(struct acpi_table_header *table)
+{
+   struct acpi_gtdt_header *header;
+   struct acpi_table_gtdt *gtdt;
+   void *gtdt_subtable;
+   int i, gwdt_index;
+   int ret = 0;
+
+   if (table-revision  2) {
+  

[PATCH v7 4/8] Watchdog: introdouce pretimeout into framework

2015-08-24 Thread fu . wei
From: Fu Wei fu@linaro.org

Also update Documentation/watchdog/watchdog-kernel-api.txt to
introduce:
(1)the new elements in the watchdog_device and watchdog_ops struct;
(2)the new API watchdog_init_timeouts

Reasons:
(1)kernel already has two watchdog drivers are using pretimeout:
drivers/char/ipmi/ipmi_watchdog.c
drivers/watchdog/kempld_wdt.c(but the definition is different)
(2)some other drivers are going to use this: ARM SBSA Generic Watchdog

Signed-off-by: Fu Wei fu@linaro.org
---
 Documentation/watchdog/watchdog-kernel-api.txt | 47 ++--
 drivers/watchdog/watchdog_core.c   | 98 ++
 drivers/watchdog/watchdog_dev.c| 53 ++
 include/linux/watchdog.h   | 39 --
 4 files changed, 200 insertions(+), 37 deletions(-)

diff --git a/Documentation/watchdog/watchdog-kernel-api.txt 
b/Documentation/watchdog/watchdog-kernel-api.txt
index d8b0d33..1fadeb9 100644
--- a/Documentation/watchdog/watchdog-kernel-api.txt
+++ b/Documentation/watchdog/watchdog-kernel-api.txt
@@ -53,6 +53,9 @@ struct watchdog_device {
unsigned int timeout;
unsigned int min_timeout;
unsigned int max_timeout;
+   unsigned int pretimeout;
+   unsigned int min_pretimeout;
+   unsigned int max_pretimeout;
void *driver_data;
struct mutex lock;
unsigned long status;
@@ -75,6 +78,9 @@ It contains following fields:
 * timeout: the watchdog timer's timeout value (in seconds).
 * min_timeout: the watchdog timer's minimum timeout value (in seconds).
 * max_timeout: the watchdog timer's maximum timeout value (in seconds).
+* pretimeout: the watchdog timer's pretimeout value (in seconds).
+* min_pretimeout: the watchdog timer's minimum pretimeout value (in seconds).
+* max_pretimeout: the watchdog timer's maximum pretimeout value (in seconds).
 * bootstatus: status of the device after booting (reported with watchdog
   WDIOF_* status bits).
 * driver_data: a pointer to the drivers private data of a watchdog device.
@@ -99,6 +105,7 @@ struct watchdog_ops {
int (*ping)(struct watchdog_device *);
unsigned int (*status)(struct watchdog_device *);
int (*set_timeout)(struct watchdog_device *, unsigned int);
+   int (*set_pretimeout)(struct watchdog_device *, unsigned int);
unsigned int (*get_timeleft)(struct watchdog_device *);
void (*ref)(struct watchdog_device *);
void (*unref)(struct watchdog_device *);
@@ -160,9 +167,19 @@ they are supported. These optional routines/operations are:
   and -EIO for could not write value to the watchdog. On success this
   routine should set the timeout value of the watchdog_device to the
   achieved timeout value (which may be different from the requested one
-  because the watchdog does not necessarily has a 1 second resolution).
+  because the watchdog does not necessarily has a 1 second resolution;
+  If the driver supports pretimeout, then the timeout value must be greater
+  than that).
   (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
   watchdog's info structure).
+* set_pretimeout: this routine checks and changes the pretimeout of the
+  watchdog timer device. It returns 0 on success, -EINVAL for parameter out of
+  range and -EIO for could not write value to the watchdog. On success this
+  routine should set the pretimeout value of the watchdog_device to the
+  achieved pretimeout value (which may be different from the requested one
+  because the watchdog does not necessarily has a 1 second resolution).
+  (Note: the WDIOF_PRETIMEOUT needs to be set in the options field of the
+  watchdog's info structure).
 * get_timeleft: this routines returns the time that's left before a reset.
 * ref: the operation that calls kref_get on the kref of a dynamically
   allocated watchdog_device struct.
@@ -226,8 +243,28 @@ extern int watchdog_init_timeout(struct watchdog_device 
*wdd,
   unsigned int timeout_parm, struct device 
*dev);
 
 The watchdog_init_timeout function allows you to initialize the timeout field
-using the module timeout parameter or by retrieving the timeout-sec property 
from
-the device tree (if the module timeout parameter is invalid). Best practice is
-to set the default timeout value as timeout value in the watchdog_device and
-then use this function to set the user preferred timeout value.
+using the module timeout parameter or by retrieving the first element of
+the timeout-sec property from the device tree (if the module timeout parameter
+is invalid). Best practice is to set the default timeout value as timeout value
+in the watchdog_device and then use this function to set the user preferred
+timeout value.
+This routine returns zero on success and a negative errno code for failure.
+
+Some watchdog timers have two stage of timeouts (timeout and pretimeout),
+to initialize the timeout and pretimeout fields at the 

[PATCH v7 7/8] Watchdog: enable ACPI GTDT support for ARM SBSA watchdog driver

2015-08-24 Thread fu . wei
From: Fu Wei fu@linaro.org

This patch enables ACPI GTDT support for ARM SBSA
watchdog driver automatically, if ACPI support is enabled.

Signed-off-by: Fu Wei fu@linaro.org
---
 drivers/watchdog/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index b2734f0..2719093 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -178,6 +178,7 @@ config ARM_SBSA_WATCHDOG
depends on ARM64
depends on ARM_ARCH_TIMER
select WATCHDOG_CORE
+   select ACPI_GTDT if ACPI
help
  ARM SBSA Generic Watchdog. This watchdog has two Watchdog timeouts.
  The first timeout will trigger a panic; the second timeout will
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 3/8] ARM64: add SBSA Generic Watchdog device node in amd-seattle-soc.dtsi

2015-08-24 Thread fu . wei
From: Fu Wei fu@linaro.org

This can be a example of adding SBSA Generic Watchdog device node
into some dts files for the Soc which contains SBSA Generic Watchdog.

Acked-by: Arnd Bergmann a...@arndb.de
Acked-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
Tested-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
Signed-off-by: Fu Wei fu@linaro.org
---
 arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi 
b/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
index 2874d92..259430f 100644
--- a/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
+++ b/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
@@ -84,6 +84,14 @@
clock-names = uartclk, apb_pclk;
};
 
+   watchdog0: watchdog@e0bb {
+   compatible = arm,sbsa-gwdt;
+   reg = 0x0 0xe0bc 0 0x1000,
+   0x0 0xe0bb 0 0x1000;
+   interrupts = 0 337 4;
+   timeout-sec = 10 5;
+   };
+
spi0: ssp@e102 {
status = disabled;
compatible = arm,pl022, arm,primecell;
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 8/8] clocksource: simplify ACPI code in arm_arch_timer.c

2015-08-24 Thread fu . wei
From: Fu Wei fu@linaro.org

The patch update arm_arch_timer driver to use the function
provided by the new GTDT driver of ACPI.
By this way, arm_arch_timer.c can be simplified, and separate
all the ACPI GTDT knowledge from this timer driver.

Signed-off-by: Fu Wei fu@linaro.org
Signed-off-by: Hanjun Guo hanjun@linaro.org
---
 arch/arm64/kernel/time.c |  4 +--
 drivers/acpi/gtdt.c  | 43 ++
 drivers/clocksource/Kconfig  |  1 +
 drivers/clocksource/arm_arch_timer.c | 60 +++-
 include/clocksource/arm_arch_timer.h |  8 +
 include/linux/acpi.h |  5 +++
 include/linux/clocksource.h  |  4 +--
 7 files changed, 72 insertions(+), 53 deletions(-)

diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c
index 42f9195..2cabea6 100644
--- a/arch/arm64/kernel/time.c
+++ b/arch/arm64/kernel/time.c
@@ -75,9 +75,9 @@ void __init time_init(void)
 
/*
 * Since ACPI or FDT will only one be available in the system,
-* we can use acpi_generic_timer_init() here safely
+* we can use arch_timer_acpi_init() here safely
 */
-   acpi_generic_timer_init();
+   arch_timer_acpi_init();
 
arch_timer_rate = arch_timer_get_rate();
if (!arch_timer_rate)
diff --git a/drivers/acpi/gtdt.c b/drivers/acpi/gtdt.c
index bbe3a2e..3559babf 100644
--- a/drivers/acpi/gtdt.c
+++ b/drivers/acpi/gtdt.c
@@ -17,6 +17,8 @@
 #include linux/module.h
 #include linux/platform_device.h
 
+#include clocksource/arm_arch_timer.h
+
 static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags)
 {
int trigger, polarity;
@@ -33,6 +35,47 @@ static int __init map_generic_timer_interrupt(u32 interrupt, 
u32 flags)
return acpi_register_gsi(NULL, interrupt, trigger, polarity);
 }
 
+static struct arch_timer_data __initdata *arch_timer_data_p;
+
+static int __init arch_timer_data_init(struct acpi_table_header *table)
+{
+   struct acpi_table_gtdt *gtdt;
+
+   gtdt = container_of(table, struct acpi_table_gtdt, header);
+
+   arch_timer_data_p-phys_secure_ppi =
+   map_generic_timer_interrupt(gtdt-secure_el1_interrupt,
+   gtdt-secure_el1_flags);
+
+   arch_timer_data_p-phys_nonsecure_ppi =
+   map_generic_timer_interrupt(gtdt-non_secure_el1_interrupt,
+   gtdt-non_secure_el1_flags);
+
+   arch_timer_data_p-virt_ppi =
+   map_generic_timer_interrupt(gtdt-virtual_timer_interrupt,
+   gtdt-virtual_timer_flags);
+
+   arch_timer_data_p-hyp_ppi =
+   map_generic_timer_interrupt(gtdt-non_secure_el2_interrupt,
+   gtdt-non_secure_el2_flags);
+
+   arch_timer_data_p-c3stop = !(gtdt-non_secure_el1_flags 
+ ACPI_GTDT_ALWAYS_ON);
+
+   return 0;
+}
+
+/* Initialize the arch_timer_data struct for arm_arch_timer by GTDT info */
+int __init gtdt_arch_timer_data_init(struct arch_timer_data *data)
+{
+   if (acpi_disabled || !data)
+   return -EINVAL;
+
+   arch_timer_data_p = data;
+
+   return acpi_table_parse(ACPI_SIG_GTDT, arch_timer_data_init);
+}
+
 /*
  * Initialize a SBSA generic Watchdog platform device info from GTDT
  * According to SBSA specification the size of refresh and control
diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index 4e57730..e111025 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -119,6 +119,7 @@ config CLKSRC_STM32
 config ARM_ARCH_TIMER
bool
select CLKSRC_OF if OF
+   select ACPI_GTDT if ACPI
 
 config ARM_ARCH_TIMER_EVTSTREAM
bool Support for ARM architected timer event stream generation
diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 0aa135d..99505bb 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -817,68 +817,30 @@ CLOCKSOURCE_OF_DECLARE(armv7_arch_timer_mem, 
arm,armv7-timer-mem,
   arch_timer_mem_init);
 
 #ifdef CONFIG_ACPI
-static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags)
-{
-   int trigger, polarity;
-
-   if (!interrupt)
-   return 0;
-
-   trigger = (flags  ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE
-   : ACPI_LEVEL_SENSITIVE;
-
-   polarity = (flags  ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW
-   : ACPI_ACTIVE_HIGH;
-
-   return acpi_register_gsi(NULL, interrupt, trigger, polarity);
-}
-
 /* Initialize per-processor generic timer */
-static int __init arch_timer_acpi_init(struct acpi_table_header *table)
+void __init arch_timer_acpi_init(void)
 {
-   struct acpi_table_gtdt *gtdt;
+   struct arch_timer_data data;
 

Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME inodes

2015-08-24 Thread Tejun Heo
Hello,

On Mon, Aug 24, 2015 at 10:51:50AM -0400, Tejun Heo wrote:
  Bah, I see the problem and indeed it was introduced by commit e79729123f639
  writeback: don't issue wb_writeback_work if clean. The problem is that
  we bail out of sync_inodes_sb() if there is no dirty IO. Which is wrong
  because we have to wait for any outstanding IO (i.e. call wait_sb_inodes())
  regardless of dirty state! And that also explains why Tejun's patch fixes
  the problem because it backs out the change to the exit condition in
  sync_inodes_sb().
 
 Dang, I'm an idiot sandwich.

A question tho, so this means that an inode may contain dirty or
writeback pages w/o the inode being on one of the dirty lists.
Looking at the generic filesystem and writeback code, this doesn't
seem true in general.  Is this something xfs specific?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/10] irqchip: irq-mips-gic: export gic_send_ipi

2015-08-24 Thread Marc Zyngier
[adding Mark Rutland, as this is heading straight into uncharted DT
territory]

On 24/08/15 17:39, Qais Yousef wrote:
 On 08/24/2015 04:07 PM, Thomas Gleixner wrote:
 On Mon, 24 Aug 2015, Qais Yousef wrote:
 On 08/24/2015 02:32 PM, Marc Zyngier wrote:
 I'd rather see something more architected than this blind export, or
 at least some level of filtering (the idea random drivers can access
 such a low-level function doesn't make me feel very good).
 I don't know how to architect this better or how to perform  the filtering,
 but I'm happy to hear suggestions and try them out.
 Keep in mind that detecting GIC and writing your own gic_send_ipi() is very
 simple. I have done this when the driver was out of tree. So restricting it 
 by
 not exporting it will not prevent someone from really accessing the
 functionality, it's just they have to do it their own way.
 Keep in mind that we are not talking about out of tree hackery. We
 talk about a kernel code submission and I doubt, that you will get
 away with a GIC detection/fiddling burried in your driver code.

 Keep in mind that just slapping an export to some random function is
 not much better than doing a GIC hack in the driver.

 Marcs concerns about blindly exposing IPI functionality to drivers is
 well justified and that kind of coprocessor stuff is not unique to
 your particular SoC. We're going to see such things more frequently in
 the not so distant future, so we better think now about proper
 solutions to that problem.
 
 Sure I'm not trying to argue against that.
 

 There are a couple of issues to solve:

 1) How is the IPI which is received by the coprocessor reserved in the
 system?

 2) How is it associated to a particular driver?
 
 Shouldn't 'interrupts' property in DT take care of these 2 questions? 
 Maybe we can give it an alias name to make it more readable that this 
 interrupt is requested for external IPI.

The interrupts property has a rather different meaning, and isn't
designed to hardcode IPIs. Also, this property describes an interrupt
from a device to the CPU, not the other way around (I imagine you also
have an interrupt coming from the AXD to the CPU, possibly using an IPI
too).

We can deal with these issues, but that's not something we can improvise.

What I had in mind was something fairly generic:
- interrupt-source: something generating an interrupt
- interrupt-sink: something being targeted by an interrupt

You could then express things like:

intc: interrupt-controller@1000 {
interrupt-controller;
};

mydevice@f000 {
interrupt-source = intc INT_SPEC 2 inttarg1 inttarg1;
};

inttarg1: mydevice@f100 {
interrupt-sink = intc HWAFFINITY1;
};

inttarg2: cpu@1 {
interrupt-sink = intc HWAFFINITY2;
};

You could also imagine having CPUs being both source and sink.


 3) How do we ensure that a driver cannot issue random IPIs and can
 only send the associated ones?
 
 If we get the irq number from DT then I'm not sure how feasible it is to 
 implement a generic_send_ipi() function that takes this number to 
 generate an IPI.
 
 Do you think this approach would work?

If you follow the above approach, it should be pretty easy to derive a
source identifier and a sink identifier from the DT, and have the core
code to route one to the other and do the right thing.

The source identifier could also be used to describe an IPI in a fairly
safe way (the target being fixed by DT, but the actual number used
dynamically allocated by the kernel).

This is just a 10 minutes braindump, so feel free to throw rocks at it
and to come up with a better solution! :-)

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pci: acpi: Generic function for setting up PCI device DMA coherency

2015-08-24 Thread Bjorn Helgaas
Here it is again.

On Thu, Aug 13, 2015 at 6:50 PM, Bjorn Helgaas bhelg...@google.com wrote:
 Hi Suravee,

 On Thu, Aug 13, 2015 at 04:58:45PM +0700, Suravee Suthikulpanit wrote:
 This patch refactors of_pci_dma_configure() into a more generic
 pci_dma_configure(), which can be reused by non-OF code.
 Then, it adds support for setting up PCI device DMA coherency from
 ACPI _CCA object that should normally be specified in the DSDT node
 of its PCI host bridge..

 Since this does two things:
   1) Rename of_pci_dma_configure() and move it to PCI
   2) Add _CCA support,
 maybe it should be split into two patches?

 There are a couple more comments below.

 While looking at this, I thought some of the existing code could be
 made simpler and easier to follow.  I appended a couple possible patches;
 you can incorporate them or ignore them, whatever seems best to you.

 Bjorn

 Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
 CC: Bjorn Helgaas bhelg...@google.com
 CC: Catalin Marinas catalin.mari...@arm.com
 CC: Will Deacon will.dea...@arm.com
 CC: Rafael J. Wysocki r...@rjwysocki.net
 CC: Rob Herring robh...@kernel.org
 CC: Murali Karicheri m-kariche...@ti.com
 ---
 Note: According to the ACPI spec, the _CCA attribute is required
   for ARM64. Therefore, this patch is a pre-req for ACPI PCI
   support for ARM64 which is currently in development.

   Also, this should not affect other architectures since
   if CCA is not required, the default value is coherent.
   Please see include/acpi/acpi_bus.h: acpi_check_dma() and
   drivers/acpi/scan.c: acpi_init_coherency() for more information

  drivers/of/of_pci.c| 20 
  drivers/pci/probe.c| 35 +--
  include/linux/of_pci.h |  3 ---
  3 files changed, 33 insertions(+), 25 deletions(-)

 diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
 index 5751dc5..b66ee4e 100644
 --- a/drivers/of/of_pci.c
 +++ b/drivers/of/of_pci.c
 @@ -117,26 +117,6 @@ int of_get_pci_domain_nr(struct device_node *node)
  }
  EXPORT_SYMBOL_GPL(of_get_pci_domain_nr);

 -/**
 - * of_pci_dma_configure - Setup DMA configuration
 - * @dev: ptr to pci_dev struct of the PCI device
 - *
 - * Function to update PCI devices's DMA configuration using the same
 - * info from the OF node of host bridge's parent (if any).
 - */
 -void of_pci_dma_configure(struct pci_dev *pci_dev)
 -{
 - struct device *dev = pci_dev-dev;
 - struct device *bridge = pci_get_host_bridge_device(pci_dev);
 -
 - if (!bridge-parent)
 - return;
 -
 - of_dma_configure(dev, bridge-parent-of_node);
 - pci_put_host_bridge_device(bridge);
 -}
 -EXPORT_SYMBOL_GPL(of_pci_dma_configure);
 -
  #if defined(CONFIG_OF_ADDRESS)
  /**
   * of_pci_get_host_bridge_resources - Parse PCI host bridge resources from 
 DT
 diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
 index cefd636..e2fcd3b 100644
 --- a/drivers/pci/probe.c
 +++ b/drivers/pci/probe.c
 @@ -6,12 +6,14 @@
  #include linux/delay.h
  #include linux/init.h
  #include linux/pci.h
 -#include linux/of_pci.h
 +#include linux/of_device.h
  #include linux/pci_hotplug.h
  #include linux/slab.h
  #include linux/module.h
  #include linux/cpumask.h
  #include linux/pci-aspm.h
 +#include linux/acpi.h
 +#include linux/property.h
  #include asm-generic/pci-bridge.h
  #include pci.h

 @@ -1544,6 +1546,35 @@ static void pci_init_capabilities(struct pci_dev *dev)
   pci_enable_acs(dev);
  }

 +/**
 + * pci_dma_configure - Setup DMA configuration
 + * @pci_dev: ptr to pci_dev struct of the PCI device
 + *
 + * Function to update PCI devices's DMA configuration using the same
 + * info from the OF node or ACPI node of host bridge's parent (if any).
 + */
 +static void pci_dma_configure(struct pci_dev *pci_dev)

 Almost all pci_dev pointers in probe.c are named dev, so I would use
 that for this one, too.  I probably would just drop the struct device
 *dev below and use dev-dev the two places you need it.  That's a
 common idiom in PCI.

 +{
 + struct device *dev = pci_dev-dev;
 + struct device *bridge = pci_get_host_bridge_device(pci_dev);
 + struct acpi_device *adev;
 + bool coherent;
 +
 + if (has_acpi_companion(bridge)) {
 + adev = to_acpi_node(bridge-fwnode);
 + if (acpi_check_dma(adev, coherent))
 + arch_setup_dma_ops(dev, 0, 0, NULL, coherent);
 + } else {
 + struct device *host = bridge-parent;
 + if (!host)
 + return;
 +
 + of_dma_configure(dev, host-of_node);
 + }

 Why is this check reversed with respect to device_dma_is_coherent()?
 In device_dma_is_coherent(), we first look for an OF property, then look
 for ACPI _CCA.  But here we check for _CCA, then for OF.

 +
 + pci_put_host_bridge_device(bridge);
 +}
 +
  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
  {
   int ret;
 @@ -1557,7 +1588,7 @@ void 

<    1   2   3   4   5   6   7   8   9   10   >