[PATCH 4/13] scsi: arcmsr: replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB that was got from firmware

2017-11-08 Thread Ching Huang
From: Ching Huang 

replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB that was 
got from firmware

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-08-04 11:19:22.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-08-04 17:07:52.0 +0800
@@ -836,6 +836,7 @@ struct AdapterControlBlock
atomic_tante_token_value;
uint32_tmaxOutstanding;
int vector_count;
+   uint32_tmaxFreeCCB;
uint32_tdoneq_index;
uint32_tccbsize;
uint32_tin_doorbell;
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:48:46.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:50:28.0 +0800
@@ -688,7 +688,7 @@ static int arcmsr_alloc_ccb_pool(struct 
acb->host->max_sectors = max_xfer_len/512;
acb->host->sg_tablesize = max_sg_entrys;
roundup_ccbsize = roundup(sizeof(struct CommandControlBlock) + 
(max_sg_entrys - 1) * sizeof(struct SG64ENTRY), 32);
-   acb->uncache_size = roundup_ccbsize * ARCMSR_MAX_FREECCB_NUM;
+   acb->uncache_size = roundup_ccbsize * acb->maxFreeCCB;
dma_coherent = dma_alloc_coherent(>dev, acb->uncache_size, 
_coherent_handle, GFP_KERNEL);
if(!dma_coherent){
printk(KERN_NOTICE "arcmsr%d: dma_alloc_coherent got error\n", 
acb->host->host_no);
@@ -700,7 +700,7 @@ static int arcmsr_alloc_ccb_pool(struct 
acb->ccbsize = roundup_ccbsize;
ccb_tmp = dma_coherent;
acb->vir2phy_offset = (unsigned long)dma_coherent - (unsigned 
long)dma_coherent_handle;
-   for(i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++){
+   for(i = 0; i < acb->maxFreeCCB; i++){
cdb_phyaddr = dma_coherent_handle + offsetof(struct 
CommandControlBlock, arcmsr_cdb);
switch (acb->adapter_type) {
case ACB_ADAPTER_TYPE_A:
@@ -1431,7 +1431,7 @@ static void arcmsr_remove(struct pci_dev
 
arcmsr_abort_allcmd(acb);
arcmsr_done4abort_postqueue(acb);
-   for (i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++) {
+   for (i = 0; i < acb->maxFreeCCB; i++) {
struct CommandControlBlock *ccb = acb->pccb_pool[i];
if (ccb->startdone == ARCMSR_CCB_START) {
ccb->startdone = ARCMSR_CCB_ABORTED;
@@ -3243,6 +3243,9 @@ static bool arcmsr_get_firmware_spec(str
else
acb->maxOutstanding = acb->firm_numbers_queue - 1;
acb->host->can_queue = acb->maxOutstanding;
+   acb->maxFreeCCB = acb->host->can_queue;
+   if (acb->maxFreeCCB < ARCMSR_MAX_FREECCB_NUM)
+   acb->maxFreeCCB += 64;
return rtn;
 }
 
@@ -4265,7 +4268,7 @@ static uint8_t arcmsr_iop_reset(struct A
rtnval = arcmsr_abort_allcmd(acb);
/* clear all outbound posted Q */
arcmsr_done4abort_postqueue(acb);
-   for (i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++) {
+   for (i = 0; i < acb->maxFreeCCB; i++) {
ccb = acb->pccb_pool[i];
if (ccb->startdone == ARCMSR_CCB_START) {
scsi_dma_unmap(ccb->pcmd);
@@ -4373,7 +4376,7 @@ static int arcmsr_abort(struct scsi_cmnd
}
 
intmask_org = arcmsr_disable_outbound_ints(acb);
-   for (i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++) {
+   for (i = 0; i < acb->maxFreeCCB; i++) {
struct CommandControlBlock *ccb = acb->pccb_pool[i];
if (ccb->startdone == ARCMSR_CCB_START && ccb->pcmd == cmd) {
ccb->startdone = ARCMSR_CCB_ABORTED;




Re: [PATCH x86/urgent] bpf: emulate push insns for uprobe on x86

2017-11-08 Thread Yonghong Song



On 11/8/17 10:53 PM, Thomas Gleixner wrote:

On Wed, 8 Nov 2017, Yonghong Song wrote:

On 11/8/17 4:06 PM, David Miller wrote:

From: Yonghong Song 
Date: Wed, 8 Nov 2017 13:37:12 -0800


Uprobe is a tracing mechanism for userspace programs.
Typical uprobe will incur overhead of two traps.
First trap is caused by replaced trap insn, and
the second trap is to execute the original displaced
insn in user space.

   ...

I don't understand how this is bpf related, and if it is you don't
explain it well in the commit message.


Right. This is not related to bpf. Will remove the "bpf" from the subject line
in the next revision.


The proper subject is something like:

 [PATCH] uprobes/x86: ...


Thanks, Thomas,

I will fix the subject etc. Previously, I added x86/urgent as the branch
I did my test on top of it (similar to net-next). I will add that 
information in the comments and re-submit.




which you can figure out by looking at the subsystem prefixes via

 git log arch/x86/kernel/uprobes.c

Note, that it says [PATCH} and nothing else. That patch is a nice
performance improvement, but certainly not x86/urgent material. x86/urgent
is for bug and regression fixes.

Thanks,

tglx



[PATCH 4/13] scsi: arcmsr: replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB that was got from firmware

2017-11-08 Thread Ching Huang
From: Ching Huang 

replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB that was 
got from firmware

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-08-04 11:19:22.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-08-04 17:07:52.0 +0800
@@ -836,6 +836,7 @@ struct AdapterControlBlock
atomic_tante_token_value;
uint32_tmaxOutstanding;
int vector_count;
+   uint32_tmaxFreeCCB;
uint32_tdoneq_index;
uint32_tccbsize;
uint32_tin_doorbell;
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:48:46.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:50:28.0 +0800
@@ -688,7 +688,7 @@ static int arcmsr_alloc_ccb_pool(struct 
acb->host->max_sectors = max_xfer_len/512;
acb->host->sg_tablesize = max_sg_entrys;
roundup_ccbsize = roundup(sizeof(struct CommandControlBlock) + 
(max_sg_entrys - 1) * sizeof(struct SG64ENTRY), 32);
-   acb->uncache_size = roundup_ccbsize * ARCMSR_MAX_FREECCB_NUM;
+   acb->uncache_size = roundup_ccbsize * acb->maxFreeCCB;
dma_coherent = dma_alloc_coherent(>dev, acb->uncache_size, 
_coherent_handle, GFP_KERNEL);
if(!dma_coherent){
printk(KERN_NOTICE "arcmsr%d: dma_alloc_coherent got error\n", 
acb->host->host_no);
@@ -700,7 +700,7 @@ static int arcmsr_alloc_ccb_pool(struct 
acb->ccbsize = roundup_ccbsize;
ccb_tmp = dma_coherent;
acb->vir2phy_offset = (unsigned long)dma_coherent - (unsigned 
long)dma_coherent_handle;
-   for(i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++){
+   for(i = 0; i < acb->maxFreeCCB; i++){
cdb_phyaddr = dma_coherent_handle + offsetof(struct 
CommandControlBlock, arcmsr_cdb);
switch (acb->adapter_type) {
case ACB_ADAPTER_TYPE_A:
@@ -1431,7 +1431,7 @@ static void arcmsr_remove(struct pci_dev
 
arcmsr_abort_allcmd(acb);
arcmsr_done4abort_postqueue(acb);
-   for (i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++) {
+   for (i = 0; i < acb->maxFreeCCB; i++) {
struct CommandControlBlock *ccb = acb->pccb_pool[i];
if (ccb->startdone == ARCMSR_CCB_START) {
ccb->startdone = ARCMSR_CCB_ABORTED;
@@ -3243,6 +3243,9 @@ static bool arcmsr_get_firmware_spec(str
else
acb->maxOutstanding = acb->firm_numbers_queue - 1;
acb->host->can_queue = acb->maxOutstanding;
+   acb->maxFreeCCB = acb->host->can_queue;
+   if (acb->maxFreeCCB < ARCMSR_MAX_FREECCB_NUM)
+   acb->maxFreeCCB += 64;
return rtn;
 }
 
@@ -4265,7 +4268,7 @@ static uint8_t arcmsr_iop_reset(struct A
rtnval = arcmsr_abort_allcmd(acb);
/* clear all outbound posted Q */
arcmsr_done4abort_postqueue(acb);
-   for (i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++) {
+   for (i = 0; i < acb->maxFreeCCB; i++) {
ccb = acb->pccb_pool[i];
if (ccb->startdone == ARCMSR_CCB_START) {
scsi_dma_unmap(ccb->pcmd);
@@ -4373,7 +4376,7 @@ static int arcmsr_abort(struct scsi_cmnd
}
 
intmask_org = arcmsr_disable_outbound_ints(acb);
-   for (i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++) {
+   for (i = 0; i < acb->maxFreeCCB; i++) {
struct CommandControlBlock *ccb = acb->pccb_pool[i];
if (ccb->startdone == ARCMSR_CCB_START && ccb->pcmd == cmd) {
ccb->startdone = ARCMSR_CCB_ABORTED;




Re: [PATCH x86/urgent] bpf: emulate push insns for uprobe on x86

2017-11-08 Thread Yonghong Song



On 11/8/17 10:53 PM, Thomas Gleixner wrote:

On Wed, 8 Nov 2017, Yonghong Song wrote:

On 11/8/17 4:06 PM, David Miller wrote:

From: Yonghong Song 
Date: Wed, 8 Nov 2017 13:37:12 -0800


Uprobe is a tracing mechanism for userspace programs.
Typical uprobe will incur overhead of two traps.
First trap is caused by replaced trap insn, and
the second trap is to execute the original displaced
insn in user space.

   ...

I don't understand how this is bpf related, and if it is you don't
explain it well in the commit message.


Right. This is not related to bpf. Will remove the "bpf" from the subject line
in the next revision.


The proper subject is something like:

 [PATCH] uprobes/x86: ...


Thanks, Thomas,

I will fix the subject etc. Previously, I added x86/urgent as the branch
I did my test on top of it (similar to net-next). I will add that 
information in the comments and re-submit.




which you can figure out by looking at the subsystem prefixes via

 git log arch/x86/kernel/uprobes.c

Note, that it says [PATCH} and nothing else. That patch is a nice
performance improvement, but certainly not x86/urgent material. x86/urgent
is for bug and regression fixes.

Thanks,

tglx



[PATCH v4 1/5] perf/core: add PERF_RECORD_SAMPLE_SKID_IP record type

2017-11-08 Thread Stephane Eranian
This patchs adds a new sample record type. The goal
is to record the interrupted instruction pointer (IP)
as seen by the kernel and reflected in the machine state (pt_regs).

On some architectures, it is possible to avoid the IP skid using
hardware support. For instance, on Intel x86, the use of PEBS helps
eliminate the skid on Haswell and later processors.

Without this patch, on Haswell processors, if you set:
 - attr.precise = 0, then you get the skid IP
 - attr.precise > 0, then you get the PEBS ip corrected for skid

The IP normally comes when the event has PERF_RECORD_SAMPLE_IP set.
However, there are certain measuremewnts where you need to have BOTH
the corrected IP and the skid IP. For instance, when studying branches,
the skid IP usually points to the target of the branch while the corrected
IP point to the branch instruction itself. Today, it is not possible to retrieve
both at the same time. This patch makes this possible by specifying
PERF_SAMPLE_IP|PERF_SAMPLE_SKID_IP.

Signed-off-by: Stephane Eranian 
---
 include/linux/perf_event.h  |  2 ++
 include/uapi/linux/perf_event.h |  4 +++-
 kernel/events/core.c| 14 ++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 874b71a70058..772530501025 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -917,6 +917,7 @@ struct perf_sample_data {
u64 stack_user_size;
 
u64 phys_addr;
+   u64 skid_ip;
 } cacheline_aligned;
 
 /* default value for data source */
@@ -937,6 +938,7 @@ static inline void perf_sample_data_init(struct 
perf_sample_data *data,
data->weight = 0;
data->data_src.val = PERF_MEM_NA;
data->txn = 0;
+   data->skid_ip = 0; /* mark as uinitialized */
 }
 
 extern void perf_output_sample(struct perf_output_handle *handle,
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 362493a2f950..48a65a90fcab 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -141,8 +141,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_TRANSACTION = 1U << 17,
PERF_SAMPLE_REGS_INTR   = 1U << 18,
PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
+   PERF_SAMPLE_SKID_IP = 1U << 20,
 
-   PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
+   PERF_SAMPLE_MAX = 1U << 21, /* non-ABI */
 };
 
 /*
@@ -817,6 +818,7 @@ enum perf_event_type {
 *  { u64   abi; # enum perf_sample_regs_abi
 *u64   regs[weight(mask)]; } && 
PERF_SAMPLE_REGS_INTR
 *  { u64   phys_addr;} && PERF_SAMPLE_PHYS_ADDR
+*  { u64   skid_ip;  } && PERF_SAMPLE_SKID_IP
 * };
 */
PERF_RECORD_SAMPLE  = 9,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0649a84204e6..40f2839c8b94 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1565,6 +1565,9 @@ static void __perf_event_header_size(struct perf_event 
*event, u64 sample_type)
if (sample_type & PERF_SAMPLE_PHYS_ADDR)
size += sizeof(data->phys_addr);
 
+   if (sample_type & PERF_SAMPLE_SKID_IP)
+   size += sizeof(data->skid_ip);
+
event->header_size = size;
 }
 
@@ -5934,6 +5937,9 @@ void perf_output_sample(struct perf_output_handle *handle,
if (sample_type & PERF_SAMPLE_PHYS_ADDR)
perf_output_put(handle, data->phys_addr);
 
+   if (sample_type & PERF_SAMPLE_SKID_IP)
+   perf_output_put(handle, data->skid_ip);
+
if (!event->attr.watermark) {
int wakeup_events = event->attr.wakeup_events;
 
@@ -5999,6 +6005,14 @@ void perf_prepare_sample(struct perf_event_header 
*header,
if (sample_type & PERF_SAMPLE_IP)
data->ip = perf_instruction_pointer(regs);
 
+   /*
+* if skid_ip has not been set by arch specific code, then
+* we initialize it to IP as interrupt-based sampling has
+* skid
+*/
+   if (!data->skid_ip && sample_type & PERF_SAMPLE_SKID_IP)
+   data->skid_ip = perf_instruction_pointer(regs);
+
if (sample_type & PERF_SAMPLE_CALLCHAIN) {
int size = 1;
 
-- 
2.7.4




[PATCH v4 1/5] perf/core: add PERF_RECORD_SAMPLE_SKID_IP record type

2017-11-08 Thread Stephane Eranian
This patchs adds a new sample record type. The goal
is to record the interrupted instruction pointer (IP)
as seen by the kernel and reflected in the machine state (pt_regs).

On some architectures, it is possible to avoid the IP skid using
hardware support. For instance, on Intel x86, the use of PEBS helps
eliminate the skid on Haswell and later processors.

Without this patch, on Haswell processors, if you set:
 - attr.precise = 0, then you get the skid IP
 - attr.precise > 0, then you get the PEBS ip corrected for skid

The IP normally comes when the event has PERF_RECORD_SAMPLE_IP set.
However, there are certain measuremewnts where you need to have BOTH
the corrected IP and the skid IP. For instance, when studying branches,
the skid IP usually points to the target of the branch while the corrected
IP point to the branch instruction itself. Today, it is not possible to retrieve
both at the same time. This patch makes this possible by specifying
PERF_SAMPLE_IP|PERF_SAMPLE_SKID_IP.

Signed-off-by: Stephane Eranian 
---
 include/linux/perf_event.h  |  2 ++
 include/uapi/linux/perf_event.h |  4 +++-
 kernel/events/core.c| 14 ++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 874b71a70058..772530501025 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -917,6 +917,7 @@ struct perf_sample_data {
u64 stack_user_size;
 
u64 phys_addr;
+   u64 skid_ip;
 } cacheline_aligned;
 
 /* default value for data source */
@@ -937,6 +938,7 @@ static inline void perf_sample_data_init(struct 
perf_sample_data *data,
data->weight = 0;
data->data_src.val = PERF_MEM_NA;
data->txn = 0;
+   data->skid_ip = 0; /* mark as uinitialized */
 }
 
 extern void perf_output_sample(struct perf_output_handle *handle,
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 362493a2f950..48a65a90fcab 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -141,8 +141,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_TRANSACTION = 1U << 17,
PERF_SAMPLE_REGS_INTR   = 1U << 18,
PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
+   PERF_SAMPLE_SKID_IP = 1U << 20,
 
-   PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
+   PERF_SAMPLE_MAX = 1U << 21, /* non-ABI */
 };
 
 /*
@@ -817,6 +818,7 @@ enum perf_event_type {
 *  { u64   abi; # enum perf_sample_regs_abi
 *u64   regs[weight(mask)]; } && 
PERF_SAMPLE_REGS_INTR
 *  { u64   phys_addr;} && PERF_SAMPLE_PHYS_ADDR
+*  { u64   skid_ip;  } && PERF_SAMPLE_SKID_IP
 * };
 */
PERF_RECORD_SAMPLE  = 9,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0649a84204e6..40f2839c8b94 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1565,6 +1565,9 @@ static void __perf_event_header_size(struct perf_event 
*event, u64 sample_type)
if (sample_type & PERF_SAMPLE_PHYS_ADDR)
size += sizeof(data->phys_addr);
 
+   if (sample_type & PERF_SAMPLE_SKID_IP)
+   size += sizeof(data->skid_ip);
+
event->header_size = size;
 }
 
@@ -5934,6 +5937,9 @@ void perf_output_sample(struct perf_output_handle *handle,
if (sample_type & PERF_SAMPLE_PHYS_ADDR)
perf_output_put(handle, data->phys_addr);
 
+   if (sample_type & PERF_SAMPLE_SKID_IP)
+   perf_output_put(handle, data->skid_ip);
+
if (!event->attr.watermark) {
int wakeup_events = event->attr.wakeup_events;
 
@@ -5999,6 +6005,14 @@ void perf_prepare_sample(struct perf_event_header 
*header,
if (sample_type & PERF_SAMPLE_IP)
data->ip = perf_instruction_pointer(regs);
 
+   /*
+* if skid_ip has not been set by arch specific code, then
+* we initialize it to IP as interrupt-based sampling has
+* skid
+*/
+   if (!data->skid_ip && sample_type & PERF_SAMPLE_SKID_IP)
+   data->skid_ip = perf_instruction_pointer(regs);
+
if (sample_type & PERF_SAMPLE_CALLCHAIN) {
int size = 1;
 
-- 
2.7.4




[PATCH v4 3/5] perf/tools: add support for PERF_SAMPLE_SKID_IP

2017-11-08 Thread Stephane Eranian
This patch adds the support code to handle the PERF_SAMPLE_SKID_IP
record type. This is done as an event term and as such can be enabled
per event: cpu/event=xxx,skid-ip=1/. This is a boolean term which is
false by default.

Signed-off-by: Stephane Eranian 
---
 tools/include/uapi/linux/perf_event.h |  4 +++-
 tools/perf/util/event.h   |  1 +
 tools/perf/util/evsel.c   | 11 +++
 tools/perf/util/evsel.h   |  2 ++
 tools/perf/util/parse-events.c|  7 +++
 tools/perf/util/parse-events.h|  1 +
 tools/perf/util/parse-events.l|  1 +
 tools/perf/util/session.c |  3 +++
 8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index 362493a2f950..79655228dd9b 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -141,8 +141,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_TRANSACTION = 1U << 17,
PERF_SAMPLE_REGS_INTR   = 1U << 18,
PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
+   PERF_SAMPLE_SKID_IP = 1U << 20,
 
-   PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
+   PERF_SAMPLE_MAX = 1U << 21, /* non-ABI */
 };
 
 /*
@@ -817,6 +818,7 @@ enum perf_event_type {
 *  { u64   abi; # enum perf_sample_regs_abi
 *u64   regs[weight(mask)]; } && 
PERF_SAMPLE_REGS_INTR
 *  { u64   phys_addr;} && PERF_SAMPLE_PHYS_ADDR
+*  { u64   skid_ip; } && PERF_SAMPLE_SKID_IP
 * };
 */
PERF_RECORD_SAMPLE  = 9,
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 1ae95efbfb95..41622a7ed649 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -202,6 +202,7 @@ struct perf_sample {
u32 raw_size;
u64 data_src;
u64 phys_addr;
+   u64 skid_ip;
u32 flags;
u16 insn_len;
u8  cpumode;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f894893c203d..679954ed2201 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -775,6 +775,10 @@ static void apply_config_terms(struct perf_evsel *evsel,
case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
attr->write_backward = term->val.overwrite ? 1 : 0;
break;
+   case PERF_EVSEL__CONFIG_TERM_SKID_IP:
+   if (term->val.skid_ip)
+   perf_evsel__set_sample_bit(evsel, SKID_IP);
+   break;
default:
break;
}
@@ -1478,6 +1482,7 @@ static void __p_sample_type(char *buf, size_t size, u64 
value)
bit_name(BRANCH_STACK), bit_name(REGS_USER), 
bit_name(STACK_USER),
bit_name(IDENTIFIER), bit_name(REGS_INTR), bit_name(DATA_SRC),
bit_name(WEIGHT), bit_name(PHYS_ADDR),
+   bit_name(SKID_IP),
{ .name = NULL, }
};
 #undef bit_name
@@ -2225,6 +2230,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, 
union perf_event *event,
array++;
}
 
+   data->skid_ip = 0;
+   if (type & PERF_SAMPLE_SKID_IP) {
+   data->skid_ip = *array;
+   array++;
+   }
+
return 0;
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 9277df96ffda..8555095f0d48 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -49,6 +49,7 @@ enum {
PERF_EVSEL__CONFIG_TERM_OVERWRITE,
PERF_EVSEL__CONFIG_TERM_DRV_CFG,
PERF_EVSEL__CONFIG_TERM_BRANCH,
+   PERF_EVSEL__CONFIG_TERM_SKID_IP,
PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -66,6 +67,7 @@ struct perf_evsel_config_term {
boolinherit;
booloverwrite;
char*branch;
+   boolskid_ip;
} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index a7fcd95961ef..1a1d9fc509bd 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -918,6 +918,7 @@ static const char 
*config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
[PARSE_EVENTS__TERM_TYPE_OVERWRITE] = "overwrite",
[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]   = "no-overwrite",
[PARSE_EVENTS__TERM_TYPE_DRV_CFG]   = "driver-config",
+   [PARSE_EVENTS__TERM_TYPE_SKID_IP]   = "skid-ip",
 };
 
 static bool config_term_shrinked;
@@ -1026,6 +1027,9 @@ do {  
   \
case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
CHECK_TYPE_VAL(NUM);
break;
+   case 

[PATCH v4 3/5] perf/tools: add support for PERF_SAMPLE_SKID_IP

2017-11-08 Thread Stephane Eranian
This patch adds the support code to handle the PERF_SAMPLE_SKID_IP
record type. This is done as an event term and as such can be enabled
per event: cpu/event=xxx,skid-ip=1/. This is a boolean term which is
false by default.

Signed-off-by: Stephane Eranian 
---
 tools/include/uapi/linux/perf_event.h |  4 +++-
 tools/perf/util/event.h   |  1 +
 tools/perf/util/evsel.c   | 11 +++
 tools/perf/util/evsel.h   |  2 ++
 tools/perf/util/parse-events.c|  7 +++
 tools/perf/util/parse-events.h|  1 +
 tools/perf/util/parse-events.l|  1 +
 tools/perf/util/session.c |  3 +++
 8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index 362493a2f950..79655228dd9b 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -141,8 +141,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_TRANSACTION = 1U << 17,
PERF_SAMPLE_REGS_INTR   = 1U << 18,
PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
+   PERF_SAMPLE_SKID_IP = 1U << 20,
 
-   PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
+   PERF_SAMPLE_MAX = 1U << 21, /* non-ABI */
 };
 
 /*
@@ -817,6 +818,7 @@ enum perf_event_type {
 *  { u64   abi; # enum perf_sample_regs_abi
 *u64   regs[weight(mask)]; } && 
PERF_SAMPLE_REGS_INTR
 *  { u64   phys_addr;} && PERF_SAMPLE_PHYS_ADDR
+*  { u64   skid_ip; } && PERF_SAMPLE_SKID_IP
 * };
 */
PERF_RECORD_SAMPLE  = 9,
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 1ae95efbfb95..41622a7ed649 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -202,6 +202,7 @@ struct perf_sample {
u32 raw_size;
u64 data_src;
u64 phys_addr;
+   u64 skid_ip;
u32 flags;
u16 insn_len;
u8  cpumode;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f894893c203d..679954ed2201 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -775,6 +775,10 @@ static void apply_config_terms(struct perf_evsel *evsel,
case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
attr->write_backward = term->val.overwrite ? 1 : 0;
break;
+   case PERF_EVSEL__CONFIG_TERM_SKID_IP:
+   if (term->val.skid_ip)
+   perf_evsel__set_sample_bit(evsel, SKID_IP);
+   break;
default:
break;
}
@@ -1478,6 +1482,7 @@ static void __p_sample_type(char *buf, size_t size, u64 
value)
bit_name(BRANCH_STACK), bit_name(REGS_USER), 
bit_name(STACK_USER),
bit_name(IDENTIFIER), bit_name(REGS_INTR), bit_name(DATA_SRC),
bit_name(WEIGHT), bit_name(PHYS_ADDR),
+   bit_name(SKID_IP),
{ .name = NULL, }
};
 #undef bit_name
@@ -2225,6 +2230,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, 
union perf_event *event,
array++;
}
 
+   data->skid_ip = 0;
+   if (type & PERF_SAMPLE_SKID_IP) {
+   data->skid_ip = *array;
+   array++;
+   }
+
return 0;
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 9277df96ffda..8555095f0d48 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -49,6 +49,7 @@ enum {
PERF_EVSEL__CONFIG_TERM_OVERWRITE,
PERF_EVSEL__CONFIG_TERM_DRV_CFG,
PERF_EVSEL__CONFIG_TERM_BRANCH,
+   PERF_EVSEL__CONFIG_TERM_SKID_IP,
PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -66,6 +67,7 @@ struct perf_evsel_config_term {
boolinherit;
booloverwrite;
char*branch;
+   boolskid_ip;
} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index a7fcd95961ef..1a1d9fc509bd 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -918,6 +918,7 @@ static const char 
*config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
[PARSE_EVENTS__TERM_TYPE_OVERWRITE] = "overwrite",
[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]   = "no-overwrite",
[PARSE_EVENTS__TERM_TYPE_DRV_CFG]   = "driver-config",
+   [PARSE_EVENTS__TERM_TYPE_SKID_IP]   = "skid-ip",
 };
 
 static bool config_term_shrinked;
@@ -1026,6 +1027,9 @@ do {  
   \
case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
CHECK_TYPE_VAL(NUM);
break;
+   case 

[PATCH v4 4/5] perf/record: add documentation for using PERF_SAMPLE_SKID_IP

2017-11-08 Thread Stephane Eranian
This patch adds documentation to describe how to use the skid
ip support with perf record. The sample type can be provided
per event as follows: pmu_instance/...,skid-ip=1/

For instance on Intel X86:

$ perf record -e cpu/event=0xc5,skid-ip=1/pp

does record the precise address of retired branches and their target.

Signed-off-by: Stephane Eranian 
---
 tools/perf/Documentation/perf-record.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 5a626ef666c2..f0e3636dc4be 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -57,6 +57,14 @@ OPTIONS
 FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and
 "no" for disable callgraph.
  - 'stack-size': user stack size for dwarf mode
+ - 'skid-ip' : boolean, captures the unmodified interrupt instruction 
pointer
+   (IP) in each sample. Usually with event-based sampling, 
the IP
+   has skid and rarely point to the instruction which 
caused the
+   event to overflow. On some architectures, the hardware 
can eliminate
+   the skid and perf_events returns it as the IP with 
precise sampling is
+   enabled. But for certain measurements, it may be useful 
to have both
+   the correct and skid ip. This option enable capturing 
the skid ip in
+   additional to the corrected ip. Default is: false
 
   See the linkperf:perf-list[1] man page for more parameters.
 
-- 
2.7.4



[PATCH v4 4/5] perf/record: add documentation for using PERF_SAMPLE_SKID_IP

2017-11-08 Thread Stephane Eranian
This patch adds documentation to describe how to use the skid
ip support with perf record. The sample type can be provided
per event as follows: pmu_instance/...,skid-ip=1/

For instance on Intel X86:

$ perf record -e cpu/event=0xc5,skid-ip=1/pp

does record the precise address of retired branches and their target.

Signed-off-by: Stephane Eranian 
---
 tools/perf/Documentation/perf-record.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 5a626ef666c2..f0e3636dc4be 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -57,6 +57,14 @@ OPTIONS
 FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and
 "no" for disable callgraph.
  - 'stack-size': user stack size for dwarf mode
+ - 'skid-ip' : boolean, captures the unmodified interrupt instruction 
pointer
+   (IP) in each sample. Usually with event-based sampling, 
the IP
+   has skid and rarely point to the instruction which 
caused the
+   event to overflow. On some architectures, the hardware 
can eliminate
+   the skid and perf_events returns it as the IP with 
precise sampling is
+   enabled. But for certain measurements, it may be useful 
to have both
+   the correct and skid ip. This option enable capturing 
the skid ip in
+   additional to the corrected ip. Default is: false
 
   See the linkperf:perf-list[1] man page for more parameters.
 
-- 
2.7.4



[PATCH v4 2/5] perf/x86: add PERF_SAMPLE_SKID_IP support for X86 PEBS

2017-11-08 Thread Stephane Eranian
This atch adds support for SKID_IP to Intel x86 processors in PEBS
mode.

Signed-off-by: Stephane Eranian 
---
 arch/x86/events/intel/ds.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 3674a4b6f8bd..dd248ceda452 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1190,6 +1190,13 @@ static void setup_pebs_sample_data(struct perf_event 
*event,
x86_pmu.intel_cap.pebs_format >= 1)
data->addr = pebs->dla;
 
+   /*
+* unmodified, skid IP which is guaranteed to be the next
+* dyanmic instruction
+*/
+   if (sample_type & PERF_SAMPLE_SKID_IP)
+   data->skid_ip = pebs->ip;
+
if (x86_pmu.intel_cap.pebs_format >= 2) {
/* Only set the TSX weight when no memory weight. */
if ((sample_type & PERF_SAMPLE_WEIGHT) && !fll)
-- 
2.7.4



[PATCH v4 2/5] perf/x86: add PERF_SAMPLE_SKID_IP support for X86 PEBS

2017-11-08 Thread Stephane Eranian
This atch adds support for SKID_IP to Intel x86 processors in PEBS
mode.

Signed-off-by: Stephane Eranian 
---
 arch/x86/events/intel/ds.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 3674a4b6f8bd..dd248ceda452 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1190,6 +1190,13 @@ static void setup_pebs_sample_data(struct perf_event 
*event,
x86_pmu.intel_cap.pebs_format >= 1)
data->addr = pebs->dla;
 
+   /*
+* unmodified, skid IP which is guaranteed to be the next
+* dyanmic instruction
+*/
+   if (sample_type & PERF_SAMPLE_SKID_IP)
+   data->skid_ip = pebs->ip;
+
if (x86_pmu.intel_cap.pebs_format >= 2) {
/* Only set the TSX weight when no memory weight. */
if ((sample_type & PERF_SAMPLE_WEIGHT) && !fll)
-- 
2.7.4



[PATCH v4 0/5] perf: add support for capturing skid IP

2017-11-08 Thread Stephane Eranian
This patchs adds a new sample record type called
PERF_SAMPLE_SKID_IP. The goal is to record the
unmodified interrupted instruction pointer (IP) as
seen by the kernel and reflected in the machine state.

On some architectures, it is possible to avoid the IP skid using
hardware support. For instance, on Intel x86, the use of PEBS helps
eliminate the skid on Haswell and later processors. On older Intel
processor, software, i.e., the kernel,  may succeed in eliminating
the skid.

Without this patch, on Haswell processors, if you set:
 - attr.precise = 0, then you get the skid IP
 - attr.precise = 1, then you get the skid PEBS ip (off-by-1)
 - attr.precise = 2, then you get the skidless PEBS ip

The IP is captured when the event has PERF_SAMPLE_IP set in sample_type.
However, there are certain measurements where you need to have BOTH
the skidless IP and the skid IP. For instance, when studying branches,
the skid IP usually points to the target of the branch while the skidless
IP points to the branch instruction itself. Today, it is not possible to 
retrieve
both at the same time. This patch makes this possible by specifying
PERF_SAMPLE_IP|PERF_SAMPLE_SKID_IP.

As an example, consider the following code snipet:
 37.51 42c2edje 42c2f3
   42c2efadd$0x1,%rdx
   42c2f3sub$0x1,%rax

When using PEBS (precise=2) and sampling on BR_INST_RETIRED.CONDITIONAL,
the IP always points to 0x42c2ed. With precise=1, the IP would point to
0x42c2f3. It is interesting to collect both IPs in a single run to determine
how often the conditional branch is taken vs. non-taken.

Understanding the skid is also interesting for other precise events.

In V2, we rebased to 10d94ff4d558 (v4.14-rc7).

In V3, code is rebased to 4.14-rc8, LKML comments have been integrated.
The new way to specify skid ip is per event:
   $ perf record -e cpu/event=0xc5,skid-ip=1/ 

In V4, we fix document of the ski-ip event option and move a session.c
change to the correct patch as per Jiri's remark.

Stephane Eranian (5):
  perf/core: add PERF_RECORD_SAMPLE_SKID_IP record type
  perf/x86: add PERF_SAMPLE_SKID_IP support for X86 PEBS
  perf/tools: add support for PERF_SAMPLE_SKID_IP
  perf/record: add documentation for using PERF_SAMPLE_SKID_IP
  perf/script: add support for PERF_SAMPLE_SKID_IP

 arch/x86/events/intel/ds.c   |  7 +++
 include/linux/perf_event.h   |  2 ++
 include/uapi/linux/perf_event.h  |  4 +++-
 kernel/events/core.c | 14 ++
 tools/include/uapi/linux/perf_event.h|  4 +++-
 tools/perf/Documentation/perf-record.txt |  8 
 tools/perf/Documentation/perf-script.txt |  2 +-
 tools/perf/builtin-script.c  | 10 --
 tools/perf/util/event.h  |  1 +
 tools/perf/util/evsel.c  | 11 +++
 tools/perf/util/evsel.h  |  2 ++
 tools/perf/util/parse-events.c   |  7 +++
 tools/perf/util/parse-events.h   |  1 +
 tools/perf/util/parse-events.l   |  1 +
 tools/perf/util/session.c|  3 +++
 15 files changed, 72 insertions(+), 5 deletions(-)

-- 
2.7.4



[PATCH v4 5/5] perf/script: add support for PERF_SAMPLE_SKID_IP

2017-11-08 Thread Stephane Eranian
This patch adds a skid_ip field to perf script
to dump the raw value of the PERF_SAMPLE_SKID_IP
field in each sample.

$ perf script -F +ip,+skid_ip ..

The field is not enabled by default.

Signed-off-by: Stephane Eranian 
---
 tools/perf/Documentation/perf-script.txt |  2 +-
 tools/perf/builtin-script.c  | 10 --
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 2811fcf684cb..96871bd3a576 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@ OPTIONS
 Comma separated list of fields to print. Options are:
 comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
 srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, 
brstackinsn,
-brstackoff, callindent, insn, insnlen, synth, phys_addr.
+brstackoff, callindent, insn, insnlen, synth, phys_addr, skid_ip.
 Field list can be prepended with the type, trace, sw or hw,
 to indicate to which event type the field list applies.
 e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 68f36dc0344f..f00fc8c50f68 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -90,6 +90,7 @@ enum perf_output_field {
PERF_OUTPUT_SYNTH   = 1U << 25,
PERF_OUTPUT_PHYS_ADDR   = 1U << 26,
PERF_OUTPUT_UREGS   = 1U << 27,
+   PERF_OUTPUT_SKID_IP = 1U << 28,
 };
 
 struct output_option {
@@ -124,6 +125,7 @@ struct output_option {
{.str = "brstackoff", .field = PERF_OUTPUT_BRSTACKOFF},
{.str = "synth", .field = PERF_OUTPUT_SYNTH},
{.str = "phys_addr", .field = PERF_OUTPUT_PHYS_ADDR},
+   {.str = "skid_ip", .field = PERF_OUTPUT_SKID_IP},
 };
 
 enum {
@@ -1563,7 +1565,11 @@ static void process_event(struct perf_script *script,
 
if (PRINT_FIELD(PHYS_ADDR))
fprintf(fp, "%16" PRIx64, sample->phys_addr);
-   fprintf(fp, "\n");
+
+   if (PRINT_FIELD(SKID_IP))
+   printf(" %"PRIx64" ", sample->skid_ip);
+
+   printf("\n");
 }
 
 static struct scripting_ops*scripting_ops;
@@ -2915,7 +2921,7 @@ int cmd_script(int argc, const char **argv)
 "Valid types: hw,sw,trace,raw,synth. "
 "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
 "addr,symoff,period,iregs,uregs,brstack,brstacksym,flags,"
-
"bpf-output,callindent,insn,insnlen,brstackinsn,synth,phys_addr",
+
"bpf-output,callindent,insn,insnlen,brstackinsn,synth,phys_addr,skid_ip",
 parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", _wide,
"system-wide collection from all CPUs"),
-- 
2.7.4



[PATCH v4 5/5] perf/script: add support for PERF_SAMPLE_SKID_IP

2017-11-08 Thread Stephane Eranian
This patch adds a skid_ip field to perf script
to dump the raw value of the PERF_SAMPLE_SKID_IP
field in each sample.

$ perf script -F +ip,+skid_ip ..

The field is not enabled by default.

Signed-off-by: Stephane Eranian 
---
 tools/perf/Documentation/perf-script.txt |  2 +-
 tools/perf/builtin-script.c  | 10 --
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 2811fcf684cb..96871bd3a576 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@ OPTIONS
 Comma separated list of fields to print. Options are:
 comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
 srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, 
brstackinsn,
-brstackoff, callindent, insn, insnlen, synth, phys_addr.
+brstackoff, callindent, insn, insnlen, synth, phys_addr, skid_ip.
 Field list can be prepended with the type, trace, sw or hw,
 to indicate to which event type the field list applies.
 e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 68f36dc0344f..f00fc8c50f68 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -90,6 +90,7 @@ enum perf_output_field {
PERF_OUTPUT_SYNTH   = 1U << 25,
PERF_OUTPUT_PHYS_ADDR   = 1U << 26,
PERF_OUTPUT_UREGS   = 1U << 27,
+   PERF_OUTPUT_SKID_IP = 1U << 28,
 };
 
 struct output_option {
@@ -124,6 +125,7 @@ struct output_option {
{.str = "brstackoff", .field = PERF_OUTPUT_BRSTACKOFF},
{.str = "synth", .field = PERF_OUTPUT_SYNTH},
{.str = "phys_addr", .field = PERF_OUTPUT_PHYS_ADDR},
+   {.str = "skid_ip", .field = PERF_OUTPUT_SKID_IP},
 };
 
 enum {
@@ -1563,7 +1565,11 @@ static void process_event(struct perf_script *script,
 
if (PRINT_FIELD(PHYS_ADDR))
fprintf(fp, "%16" PRIx64, sample->phys_addr);
-   fprintf(fp, "\n");
+
+   if (PRINT_FIELD(SKID_IP))
+   printf(" %"PRIx64" ", sample->skid_ip);
+
+   printf("\n");
 }
 
 static struct scripting_ops*scripting_ops;
@@ -2915,7 +2921,7 @@ int cmd_script(int argc, const char **argv)
 "Valid types: hw,sw,trace,raw,synth. "
 "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
 "addr,symoff,period,iregs,uregs,brstack,brstacksym,flags,"
-
"bpf-output,callindent,insn,insnlen,brstackinsn,synth,phys_addr",
+
"bpf-output,callindent,insn,insnlen,brstackinsn,synth,phys_addr,skid_ip",
 parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", _wide,
"system-wide collection from all CPUs"),
-- 
2.7.4



[PATCH v4 0/5] perf: add support for capturing skid IP

2017-11-08 Thread Stephane Eranian
This patchs adds a new sample record type called
PERF_SAMPLE_SKID_IP. The goal is to record the
unmodified interrupted instruction pointer (IP) as
seen by the kernel and reflected in the machine state.

On some architectures, it is possible to avoid the IP skid using
hardware support. For instance, on Intel x86, the use of PEBS helps
eliminate the skid on Haswell and later processors. On older Intel
processor, software, i.e., the kernel,  may succeed in eliminating
the skid.

Without this patch, on Haswell processors, if you set:
 - attr.precise = 0, then you get the skid IP
 - attr.precise = 1, then you get the skid PEBS ip (off-by-1)
 - attr.precise = 2, then you get the skidless PEBS ip

The IP is captured when the event has PERF_SAMPLE_IP set in sample_type.
However, there are certain measurements where you need to have BOTH
the skidless IP and the skid IP. For instance, when studying branches,
the skid IP usually points to the target of the branch while the skidless
IP points to the branch instruction itself. Today, it is not possible to 
retrieve
both at the same time. This patch makes this possible by specifying
PERF_SAMPLE_IP|PERF_SAMPLE_SKID_IP.

As an example, consider the following code snipet:
 37.51 42c2edje 42c2f3
   42c2efadd$0x1,%rdx
   42c2f3sub$0x1,%rax

When using PEBS (precise=2) and sampling on BR_INST_RETIRED.CONDITIONAL,
the IP always points to 0x42c2ed. With precise=1, the IP would point to
0x42c2f3. It is interesting to collect both IPs in a single run to determine
how often the conditional branch is taken vs. non-taken.

Understanding the skid is also interesting for other precise events.

In V2, we rebased to 10d94ff4d558 (v4.14-rc7).

In V3, code is rebased to 4.14-rc8, LKML comments have been integrated.
The new way to specify skid ip is per event:
   $ perf record -e cpu/event=0xc5,skid-ip=1/ 

In V4, we fix document of the ski-ip event option and move a session.c
change to the correct patch as per Jiri's remark.

Stephane Eranian (5):
  perf/core: add PERF_RECORD_SAMPLE_SKID_IP record type
  perf/x86: add PERF_SAMPLE_SKID_IP support for X86 PEBS
  perf/tools: add support for PERF_SAMPLE_SKID_IP
  perf/record: add documentation for using PERF_SAMPLE_SKID_IP
  perf/script: add support for PERF_SAMPLE_SKID_IP

 arch/x86/events/intel/ds.c   |  7 +++
 include/linux/perf_event.h   |  2 ++
 include/uapi/linux/perf_event.h  |  4 +++-
 kernel/events/core.c | 14 ++
 tools/include/uapi/linux/perf_event.h|  4 +++-
 tools/perf/Documentation/perf-record.txt |  8 
 tools/perf/Documentation/perf-script.txt |  2 +-
 tools/perf/builtin-script.c  | 10 --
 tools/perf/util/event.h  |  1 +
 tools/perf/util/evsel.c  | 11 +++
 tools/perf/util/evsel.h  |  2 ++
 tools/perf/util/parse-events.c   |  7 +++
 tools/perf/util/parse-events.h   |  1 +
 tools/perf/util/parse-events.l   |  1 +
 tools/perf/util/session.c|  3 +++
 15 files changed, 72 insertions(+), 5 deletions(-)

-- 
2.7.4



Re: [PATCH] mm: page_ext: check if page_ext is not prepared

2017-11-08 Thread Michal Hocko
Andrew,

On Thu 09-11-17 13:35:53, Joonsoo Kim wrote:
> On Wed, Nov 08, 2017 at 03:21:06PM +0100, Michal Hocko wrote:
> > On Wed 08-11-17 16:59:56, Joonsoo Kim wrote:
> > > On Tue, Nov 07, 2017 at 10:47:30AM +0100, Michal Hocko wrote:
[...]
> > > > I suspec this goes all the way down to when page_ext has been
> > > > resurrected.  It is quite interesting that nobody has noticed this in 3
> > > > years but maybe the feature is not used all that much and the HW has to
> > > > be quite special to trigger. Anyway the following should be added
> > > > 
> > > >  Fixes: eefa864b701d ("mm/page_ext: resurrect struct page extending 
> > > > code for debugging")
> > > >  Cc: stable
> > > 
> > > IIRC, caller of lookup_page_ext() doesn't check 'NULL' until
> > > f86e427197 ("mm: check the return value of lookup_page_ext for all
> > > call sites"). So, this problem would happen old kernel even if this
> > > patch is applied to old kernel.
> > 
> > OK, then the changelog should mention dependency on that check so that
> > anybody who backports this patch to pre 4.7 kernels knows to pull that
> > one as well.
> > 
> > > IMO, proper fix is to check all the pfn in the section. It is sent
> > > from Jaewon in other mail.
> > 
> > I believe that this patch is valuable on its own and the other one
> > should build on top of it.
> 
> Okay, agreed.

could you add a note that stable backporters need to consider
f86e427197. Something like

 Cc: stable # depends on f86e427197

Thanks
-- 
Michal Hocko
SUSE Labs


Re: [PATCH] mm: page_ext: check if page_ext is not prepared

2017-11-08 Thread Michal Hocko
Andrew,

On Thu 09-11-17 13:35:53, Joonsoo Kim wrote:
> On Wed, Nov 08, 2017 at 03:21:06PM +0100, Michal Hocko wrote:
> > On Wed 08-11-17 16:59:56, Joonsoo Kim wrote:
> > > On Tue, Nov 07, 2017 at 10:47:30AM +0100, Michal Hocko wrote:
[...]
> > > > I suspec this goes all the way down to when page_ext has been
> > > > resurrected.  It is quite interesting that nobody has noticed this in 3
> > > > years but maybe the feature is not used all that much and the HW has to
> > > > be quite special to trigger. Anyway the following should be added
> > > > 
> > > >  Fixes: eefa864b701d ("mm/page_ext: resurrect struct page extending 
> > > > code for debugging")
> > > >  Cc: stable
> > > 
> > > IIRC, caller of lookup_page_ext() doesn't check 'NULL' until
> > > f86e427197 ("mm: check the return value of lookup_page_ext for all
> > > call sites"). So, this problem would happen old kernel even if this
> > > patch is applied to old kernel.
> > 
> > OK, then the changelog should mention dependency on that check so that
> > anybody who backports this patch to pre 4.7 kernels knows to pull that
> > one as well.
> > 
> > > IMO, proper fix is to check all the pfn in the section. It is sent
> > > from Jaewon in other mail.
> > 
> > I believe that this patch is valuable on its own and the other one
> > should build on top of it.
> 
> Okay, agreed.

could you add a note that stable backporters need to consider
f86e427197. Something like

 Cc: stable # depends on f86e427197

Thanks
-- 
Michal Hocko
SUSE Labs


Re: [PATCH V13 08/10] mmc: block: blk-mq: Separate card polling from recovery

2017-11-08 Thread Adrian Hunter
On 08/11/17 11:30, Linus Walleij wrote:
> On Fri, Nov 3, 2017 at 2:20 PM, Adrian Hunter  wrote:
> 
>> Recovery is simpler to understand if it is only used for errors. Create a
>> separate function for card polling.
>>
>> Signed-off-by: Adrian Hunter 
> 
> This looks good but I can't see why it's not folded into
> patch 3 already. This error handling is introduced there.

What are you on about?  If we're going to split up the patches (which I
argued against - the new code is all new, so it could be read independently
from the old mess) then this is a logically distinct step.  Polling and
error-recovery are conceptually different things and it is important to
separate them to make the code easier to understand.


Re: [PATCH V13 08/10] mmc: block: blk-mq: Separate card polling from recovery

2017-11-08 Thread Adrian Hunter
On 08/11/17 11:30, Linus Walleij wrote:
> On Fri, Nov 3, 2017 at 2:20 PM, Adrian Hunter  wrote:
> 
>> Recovery is simpler to understand if it is only used for errors. Create a
>> separate function for card polling.
>>
>> Signed-off-by: Adrian Hunter 
> 
> This looks good but I can't see why it's not folded into
> patch 3 already. This error handling is introduced there.

What are you on about?  If we're going to split up the patches (which I
argued against - the new code is all new, so it could be read independently
from the old mess) then this is a logically distinct step.  Polling and
error-recovery are conceptually different things and it is important to
separate them to make the code easier to understand.


Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT

2017-11-08 Thread Jiri Olsa
On Wed, Nov 08, 2017 at 08:59:22AM -0800, Milind Chabbi wrote:
> On Wed, Nov 8, 2017 at 7:57 AM, Jiri Olsa  wrote:
> > On Wed, Nov 08, 2017 at 07:51:10AM -0800, Milind Chabbi wrote:
> >> On Wed, Nov 8, 2017 at 7:12 AM, Jiri Olsa  wrote:
> >>
> >> > > I am not able to fully understand your concern.
> >> > > Can you point to a code file and line related to your observation?
> >> > > The patch is modeled after the existing modify_user_hw_breakpoint() 
> >> > > function
> >> > > present in events/hw_breakpoint.c; don't you see this problem in that 
> >> > > code?
> >> >
> >> > the reserve_bp_slot/release_bp_slot functions manage
> >> > counts for current breakpoints based on its type
> >> >
> >> > those counts are cumulated in here:
> >> >   static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]);
> >> >
> >> > you allow to change the breakpoint type, so I'd expect
> >> > to see some code that release slot count for old type
> >> > and take new one (if it's available)
> >> >
> >> > jirka
> >>
> >>
> >> Why is this not a concern for modify_user_hw_breakpoint() function?
> >
> > I don't know ;-)
> >
> > jirka
> 
> 
> Jirka,
> 
> I carefully looked at bp_cpuinfo[] and nr_slots[] data structures.
> nr_slots[] is an array of length two (one slot of TYPE_INST and
> another for TYPE_DATA).
> The accounting "thinks" that there is one limit on the number of
> instruction breakpoints and another limit on the number of data
> breakpoints.
> The assumption is clearly broken; for example, on x86 there exists a
> limit on the *total* number of all breakpoints disregarding their kind
> and the code has failed to capture this aspect.

there's the CONFIG_HAVE_MIXED_BREAKPOINTS_REGS that puts DATA and INST
under one count on x86.. but that seems to be the enabled only for:

arch/sh/Kconfig:select HAVE_MIXED_BREAKPOINTS_REGS
arch/x86/Kconfig:   select HAVE_MIXED_BREAKPOINTS_REGS

> 
> As such, modify_user_hw_breakpoint() makes no attempt to keep the
> counts correct. Instead, it simply tries to change and install a new
> breakpoint and fails if the hardware disallows.
> This can lead to a situation where, say on x86, someone creates 4
> TYPE_DATA breakpoints, then changes one of them to TYPE_INS via
> modify_user_hw_breakpoint() and then releases the TYPE_INS breakpoint.
> Since the accounting still thinks that there are four TYPE_DATA
> breakpoints, it will disallow creating a new TYPE_DATA breakpoint,
> although there is place for one TYPE_DATA breakpoint.
> 
> This convinces me that the problem and the solution are outside of
> this current patch.
> Do you agree?

I'll leave this decision to maintainer ;-) but seems better to fix
the interface before we add any new dependent function calls

jirka


Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT

2017-11-08 Thread Jiri Olsa
On Wed, Nov 08, 2017 at 08:59:22AM -0800, Milind Chabbi wrote:
> On Wed, Nov 8, 2017 at 7:57 AM, Jiri Olsa  wrote:
> > On Wed, Nov 08, 2017 at 07:51:10AM -0800, Milind Chabbi wrote:
> >> On Wed, Nov 8, 2017 at 7:12 AM, Jiri Olsa  wrote:
> >>
> >> > > I am not able to fully understand your concern.
> >> > > Can you point to a code file and line related to your observation?
> >> > > The patch is modeled after the existing modify_user_hw_breakpoint() 
> >> > > function
> >> > > present in events/hw_breakpoint.c; don't you see this problem in that 
> >> > > code?
> >> >
> >> > the reserve_bp_slot/release_bp_slot functions manage
> >> > counts for current breakpoints based on its type
> >> >
> >> > those counts are cumulated in here:
> >> >   static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]);
> >> >
> >> > you allow to change the breakpoint type, so I'd expect
> >> > to see some code that release slot count for old type
> >> > and take new one (if it's available)
> >> >
> >> > jirka
> >>
> >>
> >> Why is this not a concern for modify_user_hw_breakpoint() function?
> >
> > I don't know ;-)
> >
> > jirka
> 
> 
> Jirka,
> 
> I carefully looked at bp_cpuinfo[] and nr_slots[] data structures.
> nr_slots[] is an array of length two (one slot of TYPE_INST and
> another for TYPE_DATA).
> The accounting "thinks" that there is one limit on the number of
> instruction breakpoints and another limit on the number of data
> breakpoints.
> The assumption is clearly broken; for example, on x86 there exists a
> limit on the *total* number of all breakpoints disregarding their kind
> and the code has failed to capture this aspect.

there's the CONFIG_HAVE_MIXED_BREAKPOINTS_REGS that puts DATA and INST
under one count on x86.. but that seems to be the enabled only for:

arch/sh/Kconfig:select HAVE_MIXED_BREAKPOINTS_REGS
arch/x86/Kconfig:   select HAVE_MIXED_BREAKPOINTS_REGS

> 
> As such, modify_user_hw_breakpoint() makes no attempt to keep the
> counts correct. Instead, it simply tries to change and install a new
> breakpoint and fails if the hardware disallows.
> This can lead to a situation where, say on x86, someone creates 4
> TYPE_DATA breakpoints, then changes one of them to TYPE_INS via
> modify_user_hw_breakpoint() and then releases the TYPE_INS breakpoint.
> Since the accounting still thinks that there are four TYPE_DATA
> breakpoints, it will disallow creating a new TYPE_DATA breakpoint,
> although there is place for one TYPE_DATA breakpoint.
> 
> This convinces me that the problem and the solution are outside of
> this current patch.
> Do you agree?

I'll leave this decision to maintainer ;-) but seems better to fix
the interface before we add any new dependent function calls

jirka


[PATCH 3/13] scsi: arcmsr: add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

2017-11-08 Thread Ching Huang
From: Ching Huang 

add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-08-03 18:54:46.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-08-04 11:19:22.0 +0800
@@ -65,6 +65,7 @@ struct device_attribute;
 #define ARCMSR_MAX_HBB_POSTQUEUE   
264
 #define ARCMSR_MAX_ARC1214_POSTQUEUE   256
 #define ARCMSR_MAX_ARC1214_DONEQUEUE   257
+#define ARCMSR_MAX_HBE_DONEQUEUE   512
 #define ARCMSR_MAX_XFER_LEN
0x26000 /* 152K */
 #define ARCMSR_CDB_SG_PAGE_LENGTH  
256 
 #define ARCMST_NUM_MSIX_VECTORS4
@@ -77,6 +78,9 @@ struct device_attribute;
 #ifndef PCI_DEVICE_ID_ARECA_1203
#define PCI_DEVICE_ID_ARECA_12030x1203
 #endif
+#ifndef PCI_DEVICE_ID_ARECA_1884
+   #define PCI_DEVICE_ID_ARECA_18840x1884
+#endif
 /*
 
**
 **
@@ -405,6 +409,31 @@ struct FIRMWARE_INFO
 /*ARCMSR_HBAMU_MESSAGE_FIRMWARE_OK*/
 #define ARCMSR_ARC1214_MESSAGE_FIRMWARE_OK 0x8000
 #define ARCMSR_ARC1214_OUTBOUND_LIST_INTERRUPT_CLEAR   0x0001
+/* 
+***
+**SPEC. for Areca Type E adapter
+***
+*/
+#define ARCMSR_SIGNATURE_1884  0x188417D3
+
+#define ARCMSR_HBEMU_DRV2IOP_DATA_WRITE_OK 0x0002
+#define ARCMSR_HBEMU_DRV2IOP_DATA_READ_OK  0x0004
+#define ARCMSR_HBEMU_DRV2IOP_MESSAGE_CMD_DONE  0x0008
+
+#define ARCMSR_HBEMU_IOP2DRV_DATA_WRITE_OK 0x0002
+#define ARCMSR_HBEMU_IOP2DRV_DATA_READ_OK  0x0004
+#define ARCMSR_HBEMU_IOP2DRV_MESSAGE_CMD_DONE  0x0008
+
+#define ARCMSR_HBEMU_MESSAGE_FIRMWARE_OK   0x8000
+
+#define ARCMSR_HBEMU_OUTBOUND_DOORBELL_ISR 0x0001
+#define ARCMSR_HBEMU_OUTBOUND_POSTQUEUE_ISR0x0008
+#define ARCMSR_HBEMU_ALL_INTMASKENABLE 0x0009
+
+/* ARC-1884 doorbell sync */
+#define ARCMSR_HBEMU_DOORBELL_SYNC 0x100
+#define ARCMSR_ARC188X_RESET_ADAPTER   0x0004
+#define ARCMSR_ARC1884_DiagWrite_ENABLE0x0080
 /*
 ***
 **ARECA SCSI COMMAND DESCRIPTOR BLOCK size 0x1F8 (504)
@@ -614,6 +643,88 @@ struct MessageUnit_D {
u32 __iomem *msgcode_rwbuffer;  /* 0x2200 */
 };
 /*
+*
+** Messaging Unit (MU) of Type E processor(LSI)
+*
+*/
+struct MessageUnit_E{
+   uint32_tiobound_doorbell;   /* 0003*/
+   uint32_twrite_sequence_3xxx;/*0004 0007*/
+   uint32_thost_diagnostic_3xxx;   /*0008 000B*/
+   uint32_tposted_outbound_doorbell;   /*000C 000F*/
+   uint32_tmaster_error_attribute; /*0010 0013*/
+   uint32_tmaster_error_address_low;   /*0014 0017*/
+   uint32_tmaster_error_address_high;  /*0018 001B*/
+   uint32_thcb_size;   /*001C 001F*/
+   uint32_tinbound_doorbell;   /*0020 0023*/
+   uint32_tdiagnostic_rw_data; /*0024 0027*/
+   uint32_tdiagnostic_rw_address_low;  /*0028 002B*/
+   uint32_tdiagnostic_rw_address_high; /*002C 002F*/
+   uint32_thost_int_status;/*0030 0033*/
+   uint32_thost_int_mask;  /*0034 0037*/
+   uint32_tdcr_data;   /*0038 003B*/
+   uint32_tdcr_address;/*003C 003F*/
+   uint32_tinbound_queueport;  /*0040 0043*/
+   uint32_toutbound_queueport; /*0044 0047*/
+   uint32_thcb_pci_address_low;/*0048 004B*/
+   uint32_thcb_pci_address_high;   /*004C 004F*/
+   uint32_tiop_int_status; /*0050 0053*/
+   uint32_tiop_int_mask;   /*0054 0057*/
+   uint32_tiop_inbound_queue_port; /*0058 005B*/
+   uint32_tiop_outbound_queue_port;/*005C 005F*/
+   uint32_tinbound_free_list_index;/*0060 0063*/
+   uint32_t

[PATCH 3/13] scsi: arcmsr: add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

2017-11-08 Thread Ching Huang
From: Ching Huang 

add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-08-03 18:54:46.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-08-04 11:19:22.0 +0800
@@ -65,6 +65,7 @@ struct device_attribute;
 #define ARCMSR_MAX_HBB_POSTQUEUE   
264
 #define ARCMSR_MAX_ARC1214_POSTQUEUE   256
 #define ARCMSR_MAX_ARC1214_DONEQUEUE   257
+#define ARCMSR_MAX_HBE_DONEQUEUE   512
 #define ARCMSR_MAX_XFER_LEN
0x26000 /* 152K */
 #define ARCMSR_CDB_SG_PAGE_LENGTH  
256 
 #define ARCMST_NUM_MSIX_VECTORS4
@@ -77,6 +78,9 @@ struct device_attribute;
 #ifndef PCI_DEVICE_ID_ARECA_1203
#define PCI_DEVICE_ID_ARECA_12030x1203
 #endif
+#ifndef PCI_DEVICE_ID_ARECA_1884
+   #define PCI_DEVICE_ID_ARECA_18840x1884
+#endif
 /*
 
**
 **
@@ -405,6 +409,31 @@ struct FIRMWARE_INFO
 /*ARCMSR_HBAMU_MESSAGE_FIRMWARE_OK*/
 #define ARCMSR_ARC1214_MESSAGE_FIRMWARE_OK 0x8000
 #define ARCMSR_ARC1214_OUTBOUND_LIST_INTERRUPT_CLEAR   0x0001
+/* 
+***
+**SPEC. for Areca Type E adapter
+***
+*/
+#define ARCMSR_SIGNATURE_1884  0x188417D3
+
+#define ARCMSR_HBEMU_DRV2IOP_DATA_WRITE_OK 0x0002
+#define ARCMSR_HBEMU_DRV2IOP_DATA_READ_OK  0x0004
+#define ARCMSR_HBEMU_DRV2IOP_MESSAGE_CMD_DONE  0x0008
+
+#define ARCMSR_HBEMU_IOP2DRV_DATA_WRITE_OK 0x0002
+#define ARCMSR_HBEMU_IOP2DRV_DATA_READ_OK  0x0004
+#define ARCMSR_HBEMU_IOP2DRV_MESSAGE_CMD_DONE  0x0008
+
+#define ARCMSR_HBEMU_MESSAGE_FIRMWARE_OK   0x8000
+
+#define ARCMSR_HBEMU_OUTBOUND_DOORBELL_ISR 0x0001
+#define ARCMSR_HBEMU_OUTBOUND_POSTQUEUE_ISR0x0008
+#define ARCMSR_HBEMU_ALL_INTMASKENABLE 0x0009
+
+/* ARC-1884 doorbell sync */
+#define ARCMSR_HBEMU_DOORBELL_SYNC 0x100
+#define ARCMSR_ARC188X_RESET_ADAPTER   0x0004
+#define ARCMSR_ARC1884_DiagWrite_ENABLE0x0080
 /*
 ***
 **ARECA SCSI COMMAND DESCRIPTOR BLOCK size 0x1F8 (504)
@@ -614,6 +643,88 @@ struct MessageUnit_D {
u32 __iomem *msgcode_rwbuffer;  /* 0x2200 */
 };
 /*
+*
+** Messaging Unit (MU) of Type E processor(LSI)
+*
+*/
+struct MessageUnit_E{
+   uint32_tiobound_doorbell;   /* 0003*/
+   uint32_twrite_sequence_3xxx;/*0004 0007*/
+   uint32_thost_diagnostic_3xxx;   /*0008 000B*/
+   uint32_tposted_outbound_doorbell;   /*000C 000F*/
+   uint32_tmaster_error_attribute; /*0010 0013*/
+   uint32_tmaster_error_address_low;   /*0014 0017*/
+   uint32_tmaster_error_address_high;  /*0018 001B*/
+   uint32_thcb_size;   /*001C 001F*/
+   uint32_tinbound_doorbell;   /*0020 0023*/
+   uint32_tdiagnostic_rw_data; /*0024 0027*/
+   uint32_tdiagnostic_rw_address_low;  /*0028 002B*/
+   uint32_tdiagnostic_rw_address_high; /*002C 002F*/
+   uint32_thost_int_status;/*0030 0033*/
+   uint32_thost_int_mask;  /*0034 0037*/
+   uint32_tdcr_data;   /*0038 003B*/
+   uint32_tdcr_address;/*003C 003F*/
+   uint32_tinbound_queueport;  /*0040 0043*/
+   uint32_toutbound_queueport; /*0044 0047*/
+   uint32_thcb_pci_address_low;/*0048 004B*/
+   uint32_thcb_pci_address_high;   /*004C 004F*/
+   uint32_tiop_int_status; /*0050 0053*/
+   uint32_tiop_int_mask;   /*0054 0057*/
+   uint32_tiop_inbound_queue_port; /*0058 005B*/
+   uint32_tiop_outbound_queue_port;/*005C 005F*/
+   uint32_tinbound_free_list_index;/*0060 0063*/
+   uint32_tinbound_post_list_index;/*0064 0067*/
+   uint32_t 

[PATCH] qlge: remove duplicated assignment to mbcp

2017-11-08 Thread Colin King
From: Colin Ian King 

The assignment to mbcp is identical to the initiatialized value assigned
to mbcp at declaration time a few lines earlier, hence we can remove the
second redundant assignment.  Cleans up clang warning:

drivers/net/ethernet/qlogic/qlge/qlge_mpi.c:209:22: warning:
Value stored to 'mbcp' during its initialization is never read

Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/qlogic/qlge/qlge_mpi.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_mpi.c 
b/drivers/net/ethernet/qlogic/qlge/qlge_mpi.c
index 384c8bc874f3..4be65d6761b3 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge_mpi.c
+++ b/drivers/net/ethernet/qlogic/qlge/qlge_mpi.c
@@ -213,7 +213,6 @@ static int ql_idc_req_aen(struct ql_adapter *qdev)
/* Get the status data and start up a thread to
 * handle the request.
 */
-   mbcp = >idc_mbc;
mbcp->out_count = 4;
status = ql_get_mb_sts(qdev, mbcp);
if (status) {
-- 
2.14.1



[PATCH] qlge: remove duplicated assignment to mbcp

2017-11-08 Thread Colin King
From: Colin Ian King 

The assignment to mbcp is identical to the initiatialized value assigned
to mbcp at declaration time a few lines earlier, hence we can remove the
second redundant assignment.  Cleans up clang warning:

drivers/net/ethernet/qlogic/qlge/qlge_mpi.c:209:22: warning:
Value stored to 'mbcp' during its initialization is never read

Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/qlogic/qlge/qlge_mpi.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_mpi.c 
b/drivers/net/ethernet/qlogic/qlge/qlge_mpi.c
index 384c8bc874f3..4be65d6761b3 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge_mpi.c
+++ b/drivers/net/ethernet/qlogic/qlge/qlge_mpi.c
@@ -213,7 +213,6 @@ static int ql_idc_req_aen(struct ql_adapter *qdev)
/* Get the status data and start up a thread to
 * handle the request.
 */
-   mbcp = >idc_mbc;
mbcp->out_count = 4;
status = ql_get_mb_sts(qdev, mbcp);
if (status) {
-- 
2.14.1



Re: [PATCH] cpuidle: Add "cpuidle.use_deepest" to bypass governor and allow HW to go deep

2017-11-08 Thread Vincent Guittot
Hi Len

On 9 November 2017 at 08:38, Len Brown  wrote:
> From: Len Brown 
>
> While there are several mechanisms (cmdline, sysfs, PM_QOS) to limit
> cpuidle to shallow idle states, there is no simple mechanism
> to give the hardware permission to enter the deeptest state permitted by 
> PM_QOS.

and by per device resume latency QoS

>
> Here we create the "cpuidle.use_deepest" modparam to provide this capability.
>
> "cpuidle.use_deepest=Y" can be set at boot-time, and
> /sys/module/cpuidle/use_deepest can be modified (with Y/N) at run-time.
>
> n.b.
>
> Within the constraints of PM_QOS, this mechanism gives the hardware
> permission to choose the deeptest power savings and highest latency
> state available.  And so choice will depend on the particular hardware.
>
> Also, if PM_QOS is not informed of latency constraints, it can't help.
> In that case, using this mechanism may result in entering high-latency
> states that impact performance.
>
> Signed-off-by: Len Brown 
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  4 
>  drivers/cpuidle/cpuidle.c   | 19 
>  include/linux/cpuidle.h |  7 ++
>  kernel/sched/idle.c | 30 
> +++--
>  4 files changed, 53 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index 05496622b4ef..20f70de688bf 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -659,6 +659,10 @@
> cpuidle.off=1   [CPU_IDLE]
> disable the cpuidle sub-system
>
> +   cpuidle.use_deepest=Y   [CPU_IDLE]
> +   Ignore cpuidle governor, always choose deepest
> +   PM_QOS-legal CPU idle power saving state.
> +
> cpufreq.off=1   [CPU_FREQ]
> disable the cpufreq sub-system
>
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 484cc8909d5c..afee5aab7719 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -34,6 +34,7 @@ LIST_HEAD(cpuidle_detected_devices);
>
>  static int enabled_devices;
>  static int off __read_mostly;
> +static bool use_deepest __read_mostly;
>  static int initialized __read_mostly;
>
>  int cpuidle_disabled(void)
> @@ -116,6 +117,10 @@ void cpuidle_use_deepest_state(bool enable)
> preempt_enable();
>  }
>
> +bool cpuidle_using_deepest_state(void)
> +{
> +   return use_deepest;
> +}
>  /**
>   * cpuidle_find_deepest_state - Find the deepest available idle state.
>   * @drv: cpuidle driver for the given CPU.
> @@ -127,6 +132,19 @@ int cpuidle_find_deepest_state(struct cpuidle_driver 
> *drv,
> return find_deepest_state(drv, dev, UINT_MAX, 0, false);
>  }
>
> +/**
> + * cpuidle_find_deepest_state_qos - Find the deepest available idle state.
> + * @drv: cpuidle driver for the given CPU.
> + * @dev: cpuidle device for the given CPU.
> + * Honors PM_QOS
> + */
> +int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
> +  struct cpuidle_device *dev)
> +{
> +   return find_deepest_state(drv, dev,
> +   pm_qos_request(PM_QOS_CPU_DMA_LATENCY), 0, false);

You should also take into account per device latency  like in menu governor:
dev_pm_qos_raw_read_value(device);


> +}
> +
>  #ifdef CONFIG_SUSPEND
>  static void enter_s2idle_proper(struct cpuidle_driver *drv,
> struct cpuidle_device *dev, int index)
> @@ -681,4 +699,5 @@ static int __init cpuidle_init(void)
>  }
>
>  module_param(off, int, 0444);
> +module_param(use_deepest, bool, 0644);
>  core_initcall(cpuidle_init);
> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
> index 8f7788d23b57..e3c2c9d1898f 100644
> --- a/include/linux/cpuidle.h
> +++ b/include/linux/cpuidle.h
> @@ -198,19 +198,26 @@ static inline struct cpuidle_device 
> *cpuidle_get_device(void) {return NULL; }
>  #ifdef CONFIG_CPU_IDLE
>  extern int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
>   struct cpuidle_device *dev);
> +extern int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
> + struct cpuidle_device *dev);
>  extern int cpuidle_enter_s2idle(struct cpuidle_driver *drv,
> struct cpuidle_device *dev);
>  extern void cpuidle_use_deepest_state(bool enable);
> +extern bool cpuidle_using_deepest_state(void);
>  #else
>  static inline int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
>  struct cpuidle_device *dev)
>  {return -ENODEV; }
> +static inline int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
> +struct 

Re: [PATCH] cpuidle: Add "cpuidle.use_deepest" to bypass governor and allow HW to go deep

2017-11-08 Thread Vincent Guittot
Hi Len

On 9 November 2017 at 08:38, Len Brown  wrote:
> From: Len Brown 
>
> While there are several mechanisms (cmdline, sysfs, PM_QOS) to limit
> cpuidle to shallow idle states, there is no simple mechanism
> to give the hardware permission to enter the deeptest state permitted by 
> PM_QOS.

and by per device resume latency QoS

>
> Here we create the "cpuidle.use_deepest" modparam to provide this capability.
>
> "cpuidle.use_deepest=Y" can be set at boot-time, and
> /sys/module/cpuidle/use_deepest can be modified (with Y/N) at run-time.
>
> n.b.
>
> Within the constraints of PM_QOS, this mechanism gives the hardware
> permission to choose the deeptest power savings and highest latency
> state available.  And so choice will depend on the particular hardware.
>
> Also, if PM_QOS is not informed of latency constraints, it can't help.
> In that case, using this mechanism may result in entering high-latency
> states that impact performance.
>
> Signed-off-by: Len Brown 
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  4 
>  drivers/cpuidle/cpuidle.c   | 19 
>  include/linux/cpuidle.h |  7 ++
>  kernel/sched/idle.c | 30 
> +++--
>  4 files changed, 53 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index 05496622b4ef..20f70de688bf 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -659,6 +659,10 @@
> cpuidle.off=1   [CPU_IDLE]
> disable the cpuidle sub-system
>
> +   cpuidle.use_deepest=Y   [CPU_IDLE]
> +   Ignore cpuidle governor, always choose deepest
> +   PM_QOS-legal CPU idle power saving state.
> +
> cpufreq.off=1   [CPU_FREQ]
> disable the cpufreq sub-system
>
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 484cc8909d5c..afee5aab7719 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -34,6 +34,7 @@ LIST_HEAD(cpuidle_detected_devices);
>
>  static int enabled_devices;
>  static int off __read_mostly;
> +static bool use_deepest __read_mostly;
>  static int initialized __read_mostly;
>
>  int cpuidle_disabled(void)
> @@ -116,6 +117,10 @@ void cpuidle_use_deepest_state(bool enable)
> preempt_enable();
>  }
>
> +bool cpuidle_using_deepest_state(void)
> +{
> +   return use_deepest;
> +}
>  /**
>   * cpuidle_find_deepest_state - Find the deepest available idle state.
>   * @drv: cpuidle driver for the given CPU.
> @@ -127,6 +132,19 @@ int cpuidle_find_deepest_state(struct cpuidle_driver 
> *drv,
> return find_deepest_state(drv, dev, UINT_MAX, 0, false);
>  }
>
> +/**
> + * cpuidle_find_deepest_state_qos - Find the deepest available idle state.
> + * @drv: cpuidle driver for the given CPU.
> + * @dev: cpuidle device for the given CPU.
> + * Honors PM_QOS
> + */
> +int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
> +  struct cpuidle_device *dev)
> +{
> +   return find_deepest_state(drv, dev,
> +   pm_qos_request(PM_QOS_CPU_DMA_LATENCY), 0, false);

You should also take into account per device latency  like in menu governor:
dev_pm_qos_raw_read_value(device);


> +}
> +
>  #ifdef CONFIG_SUSPEND
>  static void enter_s2idle_proper(struct cpuidle_driver *drv,
> struct cpuidle_device *dev, int index)
> @@ -681,4 +699,5 @@ static int __init cpuidle_init(void)
>  }
>
>  module_param(off, int, 0444);
> +module_param(use_deepest, bool, 0644);
>  core_initcall(cpuidle_init);
> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
> index 8f7788d23b57..e3c2c9d1898f 100644
> --- a/include/linux/cpuidle.h
> +++ b/include/linux/cpuidle.h
> @@ -198,19 +198,26 @@ static inline struct cpuidle_device 
> *cpuidle_get_device(void) {return NULL; }
>  #ifdef CONFIG_CPU_IDLE
>  extern int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
>   struct cpuidle_device *dev);
> +extern int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
> + struct cpuidle_device *dev);
>  extern int cpuidle_enter_s2idle(struct cpuidle_driver *drv,
> struct cpuidle_device *dev);
>  extern void cpuidle_use_deepest_state(bool enable);
> +extern bool cpuidle_using_deepest_state(void);
>  #else
>  static inline int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
>  struct cpuidle_device *dev)
>  {return -ENODEV; }
> +static inline int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
> +struct cpuidle_device *dev)
> +{return -ENODEV; }
>  static inline 

[PATCH 0/3] VFS: name lookup improvements.

2017-11-08 Thread NeilBrown
These three patches address two issues: d_weak_revalidate and
path_mountpoint lookups.

The former is poorly defined and doesn't actually do the one thing
that it would be useful for it to do.  So the nfs implemention
is improved, the 9p one discarded, and the documentation clarified.

Given this change and recent change to follow_automount() the
mountpoint path lookup functions are no longer needed.  The regular
path look functions are quite sufficient.
The second two patches remove this with detailed explanation of why
it is OK.

Thanks,
NeilBrown


---

NeilBrown (3):
  VFS/nfs/9p: revise meaning of d_weak_invalidate.
  VFS: remove user_path_mountpoint_at()
  VFS / autofs4: remove kern_path_mountpoint()


 Documentation/filesystems/porting |5 +
 Documentation/filesystems/vfs.txt |   11 +--
 fs/9p/vfs_dentry.c|1 
 fs/autofs4/dev-ioctl.c|5 -
 fs/internal.h |1 
 fs/namei.c|  150 -
 fs/namespace.c|2 
 fs/nfs/dir.c  |   60 ++-
 include/linux/namei.h |1 
 9 files changed, 24 insertions(+), 212 deletions(-)

--
Signature



[PATCH 0/3] VFS: name lookup improvements.

2017-11-08 Thread NeilBrown
These three patches address two issues: d_weak_revalidate and
path_mountpoint lookups.

The former is poorly defined and doesn't actually do the one thing
that it would be useful for it to do.  So the nfs implemention
is improved, the 9p one discarded, and the documentation clarified.

Given this change and recent change to follow_automount() the
mountpoint path lookup functions are no longer needed.  The regular
path look functions are quite sufficient.
The second two patches remove this with detailed explanation of why
it is OK.

Thanks,
NeilBrown


---

NeilBrown (3):
  VFS/nfs/9p: revise meaning of d_weak_invalidate.
  VFS: remove user_path_mountpoint_at()
  VFS / autofs4: remove kern_path_mountpoint()


 Documentation/filesystems/porting |5 +
 Documentation/filesystems/vfs.txt |   11 +--
 fs/9p/vfs_dentry.c|1 
 fs/autofs4/dev-ioctl.c|5 -
 fs/internal.h |1 
 fs/namei.c|  150 -
 fs/namespace.c|2 
 fs/nfs/dir.c  |   60 ++-
 include/linux/namei.h |1 
 9 files changed, 24 insertions(+), 212 deletions(-)

--
Signature



[PATCH 1/3] VFS/nfs/9p: revise meaning of d_weak_invalidate.

2017-11-08 Thread NeilBrown
d_weak_invalidate() is called when a path lookup ends with
something other than a simple name.
This happen when it:
  - ends "." or "..",
  - ends at a mountpoint (including "/"), or
  - ends at a procfs symlink.

In these cases, revalidating the name of the dentry is inappropriate
as the name wasn't used.  The comments suggest it is necessary to
revalidate the inode, but this is not the case.  Whatever operation
is called on the final dentry has the opportunity to revalidate it,
and will do so - certainly both nfs_getattr and v9fs_vfs_getattr do.

The one case where d_weak_revalidate() *is* needed is for an open()
when d_revalidate() performs some handling of LOOKUP_OPEN.  The same
handling should be performed for d_weak_revalidate().  NFS is the only
filesystem which handles LOOKUP_OPEN, but it doesn't do the handling
in d_weak_revalidate().

A consequence of this is that we do not get proper close-to-open semantics
of paths that do not end with a simple name.
This can easily be confirmed by changing directory to a non-root NFS directory
and running "echo *" while watching network traffic.
CTO semantics requires the file attributes to be revalidated on each open,
but that doesn't happen.

d_weak_revalidate can use the same implementation as d_invalidate by
ensuring that implementation does nothing when LOOKUP_JUMPED is set
(implying d_weak_revalidate was called) and LOOKUP_OPEN is clear (implying
the inodes isn't being opened).

This patch:
  - removes d_weak_invalidate() from 9p as 9p doesn't handle LOOKUP_OPEN.
  - discards nfs_weak_revalidate,
  - uses nfs_revalidate for d_weak_revalidate, ensuring to avoid the
unnecessary lookup when LOOKUP_JUMPED is set (but still handling
LOOKUP_OPEN correctly),
  - defines d_weak_revalidate for nfsv4 as well (this omission first lead
me to examine d_weak_revalidate more closely),
  - removes special revalidation of the root directory in nfs_opendir()
as this is no longer needed,
  - updates some documentation.

Signed-off-by: NeilBrown 
---
 Documentation/filesystems/porting |5 +++
 Documentation/filesystems/vfs.txt |   11 ---
 fs/9p/vfs_dentry.c|1 -
 fs/nfs/dir.c  |   60 ++---
 4 files changed, 21 insertions(+), 56 deletions(-)

diff --git a/Documentation/filesystems/porting 
b/Documentation/filesystems/porting
index 93e0a2404532..f455757ff1c6 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -606,3 +606,8 @@ in your dentry operations instead.
dentry separately, and it now has request_mask and query_flags arguments
to specify the fields and sync type requested by statx.  Filesystems not
supporting any statx-specific features may ignore the new arguments.
+--
+[recommended]
+   ->d_weak_revalidate should perform the same handling of LOOKUP_OPEN as
+   ->d_revalidate.  If LOOKUP_OPEN is not set, d_weak_revalidate need not
+   do anything.
diff --git a/Documentation/filesystems/vfs.txt 
b/Documentation/filesystems/vfs.txt
index 5fd325df59e2..c2025e226b29 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -1015,14 +1015,15 @@ struct dentry_operations {
doing a lookup in the parent directory. This includes "/", "." and "..",
as well as procfs-style symlinks and mountpoint traversal.
 
-   In this case, we are less concerned with whether the dentry is still
-   fully correct, but rather that the inode is still valid. As with
-   d_revalidate, most local filesystems will set this to NULL since their
-   dcache entries are always valid.
+   Filesystems only need this if they handle LOOKUP_OPEN in
+   d_revalidate, in which case the same handling should be applied
+   in d_weak_revalidate.  When LOOKUP_OPEN is not set,
+   d_weak_revalidate can safely be a no-op.
 
This function has the same return code semantics as d_revalidate.
 
-   d_weak_revalidate is only called after leaving rcu-walk mode.
+   d_weak_revalidate is only called after leaving rcu-walk mode,
+   so LOOKUP_RCU is never set.
 
   d_hash: called when the VFS adds a dentry to the hash table. The first
dentry passed to d_hash is the parent directory that the name is
diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
index bd456c668d39..99eaf3c6d44c 100644
--- a/fs/9p/vfs_dentry.c
+++ b/fs/9p/vfs_dentry.c
@@ -111,7 +111,6 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, 
unsigned int flags)
 
 const struct dentry_operations v9fs_cached_dentry_operations = {
.d_revalidate = v9fs_lookup_revalidate,
-   .d_weak_revalidate = v9fs_lookup_revalidate,
.d_delete = v9fs_cached_dentry_delete,
.d_release = v9fs_dentry_release,
 };
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 5ceaeb1f6fb6..fc349577526f 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -118,13 +118,6 @@ 

[PATCH 1/3] VFS/nfs/9p: revise meaning of d_weak_invalidate.

2017-11-08 Thread NeilBrown
d_weak_invalidate() is called when a path lookup ends with
something other than a simple name.
This happen when it:
  - ends "." or "..",
  - ends at a mountpoint (including "/"), or
  - ends at a procfs symlink.

In these cases, revalidating the name of the dentry is inappropriate
as the name wasn't used.  The comments suggest it is necessary to
revalidate the inode, but this is not the case.  Whatever operation
is called on the final dentry has the opportunity to revalidate it,
and will do so - certainly both nfs_getattr and v9fs_vfs_getattr do.

The one case where d_weak_revalidate() *is* needed is for an open()
when d_revalidate() performs some handling of LOOKUP_OPEN.  The same
handling should be performed for d_weak_revalidate().  NFS is the only
filesystem which handles LOOKUP_OPEN, but it doesn't do the handling
in d_weak_revalidate().

A consequence of this is that we do not get proper close-to-open semantics
of paths that do not end with a simple name.
This can easily be confirmed by changing directory to a non-root NFS directory
and running "echo *" while watching network traffic.
CTO semantics requires the file attributes to be revalidated on each open,
but that doesn't happen.

d_weak_revalidate can use the same implementation as d_invalidate by
ensuring that implementation does nothing when LOOKUP_JUMPED is set
(implying d_weak_revalidate was called) and LOOKUP_OPEN is clear (implying
the inodes isn't being opened).

This patch:
  - removes d_weak_invalidate() from 9p as 9p doesn't handle LOOKUP_OPEN.
  - discards nfs_weak_revalidate,
  - uses nfs_revalidate for d_weak_revalidate, ensuring to avoid the
unnecessary lookup when LOOKUP_JUMPED is set (but still handling
LOOKUP_OPEN correctly),
  - defines d_weak_revalidate for nfsv4 as well (this omission first lead
me to examine d_weak_revalidate more closely),
  - removes special revalidation of the root directory in nfs_opendir()
as this is no longer needed,
  - updates some documentation.

Signed-off-by: NeilBrown 
---
 Documentation/filesystems/porting |5 +++
 Documentation/filesystems/vfs.txt |   11 ---
 fs/9p/vfs_dentry.c|1 -
 fs/nfs/dir.c  |   60 ++---
 4 files changed, 21 insertions(+), 56 deletions(-)

diff --git a/Documentation/filesystems/porting 
b/Documentation/filesystems/porting
index 93e0a2404532..f455757ff1c6 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -606,3 +606,8 @@ in your dentry operations instead.
dentry separately, and it now has request_mask and query_flags arguments
to specify the fields and sync type requested by statx.  Filesystems not
supporting any statx-specific features may ignore the new arguments.
+--
+[recommended]
+   ->d_weak_revalidate should perform the same handling of LOOKUP_OPEN as
+   ->d_revalidate.  If LOOKUP_OPEN is not set, d_weak_revalidate need not
+   do anything.
diff --git a/Documentation/filesystems/vfs.txt 
b/Documentation/filesystems/vfs.txt
index 5fd325df59e2..c2025e226b29 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -1015,14 +1015,15 @@ struct dentry_operations {
doing a lookup in the parent directory. This includes "/", "." and "..",
as well as procfs-style symlinks and mountpoint traversal.
 
-   In this case, we are less concerned with whether the dentry is still
-   fully correct, but rather that the inode is still valid. As with
-   d_revalidate, most local filesystems will set this to NULL since their
-   dcache entries are always valid.
+   Filesystems only need this if they handle LOOKUP_OPEN in
+   d_revalidate, in which case the same handling should be applied
+   in d_weak_revalidate.  When LOOKUP_OPEN is not set,
+   d_weak_revalidate can safely be a no-op.
 
This function has the same return code semantics as d_revalidate.
 
-   d_weak_revalidate is only called after leaving rcu-walk mode.
+   d_weak_revalidate is only called after leaving rcu-walk mode,
+   so LOOKUP_RCU is never set.
 
   d_hash: called when the VFS adds a dentry to the hash table. The first
dentry passed to d_hash is the parent directory that the name is
diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
index bd456c668d39..99eaf3c6d44c 100644
--- a/fs/9p/vfs_dentry.c
+++ b/fs/9p/vfs_dentry.c
@@ -111,7 +111,6 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, 
unsigned int flags)
 
 const struct dentry_operations v9fs_cached_dentry_operations = {
.d_revalidate = v9fs_lookup_revalidate,
-   .d_weak_revalidate = v9fs_lookup_revalidate,
.d_delete = v9fs_cached_dentry_delete,
.d_release = v9fs_dentry_release,
 };
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 5ceaeb1f6fb6..fc349577526f 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -118,13 +118,6 @@ nfs_opendir(struct inode 

[PATCH 3/3] VFS / autofs4: remove kern_path_mountpoint()

2017-11-08 Thread NeilBrown
kern_path_mountpoint() is only called from autofs4 to perform
lookups which need to identify autofs4 mount points.

Many of the differences between kern_path() and kern_path_mountpoint()
are related to the fact that we will never use O_CREAT with the
latter, and don't need to "open" the target.
The main differences that could be relevant to autofs4 are:
- kern_path_mountpoint() does not call complete_walk() in
  mountpoint_last(), contrasting with do_last() which does call it.
  This means ->d_weak_revalidate() is not called from autofs4.

- follow_managed() is not call from mountpoint_last().

- LOOKUP_NO_REVAL is used for lookup_slow on the last component,
  if it isn't in cache.

As ->d_weak_revalidate() is now a no-op when LOOKUP_OPEN isn't
present, the first difference is no longer important.

The use of LOOKUP_NO_REVAL shouldn't cause autofs4 any problems
as no autofs4 dentry has ->d_revalidate().

follow_managed() might:
 a/ call ->d_manage()
 b/ might cross a mountpoint
 c/ might call follow_automount()

'b' cannot be relevant as path_mountpoint calls follow_mount() after
mountpoint_last() is called.

'a' might only be interesting when ->d_manage is autofs4_d_manage(),
but autofs4 only calls kern_path_mountpoint from ioctls issued by the
automount daemon, and autofs4_d_manage() will exit quickly in that
case.  So there is no risk of autofs4_d_manage() waiting for the
automount daemon (which it would be blocking) and causing a deadlock.

'c' could have been a problem before commit 42f461482178 ("autofs: fix
AT_NO_AUTOMOUNT not being honored").  Prior to that commit a lookup
for a negative autofs4 dentry could trigger an automount, even though
'flags' is 0.  Since that commit and error is returned instead.

So follow_managed() is no longer a problem.

So there is no reason that autofs4 needs to use kern_path_mountpoint()
any more.  It cannot deadlock.
So the whole 'path mountpoint' infrastructure can be discarded.

Signed-off-by: NeilBrown 
---
 fs/autofs4/dev-ioctl.c |5 +-
 fs/namei.c |  129 
 include/linux/namei.h  |1 
 3 files changed, 2 insertions(+), 133 deletions(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index b7c816f39404..716c44593117 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -209,7 +209,7 @@ static int find_autofs_mount(const char *pathname,
struct path path;
int err;
 
-   err = kern_path_mountpoint(AT_FDCWD, pathname, , 0);
+   err = kern_path(pathname,0 , );
if (err)
return err;
err = -ENOENT;
@@ -547,8 +547,7 @@ static int autofs_dev_ioctl_ismountpoint(struct file *fp,
 
if (!fp || param->ioctlfd == -1) {
if (autofs_type_any(type))
-   err = kern_path_mountpoint(AT_FDCWD,
-  name, , LOOKUP_FOLLOW);
+   err = kern_path(name, LOOKUP_FOLLOW, );
else
err = find_autofs_mount(name, ,
test_by_type, );
diff --git a/fs/namei.c b/fs/namei.c
index 6639203d7eba..e90680a3f6f1 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2601,135 +2601,6 @@ int user_path_at_empty(int dfd, const char __user 
*name, unsigned flags,
 }
 EXPORT_SYMBOL(user_path_at_empty);
 
-/**
- * mountpoint_last - look up last component for umount
- * @nd:   pathwalk nameidata - currently pointing at parent directory of "last"
- *
- * This is a special lookup_last function just for umount. In this case, we
- * need to resolve the path without doing any revalidation.
- *
- * The nameidata should be the result of doing a LOOKUP_PARENT pathwalk. Since
- * mountpoints are always pinned in the dcache, their ancestors are too. Thus,
- * in almost all cases, this lookup will be served out of the dcache. The only
- * cases where it won't are if nd->last refers to a symlink or the path is
- * bogus and it doesn't exist.
- *
- * Returns:
- * -error: if there was an error during lookup. This includes -ENOENT if the
- * lookup found a negative dentry.
- *
- * 0:  if we successfully resolved nd->last and found it to not to be a
- * symlink that needs to be followed.
- *
- * 1:  if we successfully resolved nd->last and found it to be a symlink
- * that needs to be followed.
- */
-static int
-mountpoint_last(struct nameidata *nd)
-{
-   int error = 0;
-   struct dentry *dir = nd->path.dentry;
-   struct path path;
-
-   /* If we're in rcuwalk, drop out of it to handle last component */
-   if (nd->flags & LOOKUP_RCU) {
-   if (unlazy_walk(nd))
-   return -ECHILD;
-   }
-
-   nd->flags &= ~LOOKUP_PARENT;
-
-   if (unlikely(nd->last_type != LAST_NORM)) {
-   error = handle_dots(nd, nd->last_type);
-   if (error)
-   return error;
-  

[PATCH 3/3] VFS / autofs4: remove kern_path_mountpoint()

2017-11-08 Thread NeilBrown
kern_path_mountpoint() is only called from autofs4 to perform
lookups which need to identify autofs4 mount points.

Many of the differences between kern_path() and kern_path_mountpoint()
are related to the fact that we will never use O_CREAT with the
latter, and don't need to "open" the target.
The main differences that could be relevant to autofs4 are:
- kern_path_mountpoint() does not call complete_walk() in
  mountpoint_last(), contrasting with do_last() which does call it.
  This means ->d_weak_revalidate() is not called from autofs4.

- follow_managed() is not call from mountpoint_last().

- LOOKUP_NO_REVAL is used for lookup_slow on the last component,
  if it isn't in cache.

As ->d_weak_revalidate() is now a no-op when LOOKUP_OPEN isn't
present, the first difference is no longer important.

The use of LOOKUP_NO_REVAL shouldn't cause autofs4 any problems
as no autofs4 dentry has ->d_revalidate().

follow_managed() might:
 a/ call ->d_manage()
 b/ might cross a mountpoint
 c/ might call follow_automount()

'b' cannot be relevant as path_mountpoint calls follow_mount() after
mountpoint_last() is called.

'a' might only be interesting when ->d_manage is autofs4_d_manage(),
but autofs4 only calls kern_path_mountpoint from ioctls issued by the
automount daemon, and autofs4_d_manage() will exit quickly in that
case.  So there is no risk of autofs4_d_manage() waiting for the
automount daemon (which it would be blocking) and causing a deadlock.

'c' could have been a problem before commit 42f461482178 ("autofs: fix
AT_NO_AUTOMOUNT not being honored").  Prior to that commit a lookup
for a negative autofs4 dentry could trigger an automount, even though
'flags' is 0.  Since that commit and error is returned instead.

So follow_managed() is no longer a problem.

So there is no reason that autofs4 needs to use kern_path_mountpoint()
any more.  It cannot deadlock.
So the whole 'path mountpoint' infrastructure can be discarded.

Signed-off-by: NeilBrown 
---
 fs/autofs4/dev-ioctl.c |5 +-
 fs/namei.c |  129 
 include/linux/namei.h  |1 
 3 files changed, 2 insertions(+), 133 deletions(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index b7c816f39404..716c44593117 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -209,7 +209,7 @@ static int find_autofs_mount(const char *pathname,
struct path path;
int err;
 
-   err = kern_path_mountpoint(AT_FDCWD, pathname, , 0);
+   err = kern_path(pathname,0 , );
if (err)
return err;
err = -ENOENT;
@@ -547,8 +547,7 @@ static int autofs_dev_ioctl_ismountpoint(struct file *fp,
 
if (!fp || param->ioctlfd == -1) {
if (autofs_type_any(type))
-   err = kern_path_mountpoint(AT_FDCWD,
-  name, , LOOKUP_FOLLOW);
+   err = kern_path(name, LOOKUP_FOLLOW, );
else
err = find_autofs_mount(name, ,
test_by_type, );
diff --git a/fs/namei.c b/fs/namei.c
index 6639203d7eba..e90680a3f6f1 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2601,135 +2601,6 @@ int user_path_at_empty(int dfd, const char __user 
*name, unsigned flags,
 }
 EXPORT_SYMBOL(user_path_at_empty);
 
-/**
- * mountpoint_last - look up last component for umount
- * @nd:   pathwalk nameidata - currently pointing at parent directory of "last"
- *
- * This is a special lookup_last function just for umount. In this case, we
- * need to resolve the path without doing any revalidation.
- *
- * The nameidata should be the result of doing a LOOKUP_PARENT pathwalk. Since
- * mountpoints are always pinned in the dcache, their ancestors are too. Thus,
- * in almost all cases, this lookup will be served out of the dcache. The only
- * cases where it won't are if nd->last refers to a symlink or the path is
- * bogus and it doesn't exist.
- *
- * Returns:
- * -error: if there was an error during lookup. This includes -ENOENT if the
- * lookup found a negative dentry.
- *
- * 0:  if we successfully resolved nd->last and found it to not to be a
- * symlink that needs to be followed.
- *
- * 1:  if we successfully resolved nd->last and found it to be a symlink
- * that needs to be followed.
- */
-static int
-mountpoint_last(struct nameidata *nd)
-{
-   int error = 0;
-   struct dentry *dir = nd->path.dentry;
-   struct path path;
-
-   /* If we're in rcuwalk, drop out of it to handle last component */
-   if (nd->flags & LOOKUP_RCU) {
-   if (unlazy_walk(nd))
-   return -ECHILD;
-   }
-
-   nd->flags &= ~LOOKUP_PARENT;
-
-   if (unlikely(nd->last_type != LAST_NORM)) {
-   error = handle_dots(nd, nd->last_type);
-   if (error)
-   return error;
-   

[PATCH 2/3] VFS: remove user_path_mountpoint_at()

2017-11-08 Thread NeilBrown
Now that d_weak_revalidate doesn't revalidate the inode (unless
LOOKUP_OPEN is set), we don't need any extra care when umounting.
A simple user_path_at() will find the desired dentry without
performing any access on the mounted filesystems.
So we don't need user_path_mountpoint_at().

By switching to user_path_at(), there are other changes than just the
d_weak_revalidate() change.
- We no longer use LOOKUP_NO_REVAL on the last component, in the
  unlikely case that d_lookup() failed.  It is hard to see why
  this is relevant.  Most likely if d_lookup() failed we will
  have called i_op->lookup, which is at least as much of a problem
  as d_revalidate() might be.
- we now call follow_managed() on the final dentry.  This cannot
  trigger an automount, due to the flags used, but might call
  ->d_manage().  There is no reason to expect that this might
  cause problems.

So we can safely switch to user_path_at() and discard
user_path_mountpoint_at().

kern_path_mountpoint() is still in use by autofs, and so cannot go
just yet.

Signed-off-by: NeilBrown 
---
 fs/internal.h  |1 -
 fs/namei.c |   21 -
 fs/namespace.c |2 +-
 3 files changed, 1 insertion(+), 23 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 48cee21b4f14..d1228af28761 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -52,7 +52,6 @@ extern void __init chrdev_init(void);
 /*
  * namei.c
  */
-extern int user_path_mountpoint_at(int, const char __user *, unsigned int, 
struct path *);
 extern int vfs_path_lookup(struct dentry *, struct vfsmount *,
   const char *, unsigned int, struct path *);
 
diff --git a/fs/namei.c b/fs/namei.c
index ed8b9488a890..6639203d7eba 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2722,27 +2722,6 @@ filename_mountpoint(int dfd, struct filename *name, 
struct path *path,
return error;
 }
 
-/**
- * user_path_mountpoint_at - lookup a path from userland in order to umount it
- * @dfd:   directory file descriptor
- * @name:  pathname from userland
- * @flags: lookup flags
- * @path:  pointer to container to hold result
- *
- * A umount is a special case for path walking. We're not actually interested
- * in the inode in this situation, and ESTALE errors can be a problem. We
- * simply want track down the dentry and vfsmount attached at the mountpoint
- * and avoid revalidating the last component.
- *
- * Returns 0 and populates "path" on success.
- */
-int
-user_path_mountpoint_at(int dfd, const char __user *name, unsigned int flags,
-   struct path *path)
-{
-   return filename_mountpoint(dfd, getname(name), path, flags);
-}
-
 int
 kern_path_mountpoint(int dfd, const char *name, struct path *path,
unsigned int flags)
diff --git a/fs/namespace.c b/fs/namespace.c
index 23cdf6c62895..6de22f658359 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1696,7 +1696,7 @@ SYSCALL_DEFINE2(umount, char __user *, name, int, flags)
if (!(flags & UMOUNT_NOFOLLOW))
lookup_flags |= LOOKUP_FOLLOW;
 
-   retval = user_path_mountpoint_at(AT_FDCWD, name, lookup_flags, );
+   retval = user_path_at(AT_FDCWD, name, lookup_flags, );
if (retval)
goto out;
mnt = real_mount(path.mnt);




[PATCH 2/3] VFS: remove user_path_mountpoint_at()

2017-11-08 Thread NeilBrown
Now that d_weak_revalidate doesn't revalidate the inode (unless
LOOKUP_OPEN is set), we don't need any extra care when umounting.
A simple user_path_at() will find the desired dentry without
performing any access on the mounted filesystems.
So we don't need user_path_mountpoint_at().

By switching to user_path_at(), there are other changes than just the
d_weak_revalidate() change.
- We no longer use LOOKUP_NO_REVAL on the last component, in the
  unlikely case that d_lookup() failed.  It is hard to see why
  this is relevant.  Most likely if d_lookup() failed we will
  have called i_op->lookup, which is at least as much of a problem
  as d_revalidate() might be.
- we now call follow_managed() on the final dentry.  This cannot
  trigger an automount, due to the flags used, but might call
  ->d_manage().  There is no reason to expect that this might
  cause problems.

So we can safely switch to user_path_at() and discard
user_path_mountpoint_at().

kern_path_mountpoint() is still in use by autofs, and so cannot go
just yet.

Signed-off-by: NeilBrown 
---
 fs/internal.h  |1 -
 fs/namei.c |   21 -
 fs/namespace.c |2 +-
 3 files changed, 1 insertion(+), 23 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 48cee21b4f14..d1228af28761 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -52,7 +52,6 @@ extern void __init chrdev_init(void);
 /*
  * namei.c
  */
-extern int user_path_mountpoint_at(int, const char __user *, unsigned int, 
struct path *);
 extern int vfs_path_lookup(struct dentry *, struct vfsmount *,
   const char *, unsigned int, struct path *);
 
diff --git a/fs/namei.c b/fs/namei.c
index ed8b9488a890..6639203d7eba 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2722,27 +2722,6 @@ filename_mountpoint(int dfd, struct filename *name, 
struct path *path,
return error;
 }
 
-/**
- * user_path_mountpoint_at - lookup a path from userland in order to umount it
- * @dfd:   directory file descriptor
- * @name:  pathname from userland
- * @flags: lookup flags
- * @path:  pointer to container to hold result
- *
- * A umount is a special case for path walking. We're not actually interested
- * in the inode in this situation, and ESTALE errors can be a problem. We
- * simply want track down the dentry and vfsmount attached at the mountpoint
- * and avoid revalidating the last component.
- *
- * Returns 0 and populates "path" on success.
- */
-int
-user_path_mountpoint_at(int dfd, const char __user *name, unsigned int flags,
-   struct path *path)
-{
-   return filename_mountpoint(dfd, getname(name), path, flags);
-}
-
 int
 kern_path_mountpoint(int dfd, const char *name, struct path *path,
unsigned int flags)
diff --git a/fs/namespace.c b/fs/namespace.c
index 23cdf6c62895..6de22f658359 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1696,7 +1696,7 @@ SYSCALL_DEFINE2(umount, char __user *, name, int, flags)
if (!(flags & UMOUNT_NOFOLLOW))
lookup_flags |= LOOKUP_FOLLOW;
 
-   retval = user_path_mountpoint_at(AT_FDCWD, name, lookup_flags, );
+   retval = user_path_at(AT_FDCWD, name, lookup_flags, );
if (retval)
goto out;
mnt = real_mount(path.mnt);




Re: [PATCH] mm, vmstat: Make sure mutex is a global static

2017-11-08 Thread Michal Hocko
On Wed 08-11-17 07:21:20, Kees Cook wrote:
> On Tue, Nov 7, 2017 at 11:43 PM, Vlastimil Babka  wrote:
> > On 11/07/2017 10:38 PM, Kees Cook wrote:
[...]
> >> +static DEFINE_MUTEX(vm_numa_stat_lock);
> >> +
> >>  int sysctl_vm_numa_stat_handler(struct ctl_table *table, int write,
> >>   void __user *buffer, size_t *length, loff_t *ppos)
> >>  {
> >>   int ret, oldval;
> >> - DEFINE_MUTEX(vm_numa_stat_lock);
> >
> > Yeah it was Michal who suggested scoping the mutex here instead of
> > global scope, but I think he didn't mean to remove the 'static'
> > qualifier, and we both missed that in the review :(
> > So the scope under sysctl_vm_numa_stat_handler() should be okay, just
> > with the 'static' added.
> 
> That part is a matter of taste, I guess. :) But yes, static is important.

The primary reason I prefer function scope lock is that it is less
tempting to reuse the lock for something else that way. But this is
hardly something to insist on. So I am ok with the file scope as well.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH] mm, vmstat: Make sure mutex is a global static

2017-11-08 Thread Michal Hocko
On Wed 08-11-17 07:21:20, Kees Cook wrote:
> On Tue, Nov 7, 2017 at 11:43 PM, Vlastimil Babka  wrote:
> > On 11/07/2017 10:38 PM, Kees Cook wrote:
[...]
> >> +static DEFINE_MUTEX(vm_numa_stat_lock);
> >> +
> >>  int sysctl_vm_numa_stat_handler(struct ctl_table *table, int write,
> >>   void __user *buffer, size_t *length, loff_t *ppos)
> >>  {
> >>   int ret, oldval;
> >> - DEFINE_MUTEX(vm_numa_stat_lock);
> >
> > Yeah it was Michal who suggested scoping the mutex here instead of
> > global scope, but I think he didn't mean to remove the 'static'
> > qualifier, and we both missed that in the review :(
> > So the scope under sysctl_vm_numa_stat_handler() should be okay, just
> > with the 'static' added.
> 
> That part is a matter of taste, I guess. :) But yes, static is important.

The primary reason I prefer function scope lock is that it is less
tempting to reuse the lock for something else that way. But this is
hardly something to insist on. So I am ok with the file scope as well.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 01/11] Initialize the mapping of KASan shadow memory

2017-11-08 Thread Liuwenliang (Abbott Liu)
On 12/10/17 15:59, Marc Zyngier [mailto:marc.zyng...@arm.com] wrote:
> On 11/10/17 09:22, Abbott Liu wrote:
>> diff --git a/arch/arm/include/asm/proc-fns.h 
>> b/arch/arm/include/asm/proc-fns.h
>> index f2e1af4..6e26714 100644
>> --- a/arch/arm/include/asm/proc-fns.h
>> +++ b/arch/arm/include/asm/proc-fns.h
>> @@ -131,6 +131,15 @@ extern void cpu_resume(void);
>>  pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);  \
>>  (pgd_t *)phys_to_virt(pg);  \
>>  })
>> +
>> +#define cpu_set_ttbr0(val)  \
>> +do {\
>> +u64 ttbr = val; \
>> +__asm__("mcrr   p15, 0, %Q0, %R0, c2"   \
>> +: : "r" (ttbr));\
>> +} while (0)
>> +
>> +
>>  #else
>>  #define cpu_get_pgd()   \
>>  ({  \
>> @@ -140,6 +149,30 @@ extern void cpu_resume(void);
>>  pg &= ~0x3fff;  \
>>  (pgd_t *)phys_to_virt(pg);  \
>>  })
>> +
>> +#define cpu_set_ttbr(nr, val)   \
>> +do {\
>> +u64 ttbr = val; \
>> +__asm__("mcrp15, 0, %0, c2, c0, 0"  \
>> +: : "r" (ttbr));\
>> +} while (0)
>> +
>> +#define cpu_get_ttbr(nr)\
>> +({  \
>> +unsigned long ttbr; \
>> +__asm__("mrcp15, 0, %0, c2, c0, 0"  \
>> +: "=r" (ttbr)); \
>> +ttbr;   \
>> +})
>> +
>> +#define cpu_set_ttbr0(val)  \
>> +do {\
>> +u64 ttbr = val; \
>> +__asm__("mcrp15, 0, %0, c2, c0, 0"  \
>> +: : "r" (ttbr));\
>> +} while (0)
>> +
>> +
>
>You could instead lift and extend the definitions provided in kvm_hyp.h,
>and use the read_sysreg/write_sysreg helpers defined in cp15.h.

Thanks for your review. 
I extend definitions of TTBR0/TTBR1/PAR in kvm_hyp.h when the CONFIG_ARM_LPAE 
is 
not defined. 
Because cortex A9 don't support virtualization, so use CONFIG_ARM_LPAE to 
exclude
some functions and macros which are only used in virtualization.

Here is the code which I tested on vexpress_a15 and vexpress_a9:

diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index 14b5903..2592608 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -19,12 +19,14 @@
 #define __ARM_KVM_HYP_H__

 #include 
-#include 
 #include 
+
+#ifdef CONFIG_ARM_LPAE
+#include 
 #include 
 #include 
-
 #define __hyp_text __section(.hyp.text) notrace
+#endif

 #define __ACCESS_VFP(CRn)  \
"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
@@ -37,12 +39,18 @@
__val;  \
 })

+#ifdef CONFIG_ARM_LPAE
 #define TTBR0  __ACCESS_CP15_64(0, c2)
 #define TTBR1  __ACCESS_CP15_64(1, c2)
 #define VTTBR  __ACCESS_CP15_64(6, c2)
 #define PAR__ACCESS_CP15_64(0, c7)
 #define CNTV_CVAL  __ACCESS_CP15_64(3, c14)
 #define CNTVOFF__ACCESS_CP15_64(4, c14)
+#else
+#define TTBR0   __ACCESS_CP15(c2, 0, c0, 0)
+#define TTBR1   __ACCESS_CP15(c2, 0, c0, 1)
+#define PAR  __ACCESS_CP15(c7, 0, c4, 0)
+#endif

 #define MIDR   __ACCESS_CP15(c0, 0, c0, 0)
 #define CSSELR __ACCESS_CP15(c0, 2, c0, 0)
@@ -98,6 +106,7 @@
 #define cntvoff_el2CNTVOFF
 #define cnthctl_el2CNTHCTL

+#ifdef CONFIG_ARM_LPAE
 void __timer_save_state(struct kvm_vcpu *vcpu);
 void __timer_restore_state(struct kvm_vcpu *vcpu);

@@ -123,5 +132,6 @@ void __hyp_text __banked_restore_state(struct 
kvm_cpu_context *ctxt);
 asmlinkage int __guest_enter(struct kvm_vcpu *vcpu,
 struct kvm_cpu_context *host);
 asmlinkage int __hyp_do_panic(const char *, int, u32);
+#endif

 #endif /* __ARM_KVM_HYP_H__ */
diff --git a/arch/arm/mm/kasan_init.c b/arch/arm/mm/kasan_init.c
index 049ee0a..359a782 100644
--- a/arch/arm/mm/kasan_init.c
+++ b/arch/arm/mm/kasan_init.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 #include "mm.h"
@@ -203,16 +204,16 @@ void __init kasan_init(void)
u64 orig_ttbr0;
int i;

-   orig_ttbr0 = cpu_get_ttbr(0);
+ orig_ttbr0 = read_sysreg(TTBR0);

 #ifdef CONFIG_ARM_LPAE
memcpy(tmp_pmd_table, 
pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_START)), 

Re: [PATCH 01/11] Initialize the mapping of KASan shadow memory

2017-11-08 Thread Liuwenliang (Abbott Liu)
On 12/10/17 15:59, Marc Zyngier [mailto:marc.zyng...@arm.com] wrote:
> On 11/10/17 09:22, Abbott Liu wrote:
>> diff --git a/arch/arm/include/asm/proc-fns.h 
>> b/arch/arm/include/asm/proc-fns.h
>> index f2e1af4..6e26714 100644
>> --- a/arch/arm/include/asm/proc-fns.h
>> +++ b/arch/arm/include/asm/proc-fns.h
>> @@ -131,6 +131,15 @@ extern void cpu_resume(void);
>>  pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);  \
>>  (pgd_t *)phys_to_virt(pg);  \
>>  })
>> +
>> +#define cpu_set_ttbr0(val)  \
>> +do {\
>> +u64 ttbr = val; \
>> +__asm__("mcrr   p15, 0, %Q0, %R0, c2"   \
>> +: : "r" (ttbr));\
>> +} while (0)
>> +
>> +
>>  #else
>>  #define cpu_get_pgd()   \
>>  ({  \
>> @@ -140,6 +149,30 @@ extern void cpu_resume(void);
>>  pg &= ~0x3fff;  \
>>  (pgd_t *)phys_to_virt(pg);  \
>>  })
>> +
>> +#define cpu_set_ttbr(nr, val)   \
>> +do {\
>> +u64 ttbr = val; \
>> +__asm__("mcrp15, 0, %0, c2, c0, 0"  \
>> +: : "r" (ttbr));\
>> +} while (0)
>> +
>> +#define cpu_get_ttbr(nr)\
>> +({  \
>> +unsigned long ttbr; \
>> +__asm__("mrcp15, 0, %0, c2, c0, 0"  \
>> +: "=r" (ttbr)); \
>> +ttbr;   \
>> +})
>> +
>> +#define cpu_set_ttbr0(val)  \
>> +do {\
>> +u64 ttbr = val; \
>> +__asm__("mcrp15, 0, %0, c2, c0, 0"  \
>> +: : "r" (ttbr));\
>> +} while (0)
>> +
>> +
>
>You could instead lift and extend the definitions provided in kvm_hyp.h,
>and use the read_sysreg/write_sysreg helpers defined in cp15.h.

Thanks for your review. 
I extend definitions of TTBR0/TTBR1/PAR in kvm_hyp.h when the CONFIG_ARM_LPAE 
is 
not defined. 
Because cortex A9 don't support virtualization, so use CONFIG_ARM_LPAE to 
exclude
some functions and macros which are only used in virtualization.

Here is the code which I tested on vexpress_a15 and vexpress_a9:

diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index 14b5903..2592608 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -19,12 +19,14 @@
 #define __ARM_KVM_HYP_H__

 #include 
-#include 
 #include 
+
+#ifdef CONFIG_ARM_LPAE
+#include 
 #include 
 #include 
-
 #define __hyp_text __section(.hyp.text) notrace
+#endif

 #define __ACCESS_VFP(CRn)  \
"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
@@ -37,12 +39,18 @@
__val;  \
 })

+#ifdef CONFIG_ARM_LPAE
 #define TTBR0  __ACCESS_CP15_64(0, c2)
 #define TTBR1  __ACCESS_CP15_64(1, c2)
 #define VTTBR  __ACCESS_CP15_64(6, c2)
 #define PAR__ACCESS_CP15_64(0, c7)
 #define CNTV_CVAL  __ACCESS_CP15_64(3, c14)
 #define CNTVOFF__ACCESS_CP15_64(4, c14)
+#else
+#define TTBR0   __ACCESS_CP15(c2, 0, c0, 0)
+#define TTBR1   __ACCESS_CP15(c2, 0, c0, 1)
+#define PAR  __ACCESS_CP15(c7, 0, c4, 0)
+#endif

 #define MIDR   __ACCESS_CP15(c0, 0, c0, 0)
 #define CSSELR __ACCESS_CP15(c0, 2, c0, 0)
@@ -98,6 +106,7 @@
 #define cntvoff_el2CNTVOFF
 #define cnthctl_el2CNTHCTL

+#ifdef CONFIG_ARM_LPAE
 void __timer_save_state(struct kvm_vcpu *vcpu);
 void __timer_restore_state(struct kvm_vcpu *vcpu);

@@ -123,5 +132,6 @@ void __hyp_text __banked_restore_state(struct 
kvm_cpu_context *ctxt);
 asmlinkage int __guest_enter(struct kvm_vcpu *vcpu,
 struct kvm_cpu_context *host);
 asmlinkage int __hyp_do_panic(const char *, int, u32);
+#endif

 #endif /* __ARM_KVM_HYP_H__ */
diff --git a/arch/arm/mm/kasan_init.c b/arch/arm/mm/kasan_init.c
index 049ee0a..359a782 100644
--- a/arch/arm/mm/kasan_init.c
+++ b/arch/arm/mm/kasan_init.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 #include "mm.h"
@@ -203,16 +204,16 @@ void __init kasan_init(void)
u64 orig_ttbr0;
int i;

-   orig_ttbr0 = cpu_get_ttbr(0);
+ orig_ttbr0 = read_sysreg(TTBR0);

 #ifdef CONFIG_ARM_LPAE
memcpy(tmp_pmd_table, 
pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_START)), 

Re: [PATCH 23/31] nds32: Device tree support

2017-11-08 Thread Greentime Hu
2017-11-08 17:53 GMT+08:00 Arnd Bergmann :
> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>> From: Greentime Hu 
>>
>> Signed-off-by: Vincent Chen 
>> Signed-off-by: Greentime Hu 
>> ---
>>  arch/nds32/boot/dts/Makefile   |8 ++
>>  arch/nds32/boot/dts/ae3xx.dts  |   55 
>>  arch/nds32/boot/dts/ag101p.dts |   60 
>> 
>>  arch/nds32/kernel/devtree.c|   45 ++
>>  4 files changed, 168 insertions(+)
>>  create mode 100644 arch/nds32/boot/dts/Makefile
>>  create mode 100644 arch/nds32/boot/dts/ae3xx.dts
>>  create mode 100644 arch/nds32/boot/dts/ag101p.dts
>>  create mode 100644 arch/nds32/kernel/devtree.c
>>
>> diff --git a/arch/nds32/boot/dts/Makefile b/arch/nds32/boot/dts/Makefile
>> new file mode 100644
>> index 000..d31faa8
>> --- /dev/null
>> +++ b/arch/nds32/boot/dts/Makefile
>> @@ -0,0 +1,8 @@
>> +ifneq '$(CONFIG_NDS32_BUILTIN_DTB)' '""'
>> +BUILTIN_DTB := $(patsubst "%",%,$(CONFIG_NDS32_BUILTIN_DTB)).dtb.o
>> +else
>> +BUILTIN_DTB :=
>> +endif
>> +obj-$(CONFIG_OF) += $(BUILTIN_DTB)
>
> For new architectures, I think it's better to not support built-in dtb
> but instead require the
> boot loader to be aware of device trees.

Thanks.
I got your point and we have uboot supporting DTB too.
But we use gdb to load vmlinux without uboot sometimes, it can help us
to verify kernel more easily.
If dtb pointer is set by uboot, we will use it instead of using built-in dtb.

>> +clean-files := *.dtb *.dtb.S
>> diff --git a/arch/nds32/boot/dts/ae3xx.dts b/arch/nds32/boot/dts/ae3xx.dts
>> new file mode 100644
>> index 000..b6c85dc
>> --- /dev/null
>> +++ b/arch/nds32/boot/dts/ae3xx.dts
>> @@ -0,0 +1,55 @@
>> +/dts-v1/;
>> +/ {
>> +   compatible = "nds32 ae3xx";
>
> ae3xx looks like a wildcard name for multiple boards. Please always
> have compatible
> names without wildcards. You usually also want to list both the SoC
> and the board
> here.

Thanks.
It looks a little bit weird but its not a wildcards. :p
I will list all the names instead of wildcards if I need to add more platforms.

>> +   #address-cells = <1>;
>> +   #size-cells = <1>;
>> +   interrupt-parent = <>;
>> +
>> +   chosen {
>> +   bootargs = "console=ttyS0,38400n8 
>> earlyprintk=uart8250-32bit,0xf030 debug loglevel=7";
>> +   };
>
> Remove the earlyprintk option from the bootargs here, regular boards
> should never rely
> on earlyprintk. The "earlycon" support in the uart drivers works
> almost as well (it starts
> slightly later in the boot process), and it will pick up the uart from
> the chosen/stdout-path
> property.

Thanks.
I will remove it.

>> +   if (!params || !early_init_dt_scan(params)) {
>> +   pr_crit("\n"
>> +   "Error: invalid device tree blob at (virtual address 
>> 0x%p)\n"
>> +   "The dtb must be 8-byte aligned and must not exceed 
>> 8 KB in size\n"
>> +   "\nPlease check your bootloader.", params);
>
> What is the 8KB limit for the dtb for? This sounds really limiting
> once you get to
> more complex SoCs.

Thanks.
We allow uboot to set its start address and we think it might not be
too big thus we limit it to 8KB.
Maybe I should use a big size for complex SoCs.
I will update it in the next version patch.


Re: [PATCH 23/31] nds32: Device tree support

2017-11-08 Thread Greentime Hu
2017-11-08 17:53 GMT+08:00 Arnd Bergmann :
> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>> From: Greentime Hu 
>>
>> Signed-off-by: Vincent Chen 
>> Signed-off-by: Greentime Hu 
>> ---
>>  arch/nds32/boot/dts/Makefile   |8 ++
>>  arch/nds32/boot/dts/ae3xx.dts  |   55 
>>  arch/nds32/boot/dts/ag101p.dts |   60 
>> 
>>  arch/nds32/kernel/devtree.c|   45 ++
>>  4 files changed, 168 insertions(+)
>>  create mode 100644 arch/nds32/boot/dts/Makefile
>>  create mode 100644 arch/nds32/boot/dts/ae3xx.dts
>>  create mode 100644 arch/nds32/boot/dts/ag101p.dts
>>  create mode 100644 arch/nds32/kernel/devtree.c
>>
>> diff --git a/arch/nds32/boot/dts/Makefile b/arch/nds32/boot/dts/Makefile
>> new file mode 100644
>> index 000..d31faa8
>> --- /dev/null
>> +++ b/arch/nds32/boot/dts/Makefile
>> @@ -0,0 +1,8 @@
>> +ifneq '$(CONFIG_NDS32_BUILTIN_DTB)' '""'
>> +BUILTIN_DTB := $(patsubst "%",%,$(CONFIG_NDS32_BUILTIN_DTB)).dtb.o
>> +else
>> +BUILTIN_DTB :=
>> +endif
>> +obj-$(CONFIG_OF) += $(BUILTIN_DTB)
>
> For new architectures, I think it's better to not support built-in dtb
> but instead require the
> boot loader to be aware of device trees.

Thanks.
I got your point and we have uboot supporting DTB too.
But we use gdb to load vmlinux without uboot sometimes, it can help us
to verify kernel more easily.
If dtb pointer is set by uboot, we will use it instead of using built-in dtb.

>> +clean-files := *.dtb *.dtb.S
>> diff --git a/arch/nds32/boot/dts/ae3xx.dts b/arch/nds32/boot/dts/ae3xx.dts
>> new file mode 100644
>> index 000..b6c85dc
>> --- /dev/null
>> +++ b/arch/nds32/boot/dts/ae3xx.dts
>> @@ -0,0 +1,55 @@
>> +/dts-v1/;
>> +/ {
>> +   compatible = "nds32 ae3xx";
>
> ae3xx looks like a wildcard name for multiple boards. Please always
> have compatible
> names without wildcards. You usually also want to list both the SoC
> and the board
> here.

Thanks.
It looks a little bit weird but its not a wildcards. :p
I will list all the names instead of wildcards if I need to add more platforms.

>> +   #address-cells = <1>;
>> +   #size-cells = <1>;
>> +   interrupt-parent = <>;
>> +
>> +   chosen {
>> +   bootargs = "console=ttyS0,38400n8 
>> earlyprintk=uart8250-32bit,0xf030 debug loglevel=7";
>> +   };
>
> Remove the earlyprintk option from the bootargs here, regular boards
> should never rely
> on earlyprintk. The "earlycon" support in the uart drivers works
> almost as well (it starts
> slightly later in the boot process), and it will pick up the uart from
> the chosen/stdout-path
> property.

Thanks.
I will remove it.

>> +   if (!params || !early_init_dt_scan(params)) {
>> +   pr_crit("\n"
>> +   "Error: invalid device tree blob at (virtual address 
>> 0x%p)\n"
>> +   "The dtb must be 8-byte aligned and must not exceed 
>> 8 KB in size\n"
>> +   "\nPlease check your bootloader.", params);
>
> What is the 8KB limit for the dtb for? This sounds really limiting
> once you get to
> more complex SoCs.

Thanks.
We allow uboot to set its start address and we think it might not be
too big thus we limit it to 8KB.
Maybe I should use a big size for complex SoCs.
I will update it in the next version patch.


Re: [PATCH 2/3] watchdog: jz4780: Allow selection of jz4740-wdt driver

2017-11-08 Thread James Hogan
Hi Wim,

On Fri, Sep 08, 2017 at 08:35:54PM +0200, Mathieu Malaterre wrote:
> This driver works for jz4740 & jz4780
> 
> Suggested-by: Maarten ter Huurne 
> Signed-off-by: Mathieu Malaterre 

I just noticed that though Ralf applied the other two patches in this
series (defconfig + dt), he hadn't applied this patch.

Please can we have an ack from a watchdog maintainer so this can get
into 4.15 via the MIPS tree? It could alternatively go via the watchdog
tree if you prefer.

Thanks
James

> ---
>  drivers/watchdog/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
> index c722cbfdc7e6..ca200d1f310a 100644
> --- a/drivers/watchdog/Kconfig
> +++ b/drivers/watchdog/Kconfig
> @@ -1460,7 +1460,7 @@ config INDYDOG
>  
>  config JZ4740_WDT
>   tristate "Ingenic jz4740 SoC hardware watchdog"
> - depends on MACH_JZ4740
> + depends on MACH_JZ4740 || MACH_JZ4780
>   select WATCHDOG_CORE
>   help
> Hardware driver for the built-in watchdog timer on Ingenic jz4740 
> SoCs.
> -- 
> 2.11.0
> 
> 


signature.asc
Description: Digital signature


Re: [PATCH 2/3] watchdog: jz4780: Allow selection of jz4740-wdt driver

2017-11-08 Thread James Hogan
Hi Wim,

On Fri, Sep 08, 2017 at 08:35:54PM +0200, Mathieu Malaterre wrote:
> This driver works for jz4740 & jz4780
> 
> Suggested-by: Maarten ter Huurne 
> Signed-off-by: Mathieu Malaterre 

I just noticed that though Ralf applied the other two patches in this
series (defconfig + dt), he hadn't applied this patch.

Please can we have an ack from a watchdog maintainer so this can get
into 4.15 via the MIPS tree? It could alternatively go via the watchdog
tree if you prefer.

Thanks
James

> ---
>  drivers/watchdog/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
> index c722cbfdc7e6..ca200d1f310a 100644
> --- a/drivers/watchdog/Kconfig
> +++ b/drivers/watchdog/Kconfig
> @@ -1460,7 +1460,7 @@ config INDYDOG
>  
>  config JZ4740_WDT
>   tristate "Ingenic jz4740 SoC hardware watchdog"
> - depends on MACH_JZ4740
> + depends on MACH_JZ4740 || MACH_JZ4780
>   select WATCHDOG_CORE
>   help
> Hardware driver for the built-in watchdog timer on Ingenic jz4740 
> SoCs.
> -- 
> 2.11.0
> 
> 


signature.asc
Description: Digital signature


[PATCH 2/13] scsi: arcmsr: simplify arcmsr_iop_init function

2017-11-08 Thread Ching Huang
From: Ching Huang 

simplify arcmsr_iop_init function

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:46:42.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:46:40.0 +0800
@@ -3675,6 +3675,39 @@ static void arcmsr_hardware_reset(struct
msleep(1000);
return;
 }
+
+static bool arcmsr_reset_in_progress(struct AdapterControlBlock *acb)
+{
+   bool rtn = true;
+
+   switch(acb->adapter_type) {
+   case ACB_ADAPTER_TYPE_A:{
+   struct MessageUnit_A __iomem *reg = acb->pmuA;
+   rtn = ((readl(>outbound_msgaddr1) &
+   ARCMSR_OUTBOUND_MESG1_FIRMWARE_OK) == 0) ? true : false;
+   }
+   break;
+   case ACB_ADAPTER_TYPE_B:{
+   struct MessageUnit_B *reg = acb->pmuB;
+   rtn = ((readl(reg->iop2drv_doorbell) &
+   ARCMSR_MESSAGE_FIRMWARE_OK) == 0) ? true : false;
+   }
+   break;
+   case ACB_ADAPTER_TYPE_C:{
+   struct MessageUnit_C __iomem *reg = acb->pmuC;
+   rtn = (readl(>host_diagnostic) & 0x04) ? true : false;
+   }
+   break;
+   case ACB_ADAPTER_TYPE_D:{
+   struct MessageUnit_D *reg = acb->pmuD;
+   rtn = ((readl(reg->sample_at_reset) & 0x80) == 0) ?
+   true : false;
+   }
+   break;
+   }
+   return rtn;
+}
+
 static void arcmsr_iop_init(struct AdapterControlBlock *acb)
 {
uint32_t intmask_org;
@@ -3729,197 +3762,55 @@ static uint8_t arcmsr_iop_reset(struct A
 static int arcmsr_bus_reset(struct scsi_cmnd *cmd)
 {
struct AdapterControlBlock *acb;
-   uint32_t intmask_org, outbound_doorbell;
int retry_count = 0;
int rtn = FAILED;
acb = (struct AdapterControlBlock *) cmd->device->host->hostdata;
-   printk(KERN_ERR "arcmsr: executing bus reset eh.num_resets = %d, 
num_aborts = %d \n", acb->num_resets, acb->num_aborts);
+   pr_notice("arcmsr: executing bus reset eh.num_resets = %d,"
+   " num_aborts = %d \n", acb->num_resets, acb->num_aborts);
acb->num_resets++;
 
-   switch(acb->adapter_type){
-   case ACB_ADAPTER_TYPE_A:{
-   if (acb->acb_flags & ACB_F_BUS_RESET){
-   long timeout;
-   printk(KERN_ERR "arcmsr: there is an  bus reset 
eh proceeding...\n");
-   timeout = wait_event_timeout(wait_q, 
(acb->acb_flags & ACB_F_BUS_RESET) == 0, 220*HZ);
-   if (timeout) {
-   return SUCCESS;
-   }
-   }
-   acb->acb_flags |= ACB_F_BUS_RESET;
-   if (!arcmsr_iop_reset(acb)) {
-   struct MessageUnit_A __iomem *reg;
-   reg = acb->pmuA;
-   arcmsr_hardware_reset(acb);
-   acb->acb_flags &= ~ACB_F_IOP_INITED;
-sleep_again:
-   ssleep(ARCMSR_SLEEPTIME);
-   if ((readl(>outbound_msgaddr1) & 
ARCMSR_OUTBOUND_MESG1_FIRMWARE_OK) == 0) {
-   printk(KERN_ERR "arcmsr%d: waiting for 
hw bus reset return, retry=%d\n", acb->host->host_no, retry_count);
-   if (retry_count > ARCMSR_RETRYCOUNT) {
-   acb->fw_flag = FW_DEADLOCK;
-   printk(KERN_ERR "arcmsr%d: 
waiting for hw bus reset return, RETRY TERMINATED!!\n", acb->host->host_no);
-   return FAILED;
-   }
-   retry_count++;
-   goto sleep_again;
-   }
-   acb->acb_flags |= ACB_F_IOP_INITED;
-   /* disable all outbound interrupt */
-   intmask_org = arcmsr_disable_outbound_ints(acb);
-   arcmsr_get_firmware_spec(acb);
-   arcmsr_start_adapter_bgrb(acb);
-   /* clear Qbuffer if door bell ringed */
-   outbound_doorbell = 
readl(>outbound_doorbell);
-   writel(outbound_doorbell, 
>outbound_doorbell); /*clear interrupt */
-   writel(ARCMSR_INBOUND_DRIVER_DATA_READ_OK, 
>inbound_doorbell);
-   /* enable outbound Post Queue,outbound doorbell 
Interrupt */
-

[PATCH 2/13] scsi: arcmsr: simplify arcmsr_iop_init function

2017-11-08 Thread Ching Huang
From: Ching Huang 

simplify arcmsr_iop_init function

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:46:42.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:46:40.0 +0800
@@ -3675,6 +3675,39 @@ static void arcmsr_hardware_reset(struct
msleep(1000);
return;
 }
+
+static bool arcmsr_reset_in_progress(struct AdapterControlBlock *acb)
+{
+   bool rtn = true;
+
+   switch(acb->adapter_type) {
+   case ACB_ADAPTER_TYPE_A:{
+   struct MessageUnit_A __iomem *reg = acb->pmuA;
+   rtn = ((readl(>outbound_msgaddr1) &
+   ARCMSR_OUTBOUND_MESG1_FIRMWARE_OK) == 0) ? true : false;
+   }
+   break;
+   case ACB_ADAPTER_TYPE_B:{
+   struct MessageUnit_B *reg = acb->pmuB;
+   rtn = ((readl(reg->iop2drv_doorbell) &
+   ARCMSR_MESSAGE_FIRMWARE_OK) == 0) ? true : false;
+   }
+   break;
+   case ACB_ADAPTER_TYPE_C:{
+   struct MessageUnit_C __iomem *reg = acb->pmuC;
+   rtn = (readl(>host_diagnostic) & 0x04) ? true : false;
+   }
+   break;
+   case ACB_ADAPTER_TYPE_D:{
+   struct MessageUnit_D *reg = acb->pmuD;
+   rtn = ((readl(reg->sample_at_reset) & 0x80) == 0) ?
+   true : false;
+   }
+   break;
+   }
+   return rtn;
+}
+
 static void arcmsr_iop_init(struct AdapterControlBlock *acb)
 {
uint32_t intmask_org;
@@ -3729,197 +3762,55 @@ static uint8_t arcmsr_iop_reset(struct A
 static int arcmsr_bus_reset(struct scsi_cmnd *cmd)
 {
struct AdapterControlBlock *acb;
-   uint32_t intmask_org, outbound_doorbell;
int retry_count = 0;
int rtn = FAILED;
acb = (struct AdapterControlBlock *) cmd->device->host->hostdata;
-   printk(KERN_ERR "arcmsr: executing bus reset eh.num_resets = %d, 
num_aborts = %d \n", acb->num_resets, acb->num_aborts);
+   pr_notice("arcmsr: executing bus reset eh.num_resets = %d,"
+   " num_aborts = %d \n", acb->num_resets, acb->num_aborts);
acb->num_resets++;
 
-   switch(acb->adapter_type){
-   case ACB_ADAPTER_TYPE_A:{
-   if (acb->acb_flags & ACB_F_BUS_RESET){
-   long timeout;
-   printk(KERN_ERR "arcmsr: there is an  bus reset 
eh proceeding...\n");
-   timeout = wait_event_timeout(wait_q, 
(acb->acb_flags & ACB_F_BUS_RESET) == 0, 220*HZ);
-   if (timeout) {
-   return SUCCESS;
-   }
-   }
-   acb->acb_flags |= ACB_F_BUS_RESET;
-   if (!arcmsr_iop_reset(acb)) {
-   struct MessageUnit_A __iomem *reg;
-   reg = acb->pmuA;
-   arcmsr_hardware_reset(acb);
-   acb->acb_flags &= ~ACB_F_IOP_INITED;
-sleep_again:
-   ssleep(ARCMSR_SLEEPTIME);
-   if ((readl(>outbound_msgaddr1) & 
ARCMSR_OUTBOUND_MESG1_FIRMWARE_OK) == 0) {
-   printk(KERN_ERR "arcmsr%d: waiting for 
hw bus reset return, retry=%d\n", acb->host->host_no, retry_count);
-   if (retry_count > ARCMSR_RETRYCOUNT) {
-   acb->fw_flag = FW_DEADLOCK;
-   printk(KERN_ERR "arcmsr%d: 
waiting for hw bus reset return, RETRY TERMINATED!!\n", acb->host->host_no);
-   return FAILED;
-   }
-   retry_count++;
-   goto sleep_again;
-   }
-   acb->acb_flags |= ACB_F_IOP_INITED;
-   /* disable all outbound interrupt */
-   intmask_org = arcmsr_disable_outbound_ints(acb);
-   arcmsr_get_firmware_spec(acb);
-   arcmsr_start_adapter_bgrb(acb);
-   /* clear Qbuffer if door bell ringed */
-   outbound_doorbell = 
readl(>outbound_doorbell);
-   writel(outbound_doorbell, 
>outbound_doorbell); /*clear interrupt */
-   writel(ARCMSR_INBOUND_DRIVER_DATA_READ_OK, 
>inbound_doorbell);
-   /* enable outbound Post Queue,outbound doorbell 
Interrupt */
-   arcmsr_enable_outbound_ints(acb, 

Re: [PATCH V13 10/10] mmc: block: blk-mq: Stop using legacy recovery

2017-11-08 Thread Adrian Hunter
On 08/11/17 11:38, Linus Walleij wrote:
> On Fri, Nov 3, 2017 at 2:20 PM, Adrian Hunter  wrote:
> 
>> There are only a few things the recovery needs to do. Primarily, it just
>> needs to:
>> Determine the number of bytes transferred
>> Get the card back to transfer state
>> Determine whether to retry
>>
>> There are also a couple of additional features:
>> Reset the card before the last retry
>> Read one sector at a time
>>
>> The legacy code spent much effort analyzing command errors, but commands
>> fail fast, so it is simpler just to give all command errors the same number
>> of retries.
>>
>> Signed-off-by: Adrian Hunter 
> 
> I have nothing against the patch as such. In fact something
> like this makes a lot of sense (to me).
> 
> But this just makes mmc_blk_rw_recovery() look really nice.
> 
> And leaves a very ugly mmc_blk_issue_rw_rq() with the legacy
> error handling in-tree.
> 
> The former function isn't even named with some *mq* infix
> making it clear that the new recovery path only happens
> in the MQ case.
> 
> If newcomers read this code in the MMC stack they will
> just tear their hair, scream and run away. Even faster than
> before.
> 
> How are they supposed to know which functions are used on
> which path? Run ftrace?

You're kidding me right?  You don't know how to find where a function used?

> This illustrates firmly why we need to refactor and/or kill off
> the old block layer interface *first* then add MQ on top.

No it doesn't!  You are playing games!  One function could be named
differently, so that is evidence the whole patch set should be ignored.

The old code is rubbish.  There is nothing worth keeping.  Churning it
around is a waste of everybody's time.  Review and test the new code.
Delete the old code.  Much much simpler!


Re: [PATCH V13 10/10] mmc: block: blk-mq: Stop using legacy recovery

2017-11-08 Thread Adrian Hunter
On 08/11/17 11:38, Linus Walleij wrote:
> On Fri, Nov 3, 2017 at 2:20 PM, Adrian Hunter  wrote:
> 
>> There are only a few things the recovery needs to do. Primarily, it just
>> needs to:
>> Determine the number of bytes transferred
>> Get the card back to transfer state
>> Determine whether to retry
>>
>> There are also a couple of additional features:
>> Reset the card before the last retry
>> Read one sector at a time
>>
>> The legacy code spent much effort analyzing command errors, but commands
>> fail fast, so it is simpler just to give all command errors the same number
>> of retries.
>>
>> Signed-off-by: Adrian Hunter 
> 
> I have nothing against the patch as such. In fact something
> like this makes a lot of sense (to me).
> 
> But this just makes mmc_blk_rw_recovery() look really nice.
> 
> And leaves a very ugly mmc_blk_issue_rw_rq() with the legacy
> error handling in-tree.
> 
> The former function isn't even named with some *mq* infix
> making it clear that the new recovery path only happens
> in the MQ case.
> 
> If newcomers read this code in the MMC stack they will
> just tear their hair, scream and run away. Even faster than
> before.
> 
> How are they supposed to know which functions are used on
> which path? Run ftrace?

You're kidding me right?  You don't know how to find where a function used?

> This illustrates firmly why we need to refactor and/or kill off
> the old block layer interface *first* then add MQ on top.

No it doesn't!  You are playing games!  One function could be named
differently, so that is evidence the whole patch set should be ignored.

The old code is rubbish.  There is nothing worth keeping.  Churning it
around is a waste of everybody's time.  Review and test the new code.
Delete the old code.  Much much simpler!


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

On Thu, Nov 09, 2017 at 02:55:10PM +0800, Fengguang Wu wrote:

On Wed, Nov 08, 2017 at 10:34:10PM -0800, Cong Wang wrote:

On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu  wrote:

Hi Alex,


So looking over the trace the panic seems to be happening after a
decnet interface is getting deleted. Is there any chance we could try
compiling the kernel without decnet support to see if that is the
source of these issues? I don't know if anyone on the Intel Wired Lan
team is testing with that enabled so if we can eliminate that as a
possible cause that would be useful.



Sure and thank you for the suggestion!

It looks disabling DECNET still triggers the vlan_device_event BUG.
However when looking at the dmesgs, I find another warning just before
the vlan_device_event BUG. Not sure if it's related one or independent
now-fixed issue.


Those decnet symbols are probably noises.


Yes it's not related to CONFIG_DECNET.


How do you reproduce it? And what is your setup? Vlan device on
top of your eth0 (e1000)?


It can basically be reproduced in one of our test machines --
lkp-wsx03, which is a Westmere EX server.


Anyway if you'd like to try, here are the steps. It'll auto download
the images and run QEMU.

   apt-get install lib32gcc-7-dev # or lib32gcc-6-dev
   git clone https://github.com/intel/lkp-tests.git
   cd lkp-tests
   bin/lkp qemu -k  job-script  # job-script is attached in this 
email

Note that even in our lkp-wsx03 machine, the chance of reproducing it
is only 3% (3 out of 100 boots).

Thanks,
Fengguang
#!/bin/sh

export_top_env()
{
export suite='trinity'
export testcase='trinity'
export runtime=300
export 
job_origin='/lkp/lkp/src/allot/rand/vm-lkp-wsx03-openwrt-i386/trinity.yaml'
export testbox='vm-lkp-wsx03-openwrt-i386-5'
export tbox_group='vm-lkp-wsx03-openwrt-i386'
export kconfig='i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS'
export compiler='gcc-6'
export queue='wfg'
export branch='linus/master'
export commit='c470abd4fde40ea6a0846a2beab642a578c0b8cd'
export submit_id='5a03a4550b9a93f7c99708b0'
export 
job_file='/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-5/trinity-300s-openwrt-i386-2016-03-16.cgz-c470abd4fde40ea6a0846a2beab642a578c0b8cd-20171109-63433-kf9gj3-wait_kernel-0.yaml'
export id='181954ca4367d475b88dc8de99b2d52ab533a5e1'
export model='qemu-system-i386 -enable-kvm'
export nr_vm=32
export nr_cpu=1
export memory='320M'
export rootfs='openwrt-i386-2016-03-16.cgz'
export hdd_partitions='/dev/vda'
export swap_partitions='/dev/vdb'
export need_kconfig='CONFIG_KVM_GUEST=y'
export enqueue_time='2017-11-09 08:41:58 +0800'
export _id='5a03a4560b9a93f7c99708bb'
export 
_rt='/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd'
export user='lkp'
export 
result_root='/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/0'
export LKP_SERVER='inn'
export max_uptime=1500
export initrd='/osimage/openwrt/openwrt-i386-2016-03-16.cgz'
export bootloader_append='root=/dev/ram0
user=lkp
job=/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-5/trinity-300s-openwrt-i386-2016-03-16.cgz-c470abd4fde40ea6a0846a2beab642a578c0b8cd-20171109-63433-kf9gj3-wait_kernel-0.yaml
ARCH=i386
kconfig=i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS
branch=linus/master
commit=c470abd4fde40ea6a0846a2beab642a578c0b8cd
BOOT_IMAGE=/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinuz-4.10.0
max_uptime=1500
RESULT_ROOT=/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/0
LKP_SERVER=inn
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
net.ifnames=0
printk.devkmsg=on
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
drbd.minor_count=8
systemd.log_level=err
ignore_loglevel
console=tty0
earlyprintk=ttyS0,115200
console=ttyS0,115200
vga=normal
rw'
export lkp_initrd='/lkp/lkp/lkp-i386.cgz'
export bm_initrd='/osimage/pkg/static/trinity-i386.cgz'
export site='inn'
export LKP_CGI_PORT=80
export LKP_CIFS_PORT=139
export 
vmlinux_file='/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinux'
export 
kernel='/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinuz-4.10.0'
export dequeue_time='2017-11-09 09:06:15 +0800'
export 

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

On Thu, Nov 09, 2017 at 02:55:10PM +0800, Fengguang Wu wrote:

On Wed, Nov 08, 2017 at 10:34:10PM -0800, Cong Wang wrote:

On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu  wrote:

Hi Alex,


So looking over the trace the panic seems to be happening after a
decnet interface is getting deleted. Is there any chance we could try
compiling the kernel without decnet support to see if that is the
source of these issues? I don't know if anyone on the Intel Wired Lan
team is testing with that enabled so if we can eliminate that as a
possible cause that would be useful.



Sure and thank you for the suggestion!

It looks disabling DECNET still triggers the vlan_device_event BUG.
However when looking at the dmesgs, I find another warning just before
the vlan_device_event BUG. Not sure if it's related one or independent
now-fixed issue.


Those decnet symbols are probably noises.


Yes it's not related to CONFIG_DECNET.


How do you reproduce it? And what is your setup? Vlan device on
top of your eth0 (e1000)?


It can basically be reproduced in one of our test machines --
lkp-wsx03, which is a Westmere EX server.


Anyway if you'd like to try, here are the steps. It'll auto download
the images and run QEMU.

   apt-get install lib32gcc-7-dev # or lib32gcc-6-dev
   git clone https://github.com/intel/lkp-tests.git
   cd lkp-tests
   bin/lkp qemu -k  job-script  # job-script is attached in this 
email

Note that even in our lkp-wsx03 machine, the chance of reproducing it
is only 3% (3 out of 100 boots).

Thanks,
Fengguang
#!/bin/sh

export_top_env()
{
export suite='trinity'
export testcase='trinity'
export runtime=300
export 
job_origin='/lkp/lkp/src/allot/rand/vm-lkp-wsx03-openwrt-i386/trinity.yaml'
export testbox='vm-lkp-wsx03-openwrt-i386-5'
export tbox_group='vm-lkp-wsx03-openwrt-i386'
export kconfig='i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS'
export compiler='gcc-6'
export queue='wfg'
export branch='linus/master'
export commit='c470abd4fde40ea6a0846a2beab642a578c0b8cd'
export submit_id='5a03a4550b9a93f7c99708b0'
export 
job_file='/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-5/trinity-300s-openwrt-i386-2016-03-16.cgz-c470abd4fde40ea6a0846a2beab642a578c0b8cd-20171109-63433-kf9gj3-wait_kernel-0.yaml'
export id='181954ca4367d475b88dc8de99b2d52ab533a5e1'
export model='qemu-system-i386 -enable-kvm'
export nr_vm=32
export nr_cpu=1
export memory='320M'
export rootfs='openwrt-i386-2016-03-16.cgz'
export hdd_partitions='/dev/vda'
export swap_partitions='/dev/vdb'
export need_kconfig='CONFIG_KVM_GUEST=y'
export enqueue_time='2017-11-09 08:41:58 +0800'
export _id='5a03a4560b9a93f7c99708bb'
export 
_rt='/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd'
export user='lkp'
export 
result_root='/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/0'
export LKP_SERVER='inn'
export max_uptime=1500
export initrd='/osimage/openwrt/openwrt-i386-2016-03-16.cgz'
export bootloader_append='root=/dev/ram0
user=lkp
job=/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-5/trinity-300s-openwrt-i386-2016-03-16.cgz-c470abd4fde40ea6a0846a2beab642a578c0b8cd-20171109-63433-kf9gj3-wait_kernel-0.yaml
ARCH=i386
kconfig=i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS
branch=linus/master
commit=c470abd4fde40ea6a0846a2beab642a578c0b8cd
BOOT_IMAGE=/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinuz-4.10.0
max_uptime=1500
RESULT_ROOT=/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/0
LKP_SERVER=inn
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
net.ifnames=0
printk.devkmsg=on
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
drbd.minor_count=8
systemd.log_level=err
ignore_loglevel
console=tty0
earlyprintk=ttyS0,115200
console=ttyS0,115200
vga=normal
rw'
export lkp_initrd='/lkp/lkp/lkp-i386.cgz'
export bm_initrd='/osimage/pkg/static/trinity-i386.cgz'
export site='inn'
export LKP_CGI_PORT=80
export LKP_CIFS_PORT=139
export 
vmlinux_file='/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinux'
export 
kernel='/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinuz-4.10.0'
export dequeue_time='2017-11-09 09:06:15 +0800'
export 

Re: linux-next: build warnings after merge of the gpio tree

2017-11-08 Thread Linus Walleij
On Thu, Nov 9, 2017 at 4:51 AM, Stephen Rothwell  wrote:
> On Fri, 3 Nov 2017 16:37:24 +1100 Stephen Rothwell  
> wrote:
>>
>> After merging the gpio tree, yesterday's linux-next build (arm
>> multi_v7_defconfig) produced these warnings:
>>
>> arch/arm/boot/dts/bcm2835-rpi-b.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-b-rev2.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-a.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-b-plus.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-a-plus.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2836-rpi-2-b.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2837-rpi-3-b.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-zero.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-zero-w.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/exynos5410-odroidxu.dtb: Warning (interrupts_property): 
>> Missing interrupt-controller or interrupt-map property in 
>> /soc/system-controller@1004
>>
>> and many, many more.
>>
>> I have no idea what caused this.
>
> I am still getting lots of these ...

I have absolutely no clue either.

What I know is that there is a device tree compiler warning that can
be turned on, and
it generates these warnings a lot. The actual problems have been in
the DTS files
forever. They just recently started to look into them.

It has nothing to do with the GPIO tree whatsoever, so I wonder if it
is a side effect of
something else?

Yours,
Linus Walleij


Re: linux-next: build warnings after merge of the gpio tree

2017-11-08 Thread Linus Walleij
On Thu, Nov 9, 2017 at 4:51 AM, Stephen Rothwell  wrote:
> On Fri, 3 Nov 2017 16:37:24 +1100 Stephen Rothwell  
> wrote:
>>
>> After merging the gpio tree, yesterday's linux-next build (arm
>> multi_v7_defconfig) produced these warnings:
>>
>> arch/arm/boot/dts/bcm2835-rpi-b.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-b-rev2.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-a.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-b-plus.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-a-plus.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2836-rpi-2-b.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2837-rpi-3-b.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-zero.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/bcm2835-rpi-zero-w.dtb: Warning (phys_property): Missing 
>> property '#phy-cells' in node /phy or bad phandle (referred from 
>> /soc/usb@7e98:phys[0])
>> arch/arm/boot/dts/exynos5410-odroidxu.dtb: Warning (interrupts_property): 
>> Missing interrupt-controller or interrupt-map property in 
>> /soc/system-controller@1004
>>
>> and many, many more.
>>
>> I have no idea what caused this.
>
> I am still getting lots of these ...

I have absolutely no clue either.

What I know is that there is a device tree compiler warning that can
be turned on, and
it generates these warnings a lot. The actual problems have been in
the DTS files
forever. They just recently started to look into them.

It has nothing to do with the GPIO tree whatsoever, so I wonder if it
is a side effect of
something else?

Yours,
Linus Walleij


Re: [PATCH v2] locking/lockdep: Revise Documentation/locking/crossrelease.txt

2017-11-08 Thread Byungchul Park
On Thu, Nov 09, 2017 at 04:20:36PM +0900, Byungchul Park wrote:
> Changes from v1
> - Run several tools checking english spell and grammar over the text.
> - Simplify the document more.

Checker tools also reported other words e.g. crosslock, crossrelease,
lockdep, mutex, lockless, and so on, but I left them unchanged since
I thought it's better to leave it. Please let me know if I was wrong.

Thanks,
Byungchul

> -8<-
> >From 412bc9eb0d22791f70f7364bda189feb41899ff9 Mon Sep 17 00:00:00 2001
> From: Byungchul Park 
> Date: Thu, 9 Nov 2017 16:12:23 +0900
> Subject: [PATCH v2] locking/lockdep: Revise 
> Documentation/locking/crossrelease.txt
> 
> Revise Documentation/locking/crossrelease.txt to enhance its readability.
> 
> Signed-off-by: Byungchul Park 
> ---
>  Documentation/locking/crossrelease.txt | 492 
> +++--
>  1 file changed, 227 insertions(+), 265 deletions(-)
> 
> diff --git a/Documentation/locking/crossrelease.txt 
> b/Documentation/locking/crossrelease.txt
> index bdf1423..11e3e3b 100644
> --- a/Documentation/locking/crossrelease.txt
> +++ b/Documentation/locking/crossrelease.txt
> @@ -12,10 +12,10 @@ Contents:
>  
>   (*) Limitation
>  
> - - Limit lockdep
> + - Limiting lockdep
>   - Pros from the limitation
>   - Cons from the limitation
> - - Relax the limitation
> + - Relaxing the limitation
>  
>   (*) Crossrelease
>  
> @@ -30,9 +30,9 @@ Contents:
>   (*) Optimizations
>  
>   - Avoid duplication
> - - Lockless for hot paths
> + - Make hot paths lockless
>  
> - (*) APPENDIX A: What lockdep does to work aggresively
> + (*) APPENDIX A: How to add dependencies aggressively
>  
>   (*) APPENDIX B: How to avoid adding false dependencies
>  
> @@ -51,36 +51,30 @@ also impossible due to the same reason.
>  
>  For example:
>  
> -   A context going to trigger event C is waiting for event A to happen.
> -   A context going to trigger event A is waiting for event B to happen.
> -   A context going to trigger event B is waiting for event C to happen.
> +   A context going to trigger event C is waiting for event A.
> +   A context going to trigger event A is waiting for event B.
> +   A context going to trigger event B is waiting for event C.
>  
> -A deadlock occurs when these three wait operations run at the same time,
> -because event C cannot be triggered if event A does not happen, which in
> -turn cannot be triggered if event B does not happen, which in turn
> -cannot be triggered if event C does not happen. After all, no event can
> -be triggered since any of them never meets its condition to wake up.
> +A deadlock occurs when the three waiters run at the same time, because
> +event C cannot be triggered if event A does not happen, which in turn
> +cannot be triggered if event B does not happen, which in turn cannot be
> +triggered if event C does not happen. After all, no event can be
> +triggered since any of them never meets its condition to wake up.
>  
> -A dependency might exist between two waiters and a deadlock might happen
> -due to an incorrect releationship between dependencies. Thus, we must
> -define what a dependency is first. A dependency exists between them if:
> +A dependency exists between two waiters, and a deadlock happens due to
> +an incorrect relationship between dependencies. Thus, we must define
> +what a dependency is first. A dependency exists between waiters if:
>  
> 1. There are two waiters waiting for each event at a given time.
> 2. The only way to wake up each waiter is to trigger its event.
> 3. Whether one can be woken up depends on whether the other can.
>  
> -Each wait in the example creates its dependency like:
> +Each waiter in the example creates its dependency like:
>  
> Event C depends on event A.
> Event A depends on event B.
> Event B depends on event C.
>  
> -   NOTE: Precisely speaking, a dependency is one between whether a
> -   waiter for an event can be woken up and whether another waiter for
> -   another event can be woken up. However from now on, we will describe
> -   a dependency as if it's one between an event and another event for
> -   simplicity.
> -
>  And they form circular dependencies like:
>  
>  -> C -> A -> B -
> @@ -101,19 +95,18 @@ Circular dependencies cause a deadlock.
>  How lockdep works
>  -
>  
> -Lockdep tries to detect a deadlock by checking dependencies created by
> -lock operations, acquire and release. Waiting for a lock corresponds to
> -waiting for an event, and releasing a lock corresponds to triggering an
> -event in the previous section.
> +Lockdep tries to detect a deadlock by checking circular dependencies
> +created by lock operations, acquire and release, which are wait and
> +event respectively.
>  
>  In short, lockdep does:
>  
> 1. Detect a new dependency.
> -   2. Add the dependency into a global graph.
> +   2. Add the dependency 

Re: [PATCH v2] locking/lockdep: Revise Documentation/locking/crossrelease.txt

2017-11-08 Thread Byungchul Park
On Thu, Nov 09, 2017 at 04:20:36PM +0900, Byungchul Park wrote:
> Changes from v1
> - Run several tools checking english spell and grammar over the text.
> - Simplify the document more.

Checker tools also reported other words e.g. crosslock, crossrelease,
lockdep, mutex, lockless, and so on, but I left them unchanged since
I thought it's better to leave it. Please let me know if I was wrong.

Thanks,
Byungchul

> -8<-
> >From 412bc9eb0d22791f70f7364bda189feb41899ff9 Mon Sep 17 00:00:00 2001
> From: Byungchul Park 
> Date: Thu, 9 Nov 2017 16:12:23 +0900
> Subject: [PATCH v2] locking/lockdep: Revise 
> Documentation/locking/crossrelease.txt
> 
> Revise Documentation/locking/crossrelease.txt to enhance its readability.
> 
> Signed-off-by: Byungchul Park 
> ---
>  Documentation/locking/crossrelease.txt | 492 
> +++--
>  1 file changed, 227 insertions(+), 265 deletions(-)
> 
> diff --git a/Documentation/locking/crossrelease.txt 
> b/Documentation/locking/crossrelease.txt
> index bdf1423..11e3e3b 100644
> --- a/Documentation/locking/crossrelease.txt
> +++ b/Documentation/locking/crossrelease.txt
> @@ -12,10 +12,10 @@ Contents:
>  
>   (*) Limitation
>  
> - - Limit lockdep
> + - Limiting lockdep
>   - Pros from the limitation
>   - Cons from the limitation
> - - Relax the limitation
> + - Relaxing the limitation
>  
>   (*) Crossrelease
>  
> @@ -30,9 +30,9 @@ Contents:
>   (*) Optimizations
>  
>   - Avoid duplication
> - - Lockless for hot paths
> + - Make hot paths lockless
>  
> - (*) APPENDIX A: What lockdep does to work aggresively
> + (*) APPENDIX A: How to add dependencies aggressively
>  
>   (*) APPENDIX B: How to avoid adding false dependencies
>  
> @@ -51,36 +51,30 @@ also impossible due to the same reason.
>  
>  For example:
>  
> -   A context going to trigger event C is waiting for event A to happen.
> -   A context going to trigger event A is waiting for event B to happen.
> -   A context going to trigger event B is waiting for event C to happen.
> +   A context going to trigger event C is waiting for event A.
> +   A context going to trigger event A is waiting for event B.
> +   A context going to trigger event B is waiting for event C.
>  
> -A deadlock occurs when these three wait operations run at the same time,
> -because event C cannot be triggered if event A does not happen, which in
> -turn cannot be triggered if event B does not happen, which in turn
> -cannot be triggered if event C does not happen. After all, no event can
> -be triggered since any of them never meets its condition to wake up.
> +A deadlock occurs when the three waiters run at the same time, because
> +event C cannot be triggered if event A does not happen, which in turn
> +cannot be triggered if event B does not happen, which in turn cannot be
> +triggered if event C does not happen. After all, no event can be
> +triggered since any of them never meets its condition to wake up.
>  
> -A dependency might exist between two waiters and a deadlock might happen
> -due to an incorrect releationship between dependencies. Thus, we must
> -define what a dependency is first. A dependency exists between them if:
> +A dependency exists between two waiters, and a deadlock happens due to
> +an incorrect relationship between dependencies. Thus, we must define
> +what a dependency is first. A dependency exists between waiters if:
>  
> 1. There are two waiters waiting for each event at a given time.
> 2. The only way to wake up each waiter is to trigger its event.
> 3. Whether one can be woken up depends on whether the other can.
>  
> -Each wait in the example creates its dependency like:
> +Each waiter in the example creates its dependency like:
>  
> Event C depends on event A.
> Event A depends on event B.
> Event B depends on event C.
>  
> -   NOTE: Precisely speaking, a dependency is one between whether a
> -   waiter for an event can be woken up and whether another waiter for
> -   another event can be woken up. However from now on, we will describe
> -   a dependency as if it's one between an event and another event for
> -   simplicity.
> -
>  And they form circular dependencies like:
>  
>  -> C -> A -> B -
> @@ -101,19 +95,18 @@ Circular dependencies cause a deadlock.
>  How lockdep works
>  -
>  
> -Lockdep tries to detect a deadlock by checking dependencies created by
> -lock operations, acquire and release. Waiting for a lock corresponds to
> -waiting for an event, and releasing a lock corresponds to triggering an
> -event in the previous section.
> +Lockdep tries to detect a deadlock by checking circular dependencies
> +created by lock operations, acquire and release, which are wait and
> +event respectively.
>  
>  In short, lockdep does:
>  
> 1. Detect a new dependency.
> -   2. Add the dependency into a global graph.
> +   2. Add the dependency to a global graph.
> 3. Check if that makes 

[PATCH] cpuidle: Add "cpuidle.use_deepest" to bypass governor and allow HW to go deep

2017-11-08 Thread Len Brown
From: Len Brown 

While there are several mechanisms (cmdline, sysfs, PM_QOS) to limit
cpuidle to shallow idle states, there is no simple mechanism
to give the hardware permission to enter the deeptest state permitted by PM_QOS.

Here we create the "cpuidle.use_deepest" modparam to provide this capability.

"cpuidle.use_deepest=Y" can be set at boot-time, and
/sys/module/cpuidle/use_deepest can be modified (with Y/N) at run-time.

n.b.

Within the constraints of PM_QOS, this mechanism gives the hardware
permission to choose the deeptest power savings and highest latency
state available.  And so choice will depend on the particular hardware.

Also, if PM_QOS is not informed of latency constraints, it can't help.
In that case, using this mechanism may result in entering high-latency
states that impact performance.

Signed-off-by: Len Brown 
---
 Documentation/admin-guide/kernel-parameters.txt |  4 
 drivers/cpuidle/cpuidle.c   | 19 
 include/linux/cpuidle.h |  7 ++
 kernel/sched/idle.c | 30 +++--
 4 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 05496622b4ef..20f70de688bf 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -659,6 +659,10 @@
cpuidle.off=1   [CPU_IDLE]
disable the cpuidle sub-system
 
+   cpuidle.use_deepest=Y   [CPU_IDLE]
+   Ignore cpuidle governor, always choose deepest
+   PM_QOS-legal CPU idle power saving state.
+
cpufreq.off=1   [CPU_FREQ]
disable the cpufreq sub-system
 
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 484cc8909d5c..afee5aab7719 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -34,6 +34,7 @@ LIST_HEAD(cpuidle_detected_devices);
 
 static int enabled_devices;
 static int off __read_mostly;
+static bool use_deepest __read_mostly;
 static int initialized __read_mostly;
 
 int cpuidle_disabled(void)
@@ -116,6 +117,10 @@ void cpuidle_use_deepest_state(bool enable)
preempt_enable();
 }
 
+bool cpuidle_using_deepest_state(void)
+{
+   return use_deepest;
+}
 /**
  * cpuidle_find_deepest_state - Find the deepest available idle state.
  * @drv: cpuidle driver for the given CPU.
@@ -127,6 +132,19 @@ int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
return find_deepest_state(drv, dev, UINT_MAX, 0, false);
 }
 
+/**
+ * cpuidle_find_deepest_state_qos - Find the deepest available idle state.
+ * @drv: cpuidle driver for the given CPU.
+ * @dev: cpuidle device for the given CPU.
+ * Honors PM_QOS
+ */
+int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
+  struct cpuidle_device *dev)
+{
+   return find_deepest_state(drv, dev,
+   pm_qos_request(PM_QOS_CPU_DMA_LATENCY), 0, false);
+}
+
 #ifdef CONFIG_SUSPEND
 static void enter_s2idle_proper(struct cpuidle_driver *drv,
struct cpuidle_device *dev, int index)
@@ -681,4 +699,5 @@ static int __init cpuidle_init(void)
 }
 
 module_param(off, int, 0444);
+module_param(use_deepest, bool, 0644);
 core_initcall(cpuidle_init);
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 8f7788d23b57..e3c2c9d1898f 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -198,19 +198,26 @@ static inline struct cpuidle_device 
*cpuidle_get_device(void) {return NULL; }
 #ifdef CONFIG_CPU_IDLE
 extern int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
  struct cpuidle_device *dev);
+extern int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
+ struct cpuidle_device *dev);
 extern int cpuidle_enter_s2idle(struct cpuidle_driver *drv,
struct cpuidle_device *dev);
 extern void cpuidle_use_deepest_state(bool enable);
+extern bool cpuidle_using_deepest_state(void);
 #else
 static inline int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
 struct cpuidle_device *dev)
 {return -ENODEV; }
+static inline int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
+struct cpuidle_device *dev)
+{return -ENODEV; }
 static inline int cpuidle_enter_s2idle(struct cpuidle_driver *drv,
   struct cpuidle_device *dev)
 {return -ENODEV; }
 static inline void cpuidle_use_deepest_state(bool enable)
 {
 }
+static inline bool cpuidle_using_deepest_state(void) {return false; }
 #endif
 
 /* kernel/sched/idle.c */
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 

[PATCH] cpuidle: Add "cpuidle.use_deepest" to bypass governor and allow HW to go deep

2017-11-08 Thread Len Brown
From: Len Brown 

While there are several mechanisms (cmdline, sysfs, PM_QOS) to limit
cpuidle to shallow idle states, there is no simple mechanism
to give the hardware permission to enter the deeptest state permitted by PM_QOS.

Here we create the "cpuidle.use_deepest" modparam to provide this capability.

"cpuidle.use_deepest=Y" can be set at boot-time, and
/sys/module/cpuidle/use_deepest can be modified (with Y/N) at run-time.

n.b.

Within the constraints of PM_QOS, this mechanism gives the hardware
permission to choose the deeptest power savings and highest latency
state available.  And so choice will depend on the particular hardware.

Also, if PM_QOS is not informed of latency constraints, it can't help.
In that case, using this mechanism may result in entering high-latency
states that impact performance.

Signed-off-by: Len Brown 
---
 Documentation/admin-guide/kernel-parameters.txt |  4 
 drivers/cpuidle/cpuidle.c   | 19 
 include/linux/cpuidle.h |  7 ++
 kernel/sched/idle.c | 30 +++--
 4 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 05496622b4ef..20f70de688bf 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -659,6 +659,10 @@
cpuidle.off=1   [CPU_IDLE]
disable the cpuidle sub-system
 
+   cpuidle.use_deepest=Y   [CPU_IDLE]
+   Ignore cpuidle governor, always choose deepest
+   PM_QOS-legal CPU idle power saving state.
+
cpufreq.off=1   [CPU_FREQ]
disable the cpufreq sub-system
 
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 484cc8909d5c..afee5aab7719 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -34,6 +34,7 @@ LIST_HEAD(cpuidle_detected_devices);
 
 static int enabled_devices;
 static int off __read_mostly;
+static bool use_deepest __read_mostly;
 static int initialized __read_mostly;
 
 int cpuidle_disabled(void)
@@ -116,6 +117,10 @@ void cpuidle_use_deepest_state(bool enable)
preempt_enable();
 }
 
+bool cpuidle_using_deepest_state(void)
+{
+   return use_deepest;
+}
 /**
  * cpuidle_find_deepest_state - Find the deepest available idle state.
  * @drv: cpuidle driver for the given CPU.
@@ -127,6 +132,19 @@ int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
return find_deepest_state(drv, dev, UINT_MAX, 0, false);
 }
 
+/**
+ * cpuidle_find_deepest_state_qos - Find the deepest available idle state.
+ * @drv: cpuidle driver for the given CPU.
+ * @dev: cpuidle device for the given CPU.
+ * Honors PM_QOS
+ */
+int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
+  struct cpuidle_device *dev)
+{
+   return find_deepest_state(drv, dev,
+   pm_qos_request(PM_QOS_CPU_DMA_LATENCY), 0, false);
+}
+
 #ifdef CONFIG_SUSPEND
 static void enter_s2idle_proper(struct cpuidle_driver *drv,
struct cpuidle_device *dev, int index)
@@ -681,4 +699,5 @@ static int __init cpuidle_init(void)
 }
 
 module_param(off, int, 0444);
+module_param(use_deepest, bool, 0644);
 core_initcall(cpuidle_init);
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 8f7788d23b57..e3c2c9d1898f 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -198,19 +198,26 @@ static inline struct cpuidle_device 
*cpuidle_get_device(void) {return NULL; }
 #ifdef CONFIG_CPU_IDLE
 extern int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
  struct cpuidle_device *dev);
+extern int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
+ struct cpuidle_device *dev);
 extern int cpuidle_enter_s2idle(struct cpuidle_driver *drv,
struct cpuidle_device *dev);
 extern void cpuidle_use_deepest_state(bool enable);
+extern bool cpuidle_using_deepest_state(void);
 #else
 static inline int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
 struct cpuidle_device *dev)
 {return -ENODEV; }
+static inline int cpuidle_find_deepest_state_qos(struct cpuidle_driver *drv,
+struct cpuidle_device *dev)
+{return -ENODEV; }
 static inline int cpuidle_enter_s2idle(struct cpuidle_driver *drv,
   struct cpuidle_device *dev)
 {return -ENODEV; }
 static inline void cpuidle_use_deepest_state(bool enable)
 {
 }
+static inline bool cpuidle_using_deepest_state(void) {return false; }
 #endif
 
 /* kernel/sched/idle.c */
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 257f4f0b4532..6c7348ae28ec 100644
--- 

[PATCH 0/13] scsi: arcmsr: add some driver options and support new adapter ARC-1884

2017-11-08 Thread Ching Huang
From: Ching Huang 

Hi all,

The following patches apply to Martin's 4.15/scsi-queue.

Patch 1: redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

Patch 2: simplify arcmsr_iop_init function.

Patch 3: add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

Patch 4: replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB 
that was got from firmware.

Patch 5: add driver option host_can_queue to set host->can_queue value by user. 
It's value expands
 up to 1024.

Patch 6: replace constant ARCMSR_MAX_OUTSTANDING_CMD by variable 
acb->maxOutstanding that was determined by user.

Patch 7: add driver option cmd_per_lun to set host->cmd_per_lun value by user.

Patch 8: add ACB_F_MSG_GET_CONFIG to acb->acb_flags for for message interrupt 
checking before schedule work for
 get device map.

Patch 9: add a function arcmsr_set_iop_datetime and driver option set_date_time 
to set date and time to firmware.

Patch 10: fix clear doorbell queue on ACB_ADAPTER_TYPE_B controller.

Patch 11: spin off duplicate code of timer init for message isr BH in 
arcmsr_probe and arcmsr_resume as a function
  arcmsr_init_get_devmap_timer

Patch 12: adjust some tab or white-space to make text alignment.

Patch 13: update driver version to v1.40.00.02-20171011

Please review. Thanks.

---




[PATCH 0/13] scsi: arcmsr: add some driver options and support new adapter ARC-1884

2017-11-08 Thread Ching Huang
From: Ching Huang 

Hi all,

The following patches apply to Martin's 4.15/scsi-queue.

Patch 1: redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

Patch 2: simplify arcmsr_iop_init function.

Patch 3: add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

Patch 4: replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB 
that was got from firmware.

Patch 5: add driver option host_can_queue to set host->can_queue value by user. 
It's value expands
 up to 1024.

Patch 6: replace constant ARCMSR_MAX_OUTSTANDING_CMD by variable 
acb->maxOutstanding that was determined by user.

Patch 7: add driver option cmd_per_lun to set host->cmd_per_lun value by user.

Patch 8: add ACB_F_MSG_GET_CONFIG to acb->acb_flags for for message interrupt 
checking before schedule work for
 get device map.

Patch 9: add a function arcmsr_set_iop_datetime and driver option set_date_time 
to set date and time to firmware.

Patch 10: fix clear doorbell queue on ACB_ADAPTER_TYPE_B controller.

Patch 11: spin off duplicate code of timer init for message isr BH in 
arcmsr_probe and arcmsr_resume as a function
  arcmsr_init_get_devmap_timer

Patch 12: adjust some tab or white-space to make text alignment.

Patch 13: update driver version to v1.40.00.02-20171011

Please review. Thanks.

---




Re: [PATCH 2/3] perf tools: Fix build for hardened environments

2017-11-08 Thread Jiri Olsa
On Wed, Nov 08, 2017 at 01:03:21PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 08, 2017 at 11:27:38AM +0100, Jiri Olsa escreveu:
> > From: Jiri Olsa 
> > 
> > On Fedora systems the perl and python CFLAGS/LDFLAGS include the
> > hardened specs from redhat-rpm-config package. We apply them only
> > for perl/python objects, which makes them not compatible with the
> > rest of the objects and the build fails with:
> > 
> >   /usr/bin/ld: perf-in.o: relocation R_X86_64_32 against `.rodata.str1.1' 
> > can not be used when making a shared object; recompile with -fPIC
> >   /usr/bin/ld: libperf.a(libperf-in.o): relocation R_X86_64_32S against 
> > `.text' can not be used when making a shared object; recompile with -fPIC
> >   /usr/bin/ld: final link failed: Nonrepresentable section on output
> >   collect2: error: ld returned 1 exit status
> >   make[2]: *** [Makefile.perf:507: perf] Error 1
> >   make[1]: *** [Makefile.perf:210: sub-make] Error 2
> >   make: *** [Makefile:69: all] Error 2
> > 
> > Mainly it's caused by perl/python objects being compiled with:
> > 
> >   -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
> > 
> > which prevent the final link impossible, because it will check
> > for 'proper' objects with following option:
> > 
> >   -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
> > 
> > Fixing this by using the perl/python CFLAGS/LDFLAGS options
> > for all the objects.
> 
> Humm, so we're basically using the hardened config only we build with
> PERL or PYTHON, should we use that always, i.e. ask the distro what set
> of flags we should use?

right, I think this needs to be detected like we do for features,
since there maybe some supported gcc versions to detect

> What other impacts this may have on using this for all of the tools?
> I.e. we could conceivably just remove that part from the perl/python
> builds and make them use what has been used for the rest of the tools
> instead?

hum, so those are the flags the perl/python extensions are built with

we have both perl/python extensions built in the perf for the script cmd,
which creates dependencies:

[jolsa@krava perf]$ ldd ./perf  |grep perl
libperl.so.5.24 => /lib64/libperl.so.5.24 (0x7f72b33b3000)
[jolsa@krava perf]$ ldd ./perf  |grep python
libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 
(0x7f927cfe7000)

not sure we could be affected here if we remove that hardened spec 
option

and then we have the python module extension which is used separately of
perf binary, which should be fine

jirka


Re: [PATCH 2/3] perf tools: Fix build for hardened environments

2017-11-08 Thread Jiri Olsa
On Wed, Nov 08, 2017 at 01:03:21PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 08, 2017 at 11:27:38AM +0100, Jiri Olsa escreveu:
> > From: Jiri Olsa 
> > 
> > On Fedora systems the perl and python CFLAGS/LDFLAGS include the
> > hardened specs from redhat-rpm-config package. We apply them only
> > for perl/python objects, which makes them not compatible with the
> > rest of the objects and the build fails with:
> > 
> >   /usr/bin/ld: perf-in.o: relocation R_X86_64_32 against `.rodata.str1.1' 
> > can not be used when making a shared object; recompile with -fPIC
> >   /usr/bin/ld: libperf.a(libperf-in.o): relocation R_X86_64_32S against 
> > `.text' can not be used when making a shared object; recompile with -fPIC
> >   /usr/bin/ld: final link failed: Nonrepresentable section on output
> >   collect2: error: ld returned 1 exit status
> >   make[2]: *** [Makefile.perf:507: perf] Error 1
> >   make[1]: *** [Makefile.perf:210: sub-make] Error 2
> >   make: *** [Makefile:69: all] Error 2
> > 
> > Mainly it's caused by perl/python objects being compiled with:
> > 
> >   -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
> > 
> > which prevent the final link impossible, because it will check
> > for 'proper' objects with following option:
> > 
> >   -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
> > 
> > Fixing this by using the perl/python CFLAGS/LDFLAGS options
> > for all the objects.
> 
> Humm, so we're basically using the hardened config only we build with
> PERL or PYTHON, should we use that always, i.e. ask the distro what set
> of flags we should use?

right, I think this needs to be detected like we do for features,
since there maybe some supported gcc versions to detect

> What other impacts this may have on using this for all of the tools?
> I.e. we could conceivably just remove that part from the perl/python
> builds and make them use what has been used for the rest of the tools
> instead?

hum, so those are the flags the perl/python extensions are built with

we have both perl/python extensions built in the perf for the script cmd,
which creates dependencies:

[jolsa@krava perf]$ ldd ./perf  |grep perl
libperl.so.5.24 => /lib64/libperl.so.5.24 (0x7f72b33b3000)
[jolsa@krava perf]$ ldd ./perf  |grep python
libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 
(0x7f927cfe7000)

not sure we could be affected here if we remove that hardened spec 
option

and then we have the python module extension which is used separately of
perf binary, which should be fine

jirka


Re: [RFC -mm] mm, userfaultfd, THP: Avoid waiting when PMD under THP migration

2017-11-08 Thread Huang, Ying
Andrea Arcangeli  writes:

> Hello,
>
> On Sun, Nov 05, 2017 at 11:01:05AM +0800, huang ying wrote:
>> On Fri, Nov 3, 2017 at 11:00 PM, Zi Yan  wrote:
>> > On 3 Nov 2017, at 3:52, Huang, Ying wrote:
>> >
>> >> From: Huang Ying 
>> >>
>> >> If THP migration is enabled, the following situation is possible,
>> >>
>> >> - A THP is mapped at source address
>> >> - Migration is started to move the THP to another node
>> >> - Page fault occurs
>> >> - The PMD (migration entry) is copied to the destination address in mremap
>> >>
>> >
>> > You mean the page fault path follows the source address and sees 
>> > pmd_none() now
>> > because mremap() clears it and remaps the page with dest address.
>> > Otherwise, it seems not possible to get into handle_userfault(), since it 
>> > is called in
>> > pmd_none() branch inside do_huge_pmd_anonymous_page().
>> >
>> >
>> >> That is, it is possible for handle_userfault() encounter a PMD entry
>> >> which has been handled but !pmd_present().  In the current
>> >> implementation, we will wait for such PMD entries, which may cause
>> >> unnecessary waiting, and potential soft lockup.
>> >
>> > handle_userfault() should only see pmd_none() in the situation you 
>> > describe,
>> > whereas !pmd_present() (migration entry case) should lead to
>> > pmd_migration_entry_wait().
>> 
>> Yes.  This is my understanding of the source code too.  And I
>> described it in the original patch description too.  I just want to
>> make sure whether it is possible that !pmd_none() and !pmd_present()
>> for a PMD in userfaultfd_must_wait().  And, whether it is possible for
>
> I don't see how mremap is relevant above. mremap runs with mmap_sem
> for writing, so it can't race against userfaultfd_must_wait.
>
> However the concern of set_pmd_migration_entry() being called with
> only the mmap_sem for reading through TTU_MIGRATION in
> __unmap_and_move and being interpreted as a "missing" THP page by
> userfaultfd_must_wait seems valid.
>
> Compaction won't normally compact pages that are already THP sized so
> you cannot see this normally because VM don't normally get migrated
> over SHM/hugetlbfs with hard bindings while userfaults are in
> progress.
>
> Overall your patch looks more correct than current code so it's good
> idea to apply and it should avoid surprises with the above corner
> case if CONFIG_ARCH_ENABLE_THP_MIGRATION is set.
>
> Worst case the process would hang in handle_userfault(), but it will
> still respond fine to sigkill, so it's not concerning, but it should
> be fixed nevertheless.
>
> Reviewed-by: Andrea Arcangeli 

Thanks!  I will revise the patch description and send the new version!

Best Regards,
Huang, Ying

[snip]


Re: [RFC -mm] mm, userfaultfd, THP: Avoid waiting when PMD under THP migration

2017-11-08 Thread Huang, Ying
Andrea Arcangeli  writes:

> Hello,
>
> On Sun, Nov 05, 2017 at 11:01:05AM +0800, huang ying wrote:
>> On Fri, Nov 3, 2017 at 11:00 PM, Zi Yan  wrote:
>> > On 3 Nov 2017, at 3:52, Huang, Ying wrote:
>> >
>> >> From: Huang Ying 
>> >>
>> >> If THP migration is enabled, the following situation is possible,
>> >>
>> >> - A THP is mapped at source address
>> >> - Migration is started to move the THP to another node
>> >> - Page fault occurs
>> >> - The PMD (migration entry) is copied to the destination address in mremap
>> >>
>> >
>> > You mean the page fault path follows the source address and sees 
>> > pmd_none() now
>> > because mremap() clears it and remaps the page with dest address.
>> > Otherwise, it seems not possible to get into handle_userfault(), since it 
>> > is called in
>> > pmd_none() branch inside do_huge_pmd_anonymous_page().
>> >
>> >
>> >> That is, it is possible for handle_userfault() encounter a PMD entry
>> >> which has been handled but !pmd_present().  In the current
>> >> implementation, we will wait for such PMD entries, which may cause
>> >> unnecessary waiting, and potential soft lockup.
>> >
>> > handle_userfault() should only see pmd_none() in the situation you 
>> > describe,
>> > whereas !pmd_present() (migration entry case) should lead to
>> > pmd_migration_entry_wait().
>> 
>> Yes.  This is my understanding of the source code too.  And I
>> described it in the original patch description too.  I just want to
>> make sure whether it is possible that !pmd_none() and !pmd_present()
>> for a PMD in userfaultfd_must_wait().  And, whether it is possible for
>
> I don't see how mremap is relevant above. mremap runs with mmap_sem
> for writing, so it can't race against userfaultfd_must_wait.
>
> However the concern of set_pmd_migration_entry() being called with
> only the mmap_sem for reading through TTU_MIGRATION in
> __unmap_and_move and being interpreted as a "missing" THP page by
> userfaultfd_must_wait seems valid.
>
> Compaction won't normally compact pages that are already THP sized so
> you cannot see this normally because VM don't normally get migrated
> over SHM/hugetlbfs with hard bindings while userfaults are in
> progress.
>
> Overall your patch looks more correct than current code so it's good
> idea to apply and it should avoid surprises with the above corner
> case if CONFIG_ARCH_ENABLE_THP_MIGRATION is set.
>
> Worst case the process would hang in handle_userfault(), but it will
> still respond fine to sigkill, so it's not concerning, but it should
> be fixed nevertheless.
>
> Reviewed-by: Andrea Arcangeli 

Thanks!  I will revise the patch description and send the new version!

Best Regards,
Huang, Ying

[snip]


Re: [PATCH V13 07/10] mmc: block: blk-mq: Add support for direct completion

2017-11-08 Thread Adrian Hunter
On 08/11/17 11:28, Linus Walleij wrote:
> On Fri, Nov 3, 2017 at 2:20 PM, Adrian Hunter  wrote:
> 
>> For blk-mq, add support for completing requests directly in the ->done
>> callback. That means that error handling and urgent background operations
>> must be handled by recovery_work in that case.
>>
>> Signed-off-by: Adrian Hunter 
> 
> I tried enabling this on my MMC host (mmci) but I got weird
> DMA error messages when I did.
> 
> I guess this has not been tested on a non-DMA-coherent
> system?

I don't see what DMA-coherence has to do with anything.

Possibilities:
- DMA unmapping doesn't work in an atomic context
- requests' DMA operations have to be synchronized with each other

> I think I might be seeing this because the .pre and .post
> callbacks need to be strictly sequenced, and this is
> maybe not taken into account here?

I looked at mmci but that did not seem to be the case.

> Isn't there as risk
> that the .post callback of the next request is called before
> the .post callback of the previous request has returned
> for example?

Of course, the requests are treated as independent.  If the separate DMA
operations require synchronization, that is for the host driver to fix.


Re: [PATCH V13 07/10] mmc: block: blk-mq: Add support for direct completion

2017-11-08 Thread Adrian Hunter
On 08/11/17 11:28, Linus Walleij wrote:
> On Fri, Nov 3, 2017 at 2:20 PM, Adrian Hunter  wrote:
> 
>> For blk-mq, add support for completing requests directly in the ->done
>> callback. That means that error handling and urgent background operations
>> must be handled by recovery_work in that case.
>>
>> Signed-off-by: Adrian Hunter 
> 
> I tried enabling this on my MMC host (mmci) but I got weird
> DMA error messages when I did.
> 
> I guess this has not been tested on a non-DMA-coherent
> system?

I don't see what DMA-coherence has to do with anything.

Possibilities:
- DMA unmapping doesn't work in an atomic context
- requests' DMA operations have to be synchronized with each other

> I think I might be seeing this because the .pre and .post
> callbacks need to be strictly sequenced, and this is
> maybe not taken into account here?

I looked at mmci but that did not seem to be the case.

> Isn't there as risk
> that the .post callback of the next request is called before
> the .post callback of the previous request has returned
> for example?

Of course, the requests are treated as independent.  If the separate DMA
operations require synchronization, that is for the host driver to fix.


Re: [PATCH 3/3] perf tools: Removing FLAGS_PYTHON_EMBED/FLAGS_PERL_EMBED variables

2017-11-08 Thread Jiri Olsa
On Wed, Nov 08, 2017 at 01:06:40PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 08, 2017 at 11:27:39AM +0100, Jiri Olsa escreveu:
> > There's no user of those.
> 
> [acme@jouet linux]$ find tools/ -type f | xargs grep FLAGS_PYTHON_EMBED
> tools/perf/Makefile.config:  FLAGS_PYTHON_EMBED := $(PYTHON_EMBED_CCOPTS) 
> $(PYTHON_EMBED_LDOPTS)
> tools/build/feature/Makefile: $(BUILD) -fstack-protector-all -O2 
> -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -laudit -I/usr/include/slang 
> -lslang $(shell $(PKG_CONFIG) --libs --cflags gtk+-2.0 2>/dev/null) 
> $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz 
> -llzma
> tools/build/feature/Makefile: $(BUILD) $(FLAGS_PYTHON_EMBED)
> [acme@jouet linux]$ 
> 
> Should we remove these?

oops, missed that directory.. thos eneed to stay then ;-) sry

thanks,
jirka

> 
> - Arnaldo
>  
> > Link: http://lkml.kernel.org/n/tip-84jeuwojm21wcjfzvtis6...@git.kernel.org
> > Signed-off-by: Jiri Olsa 
> > ---
> >  tools/perf/Makefile.config | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> > index 5e3734e4c1e4..caa7fe26efa9 100644
> > --- a/tools/perf/Makefile.config
> > +++ b/tools/perf/Makefile.config
> > @@ -187,7 +187,6 @@ ifdef PYTHON_CONFIG
> >ifeq ($(CC_NO_CLANG), 1)
> >  PYTHON_EMBED_CCOPTS := $(filter-out -specs=%,$(PYTHON_EMBED_CCOPTS))
> >endif
> > -  FLAGS_PYTHON_EMBED := $(PYTHON_EMBED_CCOPTS) $(PYTHON_EMBED_LDOPTS)
> >  endif
> >  
> >  FEATURE_CHECK_CFLAGS-libpython := $(PYTHON_EMBED_CCOPTS)
> > @@ -580,7 +579,6 @@ else
> >PERL_EMBED_LDFLAGS = $(call strip-libs,$(PERL_EMBED_LDOPTS))
> >PERL_EMBED_LIBADD = $(call grep-libs,$(PERL_EMBED_LDOPTS))
> >PERL_EMBED_CCOPTS = $(shell perl -MExtUtils::Embed -e ccopts 2>/dev/null)
> > -  FLAGS_PERL_EMBED=$(PERL_EMBED_CCOPTS) $(PERL_EMBED_LDOPTS)
> >  
> >ifneq ($(feature-libperl), 1)
> >  CFLAGS += -DNO_LIBPERL
> > -- 
> > 2.13.6


Re: [PATCH v5 0/9] mtd: sharpslpart partition parser

2017-11-08 Thread Robert Jarzmik
Boris Brezillon  writes:

>> Hi Boris,
>> 
>> So what's the status about the sync, should I pick the patches, and have the
>> others make it to your for-next branch ?
>
> It's been merged in l2-mtd/master (our -next branch) which is
> targeting 4.15. Unfortunately we didn't create a topic branch, which
> means you'll have to wait 4.15-rc1 before pushing patches 6 to 9 if you
> want avoid regressions. Anyway, I guess it's already too late to send
> PRs to arm-soc for 4.15.

Right, my next pull request is targetting v4.16.

Andrea, would you be so kind as to resend the serie (patches 6 - 9) on the
mailing list and to me, so that I review and apply the correct version.

Thanks.

-- 
Robert


Re: [PATCH 3/3] perf tools: Removing FLAGS_PYTHON_EMBED/FLAGS_PERL_EMBED variables

2017-11-08 Thread Jiri Olsa
On Wed, Nov 08, 2017 at 01:06:40PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 08, 2017 at 11:27:39AM +0100, Jiri Olsa escreveu:
> > There's no user of those.
> 
> [acme@jouet linux]$ find tools/ -type f | xargs grep FLAGS_PYTHON_EMBED
> tools/perf/Makefile.config:  FLAGS_PYTHON_EMBED := $(PYTHON_EMBED_CCOPTS) 
> $(PYTHON_EMBED_LDOPTS)
> tools/build/feature/Makefile: $(BUILD) -fstack-protector-all -O2 
> -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -laudit -I/usr/include/slang 
> -lslang $(shell $(PKG_CONFIG) --libs --cflags gtk+-2.0 2>/dev/null) 
> $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz 
> -llzma
> tools/build/feature/Makefile: $(BUILD) $(FLAGS_PYTHON_EMBED)
> [acme@jouet linux]$ 
> 
> Should we remove these?

oops, missed that directory.. thos eneed to stay then ;-) sry

thanks,
jirka

> 
> - Arnaldo
>  
> > Link: http://lkml.kernel.org/n/tip-84jeuwojm21wcjfzvtis6...@git.kernel.org
> > Signed-off-by: Jiri Olsa 
> > ---
> >  tools/perf/Makefile.config | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> > index 5e3734e4c1e4..caa7fe26efa9 100644
> > --- a/tools/perf/Makefile.config
> > +++ b/tools/perf/Makefile.config
> > @@ -187,7 +187,6 @@ ifdef PYTHON_CONFIG
> >ifeq ($(CC_NO_CLANG), 1)
> >  PYTHON_EMBED_CCOPTS := $(filter-out -specs=%,$(PYTHON_EMBED_CCOPTS))
> >endif
> > -  FLAGS_PYTHON_EMBED := $(PYTHON_EMBED_CCOPTS) $(PYTHON_EMBED_LDOPTS)
> >  endif
> >  
> >  FEATURE_CHECK_CFLAGS-libpython := $(PYTHON_EMBED_CCOPTS)
> > @@ -580,7 +579,6 @@ else
> >PERL_EMBED_LDFLAGS = $(call strip-libs,$(PERL_EMBED_LDOPTS))
> >PERL_EMBED_LIBADD = $(call grep-libs,$(PERL_EMBED_LDOPTS))
> >PERL_EMBED_CCOPTS = $(shell perl -MExtUtils::Embed -e ccopts 2>/dev/null)
> > -  FLAGS_PERL_EMBED=$(PERL_EMBED_CCOPTS) $(PERL_EMBED_LDOPTS)
> >  
> >ifneq ($(feature-libperl), 1)
> >  CFLAGS += -DNO_LIBPERL
> > -- 
> > 2.13.6


Re: [PATCH v5 0/9] mtd: sharpslpart partition parser

2017-11-08 Thread Robert Jarzmik
Boris Brezillon  writes:

>> Hi Boris,
>> 
>> So what's the status about the sync, should I pick the patches, and have the
>> others make it to your for-next branch ?
>
> It's been merged in l2-mtd/master (our -next branch) which is
> targeting 4.15. Unfortunately we didn't create a topic branch, which
> means you'll have to wait 4.15-rc1 before pushing patches 6 to 9 if you
> want avoid regressions. Anyway, I guess it's already too late to send
> PRs to arm-soc for 4.15.

Right, my next pull request is targetting v4.16.

Andrea, would you be so kind as to resend the serie (patches 6 - 9) on the
mailing list and to me, so that I review and apply the correct version.

Thanks.

-- 
Robert


[PATCH 1/13] scsi: arcmsr: redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

2017-11-08 Thread Ching Huang
From: Ching Huang 

redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-07-31 11:50:44.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-08-03 18:54:46.0 +0800
@@ -621,10 +621,10 @@ struct MessageUnit_D {
 struct AdapterControlBlock
 {
uint32_t  adapter_type;/* adapter A,B. */
-   #define ACB_ADAPTER_TYPE_A0x0001/* hba I IOP */
-   #define ACB_ADAPTER_TYPE_B0x0002/* hbb M IOP */
-   #define ACB_ADAPTER_TYPE_C0x0004/* hbc P IOP */
-   #define ACB_ADAPTER_TYPE_D0x0008/* hbd A IOP */
+   #define ACB_ADAPTER_TYPE_A  0x  /* hba I IOP */
+   #define ACB_ADAPTER_TYPE_B  0x0001  /* hbb M IOP */
+   #define ACB_ADAPTER_TYPE_C  0x0002  /* hbc L IOP */
+   #define ACB_ADAPTER_TYPE_D  0x0003  /* hbd M IOP */
u32 roundup_ccbsize;
struct pci_dev *pdev;
struct Scsi_Host *  host;
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-07-31 11:50:16.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:46:42.0 +0800
@@ -1789,7 +1789,7 @@ arcmsr_Read_iop_rqbuffer_data(struct Ada
uint8_t __iomem *iop_data;
uint32_t iop_len;
 
-   if (acb->adapter_type & (ACB_ADAPTER_TYPE_C | ACB_ADAPTER_TYPE_D))
+   if (acb->adapter_type > ACB_ADAPTER_TYPE_B)
return arcmsr_Read_iop_rqbuffer_in_DWORD(acb, prbuffer);
iop_data = (uint8_t __iomem *)prbuffer->data;
iop_len = readl(>data_len);
@@ -1875,7 +1875,7 @@ arcmsr_write_ioctldata2iop(struct Adapte
uint8_t __iomem *iop_data;
int32_t allxfer_len = 0;
 
-   if (acb->adapter_type & (ACB_ADAPTER_TYPE_C | ACB_ADAPTER_TYPE_D)) {
+   if (acb->adapter_type > ACB_ADAPTER_TYPE_B) {
arcmsr_write_ioctldata2iop_in_DWORD(acb);
return;
}




[PATCH 1/13] scsi: arcmsr: redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

2017-11-08 Thread Ching Huang
From: Ching Huang 

redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-07-31 11:50:44.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-08-03 18:54:46.0 +0800
@@ -621,10 +621,10 @@ struct MessageUnit_D {
 struct AdapterControlBlock
 {
uint32_t  adapter_type;/* adapter A,B. */
-   #define ACB_ADAPTER_TYPE_A0x0001/* hba I IOP */
-   #define ACB_ADAPTER_TYPE_B0x0002/* hbb M IOP */
-   #define ACB_ADAPTER_TYPE_C0x0004/* hbc P IOP */
-   #define ACB_ADAPTER_TYPE_D0x0008/* hbd A IOP */
+   #define ACB_ADAPTER_TYPE_A  0x  /* hba I IOP */
+   #define ACB_ADAPTER_TYPE_B  0x0001  /* hbb M IOP */
+   #define ACB_ADAPTER_TYPE_C  0x0002  /* hbc L IOP */
+   #define ACB_ADAPTER_TYPE_D  0x0003  /* hbd M IOP */
u32 roundup_ccbsize;
struct pci_dev *pdev;
struct Scsi_Host *  host;
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-07-31 11:50:16.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-11-08 18:46:42.0 +0800
@@ -1789,7 +1789,7 @@ arcmsr_Read_iop_rqbuffer_data(struct Ada
uint8_t __iomem *iop_data;
uint32_t iop_len;
 
-   if (acb->adapter_type & (ACB_ADAPTER_TYPE_C | ACB_ADAPTER_TYPE_D))
+   if (acb->adapter_type > ACB_ADAPTER_TYPE_B)
return arcmsr_Read_iop_rqbuffer_in_DWORD(acb, prbuffer);
iop_data = (uint8_t __iomem *)prbuffer->data;
iop_len = readl(>data_len);
@@ -1875,7 +1875,7 @@ arcmsr_write_ioctldata2iop(struct Adapte
uint8_t __iomem *iop_data;
int32_t allxfer_len = 0;
 
-   if (acb->adapter_type & (ACB_ADAPTER_TYPE_C | ACB_ADAPTER_TYPE_D)) {
+   if (acb->adapter_type > ACB_ADAPTER_TYPE_B) {
arcmsr_write_ioctldata2iop_in_DWORD(acb);
return;
}




Re: [PATCH 20/31] nds32: L2 cache support

2017-11-08 Thread Greentime Hu
2017-11-08 17:48 GMT+08:00 Arnd Bergmann :
> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>> From: Greentime Hu 
>>
>> Signed-off-by: Vincent Chen 
>> Signed-off-by: Greentime Hu 
>
>> +
>> +/* This is defined for head.S to use due to device tree is not yet built. */
>> +#define L2CC_PA_BASE   0x90F0
>
> This looks problematic, since it prevents you from using the same head.S for
> multiple SoCs that have different L2 controllers or that have them at 
> different
> addresses.
>
> What does head.S actually do to the L2CC? Could the boot protocol require
> that to be done by the boot loader before entering the kernel instead?
>

Thanks.
It will disable and invalidate L2 cache. I think we can do these
things in bootloader.
I will refine it in the next version patch.


Re: [PATCH 20/31] nds32: L2 cache support

2017-11-08 Thread Greentime Hu
2017-11-08 17:48 GMT+08:00 Arnd Bergmann :
> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>> From: Greentime Hu 
>>
>> Signed-off-by: Vincent Chen 
>> Signed-off-by: Greentime Hu 
>
>> +
>> +/* This is defined for head.S to use due to device tree is not yet built. */
>> +#define L2CC_PA_BASE   0x90F0
>
> This looks problematic, since it prevents you from using the same head.S for
> multiple SoCs that have different L2 controllers or that have them at 
> different
> addresses.
>
> What does head.S actually do to the L2CC? Could the boot protocol require
> that to be done by the boot loader before entering the kernel instead?
>

Thanks.
It will disable and invalidate L2 cache. I think we can do these
things in bootloader.
I will refine it in the next version patch.


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
0, IRQ 07, APIC ID 0, APIC INT 
07
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 08, APIC ID 0, APIC INT 
08
[0.00] ACPI: IRQ9 used by override.
[0.00] ACPI: IRQ10 used by override.
[0.00] ACPI: IRQ11 used by override.
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 0c, APIC ID 0, APIC INT 
0c
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 0d, APIC ID 0, APIC INT 
0d
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 0e, APIC ID 0, APIC INT 
0e
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 0f, APIC ID 0, APIC INT 
0f
[0.00] Using ACPI (MADT) for SMP configuration information
[0.00] ACPI: HPET id: 0x8086a201 base: 0xfed0
[0.00] mapped IOAPIC to b000 (fec0)
[0.00] KVM setup async PF for cpu 0
[0.00] kvm-stealtime: cpu 0, msr 3fc97c0
[0.00] e820: [mem 0x1400-0xfeffbfff] available for PCI devices
[0.00] Booting paravirtualized kernel on KVM
[0.00] clocksource: refined-jiffies: mask: 0x max_cycles: 
0x, max_idle_ns: 7645519600211568 ns
[0.00] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[0.00] pcpu-alloc: [0] 0 
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 81146
[0.00] Kernel command line: ip=vm-lkp-wsx03-openwrt-i386-18::dhcp 
root=/dev/ram0 user=lkp 
job=/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-18/trinity-300s-openwrt-i386-2016-03-16.cgz-8d5e72dfdf0fa29a21143fd72746c6f43295ce9f-20171108-5253-ystw19-0.yaml
 ARCH=i386 kconfig=i386-randconfig-b0-11061302 branch=linus/master 
commit=8d5e72dfdf0fa29a21143fd72746c6f43295ce9f 
BOOT_IMAGE=/pkg/linux/i386-randconfig-b0-11061302/gcc-5/8d5e72dfdf0fa29a21143fd72746c6f43295ce9f/vmlinuz-4.11.0-08060-g8d5e72d
 max_uptime=1500 
RESULT_ROOT=/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302/gcc-5/8d5e72dfdf0fa29a21143fd72746c6f43295ce9f/0
 LKP_SERVER=inn debug apic=debug sysrq_always_enabled 
rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 
softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 
prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_loglevel 
console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,1152
[0.00] sysrq: sysrq always enabled.
[0.00] PID hash table entries: 2048 (order: 1, 8192 bytes)
[0.00] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[0.00] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[0.00] Initializing CPU#0
[0.00] Memory: 240480K/327144K available (33000K kernel code, 10606K 
rwdata, 15792K rodata, 1100K init, 19528K bss, 86664K reserved, 0K cma-reserved)
[0.00] virtual kernel memory layout:
[0.00] fixmap  : 0xfffa2000 - 0xf000   ( 372 kB)
[0.00] vmalloc : 0xd47dc000 - 0xfffa   ( 695 MB)
[0.00] lowmem  : 0xc000 - 0xd3fdc000   ( 319 MB)
[0.00]   .init : 0xc4a2d000 - 0xc4b4   (1100 kB)
[0.00]   .data : 0xc303a333 - 0xc4a03900   (26405 kB)
[0.00]   .text : 0xc100 - 0xc303a333   (33000 kB)
[0.00] Checking if this processor honours the WP bit even in supervisor 
mode...Ok.
[0.00] 
[0.00] **
[0.00] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
[0.00] **  **
[0.00] ** trace_printk() being used. Allocating extra memory.  **
[0.00] **  **
[0.00] ** This means that this is a DEBUG kernel and it is **
[0.00] ** unsafe for production use.   **
[0.00] **  **
[0.00] ** If you see this message and you are not debugging**
[0.00] ** the kernel, report this immediately to your vendor!  **
[0.00] **  **
[0.00] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
[0.00] **
[0.004000] Preemptible hierarchical RCU implementation.
[0.004000]  RCU kthread priority: 1.
[0.004000] NR_IRQS:2304 nr_irqs:256 16
[0.004000] CPU 0 irqstacks, hard=d35d4000 soft=d35d6000
[0.004000] Console: colour VGA+ 80x25
[0.004000] console [tty0] enabled
[0.004000] console [ttyS0] enabled
[0.004000] bootconsole [earlyser0] disabled
[0.004000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., 
Ingo Molnar
[0.004000] ... MAX_LOCKDEP_SUBCLASSES:  8
[0.004000] ... MAX_LOCK_DEPTH:  48
[0.004000] ... MAX_LOCKDEP_KEYS:8191
[0.004000] ... CLASSHASH_SIZE:  4096
[0.004000] ... MAX_LOCKDEP_ENTRIES: 32768
[0.004000] ... MAX_LOCKDE

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
0, IRQ 07, APIC ID 0, APIC INT 
07
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 08, APIC ID 0, APIC INT 
08
[0.00] ACPI: IRQ9 used by override.
[0.00] ACPI: IRQ10 used by override.
[0.00] ACPI: IRQ11 used by override.
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 0c, APIC ID 0, APIC INT 
0c
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 0d, APIC ID 0, APIC INT 
0d
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 0e, APIC ID 0, APIC INT 
0e
[0.00] Int: type 0, pol 0, trig 0, bus 00, IRQ 0f, APIC ID 0, APIC INT 
0f
[0.00] Using ACPI (MADT) for SMP configuration information
[0.00] ACPI: HPET id: 0x8086a201 base: 0xfed0
[0.00] mapped IOAPIC to b000 (fec0)
[0.00] KVM setup async PF for cpu 0
[0.00] kvm-stealtime: cpu 0, msr 3fc97c0
[0.00] e820: [mem 0x1400-0xfeffbfff] available for PCI devices
[0.00] Booting paravirtualized kernel on KVM
[0.00] clocksource: refined-jiffies: mask: 0x max_cycles: 
0x, max_idle_ns: 7645519600211568 ns
[0.00] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[0.00] pcpu-alloc: [0] 0 
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 81146
[0.00] Kernel command line: ip=vm-lkp-wsx03-openwrt-i386-18::dhcp 
root=/dev/ram0 user=lkp 
job=/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-18/trinity-300s-openwrt-i386-2016-03-16.cgz-8d5e72dfdf0fa29a21143fd72746c6f43295ce9f-20171108-5253-ystw19-0.yaml
 ARCH=i386 kconfig=i386-randconfig-b0-11061302 branch=linus/master 
commit=8d5e72dfdf0fa29a21143fd72746c6f43295ce9f 
BOOT_IMAGE=/pkg/linux/i386-randconfig-b0-11061302/gcc-5/8d5e72dfdf0fa29a21143fd72746c6f43295ce9f/vmlinuz-4.11.0-08060-g8d5e72d
 max_uptime=1500 
RESULT_ROOT=/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302/gcc-5/8d5e72dfdf0fa29a21143fd72746c6f43295ce9f/0
 LKP_SERVER=inn debug apic=debug sysrq_always_enabled 
rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 
softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 
prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_loglevel 
console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,1152
[0.00] sysrq: sysrq always enabled.
[0.00] PID hash table entries: 2048 (order: 1, 8192 bytes)
[0.00] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[0.00] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[0.00] Initializing CPU#0
[0.00] Memory: 240480K/327144K available (33000K kernel code, 10606K 
rwdata, 15792K rodata, 1100K init, 19528K bss, 86664K reserved, 0K cma-reserved)
[0.00] virtual kernel memory layout:
[0.00] fixmap  : 0xfffa2000 - 0xf000   ( 372 kB)
[0.00] vmalloc : 0xd47dc000 - 0xfffa   ( 695 MB)
[0.00] lowmem  : 0xc000 - 0xd3fdc000   ( 319 MB)
[0.00]   .init : 0xc4a2d000 - 0xc4b4   (1100 kB)
[0.00]   .data : 0xc303a333 - 0xc4a03900   (26405 kB)
[0.00]   .text : 0xc100 - 0xc303a333   (33000 kB)
[0.00] Checking if this processor honours the WP bit even in supervisor 
mode...Ok.
[0.00] 
[0.00] **
[0.00] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
[0.00] **  **
[0.00] ** trace_printk() being used. Allocating extra memory.  **
[0.00] **  **
[0.00] ** This means that this is a DEBUG kernel and it is **
[0.00] ** unsafe for production use.   **
[0.00] **  **
[0.00] ** If you see this message and you are not debugging**
[0.00] ** the kernel, report this immediately to your vendor!  **
[0.00] **  **
[0.00] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
[0.00] **
[0.004000] Preemptible hierarchical RCU implementation.
[0.004000]  RCU kthread priority: 1.
[0.004000] NR_IRQS:2304 nr_irqs:256 16
[0.004000] CPU 0 irqstacks, hard=d35d4000 soft=d35d6000
[0.004000] Console: colour VGA+ 80x25
[0.004000] console [tty0] enabled
[0.004000] console [ttyS0] enabled
[0.004000] bootconsole [earlyser0] disabled
[0.004000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., 
Ingo Molnar
[0.004000] ... MAX_LOCKDEP_SUBCLASSES:  8
[0.004000] ... MAX_LOCK_DEPTH:  48
[0.004000] ... MAX_LOCKDEP_KEYS:8191
[0.004000] ... CLASSHASH_SIZE:  4096
[0.004000] ... MAX_LOCKDEP_ENTRIES: 32768
[0.004000] ... MAX_LOCKDE

[PATCH v2] locking/lockdep: Revise Documentation/locking/crossrelease.txt

2017-11-08 Thread Byungchul Park
Changes from v1
- Run several tools checking english spell and grammar over the text.
- Simplify the document more.

-8<-
>From 412bc9eb0d22791f70f7364bda189feb41899ff9 Mon Sep 17 00:00:00 2001
From: Byungchul Park 
Date: Thu, 9 Nov 2017 16:12:23 +0900
Subject: [PATCH v2] locking/lockdep: Revise 
Documentation/locking/crossrelease.txt

Revise Documentation/locking/crossrelease.txt to enhance its readability.

Signed-off-by: Byungchul Park 
---
 Documentation/locking/crossrelease.txt | 492 +++--
 1 file changed, 227 insertions(+), 265 deletions(-)

diff --git a/Documentation/locking/crossrelease.txt 
b/Documentation/locking/crossrelease.txt
index bdf1423..11e3e3b 100644
--- a/Documentation/locking/crossrelease.txt
+++ b/Documentation/locking/crossrelease.txt
@@ -12,10 +12,10 @@ Contents:
 
  (*) Limitation
 
- - Limit lockdep
+ - Limiting lockdep
  - Pros from the limitation
  - Cons from the limitation
- - Relax the limitation
+ - Relaxing the limitation
 
  (*) Crossrelease
 
@@ -30,9 +30,9 @@ Contents:
  (*) Optimizations
 
  - Avoid duplication
- - Lockless for hot paths
+ - Make hot paths lockless
 
- (*) APPENDIX A: What lockdep does to work aggresively
+ (*) APPENDIX A: How to add dependencies aggressively
 
  (*) APPENDIX B: How to avoid adding false dependencies
 
@@ -51,36 +51,30 @@ also impossible due to the same reason.
 
 For example:
 
-   A context going to trigger event C is waiting for event A to happen.
-   A context going to trigger event A is waiting for event B to happen.
-   A context going to trigger event B is waiting for event C to happen.
+   A context going to trigger event C is waiting for event A.
+   A context going to trigger event A is waiting for event B.
+   A context going to trigger event B is waiting for event C.
 
-A deadlock occurs when these three wait operations run at the same time,
-because event C cannot be triggered if event A does not happen, which in
-turn cannot be triggered if event B does not happen, which in turn
-cannot be triggered if event C does not happen. After all, no event can
-be triggered since any of them never meets its condition to wake up.
+A deadlock occurs when the three waiters run at the same time, because
+event C cannot be triggered if event A does not happen, which in turn
+cannot be triggered if event B does not happen, which in turn cannot be
+triggered if event C does not happen. After all, no event can be
+triggered since any of them never meets its condition to wake up.
 
-A dependency might exist between two waiters and a deadlock might happen
-due to an incorrect releationship between dependencies. Thus, we must
-define what a dependency is first. A dependency exists between them if:
+A dependency exists between two waiters, and a deadlock happens due to
+an incorrect relationship between dependencies. Thus, we must define
+what a dependency is first. A dependency exists between waiters if:
 
1. There are two waiters waiting for each event at a given time.
2. The only way to wake up each waiter is to trigger its event.
3. Whether one can be woken up depends on whether the other can.
 
-Each wait in the example creates its dependency like:
+Each waiter in the example creates its dependency like:
 
Event C depends on event A.
Event A depends on event B.
Event B depends on event C.
 
-   NOTE: Precisely speaking, a dependency is one between whether a
-   waiter for an event can be woken up and whether another waiter for
-   another event can be woken up. However from now on, we will describe
-   a dependency as if it's one between an event and another event for
-   simplicity.
-
 And they form circular dependencies like:
 
 -> C -> A -> B -
@@ -101,19 +95,18 @@ Circular dependencies cause a deadlock.
 How lockdep works
 -
 
-Lockdep tries to detect a deadlock by checking dependencies created by
-lock operations, acquire and release. Waiting for a lock corresponds to
-waiting for an event, and releasing a lock corresponds to triggering an
-event in the previous section.
+Lockdep tries to detect a deadlock by checking circular dependencies
+created by lock operations, acquire and release, which are wait and
+event respectively.
 
 In short, lockdep does:
 
1. Detect a new dependency.
-   2. Add the dependency into a global graph.
+   2. Add the dependency to a global graph.
3. Check if that makes dependencies circular.
-   4. Report a deadlock or its possibility if so.
+   4. Report the deadlock if so.
 
-For example, consider a graph built by lockdep that looks like:
+For example, the graph has been built like:
 
A -> B -
\
@@ -123,7 +116,7 @@ For example, consider a graph built by lockdep that looks 
like:
 
where A, B,..., E are different lock classes.
 
-Lockdep will add a dependency into the graph on detection of a new
+Lockdep will add a 

[PATCH v2] locking/lockdep: Revise Documentation/locking/crossrelease.txt

2017-11-08 Thread Byungchul Park
Changes from v1
- Run several tools checking english spell and grammar over the text.
- Simplify the document more.

-8<-
>From 412bc9eb0d22791f70f7364bda189feb41899ff9 Mon Sep 17 00:00:00 2001
From: Byungchul Park 
Date: Thu, 9 Nov 2017 16:12:23 +0900
Subject: [PATCH v2] locking/lockdep: Revise 
Documentation/locking/crossrelease.txt

Revise Documentation/locking/crossrelease.txt to enhance its readability.

Signed-off-by: Byungchul Park 
---
 Documentation/locking/crossrelease.txt | 492 +++--
 1 file changed, 227 insertions(+), 265 deletions(-)

diff --git a/Documentation/locking/crossrelease.txt 
b/Documentation/locking/crossrelease.txt
index bdf1423..11e3e3b 100644
--- a/Documentation/locking/crossrelease.txt
+++ b/Documentation/locking/crossrelease.txt
@@ -12,10 +12,10 @@ Contents:
 
  (*) Limitation
 
- - Limit lockdep
+ - Limiting lockdep
  - Pros from the limitation
  - Cons from the limitation
- - Relax the limitation
+ - Relaxing the limitation
 
  (*) Crossrelease
 
@@ -30,9 +30,9 @@ Contents:
  (*) Optimizations
 
  - Avoid duplication
- - Lockless for hot paths
+ - Make hot paths lockless
 
- (*) APPENDIX A: What lockdep does to work aggresively
+ (*) APPENDIX A: How to add dependencies aggressively
 
  (*) APPENDIX B: How to avoid adding false dependencies
 
@@ -51,36 +51,30 @@ also impossible due to the same reason.
 
 For example:
 
-   A context going to trigger event C is waiting for event A to happen.
-   A context going to trigger event A is waiting for event B to happen.
-   A context going to trigger event B is waiting for event C to happen.
+   A context going to trigger event C is waiting for event A.
+   A context going to trigger event A is waiting for event B.
+   A context going to trigger event B is waiting for event C.
 
-A deadlock occurs when these three wait operations run at the same time,
-because event C cannot be triggered if event A does not happen, which in
-turn cannot be triggered if event B does not happen, which in turn
-cannot be triggered if event C does not happen. After all, no event can
-be triggered since any of them never meets its condition to wake up.
+A deadlock occurs when the three waiters run at the same time, because
+event C cannot be triggered if event A does not happen, which in turn
+cannot be triggered if event B does not happen, which in turn cannot be
+triggered if event C does not happen. After all, no event can be
+triggered since any of them never meets its condition to wake up.
 
-A dependency might exist between two waiters and a deadlock might happen
-due to an incorrect releationship between dependencies. Thus, we must
-define what a dependency is first. A dependency exists between them if:
+A dependency exists between two waiters, and a deadlock happens due to
+an incorrect relationship between dependencies. Thus, we must define
+what a dependency is first. A dependency exists between waiters if:
 
1. There are two waiters waiting for each event at a given time.
2. The only way to wake up each waiter is to trigger its event.
3. Whether one can be woken up depends on whether the other can.
 
-Each wait in the example creates its dependency like:
+Each waiter in the example creates its dependency like:
 
Event C depends on event A.
Event A depends on event B.
Event B depends on event C.
 
-   NOTE: Precisely speaking, a dependency is one between whether a
-   waiter for an event can be woken up and whether another waiter for
-   another event can be woken up. However from now on, we will describe
-   a dependency as if it's one between an event and another event for
-   simplicity.
-
 And they form circular dependencies like:
 
 -> C -> A -> B -
@@ -101,19 +95,18 @@ Circular dependencies cause a deadlock.
 How lockdep works
 -
 
-Lockdep tries to detect a deadlock by checking dependencies created by
-lock operations, acquire and release. Waiting for a lock corresponds to
-waiting for an event, and releasing a lock corresponds to triggering an
-event in the previous section.
+Lockdep tries to detect a deadlock by checking circular dependencies
+created by lock operations, acquire and release, which are wait and
+event respectively.
 
 In short, lockdep does:
 
1. Detect a new dependency.
-   2. Add the dependency into a global graph.
+   2. Add the dependency to a global graph.
3. Check if that makes dependencies circular.
-   4. Report a deadlock or its possibility if so.
+   4. Report the deadlock if so.
 
-For example, consider a graph built by lockdep that looks like:
+For example, the graph has been built like:
 
A -> B -
\
@@ -123,7 +116,7 @@ For example, consider a graph built by lockdep that looks 
like:
 
where A, B,..., E are different lock classes.
 
-Lockdep will add a dependency into the graph on detection of a new
+Lockdep will add a dependency to the graph on detection of a new
 

Re: [PATCH 03/31] nds32: Support early_printk

2017-11-08 Thread Greentime Hu
2017-11-08 17:47 GMT+08:00 Tobias Klauser :
> On 2017-11-08 at 06:54:51 +0100, Greentime Hu  wrote:
>> From: Greentime Hu 
>>
>> Signed-off-by: Rick Chen 
>> Signed-off-by: Greentime Hu 
>> ---
>>  arch/nds32/kernel/early_printk.c |  124 
>> ++
>>  1 file changed, 124 insertions(+)
>>  create mode 100644 arch/nds32/kernel/early_printk.c
>>
>> diff --git a/arch/nds32/kernel/early_printk.c 
>> b/arch/nds32/kernel/early_printk.c
>> new file mode 100644
>> index 000..269c3cd
>> --- /dev/null
>> +++ b/arch/nds32/kernel/early_printk.c
>
> Could be implemented using earlycon (the 8250 drivers already supports
> it) instead of duplicating functionality in arch/nds32?  See e.g. the
> nios2 port for how this could be done, specifically commit e118c3fec9c0
> ("nios2: remove custom early console implementation").

Thanks.
I will try to use earlycon in the next version patch.
I will remove this patch if earlycon is able to be used in nds32.


Re: [PATCH 03/31] nds32: Support early_printk

2017-11-08 Thread Greentime Hu
2017-11-08 17:47 GMT+08:00 Tobias Klauser :
> On 2017-11-08 at 06:54:51 +0100, Greentime Hu  wrote:
>> From: Greentime Hu 
>>
>> Signed-off-by: Rick Chen 
>> Signed-off-by: Greentime Hu 
>> ---
>>  arch/nds32/kernel/early_printk.c |  124 
>> ++
>>  1 file changed, 124 insertions(+)
>>  create mode 100644 arch/nds32/kernel/early_printk.c
>>
>> diff --git a/arch/nds32/kernel/early_printk.c 
>> b/arch/nds32/kernel/early_printk.c
>> new file mode 100644
>> index 000..269c3cd
>> --- /dev/null
>> +++ b/arch/nds32/kernel/early_printk.c
>
> Could be implemented using earlycon (the 8250 drivers already supports
> it) instead of duplicating functionality in arch/nds32?  See e.g. the
> nios2 port for how this could be done, specifically commit e118c3fec9c0
> ("nios2: remove custom early console implementation").

Thanks.
I will try to use earlycon in the next version patch.
I will remove this patch if earlycon is able to be used in nds32.


Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces

2017-11-08 Thread महेश बंडेवार
[resend response as earlier one failed because of formatting issues]

On Thu, Nov 9, 2017 at 12:21 PM, Serge E. Hallyn  wrote:
>
> On Thu, Nov 09, 2017 at 09:55:41AM +0900, Mahesh Bandewar (महेश बंडेवार) 
> wrote:
> > On Thu, Nov 9, 2017 at 4:02 AM, Christian Brauner
> >  wrote:
> > > On Wed, Nov 08, 2017 at 03:09:59AM -0800, Mahesh Bandewar (महेश बंडेवार) 
> > > wrote:
> > >> Sorry folks I was traveling and seems like lot happened on this thread. 
> > >> :p
> > >>
> > >> I will try to response few of these comments selectively -
> > >>
> > >> > The thing that makes me hesitate with this set is that it is a
> > >> > permanent new feature to address what (I hope) is a temporary
> > >> > problem.
> > >> I agree this is permanent new feature but it's not solving a temporary
> > >> problem. It's impossible to assess what and when new vulnerability
> > >> that could show up. I think Daniel summed it up appropriately in his
> > >> response
> > >>
> > >> > Seems like there are two naive ways to do it, the first being to just
> > >> > look at all code under ns_capable() plus code called from there.  It
> > >> > seems like looking at the result of that could be fruitful.
> > >> This is really hard. The main issue that there were features designed
> > >> and developed before user-ns days with an assumption that unprivileged
> > >> users will never get certain capabilities which only root user gets.
> > >> Now that is not true anymore with user-ns creation with mapping root
> > >> for any process. Also at the same time blocking user-ns creation for
> > >> eveyone is a big-hammer which is not needed too. So it's not that easy
> > >> to just perform a code-walk-though and correct those decisions now.
> > >>
> > >> > It seems to me that the existing control in
> > >> > /proc/sys/kernel/unprivileged_userns_clone might be the better duct 
> > >> > tape
> > >> > in that case.
> > >> This solution is essentially blocking unprivileged users from using
> > >> the user-namespaces entirely. This is not really a solution that can
> > >> work. The solution that this patch-set adds allows unprivileged users
> > >> to create user-namespaces. Actually the proposed solution is more
> > >> fine-grained approach than the unprivileged_userns_clone solution
> > >> since you can selectively block capabilities rather than completely
> > >> blocking the functionality.
> > >
> > > I've been talking to Stéphane today about this and we should also keep in 
> > > mind
> > > that we have:
> > >
> > > chb@conventiont|~
> > >> ls -al /proc/sys/user/
> > > total 0
> > > dr-xr-xr-x 1 root root 0 Nov  6 23:32 .
> > > dr-xr-xr-x 1 root root 0 Nov  2 22:13 ..
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_cgroup_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_inotify_instances
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_inotify_watches
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_ipc_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_mnt_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_net_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_pid_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_user_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_uts_namespaces
> > >
> > > These files allow you to limit the number of namespaces that can be 
> > > created
> > > *per namespace* type. So let's say your system runs a bunch of user 
> > > namespaces
> > > you can do:
> > >
> > > chb@conventiont|~
> > >> echo 0 > /proc/sys/user/max_user_namespaces
> > >
> > > So that the next time you try to create a user namespaces you'd see:
> > >
> > > chb@conventiont|~
> > >> unshare -U
> > > unshare: unshare failed: No space left on device
> > >
> > > So there's not even a need to upstream a new sysctl since we have ways of
> > > blocking this.
> > >
> > I'm not sure how it's solving the problem that my patch-set is addressing?
> > I agree though that the need for unprivileged_userns_clone sysctl goes
> > away as this is equivalent to setting that sysctl to 0 as you have
> > described above.
>
> oh right that was the reasoning iirc for not needing the other sysctl.
>
> > However as I mentioned earlier, blocking processes from creating
> > user-namespaces is not the solution. Processes should be able to
> > create namespaces as they are designed but at the same time we need to
> > have controls to 'contain' them if a need arise. Setting max_no to 0
> > is not the solution that I'm looking for since it doesn't solve the
> > problem.
>
> well yesterday we were told that was explicitly not the goal, but that was
> not by you ... i just mention it to explain why we seem to be walking in
> circles a bit.
>
> anyway the bounding set doesn't actually make sense so forget that.   the
> question then is just whether it makes sense to allow things to continue
> at all in this situation.  would you mind indulging me by giving one or two
> concrete examples in the 

[PATCH 1/1] intel_idle: Graceful probe failure when MWAIT is disabled

2017-11-08 Thread Len Brown
From: Len Brown 

When MWAIT is disabled, intel_idle refuses to probe.
But it may mis-lead the user by blaming this on the model number:

intel_idle: does not run on family 6 modesl 79

So defer the check for MWAIT until after the model# white-list check succeeds,
and if the MWAIT check fails, tell the user how to fix it:

intel_idle: Please enable MWAIT in BIOS SETUP

Signed-off-by: Len Brown 
---
 drivers/idle/intel_idle.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index f0b06b14e782..16249b0953ff 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1061,7 +1061,7 @@ static const struct idle_cpu idle_cpu_dnv = {
 };
 
 #define ICPU(model, cpu) \
-   { X86_VENDOR_INTEL, 6, model, X86_FEATURE_MWAIT, (unsigned long) }
+   { X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
 static const struct x86_cpu_id intel_idle_ids[] __initconst = {
ICPU(INTEL_FAM6_NEHALEM_EP, idle_cpu_nehalem),
@@ -1125,6 +1125,11 @@ static int __init intel_idle_probe(void)
return -ENODEV;
}
 
+   if (!boot_cpu_has(X86_FEATURE_MWAIT)) {
+   pr_debug("Please enable MWAIT in BIOS SETUP\n");
+   return -ENODEV;
+   }
+
if (boot_cpu_data.cpuid_level < CPUID_MWAIT_LEAF)
return -ENODEV;
 
-- 
2.14.0-rc0



[PATCH 1/1] intel_idle: Graceful probe failure when MWAIT is disabled

2017-11-08 Thread Len Brown
From: Len Brown 

When MWAIT is disabled, intel_idle refuses to probe.
But it may mis-lead the user by blaming this on the model number:

intel_idle: does not run on family 6 modesl 79

So defer the check for MWAIT until after the model# white-list check succeeds,
and if the MWAIT check fails, tell the user how to fix it:

intel_idle: Please enable MWAIT in BIOS SETUP

Signed-off-by: Len Brown 
---
 drivers/idle/intel_idle.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index f0b06b14e782..16249b0953ff 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1061,7 +1061,7 @@ static const struct idle_cpu idle_cpu_dnv = {
 };
 
 #define ICPU(model, cpu) \
-   { X86_VENDOR_INTEL, 6, model, X86_FEATURE_MWAIT, (unsigned long) }
+   { X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
 static const struct x86_cpu_id intel_idle_ids[] __initconst = {
ICPU(INTEL_FAM6_NEHALEM_EP, idle_cpu_nehalem),
@@ -1125,6 +1125,11 @@ static int __init intel_idle_probe(void)
return -ENODEV;
}
 
+   if (!boot_cpu_has(X86_FEATURE_MWAIT)) {
+   pr_debug("Please enable MWAIT in BIOS SETUP\n");
+   return -ENODEV;
+   }
+
if (boot_cpu_data.cpuid_level < CPUID_MWAIT_LEAF)
return -ENODEV;
 
-- 
2.14.0-rc0



Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces

2017-11-08 Thread महेश बंडेवार
[resend response as earlier one failed because of formatting issues]

On Thu, Nov 9, 2017 at 12:21 PM, Serge E. Hallyn  wrote:
>
> On Thu, Nov 09, 2017 at 09:55:41AM +0900, Mahesh Bandewar (महेश बंडेवार) 
> wrote:
> > On Thu, Nov 9, 2017 at 4:02 AM, Christian Brauner
> >  wrote:
> > > On Wed, Nov 08, 2017 at 03:09:59AM -0800, Mahesh Bandewar (महेश बंडेवार) 
> > > wrote:
> > >> Sorry folks I was traveling and seems like lot happened on this thread. 
> > >> :p
> > >>
> > >> I will try to response few of these comments selectively -
> > >>
> > >> > The thing that makes me hesitate with this set is that it is a
> > >> > permanent new feature to address what (I hope) is a temporary
> > >> > problem.
> > >> I agree this is permanent new feature but it's not solving a temporary
> > >> problem. It's impossible to assess what and when new vulnerability
> > >> that could show up. I think Daniel summed it up appropriately in his
> > >> response
> > >>
> > >> > Seems like there are two naive ways to do it, the first being to just
> > >> > look at all code under ns_capable() plus code called from there.  It
> > >> > seems like looking at the result of that could be fruitful.
> > >> This is really hard. The main issue that there were features designed
> > >> and developed before user-ns days with an assumption that unprivileged
> > >> users will never get certain capabilities which only root user gets.
> > >> Now that is not true anymore with user-ns creation with mapping root
> > >> for any process. Also at the same time blocking user-ns creation for
> > >> eveyone is a big-hammer which is not needed too. So it's not that easy
> > >> to just perform a code-walk-though and correct those decisions now.
> > >>
> > >> > It seems to me that the existing control in
> > >> > /proc/sys/kernel/unprivileged_userns_clone might be the better duct 
> > >> > tape
> > >> > in that case.
> > >> This solution is essentially blocking unprivileged users from using
> > >> the user-namespaces entirely. This is not really a solution that can
> > >> work. The solution that this patch-set adds allows unprivileged users
> > >> to create user-namespaces. Actually the proposed solution is more
> > >> fine-grained approach than the unprivileged_userns_clone solution
> > >> since you can selectively block capabilities rather than completely
> > >> blocking the functionality.
> > >
> > > I've been talking to Stéphane today about this and we should also keep in 
> > > mind
> > > that we have:
> > >
> > > chb@conventiont|~
> > >> ls -al /proc/sys/user/
> > > total 0
> > > dr-xr-xr-x 1 root root 0 Nov  6 23:32 .
> > > dr-xr-xr-x 1 root root 0 Nov  2 22:13 ..
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_cgroup_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_inotify_instances
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_inotify_watches
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_ipc_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_mnt_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_net_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_pid_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_user_namespaces
> > > -rw-r--r-- 1 root root 0 Nov  8 19:48 max_uts_namespaces
> > >
> > > These files allow you to limit the number of namespaces that can be 
> > > created
> > > *per namespace* type. So let's say your system runs a bunch of user 
> > > namespaces
> > > you can do:
> > >
> > > chb@conventiont|~
> > >> echo 0 > /proc/sys/user/max_user_namespaces
> > >
> > > So that the next time you try to create a user namespaces you'd see:
> > >
> > > chb@conventiont|~
> > >> unshare -U
> > > unshare: unshare failed: No space left on device
> > >
> > > So there's not even a need to upstream a new sysctl since we have ways of
> > > blocking this.
> > >
> > I'm not sure how it's solving the problem that my patch-set is addressing?
> > I agree though that the need for unprivileged_userns_clone sysctl goes
> > away as this is equivalent to setting that sysctl to 0 as you have
> > described above.
>
> oh right that was the reasoning iirc for not needing the other sysctl.
>
> > However as I mentioned earlier, blocking processes from creating
> > user-namespaces is not the solution. Processes should be able to
> > create namespaces as they are designed but at the same time we need to
> > have controls to 'contain' them if a need arise. Setting max_no to 0
> > is not the solution that I'm looking for since it doesn't solve the
> > problem.
>
> well yesterday we were told that was explicitly not the goal, but that was
> not by you ... i just mention it to explain why we seem to be walking in
> circles a bit.
>
> anyway the bounding set doesn't actually make sense so forget that.   the
> question then is just whether it makes sense to allow things to continue
> at all in this situation.  would you mind indulging me by giving one or two
> concrete examples in the previous known cves of what capabilities you would
> 

[PATCH 0/13] scsi: arcmsr: add some driver options and support new adapter ARC-1884

2017-11-08 Thread Ching Huang
From: Ching Huang 

Hi all,

The following patches apply to Martin's 4.15/scsi-queue.

Patch 1: redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

Patch 2: simplify arcmsr_iop_init function.

Patch 3: add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

Patch 4: replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB 
that was got from firmware.

Patch 5: add driver option host_can_queue to set host->can_queue value by user. 
It's value expands
 up to 1024.

Patch 6: replace constant ARCMSR_MAX_OUTSTANDING_CMD by variable 
acb->maxOutstanding that was determined by user.

Patch 7: add driver option cmd_per_lun to set host->cmd_per_lun value by user.

Patch 8: add ACB_F_MSG_GET_CONFIG to acb->acb_flags for for message interrupt 
checking before schedule work for
 get device map.

Patch 9: add a function arcmsr_set_iop_datetime and driver option set_date_time 
to set date and time to firmware.

Patch 10: fix clear doorbell queue on ACB_ADAPTER_TYPE_B controller.

Patch 11: spin off duplicate code of timer init for message isr BH in 
arcmsr_probe and arcmsr_resume as a function
  arcmsr_init_get_devmap_timer

Patch 12: adjust some tab or white-space to make text alignment.

Patch 13: update driver version to v1.40.00.02-20171011

Please review. Thanks.

---




[PATCH 0/13] scsi: arcmsr: add some driver options and support new adapter ARC-1884

2017-11-08 Thread Ching Huang
From: Ching Huang 

Hi all,

The following patches apply to Martin's 4.15/scsi-queue.

Patch 1: redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

Patch 2: simplify arcmsr_iop_init function.

Patch 3: add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

Patch 4: replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB 
that was got from firmware.

Patch 5: add driver option host_can_queue to set host->can_queue value by user. 
It's value expands
 up to 1024.

Patch 6: replace constant ARCMSR_MAX_OUTSTANDING_CMD by variable 
acb->maxOutstanding that was determined by user.

Patch 7: add driver option cmd_per_lun to set host->cmd_per_lun value by user.

Patch 8: add ACB_F_MSG_GET_CONFIG to acb->acb_flags for for message interrupt 
checking before schedule work for
 get device map.

Patch 9: add a function arcmsr_set_iop_datetime and driver option set_date_time 
to set date and time to firmware.

Patch 10: fix clear doorbell queue on ACB_ADAPTER_TYPE_B controller.

Patch 11: spin off duplicate code of timer init for message isr BH in 
arcmsr_probe and arcmsr_resume as a function
  arcmsr_init_get_devmap_timer

Patch 12: adjust some tab or white-space to make text alignment.

Patch 13: update driver version to v1.40.00.02-20171011

Please review. Thanks.

---




Re: [PATCH V13 06/10] mmc: sdhci-pci: Add CQHCI support for Intel GLK

2017-11-08 Thread Adrian Hunter
On 08/11/17 11:24, Linus Walleij wrote:
> On Fri, Nov 3, 2017 at 2:20 PM, Adrian Hunter  wrote:
> 
>> Add CQHCI initialization and implement CQHCI operations for Intel GLK.
>>
>> Signed-off-by: Adrian Hunter 
> 
> This patch seems OK in context, but it merely illustrates the
> weirdness of .[runtime]_suspend/resume calling into CQE-specific
> APIs rather than using generic host callbacks.

Your comment makes no sense at all.  The host driver has
[runtime]_suspend/resume callbacks and it is up to the host driver to decide
what to do.  CQHCI provides helpers since that is the whole point of having
a CQHCI library.


Re: [PATCH V13 06/10] mmc: sdhci-pci: Add CQHCI support for Intel GLK

2017-11-08 Thread Adrian Hunter
On 08/11/17 11:24, Linus Walleij wrote:
> On Fri, Nov 3, 2017 at 2:20 PM, Adrian Hunter  wrote:
> 
>> Add CQHCI initialization and implement CQHCI operations for Intel GLK.
>>
>> Signed-off-by: Adrian Hunter 
> 
> This patch seems OK in context, but it merely illustrates the
> weirdness of .[runtime]_suspend/resume calling into CQE-specific
> APIs rather than using generic host callbacks.

Your comment makes no sense at all.  The host driver has
[runtime]_suspend/resume callbacks and it is up to the host driver to decide
what to do.  CQHCI provides helpers since that is the whole point of having
a CQHCI library.


Re: [PATCH 13/31] nds32: DMA mapping API

2017-11-08 Thread Greentime Hu
2017-11-08 17:09 GMT+08:00 Arnd Bergmann :
> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>
>> +static void consistent_sync(void *vaddr, size_t size, int direction)
>> +{
>> +   unsigned long start = (unsigned long)vaddr;
>> +   unsigned long end = start + size;
>> +
>> +   switch (direction) {
>> +   case DMA_FROM_DEVICE:   /* invalidate only */
>> +   cpu_dma_inval_range(start, end);
>> +   break;
>> +   case DMA_TO_DEVICE: /* writeback only */
>> +   cpu_dma_wb_range(start, end);
>> +   break;
>> +   case DMA_BIDIRECTIONAL: /* writeback and invalidate */
>> +   cpu_dma_wbinval_range(start, end);
>> +   break;
>> +   default:
>> +   BUG();
>> +   }
>> +}
>
>> +
>> +static void
>> +nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
>> + size_t size, enum dma_data_direction dir)
>> +{
>> +   consistent_sync((void *)dma_to_virt(dev, handle), size, dir);
>> +}
>> +
>> +static void
>> +nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
>> +size_t size, enum dma_data_direction dir)
>> +{
>> +   consistent_sync((void *)dma_to_virt(dev, handle), size, dir);
>> +}
>
> You do the same cache operations for _to_cpu and _to_device, which
> usually works,
> but is more expensive than you need. It's better to take the ownership into
> account and only do what you need.
>

Thanks.

Like this?

static void
nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
  size_t size, enum dma_data_direction dir)
{
consistent_sync((void *)dma_to_virt(dev, handle), size,
DMA_FROM_DEVICE);
}

static void
nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
 size_t size, enum dma_data_direction dir)
{
consistent_sync((void *)dma_to_virt(dev, handle), size,
DMA_TO_DEVICE);
}


Re: [PATCH 13/31] nds32: DMA mapping API

2017-11-08 Thread Greentime Hu
2017-11-08 17:09 GMT+08:00 Arnd Bergmann :
> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>
>> +static void consistent_sync(void *vaddr, size_t size, int direction)
>> +{
>> +   unsigned long start = (unsigned long)vaddr;
>> +   unsigned long end = start + size;
>> +
>> +   switch (direction) {
>> +   case DMA_FROM_DEVICE:   /* invalidate only */
>> +   cpu_dma_inval_range(start, end);
>> +   break;
>> +   case DMA_TO_DEVICE: /* writeback only */
>> +   cpu_dma_wb_range(start, end);
>> +   break;
>> +   case DMA_BIDIRECTIONAL: /* writeback and invalidate */
>> +   cpu_dma_wbinval_range(start, end);
>> +   break;
>> +   default:
>> +   BUG();
>> +   }
>> +}
>
>> +
>> +static void
>> +nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
>> + size_t size, enum dma_data_direction dir)
>> +{
>> +   consistent_sync((void *)dma_to_virt(dev, handle), size, dir);
>> +}
>> +
>> +static void
>> +nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
>> +size_t size, enum dma_data_direction dir)
>> +{
>> +   consistent_sync((void *)dma_to_virt(dev, handle), size, dir);
>> +}
>
> You do the same cache operations for _to_cpu and _to_device, which
> usually works,
> but is more expensive than you need. It's better to take the ownership into
> account and only do what you need.
>

Thanks.

Like this?

static void
nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
  size_t size, enum dma_data_direction dir)
{
consistent_sync((void *)dma_to_virt(dev, handle), size,
DMA_FROM_DEVICE);
}

static void
nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
 size_t size, enum dma_data_direction dir)
{
consistent_sync((void *)dma_to_virt(dev, handle), size,
DMA_TO_DEVICE);
}


[PATCH v3 2/3] drivers: hwmon: Add W83773G driver

2017-11-08 Thread Lei YU
Nuvoton W83773G is a hardware monitor IC providing one local
temperature and two remote temperature sensors.

Signed-off-by: Lei YU 
---
v2:
 - Rewrite the driver using regmap
 - Add offset and update_interval
v3:
 - Use devm_hwmon_device_register_with_info() with is_visible/read/write
   functions.
---
 drivers/hwmon/Kconfig   |  10 ++
 drivers/hwmon/Makefile  |   1 +
 drivers/hwmon/w83773g.c | 348 
 3 files changed, 359 insertions(+)
 create mode 100644 drivers/hwmon/w83773g.c

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index d654314..11c6248 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1710,6 +1710,16 @@ config SENSORS_VT8231
  This driver can also be built as a module.  If so, the module
  will be called vt8231.
 
+config SENSORS_W83773G
+   tristate "Nuvoton W83773G"
+   depends on I2C
+   help
+ If you say yes here you get support for the Nuvoton W83773G hardware
+ monitoring chip.
+
+ This driver can also be built as a module.  If so, the module
+ will be called w83773g.
+
 config SENSORS_W83781D
tristate "Winbond W83781D, W83782D, W83783S, Asus AS99127F"
depends on I2C
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index c84d978..0649ad8 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_SENSORS_ATK0110) += asus_atk0110.o
 # asb100, then w83781d go first, as they can override other drivers' addresses.
 obj-$(CONFIG_SENSORS_ASB100)   += asb100.o
 obj-$(CONFIG_SENSORS_W83627HF) += w83627hf.o
+obj-$(CONFIG_SENSORS_W83773G)  += w83773g.o
 obj-$(CONFIG_SENSORS_W83792D)  += w83792d.o
 obj-$(CONFIG_SENSORS_W83793)   += w83793.o
 obj-$(CONFIG_SENSORS_W83795)   += w83795.o
diff --git a/drivers/hwmon/w83773g.c b/drivers/hwmon/w83773g.c
new file mode 100644
index 000..58ac45b
--- /dev/null
+++ b/drivers/hwmon/w83773g.c
@@ -0,0 +1,348 @@
+/*
+ * Copyright (C) 2017 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * Driver for the Nuvoton W83773G SMBus temperature sensor IC.
+ * Supported models: W83773G
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Addresses to scan */
+static const unsigned short normal_i2c[] = { 0x4c, 0x4d, I2C_CLIENT_END };
+
+/* W83773 has 3 channels */
+#define W83773_CHANNELS3
+
+/* The W83773 registers */
+#define W83773_CONVERSION_RATE_REG_READ0x04
+#define W83773_CONVERSION_RATE_REG_WRITE   0x0A
+#define W83773_MANUFACTURER_ID_REG 0xFE
+#define W83773_LOCAL_TEMP  0x00
+
+static const u8 W83773_STATUS[2] = { 0x02, 0x17 };
+
+static const u8 W83773_TEMP_LSB[2] = { 0x10, 0x25 };
+static const u8 W83773_TEMP_MSB[2] = { 0x01, 0x24 };
+
+static const u8 W83773_OFFSET_LSB[2] = { 0x12, 0x16 };
+static const u8 W83773_OFFSET_MSB[2] = { 0x11, 0x15 };
+
+/* this is the number of sensors in the device */
+static const struct i2c_device_id w83773_id[] = {
+   { "w83773g" },
+   { }
+};
+
+MODULE_DEVICE_TABLE(i2c, w83773_id);
+
+static const struct of_device_id w83773_of_match[] = {
+   {
+   .compatible = "nuvoton,w83773g"
+   },
+   { },
+};
+MODULE_DEVICE_TABLE(of, w83773_of_match);
+
+static inline long temp_of_local(s8 reg)
+{
+   return reg * 1000;
+}
+
+static inline long temp_of_remote(s8 hb, u8 lb)
+{
+   return (hb << 3 | lb >> 5) * 125;
+}
+
+static int get_local_temp(struct regmap *regmap, long *val)
+{
+   unsigned int regval;
+   int ret;
+
+   ret = regmap_read(regmap, W83773_LOCAL_TEMP, );
+   if (ret < 0)
+   return ret;
+
+   *val = temp_of_local(regval);
+   return 0;
+}
+
+static int get_remote_temp(struct regmap *regmap, int index, long *val)
+{
+   unsigned int regval_high;
+   unsigned int regval_low;
+   int ret;
+
+   ret = regmap_read(regmap, W83773_TEMP_MSB[index], _high);
+   if (ret < 0)
+   return ret;
+
+   ret = regmap_read(regmap, W83773_TEMP_LSB[index], _low);
+   if (ret < 0)
+   return ret;
+
+   *val = temp_of_remote(regval_high, regval_low);
+   return 0;
+}
+
+static int get_fault(struct regmap *regmap, int index, long *val)
+{
+   unsigned int regval;
+   int ret;
+
+   ret = regmap_read(regmap, W83773_STATUS[index], );
+
+   if (ret < 0)
+   return ret;
+
+   *val = (u8)regval & 0x04 >> 2;
+   return 0;
+}
+
+static int get_offset(struct regmap *regmap, int index, long *val)
+{
+   unsigned int regval_high;
+   unsigned int regval_low;
+   int ret;
+
+   ret = 

[PATCH v3 2/3] drivers: hwmon: Add W83773G driver

2017-11-08 Thread Lei YU
Nuvoton W83773G is a hardware monitor IC providing one local
temperature and two remote temperature sensors.

Signed-off-by: Lei YU 
---
v2:
 - Rewrite the driver using regmap
 - Add offset and update_interval
v3:
 - Use devm_hwmon_device_register_with_info() with is_visible/read/write
   functions.
---
 drivers/hwmon/Kconfig   |  10 ++
 drivers/hwmon/Makefile  |   1 +
 drivers/hwmon/w83773g.c | 348 
 3 files changed, 359 insertions(+)
 create mode 100644 drivers/hwmon/w83773g.c

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index d654314..11c6248 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1710,6 +1710,16 @@ config SENSORS_VT8231
  This driver can also be built as a module.  If so, the module
  will be called vt8231.
 
+config SENSORS_W83773G
+   tristate "Nuvoton W83773G"
+   depends on I2C
+   help
+ If you say yes here you get support for the Nuvoton W83773G hardware
+ monitoring chip.
+
+ This driver can also be built as a module.  If so, the module
+ will be called w83773g.
+
 config SENSORS_W83781D
tristate "Winbond W83781D, W83782D, W83783S, Asus AS99127F"
depends on I2C
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index c84d978..0649ad8 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_SENSORS_ATK0110) += asus_atk0110.o
 # asb100, then w83781d go first, as they can override other drivers' addresses.
 obj-$(CONFIG_SENSORS_ASB100)   += asb100.o
 obj-$(CONFIG_SENSORS_W83627HF) += w83627hf.o
+obj-$(CONFIG_SENSORS_W83773G)  += w83773g.o
 obj-$(CONFIG_SENSORS_W83792D)  += w83792d.o
 obj-$(CONFIG_SENSORS_W83793)   += w83793.o
 obj-$(CONFIG_SENSORS_W83795)   += w83795.o
diff --git a/drivers/hwmon/w83773g.c b/drivers/hwmon/w83773g.c
new file mode 100644
index 000..58ac45b
--- /dev/null
+++ b/drivers/hwmon/w83773g.c
@@ -0,0 +1,348 @@
+/*
+ * Copyright (C) 2017 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * Driver for the Nuvoton W83773G SMBus temperature sensor IC.
+ * Supported models: W83773G
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Addresses to scan */
+static const unsigned short normal_i2c[] = { 0x4c, 0x4d, I2C_CLIENT_END };
+
+/* W83773 has 3 channels */
+#define W83773_CHANNELS3
+
+/* The W83773 registers */
+#define W83773_CONVERSION_RATE_REG_READ0x04
+#define W83773_CONVERSION_RATE_REG_WRITE   0x0A
+#define W83773_MANUFACTURER_ID_REG 0xFE
+#define W83773_LOCAL_TEMP  0x00
+
+static const u8 W83773_STATUS[2] = { 0x02, 0x17 };
+
+static const u8 W83773_TEMP_LSB[2] = { 0x10, 0x25 };
+static const u8 W83773_TEMP_MSB[2] = { 0x01, 0x24 };
+
+static const u8 W83773_OFFSET_LSB[2] = { 0x12, 0x16 };
+static const u8 W83773_OFFSET_MSB[2] = { 0x11, 0x15 };
+
+/* this is the number of sensors in the device */
+static const struct i2c_device_id w83773_id[] = {
+   { "w83773g" },
+   { }
+};
+
+MODULE_DEVICE_TABLE(i2c, w83773_id);
+
+static const struct of_device_id w83773_of_match[] = {
+   {
+   .compatible = "nuvoton,w83773g"
+   },
+   { },
+};
+MODULE_DEVICE_TABLE(of, w83773_of_match);
+
+static inline long temp_of_local(s8 reg)
+{
+   return reg * 1000;
+}
+
+static inline long temp_of_remote(s8 hb, u8 lb)
+{
+   return (hb << 3 | lb >> 5) * 125;
+}
+
+static int get_local_temp(struct regmap *regmap, long *val)
+{
+   unsigned int regval;
+   int ret;
+
+   ret = regmap_read(regmap, W83773_LOCAL_TEMP, );
+   if (ret < 0)
+   return ret;
+
+   *val = temp_of_local(regval);
+   return 0;
+}
+
+static int get_remote_temp(struct regmap *regmap, int index, long *val)
+{
+   unsigned int regval_high;
+   unsigned int regval_low;
+   int ret;
+
+   ret = regmap_read(regmap, W83773_TEMP_MSB[index], _high);
+   if (ret < 0)
+   return ret;
+
+   ret = regmap_read(regmap, W83773_TEMP_LSB[index], _low);
+   if (ret < 0)
+   return ret;
+
+   *val = temp_of_remote(regval_high, regval_low);
+   return 0;
+}
+
+static int get_fault(struct regmap *regmap, int index, long *val)
+{
+   unsigned int regval;
+   int ret;
+
+   ret = regmap_read(regmap, W83773_STATUS[index], );
+
+   if (ret < 0)
+   return ret;
+
+   *val = (u8)regval & 0x04 >> 2;
+   return 0;
+}
+
+static int get_offset(struct regmap *regmap, int index, long *val)
+{
+   unsigned int regval_high;
+   unsigned int regval_low;
+   int ret;
+
+   ret = regmap_read(regmap, 

[PATCH] rtc: Use time64_t variables to set time/alarm from sysfs

2017-11-08 Thread Baolin Wang
Use time64_t variables and related APIs for sysfs interfaces to
support setting time or alarm after the year 2038 on 32-bit system.

Signed-off-by: Baolin Wang 
---
 drivers/rtc/rtc-sysfs.c |   25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/rtc/rtc-sysfs.c b/drivers/rtc/rtc-sysfs.c
index e364550..92ff2ed 100644
--- a/drivers/rtc/rtc-sysfs.c
+++ b/drivers/rtc/rtc-sysfs.c
@@ -72,9 +72,10 @@
 
retval = rtc_read_time(to_rtc_device(dev), );
if (retval == 0) {
-   unsigned long time;
-   rtc_tm_to_time(, );
-   retval = sprintf(buf, "%lu\n", time);
+   time64_t time;
+
+   time = rtc_tm_to_time64();
+   retval = sprintf(buf, "%lld\n", time);
}
 
return retval;
@@ -132,7 +133,7 @@
 wakealarm_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
ssize_t retval;
-   unsigned long alarm;
+   time64_t alarm;
struct rtc_wkalrm alm;
 
/* Don't show disabled alarms.  For uniformity, RTC alarms are
@@ -145,8 +146,8 @@
 */
retval = rtc_read_alarm(to_rtc_device(dev), );
if (retval == 0 && alm.enabled) {
-   rtc_tm_to_time(, );
-   retval = sprintf(buf, "%lu\n", alarm);
+   alarm = rtc_tm_to_time64();
+   retval = sprintf(buf, "%lld\n", alarm);
}
 
return retval;
@@ -157,8 +158,8 @@
const char *buf, size_t n)
 {
ssize_t retval;
-   unsigned long now, alarm;
-   unsigned long push = 0;
+   time64_t now, alarm;
+   time64_t push = 0;
struct rtc_wkalrm alm;
struct rtc_device *rtc = to_rtc_device(dev);
const char *buf_ptr;
@@ -170,7 +171,7 @@
retval = rtc_read_time(rtc, );
if (retval < 0)
return retval;
-   rtc_tm_to_time(, );
+   now = rtc_tm_to_time64();
 
buf_ptr = buf;
if (*buf_ptr == '+') {
@@ -181,7 +182,7 @@
} else
adjust = 1;
}
-   retval = kstrtoul(buf_ptr, 0, );
+   retval = kstrtos64(buf_ptr, 0, );
if (retval)
return retval;
if (adjust) {
@@ -197,7 +198,7 @@
return retval;
if (alm.enabled) {
if (push) {
-   rtc_tm_to_time(, );
+   push = rtc_tm_to_time64();
alarm += push;
} else
return -EBUSY;
@@ -212,7 +213,7 @@
 */
alarm = now + 300;
}
-   rtc_time_to_tm(alarm, );
+   rtc_time64_to_tm(alarm, );
 
retval = rtc_set_alarm(rtc, );
return (retval < 0) ? retval : n;
-- 
1.7.9.5



[PATCH] rtc: Use time64_t variables to set time/alarm from sysfs

2017-11-08 Thread Baolin Wang
Use time64_t variables and related APIs for sysfs interfaces to
support setting time or alarm after the year 2038 on 32-bit system.

Signed-off-by: Baolin Wang 
---
 drivers/rtc/rtc-sysfs.c |   25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/rtc/rtc-sysfs.c b/drivers/rtc/rtc-sysfs.c
index e364550..92ff2ed 100644
--- a/drivers/rtc/rtc-sysfs.c
+++ b/drivers/rtc/rtc-sysfs.c
@@ -72,9 +72,10 @@
 
retval = rtc_read_time(to_rtc_device(dev), );
if (retval == 0) {
-   unsigned long time;
-   rtc_tm_to_time(, );
-   retval = sprintf(buf, "%lu\n", time);
+   time64_t time;
+
+   time = rtc_tm_to_time64();
+   retval = sprintf(buf, "%lld\n", time);
}
 
return retval;
@@ -132,7 +133,7 @@
 wakealarm_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
ssize_t retval;
-   unsigned long alarm;
+   time64_t alarm;
struct rtc_wkalrm alm;
 
/* Don't show disabled alarms.  For uniformity, RTC alarms are
@@ -145,8 +146,8 @@
 */
retval = rtc_read_alarm(to_rtc_device(dev), );
if (retval == 0 && alm.enabled) {
-   rtc_tm_to_time(, );
-   retval = sprintf(buf, "%lu\n", alarm);
+   alarm = rtc_tm_to_time64();
+   retval = sprintf(buf, "%lld\n", alarm);
}
 
return retval;
@@ -157,8 +158,8 @@
const char *buf, size_t n)
 {
ssize_t retval;
-   unsigned long now, alarm;
-   unsigned long push = 0;
+   time64_t now, alarm;
+   time64_t push = 0;
struct rtc_wkalrm alm;
struct rtc_device *rtc = to_rtc_device(dev);
const char *buf_ptr;
@@ -170,7 +171,7 @@
retval = rtc_read_time(rtc, );
if (retval < 0)
return retval;
-   rtc_tm_to_time(, );
+   now = rtc_tm_to_time64();
 
buf_ptr = buf;
if (*buf_ptr == '+') {
@@ -181,7 +182,7 @@
} else
adjust = 1;
}
-   retval = kstrtoul(buf_ptr, 0, );
+   retval = kstrtos64(buf_ptr, 0, );
if (retval)
return retval;
if (adjust) {
@@ -197,7 +198,7 @@
return retval;
if (alm.enabled) {
if (push) {
-   rtc_tm_to_time(, );
+   push = rtc_tm_to_time64();
alarm += push;
} else
return -EBUSY;
@@ -212,7 +213,7 @@
 */
alarm = now + 300;
}
-   rtc_time_to_tm(alarm, );
+   rtc_time64_to_tm(alarm, );
 
retval = rtc_set_alarm(rtc, );
return (retval < 0) ? retval : n;
-- 
1.7.9.5



[PATCH v3 0/3] Add W83773G hwmon sensor driver and doc

2017-11-08 Thread Lei YU
Nuvoton W83773G is a hardware monitoring chip, which integrates two remote
and one local temperature sensors.
---
v2:
 - The driver is re-written as v1's comment, so the author is changed to me.
 - Added the device to trivial-devices.txt
v3:
 - Update the driver to use new API devm_hwmon_device_register_with_info()

Lei YU (3):
  DT: i2c: W83773G is a trivial device
  drivers: hwmon: Add W83773G driver
  hwmon: (w83773g) Add documentation

 .../devicetree/bindings/trivial-devices.txt|   1 +
 Documentation/hwmon/w83773g|  33 ++
 drivers/hwmon/Kconfig  |  10 +
 drivers/hwmon/Makefile |   1 +
 drivers/hwmon/w83773g.c| 348 +
 5 files changed, 393 insertions(+)
 create mode 100644 Documentation/hwmon/w83773g
 create mode 100644 drivers/hwmon/w83773g.c

-- 
1.9.1



[PATCH v3 0/3] Add W83773G hwmon sensor driver and doc

2017-11-08 Thread Lei YU
Nuvoton W83773G is a hardware monitoring chip, which integrates two remote
and one local temperature sensors.
---
v2:
 - The driver is re-written as v1's comment, so the author is changed to me.
 - Added the device to trivial-devices.txt
v3:
 - Update the driver to use new API devm_hwmon_device_register_with_info()

Lei YU (3):
  DT: i2c: W83773G is a trivial device
  drivers: hwmon: Add W83773G driver
  hwmon: (w83773g) Add documentation

 .../devicetree/bindings/trivial-devices.txt|   1 +
 Documentation/hwmon/w83773g|  33 ++
 drivers/hwmon/Kconfig  |  10 +
 drivers/hwmon/Makefile |   1 +
 drivers/hwmon/w83773g.c| 348 +
 5 files changed, 393 insertions(+)
 create mode 100644 Documentation/hwmon/w83773g
 create mode 100644 drivers/hwmon/w83773g.c

-- 
1.9.1



[PATCH v3 3/3] hwmon: (w83773g) Add documentation

2017-11-08 Thread Lei YU
Add documentation for the w83773g driver.

Signed-off-by: Lei YU 
---
v2:
 - Add notes for offset and update_interval
---
 Documentation/hwmon/w83773g | 33 +
 1 file changed, 33 insertions(+)
 create mode 100644 Documentation/hwmon/w83773g

diff --git a/Documentation/hwmon/w83773g b/Documentation/hwmon/w83773g
new file mode 100644
index 000..4cc6c0b
--- /dev/null
+++ b/Documentation/hwmon/w83773g
@@ -0,0 +1,33 @@
+Kernel driver w83773g
+
+
+Supported chips:
+  * Nuvoton W83773G
+Prefix: 'w83773g'
+Addresses scanned: I2C 0x4c and 0x4d
+Datasheet: 
https://www.nuvoton.com/resource-files/W83773G_SG_DatasheetV1_2.pdf
+
+Authors:
+   Lei YU 
+
+Description
+---
+
+This driver implements support for Nuvoton W83773G temperature sensor
+chip. This chip implements one local and two remote sensors.
+The chip also features offsets for the two remote sensors which get added to
+the input readings. The chip does all the scaling by itself and the driver
+therefore reports true temperatures that don't need any user-space adjustments.
+Temperature is measured in degrees Celsius.
+The chip is wired over I2C/SMBus and specified over a temperature
+range of -40 to +125 degrees Celsius (for local sensor) and -40 to +127
+degrees Celsius (for remote sensors).
+Resolution for both the local and remote channels is 0.125 degree C.
+
+The chip supports only temperature measurement. The driver exports
+the temperature values via the following sysfs files:
+
+temp[1-3]_input
+temp[2-3]_fault
+temp[2-3]_offset
+update_interval
-- 
1.9.1



[PATCH v3 3/3] hwmon: (w83773g) Add documentation

2017-11-08 Thread Lei YU
Add documentation for the w83773g driver.

Signed-off-by: Lei YU 
---
v2:
 - Add notes for offset and update_interval
---
 Documentation/hwmon/w83773g | 33 +
 1 file changed, 33 insertions(+)
 create mode 100644 Documentation/hwmon/w83773g

diff --git a/Documentation/hwmon/w83773g b/Documentation/hwmon/w83773g
new file mode 100644
index 000..4cc6c0b
--- /dev/null
+++ b/Documentation/hwmon/w83773g
@@ -0,0 +1,33 @@
+Kernel driver w83773g
+
+
+Supported chips:
+  * Nuvoton W83773G
+Prefix: 'w83773g'
+Addresses scanned: I2C 0x4c and 0x4d
+Datasheet: 
https://www.nuvoton.com/resource-files/W83773G_SG_DatasheetV1_2.pdf
+
+Authors:
+   Lei YU 
+
+Description
+---
+
+This driver implements support for Nuvoton W83773G temperature sensor
+chip. This chip implements one local and two remote sensors.
+The chip also features offsets for the two remote sensors which get added to
+the input readings. The chip does all the scaling by itself and the driver
+therefore reports true temperatures that don't need any user-space adjustments.
+Temperature is measured in degrees Celsius.
+The chip is wired over I2C/SMBus and specified over a temperature
+range of -40 to +125 degrees Celsius (for local sensor) and -40 to +127
+degrees Celsius (for remote sensors).
+Resolution for both the local and remote channels is 0.125 degree C.
+
+The chip supports only temperature measurement. The driver exports
+the temperature values via the following sysfs files:
+
+temp[1-3]_input
+temp[2-3]_fault
+temp[2-3]_offset
+update_interval
-- 
1.9.1



[PATCH v3 1/3] DT: i2c: W83773G is a trivial device

2017-11-08 Thread Lei YU
Signed-off-by: Lei YU 
---
 Documentation/devicetree/bindings/trivial-devices.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/trivial-devices.txt 
b/Documentation/devicetree/bindings/trivial-devices.txt
index af284fb..63ad2f1 100644
--- a/Documentation/devicetree/bindings/trivial-devices.txt
+++ b/Documentation/devicetree/bindings/trivial-devices.txt
@@ -188,3 +188,4 @@ ti,tmp103   Low Power Digital Temperature Sensor 
with SMBUS/Two Wire Serial Inter
 ti,tmp275  Digital Temperature Sensor
 winbond,w83793 Winbond/Nuvoton H/W Monitor
 winbond,wpct301i2c trusted platform module (TPM)
+nuvoton,w83773gNuvoton Temperature Sensor
-- 
1.9.1



[PATCH v3 1/3] DT: i2c: W83773G is a trivial device

2017-11-08 Thread Lei YU
Signed-off-by: Lei YU 
---
 Documentation/devicetree/bindings/trivial-devices.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/trivial-devices.txt 
b/Documentation/devicetree/bindings/trivial-devices.txt
index af284fb..63ad2f1 100644
--- a/Documentation/devicetree/bindings/trivial-devices.txt
+++ b/Documentation/devicetree/bindings/trivial-devices.txt
@@ -188,3 +188,4 @@ ti,tmp103   Low Power Digital Temperature Sensor 
with SMBUS/Two Wire Serial Inter
 ti,tmp275  Digital Temperature Sensor
 winbond,w83793 Winbond/Nuvoton H/W Monitor
 winbond,wpct301i2c trusted platform module (TPM)
+nuvoton,w83773gNuvoton Temperature Sensor
-- 
1.9.1



Re: [PATCH 12/31] nds32: Device specific operations

2017-11-08 Thread Greentime Hu
2017-11-08 17:04 GMT+08:00 Arnd Bergmann :
> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>
>> +
>> +#define ioremap(cookie,size)   __ioremap(cookie,size,0,1)
>> +#define ioremap_nocache(cookie,size)   __ioremap(cookie,size,0,1)
>> +#define iounmap(cookie)__iounmap(cookie)
>
>> +#include 
>
> asm-generic/io.h now provides an ioremap_nocache() helper along with
> ioremap_uc/ioremap_wc/ioremap_wt, so I think you can remove the
> ioremap_nocache definition here. You might also be able to remove
> __ioremap and __iounmap, and only provide ioremap/iounmap, plus
> the identity macro 'define ioremap ioremap'

Thanks. I will try to use generic ioremap_nocache() helper in the next
version patch.

>> +void __iomem *__ioremap(unsigned long phys_addr, size_t size,
>> +   unsigned long flags, unsigned long align)
>
> The 'align' argument is unused here, and not used on other architectures
> either.
>

Thanks. I will remove this argument in the next version patch.

>> +{
>> +   struct vm_struct *area;
>> +   unsigned long addr, offset, last_addr;
>> +   pgprot_t prot;
>> +
>> +   /* Don't allow wraparound or zero size */
>> +   last_addr = phys_addr + size - 1;
>> +   if (!size || last_addr < phys_addr)
>> +   return NULL;
>> +
>> +   /*
>> +* Mappings have to be page-aligned
>> +*/
>> +   offset = phys_addr & ~PAGE_MASK;
>> +   phys_addr &= PAGE_MASK;
>> +   size = PAGE_ALIGN(last_addr + 1) - phys_addr;
>> +
>> +   /*
>> +* Ok, go for it..
>> +*/
>> +   area = get_vm_area(size, VM_IOREMAP);
>
> Better use get_vm_area_caller here to have the ioremap areas show up
> in a more useful form in /proc/vmallocinfo

Thanks.
I will use get_vm_area_caller() in the next version patch.

> Please also have a look at what you can do for memremap().
>
> Since you have no cacheable version of ioremap_wb/wt, it will
> return an uncached mapping all the time, which is not ideal.

Thanks.
I will study kernel/memremap.c


Re: [PATCH 12/31] nds32: Device specific operations

2017-11-08 Thread Greentime Hu
2017-11-08 17:04 GMT+08:00 Arnd Bergmann :
> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>
>> +
>> +#define ioremap(cookie,size)   __ioremap(cookie,size,0,1)
>> +#define ioremap_nocache(cookie,size)   __ioremap(cookie,size,0,1)
>> +#define iounmap(cookie)__iounmap(cookie)
>
>> +#include 
>
> asm-generic/io.h now provides an ioremap_nocache() helper along with
> ioremap_uc/ioremap_wc/ioremap_wt, so I think you can remove the
> ioremap_nocache definition here. You might also be able to remove
> __ioremap and __iounmap, and only provide ioremap/iounmap, plus
> the identity macro 'define ioremap ioremap'

Thanks. I will try to use generic ioremap_nocache() helper in the next
version patch.

>> +void __iomem *__ioremap(unsigned long phys_addr, size_t size,
>> +   unsigned long flags, unsigned long align)
>
> The 'align' argument is unused here, and not used on other architectures
> either.
>

Thanks. I will remove this argument in the next version patch.

>> +{
>> +   struct vm_struct *area;
>> +   unsigned long addr, offset, last_addr;
>> +   pgprot_t prot;
>> +
>> +   /* Don't allow wraparound or zero size */
>> +   last_addr = phys_addr + size - 1;
>> +   if (!size || last_addr < phys_addr)
>> +   return NULL;
>> +
>> +   /*
>> +* Mappings have to be page-aligned
>> +*/
>> +   offset = phys_addr & ~PAGE_MASK;
>> +   phys_addr &= PAGE_MASK;
>> +   size = PAGE_ALIGN(last_addr + 1) - phys_addr;
>> +
>> +   /*
>> +* Ok, go for it..
>> +*/
>> +   area = get_vm_area(size, VM_IOREMAP);
>
> Better use get_vm_area_caller here to have the ioremap areas show up
> in a more useful form in /proc/vmallocinfo

Thanks.
I will use get_vm_area_caller() in the next version patch.

> Please also have a look at what you can do for memremap().
>
> Since you have no cacheable version of ioremap_wb/wt, it will
> return an uncached mapping all the time, which is not ideal.

Thanks.
I will study kernel/memremap.c


  1   2   3   4   5   6   7   8   9   10   >