Re: [PATCH] lpfc: Avoid to disable pci_dev twice

2014-09-10 Thread Mike Qiu

On 08/28/2014 02:34 AM, James Smart wrote:

Mike,

Can you confirm - the "nulls" this patch correct are because the 
probe_one and error_detect threads are running concurrently, thus 
battling ?


If so - this fix looks insufficient and we should rework it.


Yes, it is. My patch is just a workaround for this bug.



Q: why are they allowed to run concurrently ?  I could see this solved 
at the platform level to let probe_one finish before error_detect is 
called (and therefore stating error_detect only makes sense to call if 
probe_one was successful). It's also a much driver-friendly solution. 
I could see other drivers have much the same issue with concurrency 
and data structure teardown - and if locks aren't allowed in the 
error-detect path... it's not good.




I agree with you on this point, platform solution is much better. So 
maybe use a lock or a flag to show it is in such stat, this maybe  also 
happens when driver is in  remove stat.


Thanks,
Mike

-- james s



On 7/31/2014 10:16 PM, Mike Qiu wrote:

On 07/17/2014 02:32 PM, Mike Qiu wrote:


Hi, all

How about this patch ?

Any idea ?


In IBM Power servers, when hardware error occurs during probe
state, EEH subsystem will call driver's error_detected interface,
which will call pci_disable_device(). But driver's probe function also
call pci_disable_device() in this situation.

So pci_dev will be disabled twice:

Device lpfc disabling already-disabled device
[ cut here ]
WARNING: at drivers/pci/pci.c:1407
CPU: 0 PID: 8744 Comm: kworker/0:0 Tainted: GW 
3.10.42-2002.pkvm2_1_1.6.ppc64 #1

Workqueue: events .work_for_cpu_fn
task: c0274e3f5400 ti: c027d3958000 task.ti: c027d3958000
NIP: c0471b8c LR: c0471b88 CTR: c043ebe0
REGS: c027d395b650 TRAP: 0700   Tainted: GW 
(3.10.42-2002.pkvm2_1_1.6.ppc64)
MSR: 900100029032   CR: 28b52b44  XER: 
2000

CFAR: c0879ab8 SOFTE: 1
...
NIP .pci_disable_device+0xcc/0xe0
LR  .pci_disable_device+0xc8/0xe0
Call Trace:
.pci_disable_device+0xc8/0xe0 (unreliable)
.lpfc_disable_pci_dev+0x50/0x80 [lpfc]
.lpfc_pci_probe_one+0x870/0x21a0 [lpfc]
.local_pci_probe+0x68/0xb0
.work_for_cpu_fn+0x38/0x60
.process_one_work+0x1a4/0x4d0
.worker_thread+0x37c/0x490
.kthread+0xf0/0x100
.ret_from_kernel_thread+0x5c/0x80

Signed-off-by: Mike Qiu 
---
  drivers/scsi/lpfc/lpfc.h  |  1 +
  drivers/scsi/lpfc/lpfc_init.c | 59 
+++

  2 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 434e903..0c7bad9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -813,6 +813,7 @@ struct lpfc_hba {
  #define VPD_MASK0xf /* mask for any vpd data */

  uint8_t soft_wwn_enable;
+uint8_t probe_done;

  struct timer_list fcp_poll_timer;
  struct timer_list eratt_poll;
diff --git a/drivers/scsi/lpfc/lpfc_init.c 
b/drivers/scsi/lpfc/lpfc_init.c

index 06f9a5b..c2e67ae 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -9519,6 +9519,9 @@ lpfc_pci_probe_one_s3(struct pci_dev *pdev, 
const struct pci_device_id *pid)

  }
  }

+/* Set the probe flag */
+phba->probe_done = 1;
+
  /* Perform post initialization setup */
  lpfc_post_init_setup(phba);

@@ -9795,6 +9798,9 @@ lpfc_sli_prep_dev_for_recover(struct lpfc_hba 
*phba)

  static void
  lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+if (phba)
+return;
+
  lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
  "2710 PCI channel disable preparing for reset\n");

@@ -9812,7 +9818,8 @@ lpfc_sli_prep_dev_for_reset(struct lpfc_hba 
*phba)


  /* Disable interrupt and pci device */
  lpfc_sli_disable_intr(phba);
-pci_disable_device(phba->pcidev);
+if (phba->probe_done && phba->pcidev)
+pci_disable_device(phba->pcidev);
  }

  /**
@@ -10282,6 +10289,9 @@ lpfc_pci_probe_one_s4(struct pci_dev *pdev, 
const struct pci_device_id *pid)

  goto out_disable_intr;
  }

+/* Set probe_done flag */
+phba->probe_done = 1;
+
  /* Log the current active interrupt mode */
  phba->intr_mode = intr_mode;
  lpfc_log_intr_mode(phba, intr_mode);
@@ -10544,6 +10554,9 @@ lpfc_sli4_prep_dev_for_recover(struct 
lpfc_hba *phba)

  static void
  lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+if (!phba)
+return;
+
  lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
  "2826 PCI channel disable preparing for reset\n");

@@ -10562,7 +10575,9 @@ lpfc_sli4_prep_dev_for_reset(struct lpfc_hba 
*phba)

  /* Disable interrupt and pci device */
  lpfc_sli4_disable_intr(phba);
  lpfc_sli4_queue_destroy(phba);
-pci_disable_device(phba->pcidev);
+
+if (phba->probe_done && phba->pcidev)
+pci_disable_devic

Re: [PATCH] lpfc: Avoid to disable pci_dev twice

2014-09-10 Thread Mike Qiu

On 08/28/2014 02:34 AM, James Smart wrote:

Mike,

Can you confirm - the nulls this patch correct are because the 
probe_one and error_detect threads are running concurrently, thus 
battling ?


If so - this fix looks insufficient and we should rework it.


Yes, it is. My patch is just a workaround for this bug.



Q: why are they allowed to run concurrently ?  I could see this solved 
at the platform level to let probe_one finish before error_detect is 
called (and therefore stating error_detect only makes sense to call if 
probe_one was successful). It's also a much driver-friendly solution. 
I could see other drivers have much the same issue with concurrency 
and data structure teardown - and if locks aren't allowed in the 
error-detect path... it's not good.




I agree with you on this point, platform solution is much better. So 
maybe use a lock or a flag to show it is in such stat, this maybe  also 
happens when driver is in  remove stat.


Thanks,
Mike

-- james s



On 7/31/2014 10:16 PM, Mike Qiu wrote:

On 07/17/2014 02:32 PM, Mike Qiu wrote:


Hi, all

How about this patch ?

Any idea ?


In IBM Power servers, when hardware error occurs during probe
state, EEH subsystem will call driver's error_detected interface,
which will call pci_disable_device(). But driver's probe function also
call pci_disable_device() in this situation.

So pci_dev will be disabled twice:

Device lpfc disabling already-disabled device
[ cut here ]
WARNING: at drivers/pci/pci.c:1407
CPU: 0 PID: 8744 Comm: kworker/0:0 Tainted: GW 
3.10.42-2002.pkvm2_1_1.6.ppc64 #1

Workqueue: events .work_for_cpu_fn
task: c0274e3f5400 ti: c027d3958000 task.ti: c027d3958000
NIP: c0471b8c LR: c0471b88 CTR: c043ebe0
REGS: c027d395b650 TRAP: 0700   Tainted: GW 
(3.10.42-2002.pkvm2_1_1.6.ppc64)
MSR: 900100029032 SF,HV,EE,ME,IR,DR,RI  CR: 28b52b44  XER: 
2000

CFAR: c0879ab8 SOFTE: 1
...
NIP .pci_disable_device+0xcc/0xe0
LR  .pci_disable_device+0xc8/0xe0
Call Trace:
.pci_disable_device+0xc8/0xe0 (unreliable)
.lpfc_disable_pci_dev+0x50/0x80 [lpfc]
.lpfc_pci_probe_one+0x870/0x21a0 [lpfc]
.local_pci_probe+0x68/0xb0
.work_for_cpu_fn+0x38/0x60
.process_one_work+0x1a4/0x4d0
.worker_thread+0x37c/0x490
.kthread+0xf0/0x100
.ret_from_kernel_thread+0x5c/0x80

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
  drivers/scsi/lpfc/lpfc.h  |  1 +
  drivers/scsi/lpfc/lpfc_init.c | 59 
+++

  2 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 434e903..0c7bad9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -813,6 +813,7 @@ struct lpfc_hba {
  #define VPD_MASK0xf /* mask for any vpd data */

  uint8_t soft_wwn_enable;
+uint8_t probe_done;

  struct timer_list fcp_poll_timer;
  struct timer_list eratt_poll;
diff --git a/drivers/scsi/lpfc/lpfc_init.c 
b/drivers/scsi/lpfc/lpfc_init.c

index 06f9a5b..c2e67ae 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -9519,6 +9519,9 @@ lpfc_pci_probe_one_s3(struct pci_dev *pdev, 
const struct pci_device_id *pid)

  }
  }

+/* Set the probe flag */
+phba-probe_done = 1;
+
  /* Perform post initialization setup */
  lpfc_post_init_setup(phba);

@@ -9795,6 +9798,9 @@ lpfc_sli_prep_dev_for_recover(struct lpfc_hba 
*phba)

  static void
  lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+if (phba)
+return;
+
  lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
  2710 PCI channel disable preparing for reset\n);

@@ -9812,7 +9818,8 @@ lpfc_sli_prep_dev_for_reset(struct lpfc_hba 
*phba)


  /* Disable interrupt and pci device */
  lpfc_sli_disable_intr(phba);
-pci_disable_device(phba-pcidev);
+if (phba-probe_done  phba-pcidev)
+pci_disable_device(phba-pcidev);
  }

  /**
@@ -10282,6 +10289,9 @@ lpfc_pci_probe_one_s4(struct pci_dev *pdev, 
const struct pci_device_id *pid)

  goto out_disable_intr;
  }

+/* Set probe_done flag */
+phba-probe_done = 1;
+
  /* Log the current active interrupt mode */
  phba-intr_mode = intr_mode;
  lpfc_log_intr_mode(phba, intr_mode);
@@ -10544,6 +10554,9 @@ lpfc_sli4_prep_dev_for_recover(struct 
lpfc_hba *phba)

  static void
  lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+if (!phba)
+return;
+
  lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
  2826 PCI channel disable preparing for reset\n);

@@ -10562,7 +10575,9 @@ lpfc_sli4_prep_dev_for_reset(struct lpfc_hba 
*phba)

  /* Disable interrupt and pci device */
  lpfc_sli4_disable_intr(phba);
  lpfc_sli4_queue_destroy(phba);
-pci_disable_device(phba-pcidev);
+
+if (phba-probe_done  phba-pcidev)
+pci_disable_device(phba-pcidev);
  }

  /**
@@ -10893,9 +10908,21 @@ static

Re: [PATCH] lpfc: Avoid to disable pci_dev twice

2014-07-31 Thread Mike Qiu

On 07/17/2014 02:32 PM, Mike Qiu wrote:


Hi, all

How about this patch ?

Any idea ?


In IBM Power servers, when hardware error occurs during probe
state, EEH subsystem will call driver's error_detected interface,
which will call pci_disable_device(). But driver's probe function also
call pci_disable_device() in this situation.

So pci_dev will be disabled twice:

Device lpfc disabling already-disabled device
[ cut here ]
WARNING: at drivers/pci/pci.c:1407
CPU: 0 PID: 8744 Comm: kworker/0:0 Tainted: GW
3.10.42-2002.pkvm2_1_1.6.ppc64 #1
Workqueue: events .work_for_cpu_fn
task: c0274e3f5400 ti: c027d3958000 task.ti: c027d3958000
NIP: c0471b8c LR: c0471b88 CTR: c043ebe0
REGS: c027d395b650 TRAP: 0700   Tainted: GW 
(3.10.42-2002.pkvm2_1_1.6.ppc64)
MSR: 900100029032   CR: 28b52b44  XER: 2000
CFAR: c0879ab8 SOFTE: 1
...
NIP .pci_disable_device+0xcc/0xe0
LR  .pci_disable_device+0xc8/0xe0
Call Trace:
.pci_disable_device+0xc8/0xe0 (unreliable)
.lpfc_disable_pci_dev+0x50/0x80 [lpfc]
.lpfc_pci_probe_one+0x870/0x21a0 [lpfc]
.local_pci_probe+0x68/0xb0
.work_for_cpu_fn+0x38/0x60
.process_one_work+0x1a4/0x4d0
.worker_thread+0x37c/0x490
.kthread+0xf0/0x100
.ret_from_kernel_thread+0x5c/0x80

Signed-off-by: Mike Qiu 
---
  drivers/scsi/lpfc/lpfc.h  |  1 +
  drivers/scsi/lpfc/lpfc_init.c | 59 +++
  2 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 434e903..0c7bad9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -813,6 +813,7 @@ struct lpfc_hba {
  #define VPD_MASK0xf /* mask for any vpd data */

uint8_t soft_wwn_enable;
+   uint8_t probe_done;

struct timer_list fcp_poll_timer;
struct timer_list eratt_poll;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 06f9a5b..c2e67ae 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -9519,6 +9519,9 @@ lpfc_pci_probe_one_s3(struct pci_dev *pdev, const struct 
pci_device_id *pid)
}
}

+   /* Set the probe flag */
+   phba->probe_done = 1;
+
/* Perform post initialization setup */
lpfc_post_init_setup(phba);

@@ -9795,6 +9798,9 @@ lpfc_sli_prep_dev_for_recover(struct lpfc_hba *phba)
  static void
  lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+   if (phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"2710 PCI channel disable preparing for reset\n");

@@ -9812,7 +9818,8 @@ lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)

/* Disable interrupt and pci device */
lpfc_sli_disable_intr(phba);
-   pci_disable_device(phba->pcidev);
+   if (phba->probe_done && phba->pcidev)
+   pci_disable_device(phba->pcidev);
  }

  /**
@@ -10282,6 +10289,9 @@ lpfc_pci_probe_one_s4(struct pci_dev *pdev, const 
struct pci_device_id *pid)
goto out_disable_intr;
}

+   /* Set probe_done flag */
+   phba->probe_done = 1;
+
/* Log the current active interrupt mode */
phba->intr_mode = intr_mode;
lpfc_log_intr_mode(phba, intr_mode);
@@ -10544,6 +10554,9 @@ lpfc_sli4_prep_dev_for_recover(struct lpfc_hba *phba)
  static void
  lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+   if (!phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"2826 PCI channel disable preparing for reset\n");

@@ -10562,7 +10575,9 @@ lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
/* Disable interrupt and pci device */
lpfc_sli4_disable_intr(phba);
lpfc_sli4_queue_destroy(phba);
-   pci_disable_device(phba->pcidev);
+
+   if (phba->probe_done && phba->pcidev)
+   pci_disable_device(phba->pcidev);
  }

  /**
@@ -10893,9 +10908,21 @@ static pci_ers_result_t
  lpfc_io_error_detected(struct pci_dev *pdev, pci_channel_state_t state)
  {
struct Scsi_Host *shost = pci_get_drvdata(pdev);
-   struct lpfc_hba *phba = ((struct lpfc_vport *)shost->hostdata)->phba;
+   struct lpfc_hba *phba;
pci_ers_result_t rc = PCI_ERS_RESULT_DISCONNECT;

+   if (!shost)
+   /* Run here means it may during probe state and
+* Scsi_Host has not been created and We can do nothing
+* in this state so call for hotplug*/
+   return PCI_ERS_RESULT_NONE;
+
+   phba = ((struct lpfc_vport *)shost->hostdata)->phba;
+
+   if (!phba || !phba->probe_done)
+   /* Run here means it may during probe state */
+   return PCI_ERS_RESULT_NONE;
+
switch (phba->pci_dev_grp) {
case LPFC_PCI_DEV_LP:
 

Re: [PATCH] lpfc: Avoid to disable pci_dev twice

2014-07-31 Thread Mike Qiu

On 07/17/2014 02:32 PM, Mike Qiu wrote:


Hi, all

How about this patch ?

Any idea ?


In IBM Power servers, when hardware error occurs during probe
state, EEH subsystem will call driver's error_detected interface,
which will call pci_disable_device(). But driver's probe function also
call pci_disable_device() in this situation.

So pci_dev will be disabled twice:

Device lpfc disabling already-disabled device
[ cut here ]
WARNING: at drivers/pci/pci.c:1407
CPU: 0 PID: 8744 Comm: kworker/0:0 Tainted: GW
3.10.42-2002.pkvm2_1_1.6.ppc64 #1
Workqueue: events .work_for_cpu_fn
task: c0274e3f5400 ti: c027d3958000 task.ti: c027d3958000
NIP: c0471b8c LR: c0471b88 CTR: c043ebe0
REGS: c027d395b650 TRAP: 0700   Tainted: GW 
(3.10.42-2002.pkvm2_1_1.6.ppc64)
MSR: 900100029032 SF,HV,EE,ME,IR,DR,RI  CR: 28b52b44  XER: 2000
CFAR: c0879ab8 SOFTE: 1
...
NIP .pci_disable_device+0xcc/0xe0
LR  .pci_disable_device+0xc8/0xe0
Call Trace:
.pci_disable_device+0xc8/0xe0 (unreliable)
.lpfc_disable_pci_dev+0x50/0x80 [lpfc]
.lpfc_pci_probe_one+0x870/0x21a0 [lpfc]
.local_pci_probe+0x68/0xb0
.work_for_cpu_fn+0x38/0x60
.process_one_work+0x1a4/0x4d0
.worker_thread+0x37c/0x490
.kthread+0xf0/0x100
.ret_from_kernel_thread+0x5c/0x80

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
  drivers/scsi/lpfc/lpfc.h  |  1 +
  drivers/scsi/lpfc/lpfc_init.c | 59 +++
  2 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 434e903..0c7bad9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -813,6 +813,7 @@ struct lpfc_hba {
  #define VPD_MASK0xf /* mask for any vpd data */

uint8_t soft_wwn_enable;
+   uint8_t probe_done;

struct timer_list fcp_poll_timer;
struct timer_list eratt_poll;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 06f9a5b..c2e67ae 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -9519,6 +9519,9 @@ lpfc_pci_probe_one_s3(struct pci_dev *pdev, const struct 
pci_device_id *pid)
}
}

+   /* Set the probe flag */
+   phba-probe_done = 1;
+
/* Perform post initialization setup */
lpfc_post_init_setup(phba);

@@ -9795,6 +9798,9 @@ lpfc_sli_prep_dev_for_recover(struct lpfc_hba *phba)
  static void
  lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+   if (phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
2710 PCI channel disable preparing for reset\n);

@@ -9812,7 +9818,8 @@ lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)

/* Disable interrupt and pci device */
lpfc_sli_disable_intr(phba);
-   pci_disable_device(phba-pcidev);
+   if (phba-probe_done  phba-pcidev)
+   pci_disable_device(phba-pcidev);
  }

  /**
@@ -10282,6 +10289,9 @@ lpfc_pci_probe_one_s4(struct pci_dev *pdev, const 
struct pci_device_id *pid)
goto out_disable_intr;
}

+   /* Set probe_done flag */
+   phba-probe_done = 1;
+
/* Log the current active interrupt mode */
phba-intr_mode = intr_mode;
lpfc_log_intr_mode(phba, intr_mode);
@@ -10544,6 +10554,9 @@ lpfc_sli4_prep_dev_for_recover(struct lpfc_hba *phba)
  static void
  lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+   if (!phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
2826 PCI channel disable preparing for reset\n);

@@ -10562,7 +10575,9 @@ lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
/* Disable interrupt and pci device */
lpfc_sli4_disable_intr(phba);
lpfc_sli4_queue_destroy(phba);
-   pci_disable_device(phba-pcidev);
+
+   if (phba-probe_done  phba-pcidev)
+   pci_disable_device(phba-pcidev);
  }

  /**
@@ -10893,9 +10908,21 @@ static pci_ers_result_t
  lpfc_io_error_detected(struct pci_dev *pdev, pci_channel_state_t state)
  {
struct Scsi_Host *shost = pci_get_drvdata(pdev);
-   struct lpfc_hba *phba = ((struct lpfc_vport *)shost-hostdata)-phba;
+   struct lpfc_hba *phba;
pci_ers_result_t rc = PCI_ERS_RESULT_DISCONNECT;

+   if (!shost)
+   /* Run here means it may during probe state and
+* Scsi_Host has not been created and We can do nothing
+* in this state so call for hotplug*/
+   return PCI_ERS_RESULT_NONE;
+
+   phba = ((struct lpfc_vport *)shost-hostdata)-phba;
+
+   if (!phba || !phba-probe_done)
+   /* Run here means it may during probe state */
+   return PCI_ERS_RESULT_NONE;
+
switch (phba-pci_dev_grp) {
case LPFC_PCI_DEV_LP:
rc = lpfc_io_error_detected_s3(pdev, state);
@@ -10930,9

Re: WARNING: at kernel/cpuset.c:1139

2014-07-23 Thread Mike Qiu

On 07/24/2014 08:27 AM, Li Zefan wrote:

On 2014/7/23 23:12, Tejun Heo wrote:

On Wed, Jul 23, 2014 at 10:50:29AM +0800, Mike Qiu wrote:

commit 734d45130cb ("cpuset: update cs->effective_{cpus, mems} when config
changes") introduce the below warning in my server.

[   35.652137] [ cut here ]
[   35.652141] WARNING: at kernel/cpuset.c:1139

Hah, can you reproduce it?  If so, can you detail how?


It's a typo.

WARN_ON(!cgroup_on_dfl(cp->css.cgroup) &&
nodes_equal(cp->mems_allowed, cp->effective_mems));

should be

WARN_ON(!cgroup_on_dfl(cp->css.cgroup) &&
!nodes_equal(cp->mems_allowed, cp->effective_mems));


Yes, it is. This warning disappeared after this patch.

Reported-and-Tested-by: Mike Qiu 




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING: at kernel/cpuset.c:1139

2014-07-23 Thread Mike Qiu

On 07/24/2014 08:27 AM, Li Zefan wrote:

On 2014/7/23 23:12, Tejun Heo wrote:

On Wed, Jul 23, 2014 at 10:50:29AM +0800, Mike Qiu wrote:

commit 734d45130cb (cpuset: update cs-effective_{cpus, mems} when config
changes) introduce the below warning in my server.

[   35.652137] [ cut here ]
[   35.652141] WARNING: at kernel/cpuset.c:1139

Hah, can you reproduce it?  If so, can you detail how?


It's a typo.

WARN_ON(!cgroup_on_dfl(cp-css.cgroup) 
nodes_equal(cp-mems_allowed, cp-effective_mems));

should be

WARN_ON(!cgroup_on_dfl(cp-css.cgroup) 
!nodes_equal(cp-mems_allowed, cp-effective_mems));


Yes, it is. This warning disappeared after this patch.

Reported-and-Tested-by: Mike Qiu qiud...@linux.vnet.ibm.com




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


WARNING: at kernel/cpuset.c:1139

2014-07-22 Thread Mike Qiu
commit 734d45130cb ("cpuset: update cs->effective_{cpus, mems} when 
config changes") introduce the below warning in my server.


[   35.652137] [ cut here ]
[   35.652141] WARNING: at kernel/cpuset.c:1139
[   35.652142] Modules linked in: ebtable_nat xt_CHECKSUM bridge stp llc 
be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio 
libcxgbi ib_iser iptable_mangle nf_conntrack_ipv4 rdma_cm nf_defrag_ipv4 
xt_conntrack iw_cm nf_conntrack ib_cm ib_sa ib_mad ebtable_filter 
ib_core ebtables ip6_tables ib_addr iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi e1000e ses ptp enclosure pps_core be2net shpchp 
vhost_net tun macvtap macvlan vhost kvm binfmt_misc uinput lpfc 
scsi_transport_fc ipr
[   35.652185] CPU: 36 PID: 1363 Comm: libvirtd Not tainted 
3.16.0-rc5-next-20140721+ #93
[   35.652187] task: c003b3443a00 ti: c003bb008000 task.ti: 
c003bb008000
[   35.652189] NIP: c015ff38 LR: c015ff2c CTR: 

[   35.652190] REGS: c003bb00b850 TRAP: 0700   Not tainted 
(3.16.0-rc5-next-20140721+)
[   35.652191] MSR: 90029032  CR: 
24004824  XER: 

[   35.652196] CFAR: c045f6cc SOFTE: 1
GPR00: c015ff04 c003bb00bad0 c145acf8 0001
GPR04: c003b3dae5d0 0100  
GPR08: c003b3dae548 0004  0004
GPR12: 0001 cfeea200 008066727bd8 008066727a30
GPR16: 0080667dfa08 008066727a68 0080667279f8 0080667279d0
GPR20: c166acf8 c003b3dae530 c1311990 c003b3dae5d0
GPR24: c003b3dae530 c003b3dadc00 c003b3dae400 0001
GPR28:  c1311968 c003b1873100 c003b3dae400
[   35.652219] NIP [c015ff38] .cpuset_write_resmask+0x438/0x8c0
[   35.652221] LR [c015ff2c] .cpuset_write_resmask+0x42c/0x8c0
[   35.65] Call Trace:
[   35.652224] [c003bb00bad0] [c015ff04] 
.cpuset_write_resmask+0x404/0x8c0 (unreliable)
[   35.652227] [c003bb00bba0] [c0156f08] 
.cgroup_file_write+0x78/0x190
[   35.652230] [c003bb00bc50] [c030c490] 
.kernfs_fop_write+0x150/0x1e0

[   35.652233] [c003bb00bcf0] [c026b6d0] .vfs_write+0xe0/0x270
[   35.652235] [c003bb00bd90] [c026be24] .SyS_write+0x64/0x110
[   35.652238] [c003bb00be30] [c000a158] syscall_exit+0x0/0x98
[   35.652239] Instruction dump:
[   35.652240] e93a 39549528 e9290118 7fa95000 419e0024 7ea3ab78 
7ee4bb78 38a00100
[   35.652243] 482ff719 6000 2fa3 419e0008 <0fe0> 7f43d378 
4bfffa71 813a006c

[   35.652247] ---[ end trace f91b0c3aadfe71a6 ]---

Thanks,
Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] libata: Fix NULL pointer of scsi_host in ata_port

2014-07-22 Thread Mike Qiu

On 07/22/2014 10:51 PM, Mike Qiu wrote:

In ata_sas_port_alloc(), it haven't initialized scsi_host field in
ata_port, although scsi_host is in parameters list and unused in this
function.

With commit 1871ee134b73 ("libata: support the ata host which implements a queue 
depth less than 32")
ata_qc_new() try to use scsi_host, while it
is a NULL pointer for ipr IOA and error message shows below:

...


While scsi_host is unused in ata_sas_port_alloc(), better to set it
in ata_sas_port_alloc() instead of in driver.

Signed-off-by: Mike Qiu 
---
  drivers/ata/libata-scsi.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0586f66..a472b6f 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -4070,6 +4070,7 @@ struct ata_port *ata_sas_port_alloc(struct ata_host *host,
ap->flags |= port_info->flags;
ap->ops = port_info->port_ops;
ap->cbl = ATA_CBL_SATA;
+   ap->scsi_host = shost;


What about my patch itself, ata_sas_port_alloc() has "shot" in 
parameters list, but unused.


Maybe better to set ap->scsi_host here, it is very convenient, and 
drivers, like ipr, may forget to set this field, otherwise "shot" need 
to be removed from parameters list I think.


Thanks,
Mike

return ap;
  }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] libata: Fix NULL pointer of scsi_host in ata_port

2014-07-22 Thread Mike Qiu

I have tested with the ipr IOA, passed.

Reviewed-and Tested-by: Mike Qiu 

On 07/23/2014 04:11 AM, Tejun Heo wrote:

Hello,

Can you please test the following patch?

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index d19c37a7..773f4e6 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4798,9 +4798,8 @@ void swap_buf_le16(u16 *buf, unsigned int buf_words)
  static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap)
  {
struct ata_queued_cmd *qc = NULL;
-   unsigned int i, tag, max_queue;
-
-   max_queue = ap->scsi_host->can_queue;
+   unsigned int max_queue = ap->host->n_tags;
+   unsigned int i, tag;

/* no command while frozen */
if (unlikely(ap->pflags & ATA_PFLAG_FROZEN))
@@ -6094,6 +6093,7 @@ void ata_host_init(struct ata_host *host, struct device 
*dev,
  {
spin_lock_init(>lock);
mutex_init(>eh_mutex);
+   host->n_tags = ATA_MAX_QUEUE;
host->dev = dev;
host->ops = ops;
  }
@@ -6179,11 +6179,7 @@ int ata_host_register(struct ata_host *host, struct 
scsi_host_template *sht)
 * The max queue supported by hardware must not be greater than
 * ATA_MAX_QUEUE.
 */
-   if (sht->can_queue > ATA_MAX_QUEUE) {
-   dev_err(host->dev, "BUG: the hardware max queue is too 
large\n");
-   WARN_ON(1);
-   return -EINVAL;
-   }
+   host->n_tags = clamp(sht->can_queue, 1, ATA_MAX_QUEUE);

/* host must have been started */
if (!(host->flags & ATA_HOST_STARTED)) {
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 5ab4e3a..92abb49 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -593,6 +593,7 @@ struct ata_host {
struct device   *dev;
void __iomem * const*iomap;
unsigned intn_ports;
+   unsigned intn_tags; /* nr of NCQ tags */
void*private_data;
struct ata_port_operations *ops;
unsigned long   flags;



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] libata: Fix scsi_host can_queue issue in ata_qc_new()

2014-07-22 Thread Mike Qiu

On 07/22/2014 11:42 PM, Tejun Heo wrote:

Hello,

(cc'ing Dan)

On Tue, Jul 22, 2014 at 10:50:19AM -0400, Mike Qiu wrote:

The can_queue in scsi_host can be more than ATA_MAX_QUEUE (32),
for example, in ipr, it can be 100 or more.

Also, some drivers, like ipr driver, haven't filled the field
scsi_host in ata_port, and will lead a call trace, so add
check for that.

Signed-off-by: Mike Qiu 
---
  drivers/ata/libata-core.c | 15 ---
  1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 259d879..a5b9c70 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4734,7 +4734,10 @@ static struct ata_queued_cmd *ata_qc_new(struct ata_port 
*ap)
struct ata_queued_cmd *qc = NULL;
unsigned int i, tag, max_queue;
  
-	max_queue = ap->scsi_host->can_queue;

+   if (ap->scsi_host && ap->scsi_host->can_queue <= ATA_MAX_QUEUE)
+   max_queue = ap->scsi_host->can_queue;
+   else
+   max_queue = ATA_MAX_QUEUE;
  
  	/* no command while frozen */

if (unlikely(ap->pflags & ATA_PFLAG_FROZEN))
@@ -6109,16 +6112,6 @@ int ata_host_register(struct ata_host *host, struct 
scsi_host_template *sht)
  {
int i, rc;
  
-	/*

-* The max queue supported by hardware must not be greater than
-* ATA_MAX_QUEUE.
-*/
-   if (sht->can_queue > ATA_MAX_QUEUE) {
-   dev_err(host->dev, "BUG: the hardware max queue is too 
large\n");
-   WARN_ON(1);
-   return -EINVAL;
-   }
-

So, ummm, I really don't like that we're adding the conditionals to
the hot path (yeah, its implementation is slow but still).  Maybe we


Yes, agree ..., not a good idea to do this...

Thanks
Mike

need to store the chosen queue depth after all?  Dan?

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] libata: Fix NULL pointer of scsi_host in ata_port

2014-07-22 Thread Mike Qiu

[+cc Wendy, Brian King, Stephen]


On 07/22/2014 10:51 PM, Mike Qiu wrote:

In ata_sas_port_alloc(), it haven't initialized scsi_host field in
ata_port, although scsi_host is in parameters list and unused in this
function.

With commit 1871ee134b73 ("libata: support the ata host which implements a queue 
depth less than 32")
ata_qc_new() try to use scsi_host, while it
is a NULL pointer for ipr IOA and error message shows below:

Unable to handle kernel paging request for data at address 0x0114
Faulting instruction address: 0xc05c2580
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c05c2580] .ata_qc_new_init+0x30/0x1f0
LR [c05c9384] .ata_scsi_translate+0x44/0x230
Call Trace:
0xc003ad332280 (unreliable)
.ata_scsi_translate+0x44/0x230
.ipr_queuecommand+0x2e0/0x780 [ipr]
.scsi_dispatch_cmd+0xec/0x400
.scsi_request_fn+0x52c/0x670
.__blk_run_queue+0x5c/0x80
.blk_execute_rq_nowait+0xf8/0x1c0
.blk_execute_rq+0x88/0x150
.scsi_execute+0xf0/0x1f0
.scsi_execute_req_flags+0xc4/0x170
.scsi_probe_and_add_lun+0x2d4/0xe00
.__scsi_scan_target+0x1a4/0x790
.scsi_scan_channel.part.3+0x80/0xc0
.scsi_scan_host_selected+0x1a0/0x240
.do_scan_async+0x30/0x210
.async_run_entry_fn+0x78/0x1c0
.process_one_work+0x1c4/0x4a0
.worker_thread+0x184/0x600
.kthread+0x10c/0x130
.ret_from_kernel_thread+0x58/0x7c

While scsi_host is unused in ata_sas_port_alloc(), better to set it
in ata_sas_port_alloc() instead of in driver.

Signed-off-by: Mike Qiu 
---
  drivers/ata/libata-scsi.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0586f66..a472b6f 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -4070,6 +4070,7 @@ struct ata_port *ata_sas_port_alloc(struct ata_host *host,
ap->flags |= port_info->flags;
ap->ops = port_info->port_ops;
ap->cbl = ATA_CBL_SATA;
+   ap->scsi_host = shost;

return ap;
  }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] libata: Fix NULL pointer of scsi_host in ata_port

2014-07-22 Thread Mike Qiu
In ata_sas_port_alloc(), it haven't initialized scsi_host field in
ata_port, although scsi_host is in parameters list and unused in this
function.

With commit 1871ee134b73 ("libata: support the ata host which implements a 
queue depth less than 32")
ata_qc_new() try to use scsi_host, while it
is a NULL pointer for ipr IOA and error message shows below:

Unable to handle kernel paging request for data at address 0x0114
Faulting instruction address: 0xc05c2580
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c05c2580] .ata_qc_new_init+0x30/0x1f0
LR [c05c9384] .ata_scsi_translate+0x44/0x230
Call Trace:
0xc003ad332280 (unreliable)
.ata_scsi_translate+0x44/0x230
.ipr_queuecommand+0x2e0/0x780 [ipr]
.scsi_dispatch_cmd+0xec/0x400
.scsi_request_fn+0x52c/0x670
.__blk_run_queue+0x5c/0x80
.blk_execute_rq_nowait+0xf8/0x1c0
.blk_execute_rq+0x88/0x150
.scsi_execute+0xf0/0x1f0
.scsi_execute_req_flags+0xc4/0x170
.scsi_probe_and_add_lun+0x2d4/0xe00
.__scsi_scan_target+0x1a4/0x790
.scsi_scan_channel.part.3+0x80/0xc0
.scsi_scan_host_selected+0x1a0/0x240
.do_scan_async+0x30/0x210
.async_run_entry_fn+0x78/0x1c0
.process_one_work+0x1c4/0x4a0
.worker_thread+0x184/0x600
.kthread+0x10c/0x130
.ret_from_kernel_thread+0x58/0x7c

While scsi_host is unused in ata_sas_port_alloc(), better to set it
in ata_sas_port_alloc() instead of in driver.

Signed-off-by: Mike Qiu 
---
 drivers/ata/libata-scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0586f66..a472b6f 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -4070,6 +4070,7 @@ struct ata_port *ata_sas_port_alloc(struct ata_host *host,
ap->flags |= port_info->flags;
ap->ops = port_info->port_ops;
ap->cbl = ATA_CBL_SATA;
+   ap->scsi_host = shost;
 
return ap;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] libata: Fix scsi_host can_queue issue in ata_qc_new()

2014-07-22 Thread Mike Qiu
The can_queue in scsi_host can be more than ATA_MAX_QUEUE (32),
for example, in ipr, it can be 100 or more.

Also, some drivers, like ipr driver, haven't filled the field
scsi_host in ata_port, and will lead a call trace, so add
check for that.

Signed-off-by: Mike Qiu 
---
 drivers/ata/libata-core.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 259d879..a5b9c70 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4734,7 +4734,10 @@ static struct ata_queued_cmd *ata_qc_new(struct ata_port 
*ap)
struct ata_queued_cmd *qc = NULL;
unsigned int i, tag, max_queue;
 
-   max_queue = ap->scsi_host->can_queue;
+   if (ap->scsi_host && ap->scsi_host->can_queue <= ATA_MAX_QUEUE)
+   max_queue = ap->scsi_host->can_queue;
+   else
+   max_queue = ATA_MAX_QUEUE;
 
/* no command while frozen */
if (unlikely(ap->pflags & ATA_PFLAG_FROZEN))
@@ -6109,16 +6112,6 @@ int ata_host_register(struct ata_host *host, struct 
scsi_host_template *sht)
 {
int i, rc;
 
-   /*
-* The max queue supported by hardware must not be greater than
-* ATA_MAX_QUEUE.
-*/
-   if (sht->can_queue > ATA_MAX_QUEUE) {
-   dev_err(host->dev, "BUG: the hardware max queue is too 
large\n");
-   WARN_ON(1);
-   return -EINVAL;
-   }
-
/* host must have been started */
if (!(host->flags & ATA_HOST_STARTED)) {
dev_err(host->dev, "BUG: trying to register unstarted host\n");
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] libata: Fix scsi_host can_queue issue in ata_qc_new()

2014-07-22 Thread Mike Qiu
The can_queue in scsi_host can be more than ATA_MAX_QUEUE (32),
for example, in ipr, it can be 100 or more.

Also, some drivers, like ipr driver, haven't filled the field
scsi_host in ata_port, and will lead a call trace, so add
check for that.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
 drivers/ata/libata-core.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 259d879..a5b9c70 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4734,7 +4734,10 @@ static struct ata_queued_cmd *ata_qc_new(struct ata_port 
*ap)
struct ata_queued_cmd *qc = NULL;
unsigned int i, tag, max_queue;
 
-   max_queue = ap-scsi_host-can_queue;
+   if (ap-scsi_host  ap-scsi_host-can_queue = ATA_MAX_QUEUE)
+   max_queue = ap-scsi_host-can_queue;
+   else
+   max_queue = ATA_MAX_QUEUE;
 
/* no command while frozen */
if (unlikely(ap-pflags  ATA_PFLAG_FROZEN))
@@ -6109,16 +6112,6 @@ int ata_host_register(struct ata_host *host, struct 
scsi_host_template *sht)
 {
int i, rc;
 
-   /*
-* The max queue supported by hardware must not be greater than
-* ATA_MAX_QUEUE.
-*/
-   if (sht-can_queue  ATA_MAX_QUEUE) {
-   dev_err(host-dev, BUG: the hardware max queue is too 
large\n);
-   WARN_ON(1);
-   return -EINVAL;
-   }
-
/* host must have been started */
if (!(host-flags  ATA_HOST_STARTED)) {
dev_err(host-dev, BUG: trying to register unstarted host\n);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] libata: Fix NULL pointer of scsi_host in ata_port

2014-07-22 Thread Mike Qiu
In ata_sas_port_alloc(), it haven't initialized scsi_host field in
ata_port, although scsi_host is in parameters list and unused in this
function.

With commit 1871ee134b73 (libata: support the ata host which implements a 
queue depth less than 32)
ata_qc_new() try to use scsi_host, while it
is a NULL pointer for ipr IOA and error message shows below:

Unable to handle kernel paging request for data at address 0x0114
Faulting instruction address: 0xc05c2580
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c05c2580] .ata_qc_new_init+0x30/0x1f0
LR [c05c9384] .ata_scsi_translate+0x44/0x230
Call Trace:
0xc003ad332280 (unreliable)
.ata_scsi_translate+0x44/0x230
.ipr_queuecommand+0x2e0/0x780 [ipr]
.scsi_dispatch_cmd+0xec/0x400
.scsi_request_fn+0x52c/0x670
.__blk_run_queue+0x5c/0x80
.blk_execute_rq_nowait+0xf8/0x1c0
.blk_execute_rq+0x88/0x150
.scsi_execute+0xf0/0x1f0
.scsi_execute_req_flags+0xc4/0x170
.scsi_probe_and_add_lun+0x2d4/0xe00
.__scsi_scan_target+0x1a4/0x790
.scsi_scan_channel.part.3+0x80/0xc0
.scsi_scan_host_selected+0x1a0/0x240
.do_scan_async+0x30/0x210
.async_run_entry_fn+0x78/0x1c0
.process_one_work+0x1c4/0x4a0
.worker_thread+0x184/0x600
.kthread+0x10c/0x130
.ret_from_kernel_thread+0x58/0x7c

While scsi_host is unused in ata_sas_port_alloc(), better to set it
in ata_sas_port_alloc() instead of in driver.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
 drivers/ata/libata-scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0586f66..a472b6f 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -4070,6 +4070,7 @@ struct ata_port *ata_sas_port_alloc(struct ata_host *host,
ap-flags |= port_info-flags;
ap-ops = port_info-port_ops;
ap-cbl = ATA_CBL_SATA;
+   ap-scsi_host = shost;
 
return ap;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] libata: Fix NULL pointer of scsi_host in ata_port

2014-07-22 Thread Mike Qiu

[+cc Wendy, Brian King, Stephen]


On 07/22/2014 10:51 PM, Mike Qiu wrote:

In ata_sas_port_alloc(), it haven't initialized scsi_host field in
ata_port, although scsi_host is in parameters list and unused in this
function.

With commit 1871ee134b73 (libata: support the ata host which implements a queue 
depth less than 32)
ata_qc_new() try to use scsi_host, while it
is a NULL pointer for ipr IOA and error message shows below:

Unable to handle kernel paging request for data at address 0x0114
Faulting instruction address: 0xc05c2580
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c05c2580] .ata_qc_new_init+0x30/0x1f0
LR [c05c9384] .ata_scsi_translate+0x44/0x230
Call Trace:
0xc003ad332280 (unreliable)
.ata_scsi_translate+0x44/0x230
.ipr_queuecommand+0x2e0/0x780 [ipr]
.scsi_dispatch_cmd+0xec/0x400
.scsi_request_fn+0x52c/0x670
.__blk_run_queue+0x5c/0x80
.blk_execute_rq_nowait+0xf8/0x1c0
.blk_execute_rq+0x88/0x150
.scsi_execute+0xf0/0x1f0
.scsi_execute_req_flags+0xc4/0x170
.scsi_probe_and_add_lun+0x2d4/0xe00
.__scsi_scan_target+0x1a4/0x790
.scsi_scan_channel.part.3+0x80/0xc0
.scsi_scan_host_selected+0x1a0/0x240
.do_scan_async+0x30/0x210
.async_run_entry_fn+0x78/0x1c0
.process_one_work+0x1c4/0x4a0
.worker_thread+0x184/0x600
.kthread+0x10c/0x130
.ret_from_kernel_thread+0x58/0x7c

While scsi_host is unused in ata_sas_port_alloc(), better to set it
in ata_sas_port_alloc() instead of in driver.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
  drivers/ata/libata-scsi.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0586f66..a472b6f 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -4070,6 +4070,7 @@ struct ata_port *ata_sas_port_alloc(struct ata_host *host,
ap-flags |= port_info-flags;
ap-ops = port_info-port_ops;
ap-cbl = ATA_CBL_SATA;
+   ap-scsi_host = shost;

return ap;
  }


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] libata: Fix scsi_host can_queue issue in ata_qc_new()

2014-07-22 Thread Mike Qiu

On 07/22/2014 11:42 PM, Tejun Heo wrote:

Hello,

(cc'ing Dan)

On Tue, Jul 22, 2014 at 10:50:19AM -0400, Mike Qiu wrote:

The can_queue in scsi_host can be more than ATA_MAX_QUEUE (32),
for example, in ipr, it can be 100 or more.

Also, some drivers, like ipr driver, haven't filled the field
scsi_host in ata_port, and will lead a call trace, so add
check for that.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
  drivers/ata/libata-core.c | 15 ---
  1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 259d879..a5b9c70 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4734,7 +4734,10 @@ static struct ata_queued_cmd *ata_qc_new(struct ata_port 
*ap)
struct ata_queued_cmd *qc = NULL;
unsigned int i, tag, max_queue;
  
-	max_queue = ap-scsi_host-can_queue;

+   if (ap-scsi_host  ap-scsi_host-can_queue = ATA_MAX_QUEUE)
+   max_queue = ap-scsi_host-can_queue;
+   else
+   max_queue = ATA_MAX_QUEUE;
  
  	/* no command while frozen */

if (unlikely(ap-pflags  ATA_PFLAG_FROZEN))
@@ -6109,16 +6112,6 @@ int ata_host_register(struct ata_host *host, struct 
scsi_host_template *sht)
  {
int i, rc;
  
-	/*

-* The max queue supported by hardware must not be greater than
-* ATA_MAX_QUEUE.
-*/
-   if (sht-can_queue  ATA_MAX_QUEUE) {
-   dev_err(host-dev, BUG: the hardware max queue is too 
large\n);
-   WARN_ON(1);
-   return -EINVAL;
-   }
-

So, ummm, I really don't like that we're adding the conditionals to
the hot path (yeah, its implementation is slow but still).  Maybe we


Yes, agree ..., not a good idea to do this...

Thanks
Mike

need to store the chosen queue depth after all?  Dan?

Thanks.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] libata: Fix NULL pointer of scsi_host in ata_port

2014-07-22 Thread Mike Qiu

I have tested with the ipr IOA, passed.

Reviewed-and Tested-by: Mike Qiu qiud...@linux.vnet.ibm.com

On 07/23/2014 04:11 AM, Tejun Heo wrote:

Hello,

Can you please test the following patch?

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index d19c37a7..773f4e6 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4798,9 +4798,8 @@ void swap_buf_le16(u16 *buf, unsigned int buf_words)
  static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap)
  {
struct ata_queued_cmd *qc = NULL;
-   unsigned int i, tag, max_queue;
-
-   max_queue = ap-scsi_host-can_queue;
+   unsigned int max_queue = ap-host-n_tags;
+   unsigned int i, tag;

/* no command while frozen */
if (unlikely(ap-pflags  ATA_PFLAG_FROZEN))
@@ -6094,6 +6093,7 @@ void ata_host_init(struct ata_host *host, struct device 
*dev,
  {
spin_lock_init(host-lock);
mutex_init(host-eh_mutex);
+   host-n_tags = ATA_MAX_QUEUE;
host-dev = dev;
host-ops = ops;
  }
@@ -6179,11 +6179,7 @@ int ata_host_register(struct ata_host *host, struct 
scsi_host_template *sht)
 * The max queue supported by hardware must not be greater than
 * ATA_MAX_QUEUE.
 */
-   if (sht-can_queue  ATA_MAX_QUEUE) {
-   dev_err(host-dev, BUG: the hardware max queue is too 
large\n);
-   WARN_ON(1);
-   return -EINVAL;
-   }
+   host-n_tags = clamp(sht-can_queue, 1, ATA_MAX_QUEUE);

/* host must have been started */
if (!(host-flags  ATA_HOST_STARTED)) {
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 5ab4e3a..92abb49 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -593,6 +593,7 @@ struct ata_host {
struct device   *dev;
void __iomem * const*iomap;
unsigned intn_ports;
+   unsigned intn_tags; /* nr of NCQ tags */
void*private_data;
struct ata_port_operations *ops;
unsigned long   flags;



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] libata: Fix NULL pointer of scsi_host in ata_port

2014-07-22 Thread Mike Qiu

On 07/22/2014 10:51 PM, Mike Qiu wrote:

In ata_sas_port_alloc(), it haven't initialized scsi_host field in
ata_port, although scsi_host is in parameters list and unused in this
function.

With commit 1871ee134b73 (libata: support the ata host which implements a queue 
depth less than 32)
ata_qc_new() try to use scsi_host, while it
is a NULL pointer for ipr IOA and error message shows below:

...


While scsi_host is unused in ata_sas_port_alloc(), better to set it
in ata_sas_port_alloc() instead of in driver.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
  drivers/ata/libata-scsi.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0586f66..a472b6f 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -4070,6 +4070,7 @@ struct ata_port *ata_sas_port_alloc(struct ata_host *host,
ap-flags |= port_info-flags;
ap-ops = port_info-port_ops;
ap-cbl = ATA_CBL_SATA;
+   ap-scsi_host = shost;


What about my patch itself, ata_sas_port_alloc() has shot in 
parameters list, but unused.


Maybe better to set ap-scsi_host here, it is very convenient, and 
drivers, like ipr, may forget to set this field, otherwise shot need 
to be removed from parameters list I think.


Thanks,
Mike

return ap;
  }


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


WARNING: at kernel/cpuset.c:1139

2014-07-22 Thread Mike Qiu
commit 734d45130cb (cpuset: update cs-effective_{cpus, mems} when 
config changes) introduce the below warning in my server.


[   35.652137] [ cut here ]
[   35.652141] WARNING: at kernel/cpuset.c:1139
[   35.652142] Modules linked in: ebtable_nat xt_CHECKSUM bridge stp llc 
be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio 
libcxgbi ib_iser iptable_mangle nf_conntrack_ipv4 rdma_cm nf_defrag_ipv4 
xt_conntrack iw_cm nf_conntrack ib_cm ib_sa ib_mad ebtable_filter 
ib_core ebtables ip6_tables ib_addr iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi e1000e ses ptp enclosure pps_core be2net shpchp 
vhost_net tun macvtap macvlan vhost kvm binfmt_misc uinput lpfc 
scsi_transport_fc ipr
[   35.652185] CPU: 36 PID: 1363 Comm: libvirtd Not tainted 
3.16.0-rc5-next-20140721+ #93
[   35.652187] task: c003b3443a00 ti: c003bb008000 task.ti: 
c003bb008000
[   35.652189] NIP: c015ff38 LR: c015ff2c CTR: 

[   35.652190] REGS: c003bb00b850 TRAP: 0700   Not tainted 
(3.16.0-rc5-next-20140721+)
[   35.652191] MSR: 90029032 SF,HV,EE,ME,IR,DR,RI CR: 
24004824  XER: 

[   35.652196] CFAR: c045f6cc SOFTE: 1
GPR00: c015ff04 c003bb00bad0 c145acf8 0001
GPR04: c003b3dae5d0 0100  
GPR08: c003b3dae548 0004  0004
GPR12: 0001 cfeea200 008066727bd8 008066727a30
GPR16: 0080667dfa08 008066727a68 0080667279f8 0080667279d0
GPR20: c166acf8 c003b3dae530 c1311990 c003b3dae5d0
GPR24: c003b3dae530 c003b3dadc00 c003b3dae400 0001
GPR28:  c1311968 c003b1873100 c003b3dae400
[   35.652219] NIP [c015ff38] .cpuset_write_resmask+0x438/0x8c0
[   35.652221] LR [c015ff2c] .cpuset_write_resmask+0x42c/0x8c0
[   35.65] Call Trace:
[   35.652224] [c003bb00bad0] [c015ff04] 
.cpuset_write_resmask+0x404/0x8c0 (unreliable)
[   35.652227] [c003bb00bba0] [c0156f08] 
.cgroup_file_write+0x78/0x190
[   35.652230] [c003bb00bc50] [c030c490] 
.kernfs_fop_write+0x150/0x1e0

[   35.652233] [c003bb00bcf0] [c026b6d0] .vfs_write+0xe0/0x270
[   35.652235] [c003bb00bd90] [c026be24] .SyS_write+0x64/0x110
[   35.652238] [c003bb00be30] [c000a158] syscall_exit+0x0/0x98
[   35.652239] Instruction dump:
[   35.652240] e93a 39549528 e9290118 7fa95000 419e0024 7ea3ab78 
7ee4bb78 38a00100
[   35.652243] 482ff719 6000 2fa3 419e0008 0fe0 7f43d378 
4bfffa71 813a006c

[   35.652247] ---[ end trace f91b0c3aadfe71a6 ]---

Thanks,
Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] lpfc: Avoid to disable pci_dev twice

2014-07-17 Thread Mike Qiu

On 07/17/2014 10:15 PM, Joe Lawrence wrote:

[ +cc linux-pci and Bjorn, comments inline/below ... ]

On Thu, 17 Jul 2014 02:32:31 -0400
Mike Qiu  wrote:


In IBM Power servers, when hardware error occurs during probe
state, EEH subsystem will call driver's error_detected interface,
which will call pci_disable_device(). But driver's probe function also
call pci_disable_device() in this situation.

So pci_dev will be disabled twice:

Device lpfc disabling already-disabled device
[ cut here ]
WARNING: at drivers/pci/pci.c:1407
CPU: 0 PID: 8744 Comm: kworker/0:0 Tainted: GW
3.10.42-2002.pkvm2_1_1.6.ppc64 #1
Workqueue: events .work_for_cpu_fn
task: c0274e3f5400 ti: c027d3958000 task.ti: c027d3958000
NIP: c0471b8c LR: c0471b88 CTR: c043ebe0
REGS: c027d395b650 TRAP: 0700   Tainted: GW 
(3.10.42-2002.pkvm2_1_1.6.ppc64)
MSR: 900100029032   CR: 28b52b44  XER: 2000
CFAR: c0879ab8 SOFTE: 1
...
NIP .pci_disable_device+0xcc/0xe0
LR  .pci_disable_device+0xc8/0xe0
Call Trace:
.pci_disable_device+0xc8/0xe0 (unreliable)
.lpfc_disable_pci_dev+0x50/0x80 [lpfc]
.lpfc_pci_probe_one+0x870/0x21a0 [lpfc]
.local_pci_probe+0x68/0xb0
.work_for_cpu_fn+0x38/0x60
.process_one_work+0x1a4/0x4d0
.worker_thread+0x37c/0x490
.kthread+0xf0/0x100
.ret_from_kernel_thread+0x5c/0x80

Signed-off-by: Mike Qiu 
---
  drivers/scsi/lpfc/lpfc.h  |  1 +
  drivers/scsi/lpfc/lpfc_init.c | 59 +++
  2 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 434e903..0c7bad9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -813,6 +813,7 @@ struct lpfc_hba {
  #define VPD_MASK0xf /* mask for any vpd data */
  
  	uint8_t soft_wwn_enable;

+   uint8_t probe_done;
  
  	struct timer_list fcp_poll_timer;

struct timer_list eratt_poll;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 06f9a5b..c2e67ae 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -9519,6 +9519,9 @@ lpfc_pci_probe_one_s3(struct pci_dev *pdev, const struct 
pci_device_id *pid)
}
}
  
+	/* Set the probe flag */

+   phba->probe_done = 1;
+
/* Perform post initialization setup */
lpfc_post_init_setup(phba);
  
@@ -9795,6 +9798,9 @@ lpfc_sli_prep_dev_for_recover(struct lpfc_hba *phba)

  static void
  lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+   if (phba)
+   return;
+

Should that be "if *not* phba" like the others below?


Yes, should be ...

if (!phba)




lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"2710 PCI channel disable preparing for reset\n");
  
@@ -9812,7 +9818,8 @@ lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
  
  	/* Disable interrupt and pci device */

lpfc_sli_disable_intr(phba);
-   pci_disable_device(phba->pcidev);
+   if (phba->probe_done && phba->pcidev)
+   pci_disable_device(phba->pcidev);
  }
  
  /**

@@ -10282,6 +10289,9 @@ lpfc_pci_probe_one_s4(struct pci_dev *pdev, const 
struct pci_device_id *pid)
goto out_disable_intr;
}
  
+	/* Set probe_done flag */

+   phba->probe_done = 1;
+
/* Log the current active interrupt mode */
phba->intr_mode = intr_mode;
lpfc_log_intr_mode(phba, intr_mode);
@@ -10544,6 +10554,9 @@ lpfc_sli4_prep_dev_for_recover(struct lpfc_hba *phba)
  static void
  lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+   if (!phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"2826 PCI channel disable preparing for reset\n");
  
@@ -10562,7 +10575,9 @@ lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)

/* Disable interrupt and pci device */
lpfc_sli4_disable_intr(phba);
lpfc_sli4_queue_destroy(phba);
-   pci_disable_device(phba->pcidev);
+
+   if (phba->probe_done && phba->pcidev)
+   pci_disable_device(phba->pcidev);
  }
  
  /**

@@ -10893,9 +10908,21 @@ static pci_ers_result_t
  lpfc_io_error_detected(struct pci_dev *pdev, pci_channel_state_t state)
  {
struct Scsi_Host *shost = pci_get_drvdata(pdev);
-   struct lpfc_hba *phba = ((struct lpfc_vport *)shost->hostdata)->phba;
+   struct lpfc_hba *phba;
pci_ers_result_t rc = PCI_ERS_RESULT_DISCONNECT;
  
+	if (!shost)

+   /* Run here means it may during probe state and
+* Scsi_Host has not been created and We can do nothing
+* in this state so call for hotplug*/
+   return PCI_ERS_RESULT_NONE;

Is it possible to get here during device removal, ie
lpfc_pci_remove_one?  If so, we may have shost in hand now, but can
these rou

[PATCH] lpfc: Avoid to disable pci_dev twice

2014-07-17 Thread Mike Qiu
In IBM Power servers, when hardware error occurs during probe
state, EEH subsystem will call driver's error_detected interface,
which will call pci_disable_device(). But driver's probe function also
call pci_disable_device() in this situation.

So pci_dev will be disabled twice:

Device lpfc disabling already-disabled device
[ cut here ]
WARNING: at drivers/pci/pci.c:1407
CPU: 0 PID: 8744 Comm: kworker/0:0 Tainted: GW
3.10.42-2002.pkvm2_1_1.6.ppc64 #1
Workqueue: events .work_for_cpu_fn
task: c0274e3f5400 ti: c027d3958000 task.ti: c027d3958000
NIP: c0471b8c LR: c0471b88 CTR: c043ebe0
REGS: c027d395b650 TRAP: 0700   Tainted: GW 
(3.10.42-2002.pkvm2_1_1.6.ppc64)
MSR: 900100029032   CR: 28b52b44  XER: 2000
CFAR: c0879ab8 SOFTE: 1
...
NIP .pci_disable_device+0xcc/0xe0
LR  .pci_disable_device+0xc8/0xe0
Call Trace:
.pci_disable_device+0xc8/0xe0 (unreliable)
.lpfc_disable_pci_dev+0x50/0x80 [lpfc]
.lpfc_pci_probe_one+0x870/0x21a0 [lpfc]
.local_pci_probe+0x68/0xb0
.work_for_cpu_fn+0x38/0x60
.process_one_work+0x1a4/0x4d0
.worker_thread+0x37c/0x490
.kthread+0xf0/0x100
.ret_from_kernel_thread+0x5c/0x80

Signed-off-by: Mike Qiu 
---
 drivers/scsi/lpfc/lpfc.h  |  1 +
 drivers/scsi/lpfc/lpfc_init.c | 59 +++
 2 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 434e903..0c7bad9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -813,6 +813,7 @@ struct lpfc_hba {
 #define VPD_MASK0xf /* mask for any vpd data */
 
uint8_t soft_wwn_enable;
+   uint8_t probe_done;
 
struct timer_list fcp_poll_timer;
struct timer_list eratt_poll;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 06f9a5b..c2e67ae 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -9519,6 +9519,9 @@ lpfc_pci_probe_one_s3(struct pci_dev *pdev, const struct 
pci_device_id *pid)
}
}
 
+   /* Set the probe flag */
+   phba->probe_done = 1;
+
/* Perform post initialization setup */
lpfc_post_init_setup(phba);
 
@@ -9795,6 +9798,9 @@ lpfc_sli_prep_dev_for_recover(struct lpfc_hba *phba)
 static void
 lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
 {
+   if (phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"2710 PCI channel disable preparing for reset\n");
 
@@ -9812,7 +9818,8 @@ lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
 
/* Disable interrupt and pci device */
lpfc_sli_disable_intr(phba);
-   pci_disable_device(phba->pcidev);
+   if (phba->probe_done && phba->pcidev)
+   pci_disable_device(phba->pcidev);
 }
 
 /**
@@ -10282,6 +10289,9 @@ lpfc_pci_probe_one_s4(struct pci_dev *pdev, const 
struct pci_device_id *pid)
goto out_disable_intr;
}
 
+   /* Set probe_done flag */
+   phba->probe_done = 1;
+
/* Log the current active interrupt mode */
phba->intr_mode = intr_mode;
lpfc_log_intr_mode(phba, intr_mode);
@@ -10544,6 +10554,9 @@ lpfc_sli4_prep_dev_for_recover(struct lpfc_hba *phba)
 static void
 lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
 {
+   if (!phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"2826 PCI channel disable preparing for reset\n");
 
@@ -10562,7 +10575,9 @@ lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
/* Disable interrupt and pci device */
lpfc_sli4_disable_intr(phba);
lpfc_sli4_queue_destroy(phba);
-   pci_disable_device(phba->pcidev);
+
+   if (phba->probe_done && phba->pcidev)
+   pci_disable_device(phba->pcidev);
 }
 
 /**
@@ -10893,9 +10908,21 @@ static pci_ers_result_t
 lpfc_io_error_detected(struct pci_dev *pdev, pci_channel_state_t state)
 {
struct Scsi_Host *shost = pci_get_drvdata(pdev);
-   struct lpfc_hba *phba = ((struct lpfc_vport *)shost->hostdata)->phba;
+   struct lpfc_hba *phba;
pci_ers_result_t rc = PCI_ERS_RESULT_DISCONNECT;
 
+   if (!shost)
+   /* Run here means it may during probe state and
+* Scsi_Host has not been created and We can do nothing
+* in this state so call for hotplug*/
+   return PCI_ERS_RESULT_NONE;
+
+   phba = ((struct lpfc_vport *)shost->hostdata)->phba;
+
+   if (!phba || !phba->probe_done)
+   /* Run here means it may during probe state */
+   return PCI_ERS_RESULT_NONE;
+
switch (phba->pci_dev_grp) {
case LPFC_PCI_DEV_LP:
rc = lpfc_io_error_detected_s3(pdev, state);
@@ -10930,9 +10957,20 @@ static pci_er

[PATCH] lpfc: Avoid to disable pci_dev twice

2014-07-17 Thread Mike Qiu
In IBM Power servers, when hardware error occurs during probe
state, EEH subsystem will call driver's error_detected interface,
which will call pci_disable_device(). But driver's probe function also
call pci_disable_device() in this situation.

So pci_dev will be disabled twice:

Device lpfc disabling already-disabled device
[ cut here ]
WARNING: at drivers/pci/pci.c:1407
CPU: 0 PID: 8744 Comm: kworker/0:0 Tainted: GW
3.10.42-2002.pkvm2_1_1.6.ppc64 #1
Workqueue: events .work_for_cpu_fn
task: c0274e3f5400 ti: c027d3958000 task.ti: c027d3958000
NIP: c0471b8c LR: c0471b88 CTR: c043ebe0
REGS: c027d395b650 TRAP: 0700   Tainted: GW 
(3.10.42-2002.pkvm2_1_1.6.ppc64)
MSR: 900100029032 SF,HV,EE,ME,IR,DR,RI  CR: 28b52b44  XER: 2000
CFAR: c0879ab8 SOFTE: 1
...
NIP .pci_disable_device+0xcc/0xe0
LR  .pci_disable_device+0xc8/0xe0
Call Trace:
.pci_disable_device+0xc8/0xe0 (unreliable)
.lpfc_disable_pci_dev+0x50/0x80 [lpfc]
.lpfc_pci_probe_one+0x870/0x21a0 [lpfc]
.local_pci_probe+0x68/0xb0
.work_for_cpu_fn+0x38/0x60
.process_one_work+0x1a4/0x4d0
.worker_thread+0x37c/0x490
.kthread+0xf0/0x100
.ret_from_kernel_thread+0x5c/0x80

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
 drivers/scsi/lpfc/lpfc.h  |  1 +
 drivers/scsi/lpfc/lpfc_init.c | 59 +++
 2 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 434e903..0c7bad9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -813,6 +813,7 @@ struct lpfc_hba {
 #define VPD_MASK0xf /* mask for any vpd data */
 
uint8_t soft_wwn_enable;
+   uint8_t probe_done;
 
struct timer_list fcp_poll_timer;
struct timer_list eratt_poll;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 06f9a5b..c2e67ae 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -9519,6 +9519,9 @@ lpfc_pci_probe_one_s3(struct pci_dev *pdev, const struct 
pci_device_id *pid)
}
}
 
+   /* Set the probe flag */
+   phba-probe_done = 1;
+
/* Perform post initialization setup */
lpfc_post_init_setup(phba);
 
@@ -9795,6 +9798,9 @@ lpfc_sli_prep_dev_for_recover(struct lpfc_hba *phba)
 static void
 lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
 {
+   if (phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
2710 PCI channel disable preparing for reset\n);
 
@@ -9812,7 +9818,8 @@ lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
 
/* Disable interrupt and pci device */
lpfc_sli_disable_intr(phba);
-   pci_disable_device(phba-pcidev);
+   if (phba-probe_done  phba-pcidev)
+   pci_disable_device(phba-pcidev);
 }
 
 /**
@@ -10282,6 +10289,9 @@ lpfc_pci_probe_one_s4(struct pci_dev *pdev, const 
struct pci_device_id *pid)
goto out_disable_intr;
}
 
+   /* Set probe_done flag */
+   phba-probe_done = 1;
+
/* Log the current active interrupt mode */
phba-intr_mode = intr_mode;
lpfc_log_intr_mode(phba, intr_mode);
@@ -10544,6 +10554,9 @@ lpfc_sli4_prep_dev_for_recover(struct lpfc_hba *phba)
 static void
 lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
 {
+   if (!phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
2826 PCI channel disable preparing for reset\n);
 
@@ -10562,7 +10575,9 @@ lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
/* Disable interrupt and pci device */
lpfc_sli4_disable_intr(phba);
lpfc_sli4_queue_destroy(phba);
-   pci_disable_device(phba-pcidev);
+
+   if (phba-probe_done  phba-pcidev)
+   pci_disable_device(phba-pcidev);
 }
 
 /**
@@ -10893,9 +10908,21 @@ static pci_ers_result_t
 lpfc_io_error_detected(struct pci_dev *pdev, pci_channel_state_t state)
 {
struct Scsi_Host *shost = pci_get_drvdata(pdev);
-   struct lpfc_hba *phba = ((struct lpfc_vport *)shost-hostdata)-phba;
+   struct lpfc_hba *phba;
pci_ers_result_t rc = PCI_ERS_RESULT_DISCONNECT;
 
+   if (!shost)
+   /* Run here means it may during probe state and
+* Scsi_Host has not been created and We can do nothing
+* in this state so call for hotplug*/
+   return PCI_ERS_RESULT_NONE;
+
+   phba = ((struct lpfc_vport *)shost-hostdata)-phba;
+
+   if (!phba || !phba-probe_done)
+   /* Run here means it may during probe state */
+   return PCI_ERS_RESULT_NONE;
+
switch (phba-pci_dev_grp) {
case LPFC_PCI_DEV_LP:
rc = lpfc_io_error_detected_s3(pdev, state);
@@ -10930,9 +10957,20 @@ static pci_ers_result_t
 lpfc_io_slot_reset(struct pci_dev *pdev)
 {
struct

Re: [PATCH] lpfc: Avoid to disable pci_dev twice

2014-07-17 Thread Mike Qiu

On 07/17/2014 10:15 PM, Joe Lawrence wrote:

[ +cc linux-pci and Bjorn, comments inline/below ... ]

On Thu, 17 Jul 2014 02:32:31 -0400
Mike Qiu qiud...@linux.vnet.ibm.com wrote:


In IBM Power servers, when hardware error occurs during probe
state, EEH subsystem will call driver's error_detected interface,
which will call pci_disable_device(). But driver's probe function also
call pci_disable_device() in this situation.

So pci_dev will be disabled twice:

Device lpfc disabling already-disabled device
[ cut here ]
WARNING: at drivers/pci/pci.c:1407
CPU: 0 PID: 8744 Comm: kworker/0:0 Tainted: GW
3.10.42-2002.pkvm2_1_1.6.ppc64 #1
Workqueue: events .work_for_cpu_fn
task: c0274e3f5400 ti: c027d3958000 task.ti: c027d3958000
NIP: c0471b8c LR: c0471b88 CTR: c043ebe0
REGS: c027d395b650 TRAP: 0700   Tainted: GW 
(3.10.42-2002.pkvm2_1_1.6.ppc64)
MSR: 900100029032 SF,HV,EE,ME,IR,DR,RI  CR: 28b52b44  XER: 2000
CFAR: c0879ab8 SOFTE: 1
...
NIP .pci_disable_device+0xcc/0xe0
LR  .pci_disable_device+0xc8/0xe0
Call Trace:
.pci_disable_device+0xc8/0xe0 (unreliable)
.lpfc_disable_pci_dev+0x50/0x80 [lpfc]
.lpfc_pci_probe_one+0x870/0x21a0 [lpfc]
.local_pci_probe+0x68/0xb0
.work_for_cpu_fn+0x38/0x60
.process_one_work+0x1a4/0x4d0
.worker_thread+0x37c/0x490
.kthread+0xf0/0x100
.ret_from_kernel_thread+0x5c/0x80

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
  drivers/scsi/lpfc/lpfc.h  |  1 +
  drivers/scsi/lpfc/lpfc_init.c | 59 +++
  2 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 434e903..0c7bad9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -813,6 +813,7 @@ struct lpfc_hba {
  #define VPD_MASK0xf /* mask for any vpd data */
  
  	uint8_t soft_wwn_enable;

+   uint8_t probe_done;
  
  	struct timer_list fcp_poll_timer;

struct timer_list eratt_poll;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 06f9a5b..c2e67ae 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -9519,6 +9519,9 @@ lpfc_pci_probe_one_s3(struct pci_dev *pdev, const struct 
pci_device_id *pid)
}
}
  
+	/* Set the probe flag */

+   phba-probe_done = 1;
+
/* Perform post initialization setup */
lpfc_post_init_setup(phba);
  
@@ -9795,6 +9798,9 @@ lpfc_sli_prep_dev_for_recover(struct lpfc_hba *phba)

  static void
  lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+   if (phba)
+   return;
+

Should that be if *not* phba like the others below?


Yes, should be ...

if (!phba)




lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
2710 PCI channel disable preparing for reset\n);
  
@@ -9812,7 +9818,8 @@ lpfc_sli_prep_dev_for_reset(struct lpfc_hba *phba)
  
  	/* Disable interrupt and pci device */

lpfc_sli_disable_intr(phba);
-   pci_disable_device(phba-pcidev);
+   if (phba-probe_done  phba-pcidev)
+   pci_disable_device(phba-pcidev);
  }
  
  /**

@@ -10282,6 +10289,9 @@ lpfc_pci_probe_one_s4(struct pci_dev *pdev, const 
struct pci_device_id *pid)
goto out_disable_intr;
}
  
+	/* Set probe_done flag */

+   phba-probe_done = 1;
+
/* Log the current active interrupt mode */
phba-intr_mode = intr_mode;
lpfc_log_intr_mode(phba, intr_mode);
@@ -10544,6 +10554,9 @@ lpfc_sli4_prep_dev_for_recover(struct lpfc_hba *phba)
  static void
  lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)
  {
+   if (!phba)
+   return;
+
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
2826 PCI channel disable preparing for reset\n);
  
@@ -10562,7 +10575,9 @@ lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba)

/* Disable interrupt and pci device */
lpfc_sli4_disable_intr(phba);
lpfc_sli4_queue_destroy(phba);
-   pci_disable_device(phba-pcidev);
+
+   if (phba-probe_done  phba-pcidev)
+   pci_disable_device(phba-pcidev);
  }
  
  /**

@@ -10893,9 +10908,21 @@ static pci_ers_result_t
  lpfc_io_error_detected(struct pci_dev *pdev, pci_channel_state_t state)
  {
struct Scsi_Host *shost = pci_get_drvdata(pdev);
-   struct lpfc_hba *phba = ((struct lpfc_vport *)shost-hostdata)-phba;
+   struct lpfc_hba *phba;
pci_ers_result_t rc = PCI_ERS_RESULT_DISCONNECT;
  
+	if (!shost)

+   /* Run here means it may during probe state and
+* Scsi_Host has not been created and We can do nothing
+* in this state so call for hotplug*/
+   return PCI_ERS_RESULT_NONE;

Is it possible to get here during device removal, ie
lpfc_pci_remove_one?  If so, we may have shost in hand now, but can
these routines race?  Same for similar instances

Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

2014-07-15 Thread Mike Qiu

On 07/15/2014 04:41 PM, Jens Axboe wrote:

On 15/07/2014, at 10.14, Mike Qiu  wrote:

My Power7 box boot fail with commit:

254c4407cb84a6dec90336054615b0f0e996bb7c
bio: modify __bio_add_page() to accept pages that don't start a new segment

Just revert it will works for me.

I have reverted it yesterday in my tree.



OK, that's fine :)

Thanks
Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

2014-07-15 Thread Mike Qiu

My Power7 box boot fail with commit:

254c4407cb84a6dec90336054615b0f0e996bb7c
bio: modify __bio_add_page() to accept pages that don't start a new segment

Just revert it will works for me.

See below:

[   22.659431] [ cut here ]
[   22.659437] kernel BUG at fs/direct-io.c:747!
[   22.659501] Oops: Exception in kernel mode, sig: 5 [#1]
[   22.659528] SMP NR_CPUS=1024 NUMA PowerNV
[   22.659533] Modules linked in: e1000e vhost_net tun ses(+) macvtap 
macvlan enclosure ptp pps_core vhost be2net(+) shpchp kvm binfmt_misc 
uinput lpfc scsi_transport_fc ipr
[   22.659688] CPU: 8 PID: 772 Comm: lvm Not tainted 
3.16.0-rc5-next-20140714+ #76
[   22.659755] task: c003b0a7dc20 ti: c003b0afc000 task.ti: 
c003b0afc000
[   22.659823] NIP: c02ba854 LR: c02bad80 CTR: 
0010
[   22.659890] REGS: c003b0aff450 TRAP: 0700   Not tainted 
(3.16.0-rc5-next-20140714+)
[   22.659957] MSR: 90029032  CR: 
24222844  XER: 2000

[   22.660114] CFAR: c02bad90 SOFTE: 1
GPR00: c02bad80 c003b0aff6d0 c145c148 
GPR04:   c0b6e7c8 0001
GPR08:  0001 0010 f000
GPR12: 24222844 cfee2400 0010 c003b914
GPR16: 0001 c003b914 00047bff 0001
GPR20:  f0cb0fdc 0001 0001
GPR24:  0001  c003b0afc000
GPR28:  023dff80 c003fcb10380 c003b9140028
[   22.660980] NIP [c02ba854] .__blockdev_direct_IO+0x1584/0x3960
[   22.661036] LR [c02bad80] .__blockdev_direct_IO+0x1ab0/0x3960
[   22.661092] Call Trace:
[   22.661116] [c003b0aff6d0] [c02bad80] 
.__blockdev_direct_IO+0x1ab0/0x3960 (unreliable)
[   22.661208] [c003b0aff980] [c02b6114] 
.blkdev_direct_IO+0x64/0x80
[   22.661276] [c003b0affa20] [c01dd430] 
.generic_file_read_iter+0x5b0/0x690
[   22.661355] [c003b0affb50] [c02b5a40] 
.blkdev_read_iter+0x60/0x90
[   22.661423] [c003b0affbd0] [c0269d28] 
.new_sync_read+0xa8/0x120

[   22.661491] [c003b0affcf0] [c026b280] .vfs_read+0xc0/0x1f0
[   22.661559] [c003b0affd90] [c026b674] .SyS_read+0x64/0x110
[   22.661628] [c003b0affe30] [c000a158] syscall_exit+0x0/0x98
[   22.661695] Instruction dump:
[   22.661729] e88100d8 80a100e4 80c100e0 f92100c0 3920 912100a8 
4814fe15 6000
[   22.661841] 812100e4 78630020 7f891800 419ef880 <0fe0> 6000 
6042 e9410118

[   22.661955] ---[ end trace 6248a5bb36020fd2 ]---

Thanks,
Mike




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

2014-07-15 Thread Mike Qiu

My Power7 box boot fail with commit:

254c4407cb84a6dec90336054615b0f0e996bb7c
bio: modify __bio_add_page() to accept pages that don't start a new segment

Just revert it will works for me.

See below:

[   22.659431] [ cut here ]
[   22.659437] kernel BUG at fs/direct-io.c:747!
[   22.659501] Oops: Exception in kernel mode, sig: 5 [#1]
[   22.659528] SMP NR_CPUS=1024 NUMA PowerNV
[   22.659533] Modules linked in: e1000e vhost_net tun ses(+) macvtap 
macvlan enclosure ptp pps_core vhost be2net(+) shpchp kvm binfmt_misc 
uinput lpfc scsi_transport_fc ipr
[   22.659688] CPU: 8 PID: 772 Comm: lvm Not tainted 
3.16.0-rc5-next-20140714+ #76
[   22.659755] task: c003b0a7dc20 ti: c003b0afc000 task.ti: 
c003b0afc000
[   22.659823] NIP: c02ba854 LR: c02bad80 CTR: 
0010
[   22.659890] REGS: c003b0aff450 TRAP: 0700   Not tainted 
(3.16.0-rc5-next-20140714+)
[   22.659957] MSR: 90029032 SF,HV,EE,ME,IR,DR,RI CR: 
24222844  XER: 2000

[   22.660114] CFAR: c02bad90 SOFTE: 1
GPR00: c02bad80 c003b0aff6d0 c145c148 
GPR04:   c0b6e7c8 0001
GPR08:  0001 0010 f000
GPR12: 24222844 cfee2400 0010 c003b914
GPR16: 0001 c003b914 00047bff 0001
GPR20:  f0cb0fdc 0001 0001
GPR24:  0001  c003b0afc000
GPR28:  023dff80 c003fcb10380 c003b9140028
[   22.660980] NIP [c02ba854] .__blockdev_direct_IO+0x1584/0x3960
[   22.661036] LR [c02bad80] .__blockdev_direct_IO+0x1ab0/0x3960
[   22.661092] Call Trace:
[   22.661116] [c003b0aff6d0] [c02bad80] 
.__blockdev_direct_IO+0x1ab0/0x3960 (unreliable)
[   22.661208] [c003b0aff980] [c02b6114] 
.blkdev_direct_IO+0x64/0x80
[   22.661276] [c003b0affa20] [c01dd430] 
.generic_file_read_iter+0x5b0/0x690
[   22.661355] [c003b0affb50] [c02b5a40] 
.blkdev_read_iter+0x60/0x90
[   22.661423] [c003b0affbd0] [c0269d28] 
.new_sync_read+0xa8/0x120

[   22.661491] [c003b0affcf0] [c026b280] .vfs_read+0xc0/0x1f0
[   22.661559] [c003b0affd90] [c026b674] .SyS_read+0x64/0x110
[   22.661628] [c003b0affe30] [c000a158] syscall_exit+0x0/0x98
[   22.661695] Instruction dump:
[   22.661729] e88100d8 80a100e4 80c100e0 f92100c0 3920 912100a8 
4814fe15 6000
[   22.661841] 812100e4 78630020 7f891800 419ef880 0fe0 6000 
6042 e9410118

[   22.661955] ---[ end trace 6248a5bb36020fd2 ]---

Thanks,
Mike




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

2014-07-15 Thread Mike Qiu

On 07/15/2014 04:41 PM, Jens Axboe wrote:

On 15/07/2014, at 10.14, Mike Qiu qiud...@linux.vnet.ibm.com wrote:

My Power7 box boot fail with commit:

254c4407cb84a6dec90336054615b0f0e996bb7c
bio: modify __bio_add_page() to accept pages that don't start a new segment

Just revert it will works for me.

I have reverted it yesterday in my tree.



OK, that's fine :)

Thanks
Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] powerpc: Avoid circular dependency with zImage.%

2014-06-11 Thread Mike Qiu

This v2 patch is good,

Tested-by: Mike Qiu 

On 06/11/2014 11:40 PM, Michal Marek wrote:

The rule to create the final images uses a zImage.% pattern.
Unfortunately, this also matches the names of the zImage.*.lds linker
scripts, which appear as a dependency of the final images. This somehow
worked when $(srctree) used to be an absolute path, but now the pattern
matches too much. List only the images from $(image-y) as the target of
the rule, to avoid the circular dependency.

Signed-off-by: Michal Marek 
---
v2:
   - Filter out duplicates in the target list
   - fix the platform argument to cmd_wrap

  arch/powerpc/boot/Makefile | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 426dce7..ccc25ed 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -333,8 +333,8 @@ $(addprefix $(obj)/, $(initrd-y)): $(obj)/ramdisk.image.gz
  $(obj)/zImage.initrd.%: vmlinux $(wrapperbits)
$(call if_changed,wrap,$*,,,$(obj)/ramdisk.image.gz)

-$(obj)/zImage.%: vmlinux $(wrapperbits)
-   $(call if_changed,wrap,$*)
+$(addprefix $(obj)/, $(sort $(filter zImage.%, $(image-y: vmlinux 
$(wrapperbits)
+   $(call if_changed,wrap,$(subst $(obj)/zImage.,,$@))

  # dtbImage% - a dtbImage is a zImage with an embedded device tree blob
  $(obj)/dtbImage.initrd.%: vmlinux $(wrapperbits) $(obj)/%.dtb


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] powerpc: Avoid circular dependency with zImage.%

2014-06-11 Thread Mike Qiu

This v2 patch is good,

Tested-by: Mike Qiu qiud...@linux.vnet.ibm.com

On 06/11/2014 11:40 PM, Michal Marek wrote:

The rule to create the final images uses a zImage.% pattern.
Unfortunately, this also matches the names of the zImage.*.lds linker
scripts, which appear as a dependency of the final images. This somehow
worked when $(srctree) used to be an absolute path, but now the pattern
matches too much. List only the images from $(image-y) as the target of
the rule, to avoid the circular dependency.

Signed-off-by: Michal Marek mma...@suse.cz
---
v2:
   - Filter out duplicates in the target list
   - fix the platform argument to cmd_wrap

  arch/powerpc/boot/Makefile | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 426dce7..ccc25ed 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -333,8 +333,8 @@ $(addprefix $(obj)/, $(initrd-y)): $(obj)/ramdisk.image.gz
  $(obj)/zImage.initrd.%: vmlinux $(wrapperbits)
$(call if_changed,wrap,$*,,,$(obj)/ramdisk.image.gz)

-$(obj)/zImage.%: vmlinux $(wrapperbits)
-   $(call if_changed,wrap,$*)
+$(addprefix $(obj)/, $(sort $(filter zImage.%, $(image-y: vmlinux 
$(wrapperbits)
+   $(call if_changed,wrap,$(subst $(obj)/zImage.,,$@))

  # dtbImage% - a dtbImage is a zImage with an embedded device tree blob
  $(obj)/dtbImage.initrd.%: vmlinux $(wrapperbits) $(obj)/%.dtb


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 08/10] irqdomain: Refactor irq_domain_associate_many()

2013-06-17 Thread Mike Qiu

于 2013/6/10 8:49, Grant Likely 写道:

Originally, irq_domain_associate_many() was designed to unwind the
mapped irqs on a failure of any individual association. However, that
proved to be a problem with certain IRQ controllers. Some of them only
support a subset of irqs, and will fail when attempting to map a
reserved IRQ. In those cases we want to map as many IRQs as possible, so
instead it is better for irq_domain_associate_many() to make a
best-effort attempt to map irqs, but not fail if any or all of them
don't succeed. If a caller really cares about how many irqs got
associated, then it should instead go back and check that all of the
irqs is cares about were mapped.

The original design open-coded the individual association code into the
body of irq_domain_associate_many(), but with no longer needing to
unwind associations, the code becomes simpler to split out
irq_domain_associate() to contain the bulk of the logic, and
irq_domain_associate_many() to be a simple loop wrapper.

This patch also adds a new error check to the associate path to make
sure it isn't called for an irq larger than the controller can handle,
and adds locking so that the irq_domain_mutex is held while setting up a
new association.

Signed-off-by: Grant Likely 
---
  include/linux/irqdomain.h |  22 +++---
  kernel/irq/irqdomain.c| 185 +++---
  2 files changed, 101 insertions(+), 106 deletions(-)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index fd4b26f..f9e8e06 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -103,6 +103,7 @@ struct irq_domain {
struct irq_domain_chip_generic *gc;

/* reverse map data. The linear map gets appended to the irq_domain */
+   irq_hw_number_t hwirq_max;
unsigned int revmap_direct_max_irq;
unsigned int revmap_size;
struct radix_tree_root revmap_tree;
@@ -110,8 +111,8 @@ struct irq_domain {
  };

  #ifdef CONFIG_IRQ_DOMAIN
-struct irq_domain *__irq_domain_add(struct device_node *of_node,
-   int size, int direct_max,
+struct irq_domain *__irq_domain_add(struct device_node *of_node, int size,
+   irq_hw_number_t hwirq_max, int direct_max,
const struct irq_domain_ops *ops,
void *host_data);
  struct irq_domain *irq_domain_add_simple(struct device_node *of_node,
@@ -140,14 +141,14 @@ static inline struct irq_domain 
*irq_domain_add_linear(struct device_node *of_no
 const struct irq_domain_ops *ops,
 void *host_data)
  {
-   return __irq_domain_add(of_node, size, 0, ops, host_data);
+   return __irq_domain_add(of_node, size, size, 0, ops, host_data);
  }
  static inline struct irq_domain *irq_domain_add_nomap(struct device_node 
*of_node,
 unsigned int max_irq,
 const struct irq_domain_ops *ops,
 void *host_data)
  {
-   return __irq_domain_add(of_node, 0, max_irq, ops, host_data);
+   return __irq_domain_add(of_node, 0, max_irq, max_irq, ops, host_data);
  }
  static inline struct irq_domain *irq_domain_add_legacy_isa(
struct device_node *of_node,
@@ -166,14 +167,11 @@ static inline struct irq_domain 
*irq_domain_add_tree(struct device_node *of_node

  extern void irq_domain_remove(struct irq_domain *host);

-extern int irq_domain_associate_many(struct irq_domain *domain,
-unsigned int irq_base,
-irq_hw_number_t hwirq_base, int count);
-static inline int irq_domain_associate(struct irq_domain *domain, unsigned int 
irq,
-   irq_hw_number_t hwirq)
-{
-   return irq_domain_associate_many(domain, irq, hwirq, 1);
-}
+extern int irq_domain_associate(struct irq_domain *domain, unsigned int irq,
+   irq_hw_number_t hwirq);
+extern void irq_domain_associate_many(struct irq_domain *domain,
+ unsigned int irq_base,
+ irq_hw_number_t hwirq_base, int count);

  extern unsigned int irq_create_mapping(struct irq_domain *host,
   irq_hw_number_t hwirq);
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 280b804..80e9249 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -35,8 +35,8 @@ static struct irq_domain *irq_default_domain;
   * register allocated irq_domain with irq_domain_register().  Returns pointer
   * to IRQ domain, or NULL on failure.
   */
-struct irq_domain *__irq_domain_add(struct device_node *of_node,
-   int size, int direct_max,
+struct irq_domain *__irq_domain_add(struct 

Re: [RFC 08/10] irqdomain: Refactor irq_domain_associate_many()

2013-06-17 Thread Mike Qiu

于 2013/6/10 8:49, Grant Likely 写道:

Originally, irq_domain_associate_many() was designed to unwind the
mapped irqs on a failure of any individual association. However, that
proved to be a problem with certain IRQ controllers. Some of them only
support a subset of irqs, and will fail when attempting to map a
reserved IRQ. In those cases we want to map as many IRQs as possible, so
instead it is better for irq_domain_associate_many() to make a
best-effort attempt to map irqs, but not fail if any or all of them
don't succeed. If a caller really cares about how many irqs got
associated, then it should instead go back and check that all of the
irqs is cares about were mapped.

The original design open-coded the individual association code into the
body of irq_domain_associate_many(), but with no longer needing to
unwind associations, the code becomes simpler to split out
irq_domain_associate() to contain the bulk of the logic, and
irq_domain_associate_many() to be a simple loop wrapper.

This patch also adds a new error check to the associate path to make
sure it isn't called for an irq larger than the controller can handle,
and adds locking so that the irq_domain_mutex is held while setting up a
new association.

Signed-off-by: Grant Likely grant.lik...@linaro.org
---
  include/linux/irqdomain.h |  22 +++---
  kernel/irq/irqdomain.c| 185 +++---
  2 files changed, 101 insertions(+), 106 deletions(-)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index fd4b26f..f9e8e06 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -103,6 +103,7 @@ struct irq_domain {
struct irq_domain_chip_generic *gc;

/* reverse map data. The linear map gets appended to the irq_domain */
+   irq_hw_number_t hwirq_max;
unsigned int revmap_direct_max_irq;
unsigned int revmap_size;
struct radix_tree_root revmap_tree;
@@ -110,8 +111,8 @@ struct irq_domain {
  };

  #ifdef CONFIG_IRQ_DOMAIN
-struct irq_domain *__irq_domain_add(struct device_node *of_node,
-   int size, int direct_max,
+struct irq_domain *__irq_domain_add(struct device_node *of_node, int size,
+   irq_hw_number_t hwirq_max, int direct_max,
const struct irq_domain_ops *ops,
void *host_data);
  struct irq_domain *irq_domain_add_simple(struct device_node *of_node,
@@ -140,14 +141,14 @@ static inline struct irq_domain 
*irq_domain_add_linear(struct device_node *of_no
 const struct irq_domain_ops *ops,
 void *host_data)
  {
-   return __irq_domain_add(of_node, size, 0, ops, host_data);
+   return __irq_domain_add(of_node, size, size, 0, ops, host_data);
  }
  static inline struct irq_domain *irq_domain_add_nomap(struct device_node 
*of_node,
 unsigned int max_irq,
 const struct irq_domain_ops *ops,
 void *host_data)
  {
-   return __irq_domain_add(of_node, 0, max_irq, ops, host_data);
+   return __irq_domain_add(of_node, 0, max_irq, max_irq, ops, host_data);
  }
  static inline struct irq_domain *irq_domain_add_legacy_isa(
struct device_node *of_node,
@@ -166,14 +167,11 @@ static inline struct irq_domain 
*irq_domain_add_tree(struct device_node *of_node

  extern void irq_domain_remove(struct irq_domain *host);

-extern int irq_domain_associate_many(struct irq_domain *domain,
-unsigned int irq_base,
-irq_hw_number_t hwirq_base, int count);
-static inline int irq_domain_associate(struct irq_domain *domain, unsigned int 
irq,
-   irq_hw_number_t hwirq)
-{
-   return irq_domain_associate_many(domain, irq, hwirq, 1);
-}
+extern int irq_domain_associate(struct irq_domain *domain, unsigned int irq,
+   irq_hw_number_t hwirq);
+extern void irq_domain_associate_many(struct irq_domain *domain,
+ unsigned int irq_base,
+ irq_hw_number_t hwirq_base, int count);

  extern unsigned int irq_create_mapping(struct irq_domain *host,
   irq_hw_number_t hwirq);
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 280b804..80e9249 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -35,8 +35,8 @@ static struct irq_domain *irq_default_domain;
   * register allocated irq_domain with irq_domain_register().  Returns pointer
   * to IRQ domain, or NULL on failure.
   */
-struct irq_domain *__irq_domain_add(struct device_node *of_node,
-   int size, int direct_max,
+struct irq_domain 

Re: [PATCH 0/3] Enable multiple MSI feature in pSeries

2013-05-22 Thread Mike Qiu

于 2013/5/22 8:15, Benjamin Herrenschmidt 写道:

On Tue, 2013-05-21 at 16:45 +0200, Alexander Gordeev wrote:

On Tue, Jan 15, 2013 at 03:38:53PM +0800, Mike Qiu wrote:

The test results is shown by 'cat /proc/interrups':
   CPU0   CPU1   CPU2   CPU3
16: 240458 261601 226310 200425  XICS Level IPI
17:  0  0  0  0  XICS Level RAS_EPOW
18: 10  0  3  2  XICS Level hvc_console
19: 122182  28481  28527  28864  XICS Level ibmvscsi
20:5067388226108118  XICS Level eth0
21:  6  5  5  5  XICS Level host1-0
22:817814816813  XICS Level host1-1

Hi Mike,

I am curious if pSeries firmware allows changing affinity masks independently
for multiple MSIs? I.e. in your example, would it be possible to assign IRQ21
and IRQ22 to different CPUs?

Yes. Each interrupt has its own affinity, whether it's an MSI or not,
the affinity is not driven by the address.

Cheers,
Ben.

Hi Ben,

May this patch be accepted? if so I will send out the 3.9 version.

As Michael Ellerman says, he want to see the performance data,

but this depends on the driver.

It is something like MSI, and the driver can use more than 1 MSI.

That is to say, the driver has more interrupt resource to use,
but whether the driver is full use of the resource, is out of
 this patch's control.

I test this patch use ipr driver, which add multiple MSI
 support by others. and it can work.

Thanks
Mike

Thanks!


LOC: 398077 316725 231882 203049   Local timer interrupts
SPU:   1659919961903   Spurious interrupts
CNT:  0  0  0  0   Performance
monitoring interrupts
MCE:  0  0  0  0   Machine check exceptions





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Enable multiple MSI feature in pSeries

2013-05-22 Thread Mike Qiu

于 2013/5/22 8:15, Benjamin Herrenschmidt 写道:

On Tue, 2013-05-21 at 16:45 +0200, Alexander Gordeev wrote:

On Tue, Jan 15, 2013 at 03:38:53PM +0800, Mike Qiu wrote:

The test results is shown by 'cat /proc/interrups':
   CPU0   CPU1   CPU2   CPU3
16: 240458 261601 226310 200425  XICS Level IPI
17:  0  0  0  0  XICS Level RAS_EPOW
18: 10  0  3  2  XICS Level hvc_console
19: 122182  28481  28527  28864  XICS Level ibmvscsi
20:5067388226108118  XICS Level eth0
21:  6  5  5  5  XICS Level host1-0
22:817814816813  XICS Level host1-1

Hi Mike,

I am curious if pSeries firmware allows changing affinity masks independently
for multiple MSIs? I.e. in your example, would it be possible to assign IRQ21
and IRQ22 to different CPUs?

Yes. Each interrupt has its own affinity, whether it's an MSI or not,
the affinity is not driven by the address.

Cheers,
Ben.

Hi Ben,

May this patch be accepted? if so I will send out the 3.9 version.

As Michael Ellerman says, he want to see the performance data,

but this depends on the driver.

It is something like MSI, and the driver can use more than 1 MSI.

That is to say, the driver has more interrupt resource to use,
but whether the driver is full use of the resource, is out of
 this patch's control.

I test this patch use ipr driver, which add multiple MSI
 support by others. and it can work.

Thanks
Mike

Thanks!


LOC: 398077 316725 231882 203049   Local timer interrupts
SPU:   1659919961903   Spurious interrupts
CNT:  0  0  0  0   Performance
monitoring interrupts
MCE:  0  0  0  0   Machine check exceptions





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-27 Thread Mike Qiu

于 2013/4/27 17:28, Chen Gang F T 写道:

On 2013年04月26日 11:54, Mike Qiu wrote:

于 2013/4/26 11:42, Chen Gang 写道:

On 2013年04月26日 11:25, Chen Gang wrote:

On 2013年04月26日 11:08, Mike Qiu wrote:

于 2013/4/26 10:06, Chen Gang 写道:

On 2013年04月26日 10:03, Mike Qiu wrote:

�� 2013/4/26 9:36, Chen Gang �:

On 2013��04��26�� 09:18, Chen Gang wrote:

On 2013��04��26�� 09:06, Chen Gang wrote:

CFAR is the Come From Register.  It saves the location of the
last

branch and is hence overwritten by any branch.


Do we process it just like others done (e.g. 0x300, 0xe00,
0xe20 ...) ?
  . = 0x900
  .globl decrementer_pSeries
decrementer_pSeries:
HMT_MEDIUM_PPR_DISCARD
  SET_SCRATCH0(r13)
  b decrementer_pSeries_0

  ...



Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which
related
with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up

OK, thanks. (hope it can work)

It seems that the machine can be bootup in powernv mode, but I'm not
sure if my machine call that module.

At lease my machine can boot up

Please reference commit number: 1707dd161349e6c54170c88d94fed012e3d224e3
(1707dd1 powerpc: Save CFAR before branching in interrupt entry paths)

What our diff v2 has done is just the fix for our patch v2 (just like
the commit 1707dd1 has done).

Please check, thanks.

:-)

I will check this evening or tomorrow, I have something else to do this
afteroon.

I think the diff v2 is correct, but is not the best one for this issue.

I prefer the Paul's patch for this issue which has better performance

:-)

yes, I use your patch and it can work, also Paul's patch can work too.


Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-27 Thread Mike Qiu

于 2013/4/27 17:28, Chen Gang F T 写道:

On 2013年04月26日 11:54, Mike Qiu wrote:

于 2013/4/26 11:42, Chen Gang 写道:

On 2013年04月26日 11:25, Chen Gang wrote:

On 2013年04月26日 11:08, Mike Qiu wrote:

于 2013/4/26 10:06, Chen Gang 写道:

On 2013年04月26日 10:03, Mike Qiu wrote:

�� 2013/4/26 9:36, Chen Gang �:

On 2013��04��26�� 09:18, Chen Gang wrote:

On 2013��04��26�� 09:06, Chen Gang wrote:

CFAR is the Come From Register.  It saves the location of the
last

branch and is hence overwritten by any branch.


Do we process it just like others done (e.g. 0x300, 0xe00,
0xe20 ...) ?
  . = 0x900
  .globl decrementer_pSeries
decrementer_pSeries:
HMT_MEDIUM_PPR_DISCARD
  SET_SCRATCH0(r13)
  b decrementer_pSeries_0

  ...



Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which
related
with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up

OK, thanks. (hope it can work)

It seems that the machine can be bootup in powernv mode, but I'm not
sure if my machine call that module.

At lease my machine can boot up

Please reference commit number: 1707dd161349e6c54170c88d94fed012e3d224e3
(1707dd1 powerpc: Save CFAR before branching in interrupt entry paths)

What our diff v2 has done is just the fix for our patch v2 (just like
the commit 1707dd1 has done).

Please check, thanks.

:-)

I will check this evening or tomorrow, I have something else to do this
afteroon.

I think the diff v2 is correct, but is not the best one for this issue.

I prefer the Paul's patch for this issue which has better performance

:-)

yes, I use your patch and it can work, also Paul's patch can work too.


Thanks.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu

于 2013/4/26 11:42, Chen Gang 写道:

On 2013年04月26日 11:25, Chen Gang wrote:

On 2013年04月26日 11:08, Mike Qiu wrote:

于 2013/4/26 10:06, Chen Gang 写道:

On 2013年04月26日 10:03, Mike Qiu wrote:

�� 2013/4/26 9:36, Chen Gang �:

On 2013��04��26�� 09:18, Chen Gang wrote:

On 2013��04��26�� 09:06, Chen Gang wrote:

CFAR is the Come From Register.  It saves the location of the
last

branch and is hence overwritten by any branch.


Do we process it just like others done (e.g. 0x300, 0xe00,
0xe20 ...) ?
 . = 0x900
 .globl decrementer_pSeries
decrementer_pSeries:
   HMT_MEDIUM_PPR_DISCARD
 SET_SCRATCH0(r13)
 b decrementer_pSeries_0

 ...



Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up

OK, thanks. (hope it can work)

It seems that the machine can be bootup in powernv mode, but I'm not
sure if my machine call that module.

At lease my machine can boot up

Please reference commit number: 1707dd161349e6c54170c88d94fed012e3d224e3
(1707dd1 powerpc: Save CFAR before branching in interrupt entry paths)

What our diff v2 has done is just the fix for our patch v2 (just like
the commit 1707dd1 has done).

Please check, thanks.

:-)
I will check this evening or tomorrow, I have something else to do this 
afteroon.

Thank you for your information !

I have checked the disassemble by powerpc64-linux-gnu-objdump, it seems
all we have done for 0x900 is almost like the original done for 0x200.

I am just learning about the CFAR (google it), And I plan to wait for a
day, if all things go smoothly, I will send patch v3.


:-)





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "attempt to move .org backwards" still show up

2013-04-25 Thread Mike Qiu

于 2013/4/25 14:25, Paul Mackerras 写道:

On Thu, Apr 25, 2013 at 12:05:54PM +0800, Mike Qiu wrote:

This has block my work now
So I hope you can take a look ASAP
Thanks
:)

Mike

As a quick fix, turn on CONFIG_KVM_BOOK3S_64_HV.  That will eliminate
the immediate problem.

Thanks
got it, I will have a try.

Paul.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu

于 2013/4/26 10:06, Chen Gang 写道:

On 2013年04月26日 10:03, Mike Qiu wrote:

�� 2013/4/26 9:36, Chen Gang �:

On 2013��04��26�� 09:18, Chen Gang wrote:

On 2013��04��26�� 09:06, Chen Gang wrote:

CFAR is the Come From Register.  It saves the location of the last

branch and is hence overwritten by any branch.


Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
. = 0x900
.globl decrementer_pSeries
decrementer_pSeries:
HMT_MEDIUM_PPR_DISCARD
SET_SCRATCH0(r13)
b decrementer_pSeries_0

...



Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up

OK, thanks. (hope it can work)
It seems that the machine can be bootup in powernv mode, but I'm not 
sure if my machine call that module.


At lease my machine can boot up

Thanks
Mike


:-)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu
于 2013/4/26 9:36, Chen Gang 写道:
> On 2013年04月26日 09:18, Chen Gang wrote:
>> On 2013年04月26日 09:06, Chen Gang wrote:
 CFAR is the Come From Register.  It saves the location of the last
> branch and is hence overwritten by any branch.
>
>>> Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
>>> . = 0x900
>>> .globl decrementer_pSeries
>>> decrementer_pSeries:
>>> HMT_MEDIUM_PPR_DISCARD
>>> SET_SCRATCH0(r13)
>>> b decrementer_pSeries_0
>>>
>>> ...
>>>
>>>
> Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
> with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up
> -diff v2 begin-
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index e789ee7..f0489c4 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -254,7 +254,15 @@ hardware_interrupt_hv:
>   STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
>   KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
>
> - MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
> + . = 0x900
> + .globl decrementer_pSeries
> +decrementer_pSeries:
> + HMT_MEDIUM_PPR_DISCARD
> + SET_SCRATCH0(r13)   /* save r13 */
> + EXCEPTION_PROLOG_0(PACA_EXGEN)
> + EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, 0x900)
> + b   decrementer_pSeries_0
> +
>   STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
>
>   MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
> @@ -536,6 +544,11 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
>  #endif
>
>   .align  7
> + /* moved from 0x900 */
> +decrementer_pSeries_0:
> + EXCEPTION_PROLOG_PSERIES_1(decrementer_common, EXC_STD)
> +
> + .align  7
>   /* moved from 0xe00 */
>   STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
>   KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)
>
>
> -diff v2 end---
>
>
>> Such as the fix below, is it OK (just like 0x300 or 0x200 has done) ?
>>
>> Please check, thanks.
>>
>> ---diff begin-
>>
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
>> b/arch/powerpc/kernel/exceptions-64s.S
>> index e789ee7..a0a5ff2 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -254,7 +254,14 @@ hardware_interrupt_hv:
>>  STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
>>  KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
>>  
>> -MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
>> +. = 0x900
>> +.globl decrementer_pSeries
>> +decrementer_pSeries:
>> +HMT_MEDIUM_PPR_DISCARD
>> +SET_SCRATCH0(r13)   /* save r13 */
>> +EXCEPTION_PROLOG_0(PACA_EXGEN)
>> +b   decrementer_pSeries_0
>> +
>>  STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
>>  
>>  MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
>> @@ -536,6 +543,12 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
>>  #endif
>>  
>>  .align  7
>> +/* moved from 0x900 */
>> +decrementer_pSeries_0:
>> +EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, 0x900)
>> +EXCEPTION_PROLOG_PSERIES_1(decrementer_common, EXC_STD)
>> +
>> +.align  7
>>  /* moved from 0xe00 */
>>  STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
>>  KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)
>>
>> ---diff end---
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "attempt to move .org backwards" still show up

2013-04-25 Thread Mike Qiu

于 2013/4/25 19:16, Chen Gang 写道:

On 2013年04月25日 14:25, Paul Mackerras wrote:

On Thu, Apr 25, 2013 at 12:05:54PM +0800, Mike Qiu wrote:

This has block my work now
So I hope you can take a look ASAP
Thanks
:)

Mike

As a quick fix, turn on CONFIG_KVM_BOOK3S_64_HV.  That will eliminate
the immediate problem.

Yes, just as my original reply to Mike to bypass it, but get no reply, I
guess he has to face the CONFIG_KVM_BOOK3S_64_PR.

Now, I am just fixing it, when I finish one patch, please help check.
Actually, I have compile pass by your patch, but I see Micheal Neuling's 
reply,

I just stop to do that, and wait for you new patch :)

Now I will use your V2 patch to build

Thanks

Mike

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu

于 2013/4/25 16:21, Chen Gang 写道:

Hello Mike:

Please try this patch, at least it can pass compiling with the config
file which you provided under my cross-compiling envrionments.

I do not give a running test now, so better to try to run the new kernel
with this patch.

OK, I will use your patch, and I will send out the result later.

Thanks

Mike

Thanks.

On 2013年04月25日 16:18, Chen Gang wrote:

When CONFIG_KVM_BOOK3S_64_PR is enabled,
MASKABLE_EXCEPTION_PSERIES(0x900 ...) will includes __KVMTEST, it will
exceed 0x980 which STD_EXCEPTION_HV(0x980 ...) will use, it will cause
compiling issue.

The related errors:
arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org backwards
make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1


Signed-off-by: Chen Gang 
---
  arch/powerpc/include/asm/kvm_asm.h   |2 +-
  arch/powerpc/kernel/exceptions-64s.S |6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index b9dd382..2c65bae 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -86,7 +86,7 @@
  #define BOOK3S_INTERRUPT_PROGRAM  0x700
  #define BOOK3S_INTERRUPT_FP_UNAVAIL   0x800
  #define BOOK3S_INTERRUPT_DECREMENTER  0x900
-#define BOOK3S_INTERRUPT_HV_DECREMENTER0x980
+#define BOOK3S_INTERRUPT_HV_DECREMENTER0x988
  #define BOOK3S_INTERRUPT_SYSCALL  0xc00
  #define BOOK3S_INTERRUPT_TRACE0xd00
  #define BOOK3S_INTERRUPT_H_DATA_STORAGE   0xe00
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index e789ee7..bb0e677 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -255,7 +255,7 @@ hardware_interrupt_hv:
KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
  
  	MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)

-   STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
+   STD_EXCEPTION_HV(0x988, 0x982, hdecrementer)
  
  	MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)

KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0xa00)
@@ -698,7 +698,7 @@ machine_check_common:
  
  	STD_EXCEPTION_COMMON_ASYNC(0x500, hardware_interrupt, do_IRQ)

STD_EXCEPTION_COMMON_ASYNC(0x900, decrementer, .timer_interrupt)
-   STD_EXCEPTION_COMMON(0x980, hdecrementer, .hdec_interrupt)
+   STD_EXCEPTION_COMMON(0x988, hdecrementer, .hdec_interrupt)
  #ifdef CONFIG_PPC_DOORBELL
STD_EXCEPTION_COMMON_ASYNC(0xa00, doorbell_super, .doorbell_exception)
  #else
@@ -802,7 +802,7 @@ hardware_interrupt_relon_hv:
STD_RELON_EXCEPTION_PSERIES(0x4700, 0x700, program_check)
STD_RELON_EXCEPTION_PSERIES(0x4800, 0x800, fp_unavailable)
MASKABLE_RELON_EXCEPTION_PSERIES(0x4900, 0x900, decrementer)
-   STD_RELON_EXCEPTION_HV(0x4980, 0x982, hdecrementer)
+   STD_RELON_EXCEPTION_HV(0x4988, 0x982, hdecrementer)
MASKABLE_RELON_EXCEPTION_PSERIES(0x4a00, 0xa00, doorbell_super)
STD_RELON_EXCEPTION_PSERIES(0x4b00, 0xb00, trap_0b)
  





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu

于 2013/4/25 16:21, Chen Gang 写道:

Hello Mike:

Please try this patch, at least it can pass compiling with the config
file which you provided under my cross-compiling envrionments.

I do not give a running test now, so better to try to run the new kernel
with this patch.

OK, I will use your patch, and I will send out the result later.

Thanks

Mike

Thanks.

On 2013年04月25日 16:18, Chen Gang wrote:

When CONFIG_KVM_BOOK3S_64_PR is enabled,
MASKABLE_EXCEPTION_PSERIES(0x900 ...) will includes __KVMTEST, it will
exceed 0x980 which STD_EXCEPTION_HV(0x980 ...) will use, it will cause
compiling issue.

The related errors:
arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org backwards
make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1


Signed-off-by: Chen Gang gang.c...@asianux.com
---
  arch/powerpc/include/asm/kvm_asm.h   |2 +-
  arch/powerpc/kernel/exceptions-64s.S |6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index b9dd382..2c65bae 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -86,7 +86,7 @@
  #define BOOK3S_INTERRUPT_PROGRAM  0x700
  #define BOOK3S_INTERRUPT_FP_UNAVAIL   0x800
  #define BOOK3S_INTERRUPT_DECREMENTER  0x900
-#define BOOK3S_INTERRUPT_HV_DECREMENTER0x980
+#define BOOK3S_INTERRUPT_HV_DECREMENTER0x988
  #define BOOK3S_INTERRUPT_SYSCALL  0xc00
  #define BOOK3S_INTERRUPT_TRACE0xd00
  #define BOOK3S_INTERRUPT_H_DATA_STORAGE   0xe00
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index e789ee7..bb0e677 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -255,7 +255,7 @@ hardware_interrupt_hv:
KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
  
  	MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)

-   STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
+   STD_EXCEPTION_HV(0x988, 0x982, hdecrementer)
  
  	MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)

KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0xa00)
@@ -698,7 +698,7 @@ machine_check_common:
  
  	STD_EXCEPTION_COMMON_ASYNC(0x500, hardware_interrupt, do_IRQ)

STD_EXCEPTION_COMMON_ASYNC(0x900, decrementer, .timer_interrupt)
-   STD_EXCEPTION_COMMON(0x980, hdecrementer, .hdec_interrupt)
+   STD_EXCEPTION_COMMON(0x988, hdecrementer, .hdec_interrupt)
  #ifdef CONFIG_PPC_DOORBELL
STD_EXCEPTION_COMMON_ASYNC(0xa00, doorbell_super, .doorbell_exception)
  #else
@@ -802,7 +802,7 @@ hardware_interrupt_relon_hv:
STD_RELON_EXCEPTION_PSERIES(0x4700, 0x700, program_check)
STD_RELON_EXCEPTION_PSERIES(0x4800, 0x800, fp_unavailable)
MASKABLE_RELON_EXCEPTION_PSERIES(0x4900, 0x900, decrementer)
-   STD_RELON_EXCEPTION_HV(0x4980, 0x982, hdecrementer)
+   STD_RELON_EXCEPTION_HV(0x4988, 0x982, hdecrementer)
MASKABLE_RELON_EXCEPTION_PSERIES(0x4a00, 0xa00, doorbell_super)
STD_RELON_EXCEPTION_PSERIES(0x4b00, 0xb00, trap_0b)
  





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: attempt to move .org backwards still show up

2013-04-25 Thread Mike Qiu

于 2013/4/25 19:16, Chen Gang 写道:

On 2013年04月25日 14:25, Paul Mackerras wrote:

On Thu, Apr 25, 2013 at 12:05:54PM +0800, Mike Qiu wrote:

This has block my work now
So I hope you can take a look ASAP
Thanks
:)

Mike

As a quick fix, turn on CONFIG_KVM_BOOK3S_64_HV.  That will eliminate
the immediate problem.

Yes, just as my original reply to Mike to bypass it, but get no reply, I
guess he has to face the CONFIG_KVM_BOOK3S_64_PR.

Now, I am just fixing it, when I finish one patch, please help check.
Actually, I have compile pass by your patch, but I see Micheal Neuling's 
reply,

I just stop to do that, and wait for you new patch :)

Now I will use your V2 patch to build

Thanks

Mike

Thanks.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu
于 2013/4/26 9:36, Chen Gang 写道:
 On 2013年04月26日 09:18, Chen Gang wrote:
 On 2013年04月26日 09:06, Chen Gang wrote:
 CFAR is the Come From Register.  It saves the location of the last
 branch and is hence overwritten by any branch.

 Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
 . = 0x900
 .globl decrementer_pSeries
 decrementer_pSeries:
 HMT_MEDIUM_PPR_DISCARD
 SET_SCRATCH0(r13)
 b decrementer_pSeries_0

 ...


 Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
 with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up
 -diff v2 begin-

 diff --git a/arch/powerpc/kernel/exceptions-64s.S 
 b/arch/powerpc/kernel/exceptions-64s.S
 index e789ee7..f0489c4 100644
 --- a/arch/powerpc/kernel/exceptions-64s.S
 +++ b/arch/powerpc/kernel/exceptions-64s.S
 @@ -254,7 +254,15 @@ hardware_interrupt_hv:
   STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
   KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)

 - MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
 + . = 0x900
 + .globl decrementer_pSeries
 +decrementer_pSeries:
 + HMT_MEDIUM_PPR_DISCARD
 + SET_SCRATCH0(r13)   /* save r13 */
 + EXCEPTION_PROLOG_0(PACA_EXGEN)
 + EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, 0x900)
 + b   decrementer_pSeries_0
 +
   STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)

   MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
 @@ -536,6 +544,11 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
  #endif

   .align  7
 + /* moved from 0x900 */
 +decrementer_pSeries_0:
 + EXCEPTION_PROLOG_PSERIES_1(decrementer_common, EXC_STD)
 +
 + .align  7
   /* moved from 0xe00 */
   STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
   KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)


 -diff v2 end---


 Such as the fix below, is it OK (just like 0x300 or 0x200 has done) ?

 Please check, thanks.

 ---diff begin-

 diff --git a/arch/powerpc/kernel/exceptions-64s.S 
 b/arch/powerpc/kernel/exceptions-64s.S
 index e789ee7..a0a5ff2 100644
 --- a/arch/powerpc/kernel/exceptions-64s.S
 +++ b/arch/powerpc/kernel/exceptions-64s.S
 @@ -254,7 +254,14 @@ hardware_interrupt_hv:
  STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
  KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
  
 -MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
 +. = 0x900
 +.globl decrementer_pSeries
 +decrementer_pSeries:
 +HMT_MEDIUM_PPR_DISCARD
 +SET_SCRATCH0(r13)   /* save r13 */
 +EXCEPTION_PROLOG_0(PACA_EXGEN)
 +b   decrementer_pSeries_0
 +
  STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
  
  MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
 @@ -536,6 +543,12 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
  #endif
  
  .align  7
 +/* moved from 0x900 */
 +decrementer_pSeries_0:
 +EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, 0x900)
 +EXCEPTION_PROLOG_PSERIES_1(decrementer_common, EXC_STD)
 +
 +.align  7
  /* moved from 0xe00 */
  STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
  KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)

 ---diff end---



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu

于 2013/4/26 10:06, Chen Gang 写道:

On 2013年04月26日 10:03, Mike Qiu wrote:

�� 2013/4/26 9:36, Chen Gang �:

On 2013��04��26�� 09:18, Chen Gang wrote:

On 2013��04��26�� 09:06, Chen Gang wrote:

CFAR is the Come From Register.  It saves the location of the last

branch and is hence overwritten by any branch.


Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
. = 0x900
.globl decrementer_pSeries
decrementer_pSeries:
HMT_MEDIUM_PPR_DISCARD
SET_SCRATCH0(r13)
b decrementer_pSeries_0

...



Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up

OK, thanks. (hope it can work)
It seems that the machine can be bootup in powernv mode, but I'm not 
sure if my machine call that module.


At lease my machine can boot up

Thanks
Mike


:-)



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: attempt to move .org backwards still show up

2013-04-25 Thread Mike Qiu

于 2013/4/25 14:25, Paul Mackerras 写道:

On Thu, Apr 25, 2013 at 12:05:54PM +0800, Mike Qiu wrote:

This has block my work now
So I hope you can take a look ASAP
Thanks
:)

Mike

As a quick fix, turn on CONFIG_KVM_BOOK3S_64_HV.  That will eliminate
the immediate problem.

Thanks
got it, I will have a try.

Paul.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu

于 2013/4/26 11:42, Chen Gang 写道:

On 2013年04月26日 11:25, Chen Gang wrote:

On 2013年04月26日 11:08, Mike Qiu wrote:

于 2013/4/26 10:06, Chen Gang 写道:

On 2013年04月26日 10:03, Mike Qiu wrote:

�� 2013/4/26 9:36, Chen Gang �:

On 2013��04��26�� 09:18, Chen Gang wrote:

On 2013��04��26�� 09:06, Chen Gang wrote:

CFAR is the Come From Register.  It saves the location of the
last

branch and is hence overwritten by any branch.


Do we process it just like others done (e.g. 0x300, 0xe00,
0xe20 ...) ?
 . = 0x900
 .globl decrementer_pSeries
decrementer_pSeries:
   HMT_MEDIUM_PPR_DISCARD
 SET_SCRATCH0(r13)
 b decrementer_pSeries_0

 ...



Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up

OK, thanks. (hope it can work)

It seems that the machine can be bootup in powernv mode, but I'm not
sure if my machine call that module.

At lease my machine can boot up

Please reference commit number: 1707dd161349e6c54170c88d94fed012e3d224e3
(1707dd1 powerpc: Save CFAR before branching in interrupt entry paths)

What our diff v2 has done is just the fix for our patch v2 (just like
the commit 1707dd1 has done).

Please check, thanks.

:-)
I will check this evening or tomorrow, I have something else to do this 
afteroon.

Thank you for your information !

I have checked the disassemble by powerpc64-linux-gnu-objdump, it seems
all we have done for 0x900 is almost like the original done for 0x200.

I am just learning about the CFAR (google it), And I plan to wait for a
day, if all things go smoothly, I will send patch v3.


:-)





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "attempt to move .org backwards" still show up

2013-04-24 Thread Mike Qiu

于 2013/4/25 9:05, Chen Gang 写道:

On 2013年04月24日 20:47, Mike wrote:

在 2013-04-24三的 20:37 +1000,Michael Neuling写道:

Mike Qiu  wrote:


于 2013/4/24 16:31, Michael Ellerman 写道:

On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:

Hi all

I get an error message when I compile the source code in Power7 platform
use the newest upstream kernel.

Hi Mike,

It depends on what your .config is. What defconfig are you building?

I just copy the config file from /boot/config.* to .config and use make
menuconfig
change nothing by manually, then save.

Can you post the resulting config here?

Do you have commit in your tree?
   commit 087aa036eb79f24b856893190359ba812b460f45
   Author: Chen Gang 
   powerpc: make additional room in exception vector area


Sure, that commit certainly in my git tree. And I just try to remove the
code and re-git clone the source code from upstream, this problem still
happen.
I will post the config file as the attachment
:)

Thanks

I will try, and plan to get a result within this week (2013-04-28)

Thanks.

Hi
This has block my work now
So I hope you can take a look ASAP
Thanks
:)

Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "attempt to move .org backwards" still show up

2013-04-24 Thread Mike Qiu

于 2013/4/24 16:31, Michael Ellerman 写道:

On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:

Hi all

I get an error message when I compile the source code in Power7 platform
use the newest upstream kernel.

Hi Mike,

It depends on what your .config is. What defconfig are you building?

cheers


And I do know how to build the source code in this machine . . .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "attempt to move .org backwards" still show up

2013-04-24 Thread Mike Qiu

于 2013/4/24 16:31, Michael Ellerman 写道:

On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:

Hi all

I get an error message when I compile the source code in Power7 platform
use the newest upstream kernel.

Hi Mike,

It depends on what your .config is. What defconfig are you building?
I just copy the config file from /boot/config.* to .config and use make 
menuconfig

change nothing by manually, then save.

cheers



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


"attempt to move .org backwards" still show up

2013-04-24 Thread Mike Qiu
Hi all

I get an error message when I compile the source code in Power7 platform
use the newest upstream kernel.

[root@feng linux]# make -j60
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
CC scripts/mod/devicetable-offsets.s
GEN scripts/mod/devicetable-offsets.h
HOSTCC scripts/mod/file2alias.o
CALL scripts/checksyscalls.sh
HOSTLD scripts/mod/modpost
CHK include/generated/compile.h
CALL arch/powerpc/kernel/systbl_chk.sh
CALL arch/powerpc/kernel/prom_init_check.sh
AS arch/powerpc/kernel/head_64.o
arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org
backwards
make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1
make: *** [arch/powerpc/kernel] Error 2
make: *** Waiting for unfinished jobs

and I see this should be fixed by the commit:
087aa036eb79f24b856893190359ba812b460f45

But it still failed in my P7 machine.

the kernel source code info:
git tree : git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[root@feng linux]# git log
commit 824282ca7d250bd7c301f221c3cd902ce906d731
Merge: f83b293 3b5e50e
Author: Linus Torvalds 
Date: Mon Apr 22 15:00:59 2013 -0700

Merge branch 'upstream' of
git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull MIPS fix from Ralf Baechle:
"Revert the change of the definition of PAGE_MASK which was prettier
but broke a few relativly rare platforms"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
Revert "MIPS: page.h: Provide more readable definition for PAGE_MASK."

commit 3b5e50edaf500f392f4a372296afc0b99ffa7e70
Author: Ralf Baechle 
Date: Mon Apr 22 17:57:54 2013 +0200

[root@feng linux]# git branch
* master
[root@feng linux]# git diff
[root@feng linux]#

Thant means I have done nothing with the kernel

Thanks
Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


attempt to move .org backwards still show up

2013-04-24 Thread Mike Qiu
Hi all

I get an error message when I compile the source code in Power7 platform
use the newest upstream kernel.

[root@feng linux]# make -j60
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
CC scripts/mod/devicetable-offsets.s
GEN scripts/mod/devicetable-offsets.h
HOSTCC scripts/mod/file2alias.o
CALL scripts/checksyscalls.sh
HOSTLD scripts/mod/modpost
CHK include/generated/compile.h
CALL arch/powerpc/kernel/systbl_chk.sh
CALL arch/powerpc/kernel/prom_init_check.sh
AS arch/powerpc/kernel/head_64.o
arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org
backwards
make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1
make: *** [arch/powerpc/kernel] Error 2
make: *** Waiting for unfinished jobs

and I see this should be fixed by the commit:
087aa036eb79f24b856893190359ba812b460f45

But it still failed in my P7 machine.

the kernel source code info:
git tree : git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[root@feng linux]# git log
commit 824282ca7d250bd7c301f221c3cd902ce906d731
Merge: f83b293 3b5e50e
Author: Linus Torvalds torva...@linux-foundation.org
Date: Mon Apr 22 15:00:59 2013 -0700

Merge branch 'upstream' of
git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull MIPS fix from Ralf Baechle:
Revert the change of the definition of PAGE_MASK which was prettier
but broke a few relativly rare platforms

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
Revert MIPS: page.h: Provide more readable definition for PAGE_MASK.

commit 3b5e50edaf500f392f4a372296afc0b99ffa7e70
Author: Ralf Baechle r...@linux-mips.org
Date: Mon Apr 22 17:57:54 2013 +0200

[root@feng linux]# git branch
* master
[root@feng linux]# git diff
[root@feng linux]#

Thant means I have done nothing with the kernel

Thanks
Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: attempt to move .org backwards still show up

2013-04-24 Thread Mike Qiu

于 2013/4/24 16:31, Michael Ellerman 写道:

On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:

Hi all

I get an error message when I compile the source code in Power7 platform
use the newest upstream kernel.

Hi Mike,

It depends on what your .config is. What defconfig are you building?
I just copy the config file from /boot/config.* to .config and use make 
menuconfig

change nothing by manually, then save.

cheers



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: attempt to move .org backwards still show up

2013-04-24 Thread Mike Qiu

于 2013/4/24 16:31, Michael Ellerman 写道:

On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:

Hi all

I get an error message when I compile the source code in Power7 platform
use the newest upstream kernel.

Hi Mike,

It depends on what your .config is. What defconfig are you building?

cheers


And I do know how to build the source code in this machine . . .

Thanks

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: attempt to move .org backwards still show up

2013-04-24 Thread Mike Qiu

于 2013/4/25 9:05, Chen Gang 写道:

On 2013年04月24日 20:47, Mike wrote:

在 2013-04-24三的 20:37 +1000,Michael Neuling写道:

Mike Qiu qiud...@linux.vnet.ibm.com wrote:


于 2013/4/24 16:31, Michael Ellerman 写道:

On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:

Hi all

I get an error message when I compile the source code in Power7 platform
use the newest upstream kernel.

Hi Mike,

It depends on what your .config is. What defconfig are you building?

I just copy the config file from /boot/config.* to .config and use make
menuconfig
change nothing by manually, then save.

Can you post the resulting config here?

Do you have commit in your tree?
   commit 087aa036eb79f24b856893190359ba812b460f45
   Author: Chen Gang gang.c...@asianux.com
   powerpc: make additional room in exception vector area


Sure, that commit certainly in my git tree. And I just try to remove the
code and re-git clone the source code from upstream, this problem still
happen.
I will post the config file as the attachment
:)

Thanks

I will try, and plan to get a result within this week (2013-04-28)

Thanks.

Hi
This has block my work now
So I hope you can take a look ASAP
Thanks
:)

Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] PowerNV/PCI: Fix NULL PCI controller

2013-04-17 Thread Mike Qiu
In pnv_pci_read_config() or pnv_pci_write_config(), we never check if
the PCI controller is valid before converting that into platform
dependent one, this is very dangerous. 

To avoid this potential risks, the patch check PCI controller first
before use it.

Signed-off-by: Mike Qiu 
---
 arch/powerpc/platforms/powernv/pci.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci.c 
b/arch/powerpc/platforms/powernv/pci.c
index b8b8e0b..e7b7f1a 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -286,11 +286,11 @@ static int pnv_pci_read_config(struct pci_bus *bus,
   int where, int size, u32 *val)
 {
struct pci_controller *hose = pci_bus_to_host(bus);
-   struct pnv_phb *phb = hose->private_data;
+   struct pnv_phb *phb = hose ? hose->private_data : NULL;
u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
s64 rc;
 
-   if (hose == NULL)
+   if (!phb)
return PCIBIOS_DEVICE_NOT_FOUND;
 
switch (size) {
@@ -330,10 +330,10 @@ static int pnv_pci_write_config(struct pci_bus *bus,
int where, int size, u32 val)
 {
struct pci_controller *hose = pci_bus_to_host(bus);
-   struct pnv_phb *phb = hose->private_data;
+   struct pnv_phb *phb = hose ? hose->private_data : NULL;
u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
 
-   if (hose == NULL)
+   if (!phb)
return PCIBIOS_DEVICE_NOT_FOUND;
 
cfg_dbg("pnv_pci_write_config bus: %x devfn: %x +%x/%x -> %08x\n",
-- 
1.7.10.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] PowerNV/PCI: Fix NULL PCI controller

2013-04-17 Thread Mike Qiu
In pnv_pci_read_config() or pnv_pci_write_config(), we never check if
the PCI controller is valid before converting that into platform
dependent one, this is very dangerous. 

To avoid this potential risks, the patch check PCI controller first
before use it.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/powernv/pci.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci.c 
b/arch/powerpc/platforms/powernv/pci.c
index b8b8e0b..e7b7f1a 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -286,11 +286,11 @@ static int pnv_pci_read_config(struct pci_bus *bus,
   int where, int size, u32 *val)
 {
struct pci_controller *hose = pci_bus_to_host(bus);
-   struct pnv_phb *phb = hose-private_data;
+   struct pnv_phb *phb = hose ? hose-private_data : NULL;
u32 bdfn = (((uint64_t)bus-number)  8) | devfn;
s64 rc;
 
-   if (hose == NULL)
+   if (!phb)
return PCIBIOS_DEVICE_NOT_FOUND;
 
switch (size) {
@@ -330,10 +330,10 @@ static int pnv_pci_write_config(struct pci_bus *bus,
int where, int size, u32 val)
 {
struct pci_controller *hose = pci_bus_to_host(bus);
-   struct pnv_phb *phb = hose-private_data;
+   struct pnv_phb *phb = hose ? hose-private_data : NULL;
u32 bdfn = (((uint64_t)bus-number)  8) | devfn;
 
-   if (hose == NULL)
+   if (!phb)
return PCIBIOS_DEVICE_NOT_FOUND;
 
cfg_dbg(pnv_pci_write_config bus: %x devfn: %x +%x/%x - %08x\n,
-- 
1.7.10.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support

2013-03-05 Thread Mike Qiu

于 2013/3/6 13:42, Michael Ellerman 写道:

On Wed, Mar 06, 2013 at 01:34:58PM +0800, Mike Qiu wrote:

于 2013/3/6 11:54, Michael Ellerman 写道:

On Tue, Mar 05, 2013 at 03:19:57PM +0800, Mike Qiu wrote:

于 2013/3/5 10:23, Michael Ellerman 写道:

On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 96f3a1d..38648e6 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, 
unsigned int irq_base,
  }
  EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
+/**
+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
+ * @domain: domain owning the interrupt range
+ * @hwirq_base: beginning of continuous hardware IRQ range
+ * @count: Number of interrupts to map

For multiple-MSI the allocated interrupt numbers must be a power-of-2,
and must be naturally aligned. I don't /think/ that's a requirement for
the virtual numbers, but it's probably best that we do it anyway.

So this API needs to specify that it will give you back a power-of-2
block that is naturally aligned - otherwise you can't use it for MSI.

rtas_call will return the numbers of hardware interrupt, and it
should be power-of-2, as this I think do not need to specify

You're confusing hardware interrupt numbers and virtual interrupt
numbers. My comment is about irq_create_mapping_many(), which returns
virtual interrupt numbers.

As I said I don't think there is a requirement that the virtual
interrupt numbers are also a power-of-2 naturally aligned block, but we
should allocate them as one anyway, to avoid any issues in future.

But for virtual interrupt numbersit should be a power-of-2 naturally
aligned block, because it must be continuous, as the MSI-HOWTO.txt says:

 4.2.2 pci_enable_msi_block
 int pci_enable_msi_block(struct pci_dev *dev, int count)
 This variation on the above call allows a device driver to request
 multiple MSIs.  The MSI specification only allows interrupts to be
 allocated in powers of two, up to a maximum of 2^5 (32).
 If this function returns 0, it has succeeded in allocating at least
 as many interrupts as the driver requested
 (it may have allocated more in order to satisfy the power-of-two
 requirement). In this case, the function enables MSI on this device
 and updates dev->irq to be the lowest of the new interrupts
 assigned to it. The other interrupts assigned to the device are in
 the range dev->irq to dev->irq + count - 1.

See the last line, that means for the virtual interrupts must be a
continuous block.

In practice I think things could work if we didn't, because we are not
using the mask routines that assume that layout.

But you're right, we must implement the API as it's specified, so the
virtual interrupt numbers must be a naturally aligned power-of-2.

Yes, also your opinion is also right, just becasue the API requires
a naturally aligned power-of-2 interrupt numbers, so we need to
implement it like this.

cheers


cheers



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support

2013-03-05 Thread Mike Qiu

于 2013/3/6 13:42, Michael Ellerman 写道:

On Wed, Mar 06, 2013 at 01:34:58PM +0800, Mike Qiu wrote:

于 2013/3/6 11:54, Michael Ellerman 写道:

On Tue, Mar 05, 2013 at 03:19:57PM +0800, Mike Qiu wrote:

于 2013/3/5 10:23, Michael Ellerman 写道:

On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 96f3a1d..38648e6 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, 
unsigned int irq_base,
  }
  EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
+/**
+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
+ * @domain: domain owning the interrupt range
+ * @hwirq_base: beginning of continuous hardware IRQ range
+ * @count: Number of interrupts to map

For multiple-MSI the allocated interrupt numbers must be a power-of-2,
and must be naturally aligned. I don't /think/ that's a requirement for
the virtual numbers, but it's probably best that we do it anyway.

So this API needs to specify that it will give you back a power-of-2
block that is naturally aligned - otherwise you can't use it for MSI.

rtas_call will return the numbers of hardware interrupt, and it
should be power-of-2, as this I think do not need to specify

You're confusing hardware interrupt numbers and virtual interrupt
numbers. My comment is about irq_create_mapping_many(), which returns
virtual interrupt numbers.

As I said I don't think there is a requirement that the virtual
interrupt numbers are also a power-of-2 naturally aligned block, but we
should allocate them as one anyway, to avoid any issues in future.

But for virtual interrupt numbersit should be a power-of-2 naturally
aligned block, because it must be continuous, as the MSI-HOWTO.txt says:

 4.2.2 pci_enable_msi_block
 int pci_enable_msi_block(struct pci_dev *dev, int count)
 This variation on the above call allows a device driver to request
 multiple MSIs.  The MSI specification only allows interrupts to be
 allocated in powers of two, up to a maximum of 2^5 (32).
 If this function returns 0, it has succeeded in allocating at least
 as many interrupts as the driver requested
 (it may have allocated more in order to satisfy the power-of-two
 requirement). In this case, the function enables MSI on this device
 and updates dev-irq to be the lowest of the new interrupts
 assigned to it. The other interrupts assigned to the device are in
 the range dev-irq to dev-irq + count - 1.

See the last line, that means for the virtual interrupts must be a
continuous block.

In practice I think things could work if we didn't, because we are not
using the mask routines that assume that layout.

But you're right, we must implement the API as it's specified, so the
virtual interrupt numbers must be a naturally aligned power-of-2.

Yes, also your opinion is also right, just becasue the API requires
a naturally aligned power-of-2 interrupt numbers, so we need to
implement it like this.

cheers


cheers



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support

2013-03-04 Thread Mike Qiu

于 2013/3/5 10:41, Paul Mundt 写道:

On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:

Adding a function irq_create_mapping_many() which can associate
multiple MSIs to a continous irq mapping.

This is needed to enable multiple MSI support for pSeries.

+int irq_create_mapping_many(struct irq_domain *domain,
+   irq_hw_number_t hwirq_base, int count)
+{

Other than the other review comments already made, I think you can
simplify this considerably by simply doing what irq_create_strict_mappings() 
does,
and relaxing the irq_base requirements.

In any event, as you are creating a new interface, I don't think you want
to carry around half of the legacy crap that irq_create_mapping() has to
deal with. We made the decision to avoid this with irq_create_strict_mappings()
intentionally, too.

Oh, yes, you are right, I will send out V2 of my patch to make it more 
comfortable , and hope you can review my patch again


Thanks

Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support

2013-03-04 Thread Mike Qiu

于 2013/3/5 10:23, Michael Ellerman 写道:

On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:

Adding a function irq_create_mapping_many() which can associate
multiple MSIs to a continous irq mapping.

This is needed to enable multiple MSI support for pSeries.

Signed-off-by: Mike Qiu 
---
  include/linux/irq.h   |2 +
  include/linux/irqdomain.h |3 ++
  kernel/irq/irqdomain.c|   61 +
  3 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 60ef45b..e00a7ec 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned 
int cnt, int node,
  #define irq_alloc_desc_from(from, node)   \
irq_alloc_descs(-1, from, 1, node)
  
+#define irq_alloc_desc_n(nevc, node)		\

+   irq_alloc_descs(-1, 0, nevc, node)

This has been superseeded by irq_alloc_descs_from(), which is the right
way to do it.
Yes, but irq_alloc_descs_from() just for 1 irq, and if I change the api, 
maybe a lot places which call this

function will be affact.



diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 0d5b17b..831dded 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -168,6 +168,9 @@ extern int irq_create_strict_mappings(struct irq_domain 
*domain,
  unsigned int irq_base,
  irq_hw_number_t hwirq_base, int count);
  
+extern int irq_create_mapping_many(struct irq_domain *domain,

+   irq_hw_number_t hwirq_base, int count);
+
  static inline int irq_create_identity_mapping(struct irq_domain *host,
  irq_hw_number_t hwirq)
  {
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 96f3a1d..38648e6 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, 
unsigned int irq_base,
  }
  EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
  
+/**

+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
+ * @domain: domain owning the interrupt range
+ * @hwirq_base: beginning of continuous hardware IRQ range
+ * @count: Number of interrupts to map

For multiple-MSI the allocated interrupt numbers must be a power-of-2,
and must be naturally aligned. I don't /think/ that's a requirement for
the virtual numbers, but it's probably best that we do it anyway.

So this API needs to specify that it will give you back a power-of-2
block that is naturally aligned - otherwise you can't use it for MSI.
rtas_call will return the numbers of hardware interrupt, and it should 
be power-of-2,

as this I think do not need to specify

+ * This routine is used for allocating and mapping a range of hardware
+ * irqs to virtual IRQs where the virtual irq numbers are not at pre-defined
+ * locations.

This comment doesn't make sense to me.


+ *
+ * Greater than 0 is returned upon success, while any failure to establish a
+ * static mapping is treated as an error.
+ */
+int irq_create_mapping_many(struct irq_domain *domain,
+   irq_hw_number_t hwirq_base, int count)
+{
+   int ret, irq_base;
+   int virq, i;
+
+   pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq_base);


I'd like to see this whole function rewritten to reduce the duplication
vs irq_create_mapping(). I don't see any reason why this can't be the
core routine, and irq_create_mapping() becomes a caller of it, passing a
count of 1 ?

It's good suggestion.

+   /* Look for default domain if nececssary */
+   if (!domain)
+   domain = irq_default_domain;
+   if (!domain) {
+   pr_warn("irq_create_mapping called for NULL domain, hwirq=%lx\n"
+   , hwirq_base);
+   WARN_ON(1);
+   return 0;
+   }
+   pr_debug("-> using domain @%p\n", domain);
+
+   /* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
+   if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
+   return irq_domain_legacy_revmap(domain, hwirq_base);

The above doesn't work.

Why it doesn't work ?

+   /* Check if mapping already exists */
+   for (i = 0; i < count; i++) {
+   virq = irq_find_mapping(domain, hwirq_base+i);
+   if (virq) {
+   pr_debug("existing mapping on virq %d,"
+   " now dispose it first\n", virq);
+   irq_dispose_mapping(virq);

You might have just disposed of someone elses mapping, we shouldn't do
that. It should be an error to the caller.
It's a good question. If the interrupt used for someone elses, why I can 
apply it from the system?
So it may someone else forget to disp

Re: [PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support

2013-03-04 Thread Mike Qiu

于 2013/3/5 10:23, Michael Ellerman 写道:

On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:

Adding a function irq_create_mapping_many() which can associate
multiple MSIs to a continous irq mapping.

This is needed to enable multiple MSI support for pSeries.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
  include/linux/irq.h   |2 +
  include/linux/irqdomain.h |3 ++
  kernel/irq/irqdomain.c|   61 +
  3 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 60ef45b..e00a7ec 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned 
int cnt, int node,
  #define irq_alloc_desc_from(from, node)   \
irq_alloc_descs(-1, from, 1, node)
  
+#define irq_alloc_desc_n(nevc, node)		\

+   irq_alloc_descs(-1, 0, nevc, node)

This has been superseeded by irq_alloc_descs_from(), which is the right
way to do it.
Yes, but irq_alloc_descs_from() just for 1 irq, and if I change the api, 
maybe a lot places which call this

function will be affact.



diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 0d5b17b..831dded 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -168,6 +168,9 @@ extern int irq_create_strict_mappings(struct irq_domain 
*domain,
  unsigned int irq_base,
  irq_hw_number_t hwirq_base, int count);
  
+extern int irq_create_mapping_many(struct irq_domain *domain,

+   irq_hw_number_t hwirq_base, int count);
+
  static inline int irq_create_identity_mapping(struct irq_domain *host,
  irq_hw_number_t hwirq)
  {
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 96f3a1d..38648e6 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, 
unsigned int irq_base,
  }
  EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
  
+/**

+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
+ * @domain: domain owning the interrupt range
+ * @hwirq_base: beginning of continuous hardware IRQ range
+ * @count: Number of interrupts to map

For multiple-MSI the allocated interrupt numbers must be a power-of-2,
and must be naturally aligned. I don't /think/ that's a requirement for
the virtual numbers, but it's probably best that we do it anyway.

So this API needs to specify that it will give you back a power-of-2
block that is naturally aligned - otherwise you can't use it for MSI.
rtas_call will return the numbers of hardware interrupt, and it should 
be power-of-2,

as this I think do not need to specify

+ * This routine is used for allocating and mapping a range of hardware
+ * irqs to virtual IRQs where the virtual irq numbers are not at pre-defined
+ * locations.

This comment doesn't make sense to me.


+ *
+ * Greater than 0 is returned upon success, while any failure to establish a
+ * static mapping is treated as an error.
+ */
+int irq_create_mapping_many(struct irq_domain *domain,
+   irq_hw_number_t hwirq_base, int count)
+{
+   int ret, irq_base;
+   int virq, i;
+
+   pr_debug(irq_create_mapping(0x%p, 0x%lx)\n, domain, hwirq_base);


I'd like to see this whole function rewritten to reduce the duplication
vs irq_create_mapping(). I don't see any reason why this can't be the
core routine, and irq_create_mapping() becomes a caller of it, passing a
count of 1 ?

It's good suggestion.

+   /* Look for default domain if nececssary */
+   if (!domain)
+   domain = irq_default_domain;
+   if (!domain) {
+   pr_warn(irq_create_mapping called for NULL domain, hwirq=%lx\n
+   , hwirq_base);
+   WARN_ON(1);
+   return 0;
+   }
+   pr_debug(- using domain @%p\n, domain);
+
+   /* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
+   if (domain-revmap_type == IRQ_DOMAIN_MAP_LEGACY)
+   return irq_domain_legacy_revmap(domain, hwirq_base);

The above doesn't work.

Why it doesn't work ?

+   /* Check if mapping already exists */
+   for (i = 0; i  count; i++) {
+   virq = irq_find_mapping(domain, hwirq_base+i);
+   if (virq) {
+   pr_debug(existing mapping on virq %d,
+now dispose it first\n, virq);
+   irq_dispose_mapping(virq);

You might have just disposed of someone elses mapping, we shouldn't do
that. It should be an error to the caller.
It's a good question. If the interrupt used for someone elses, why I can 
apply it from the system?
So it may someone else forget to dispose mapping, and it never be used 
for others as I

Re: [PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support

2013-03-04 Thread Mike Qiu

于 2013/3/5 10:41, Paul Mundt 写道:

On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:

Adding a function irq_create_mapping_many() which can associate
multiple MSIs to a continous irq mapping.

This is needed to enable multiple MSI support for pSeries.

+int irq_create_mapping_many(struct irq_domain *domain,
+   irq_hw_number_t hwirq_base, int count)
+{

Other than the other review comments already made, I think you can
simplify this considerably by simply doing what irq_create_strict_mappings() 
does,
and relaxing the irq_base requirements.

In any event, as you are creating a new interface, I don't think you want
to carry around half of the legacy crap that irq_create_mapping() has to
deal with. We made the decision to avoid this with irq_create_strict_mappings()
intentionally, too.

Oh, yes, you are right, I will send out V2 of my patch to make it more 
comfortable , and hope you can review my patch again


Thanks

Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Enable multiple MSI feature in pSeries

2013-03-03 Thread Mike Qiu

于 2013/3/1 11:54, Michael Ellerman 写道:

On Fri, Mar 01, 2013 at 11:08:45AM +0800, Mike wrote:

Hi all

Any comments? or any questions about my patchset?

You were going to get some performance numbers that show a definite
benefit for using more than one MSI.

Yes, but my patch just enable the kernel to support this feature, whether
to use it depens on the device driver.

And this feature has been merged to the kernel for X86 for a long time.
See commit: 5ca72c4f7c412c2002363218901eba5516c476b1
51906e779f2b13b38f8153774c4c7163d412ffd9

Actually, I'm trying to do the test. but it is difficult to do that test,
because it mostly depends on how the device driver to use this feature,
while the ipr driver patch was wrote by another person. also no any reply
from her.



cheers



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Enable multiple MSI feature in pSeries

2013-03-03 Thread Mike Qiu

于 2013/3/1 11:54, Michael Ellerman 写道:

On Fri, Mar 01, 2013 at 11:08:45AM +0800, Mike wrote:

Hi all

Any comments? or any questions about my patchset?

You were going to get some performance numbers that show a definite
benefit for using more than one MSI.

Yes, but my patch just enable the kernel to support this feature, whether
to use it depens on the device driver.

And this feature has been merged to the kernel for X86 for a long time.
See commit: 5ca72c4f7c412c2002363218901eba5516c476b1
51906e779f2b13b38f8153774c4c7163d412ffd9

Actually, I'm trying to do the test. but it is difficult to do that test,
because it mostly depends on how the device driver to use this feature,
while the ipr driver patch was wrote by another person. also no any reply
from her.



cheers



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Enable multiple MSI feature in pSeries

2013-02-03 Thread Mike Qiu

2013/2/4 13:56, Michael Ellerman:

On Mon, 2013-02-04 at 11:49 +0800, Mike Qiu wrote:

On Tue, 2013-01-15 at 15:38 +0800, Mike Qiu wrote:

Currently, multiple MSI feature hasn't been enabled in pSeries,
These patches try to enbale this feature.

Hi Mike,


These patches have been tested by using ipr driver, and the driver patch
has been made by Wen Xiong :

So who wrote these patches? Normally we would expect the original author
to post the patches if at all possible.

Hi Michael

These Multiple MSI patches were wrote by myself, you know this feature
has not enabled
and it need device driver to test whether it works suitable. So I test
my patches use
Wen Xiong's ipr patches, which has been send out to the maillinglist.

I'm the original author :)

Ah OK, sorry, that was more or less clear from your mail but I just
misunderstood.


[PATCH 0/7] Add support for new IBM SAS controllers

I would like to see the full series, including the driver enablement.

Yep, but the driver patches were wrote by Wen Xiong and has been send
out.

OK, you mean this series?

http://thread.gmane.org/gmane.linux.scsi/79639

Yes, exactly.




I just use her patches to test my patches. all device support Multiple
MSI can use my feature not only IBM SAS controllers, I also test my
patches use the broadcom wireless card tg3, and also works OK.

You mean drivers/net/ethernet/broadcom/tg3.c ? I don't see where it
calls pci_enable_msi_block() ?

Yes, I just modify the driver to support mutiple MSI.


All devices /can/ use it, but the driver needs to be updated. Currently
we have two drivers that do so (in Linus' tree), plus the updated IPR.

Not all devices, just the device which support the multiple MSI by hardware,
can use it



Test platform: One partition of pSeries with one cpu core(4 SMTs) and
RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) in POWER7
OS version: SUSE Linux Enterprise Server 11 SP2  (ppc64) with 3.8-rc3 kernel

IRQ 21 and 22 are assigned to the ipr device which support 2 mutiple MSI.

The test results is shown by 'cat /proc/interrups':
   CPU0   CPU1   CPU2   CPU3
21:  6  5  5  5  XICS Level host1-0
22:817814816813  XICS Level host1-1

This shows that you are correctly configuring two MSIs.

But the key advantage of using multiple interrupts is to distribute load
across CPUs and improve performance. So I would like to see some
performance numbers that show that there is a real benefit for all the
extra complexity in the code.

Yes, the system just has suport two MSIs. Anyway, I will try to do
some proformance test, to show the real benefit.
But actually it needs the driver to do so. As the data show above, it
seems there is some problems in use the interrupt, the irq 21 use few,
most use 22, I will discuss with the driver author to see why and if
she fixed, I will give out the proformance result.

Yeah that would be good.

I really dislike that we have a separate API for multi-MSI vs MSI-X, and
pci_enable_msi_block() also pushes the contiguous power-of-2 allocation
into the irq domain layer, which is unpleasant. So if we really must do
multi-MSI I would like to do it differently.
Yes, but the multi-MSI must need the hardware support, it is one extend 
for MSI,

The device may sopport MSI and multiple MSI, but not support MSI-X.
for these devices, we'd better use multiple MSI to makes it more efficiency,
compare with MSI.

multi-MSI just can use no more than 32 interrupts

Thanks


cheers




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Enable multiple MSI feature in pSeries

2013-02-03 Thread Mike Qiu

2013/2/4 13:56, Michael Ellerman:

On Mon, 2013-02-04 at 11:49 +0800, Mike Qiu wrote:

On Tue, 2013-01-15 at 15:38 +0800, Mike Qiu wrote:

Currently, multiple MSI feature hasn't been enabled in pSeries,
These patches try to enbale this feature.

Hi Mike,


These patches have been tested by using ipr driver, and the driver patch
has been made by Wen Xiong wenxi...@linux.vnet.ibm.com:

So who wrote these patches? Normally we would expect the original author
to post the patches if at all possible.

Hi Michael

These Multiple MSI patches were wrote by myself, you know this feature
has not enabled
and it need device driver to test whether it works suitable. So I test
my patches use
Wen Xiong's ipr patches, which has been send out to the maillinglist.

I'm the original author :)

Ah OK, sorry, that was more or less clear from your mail but I just
misunderstood.


[PATCH 0/7] Add support for new IBM SAS controllers

I would like to see the full series, including the driver enablement.

Yep, but the driver patches were wrote by Wen Xiong and has been send
out.

OK, you mean this series?

http://thread.gmane.org/gmane.linux.scsi/79639

Yes, exactly.




I just use her patches to test my patches. all device support Multiple
MSI can use my feature not only IBM SAS controllers, I also test my
patches use the broadcom wireless card tg3, and also works OK.

You mean drivers/net/ethernet/broadcom/tg3.c ? I don't see where it
calls pci_enable_msi_block() ?

Yes, I just modify the driver to support mutiple MSI.


All devices /can/ use it, but the driver needs to be updated. Currently
we have two drivers that do so (in Linus' tree), plus the updated IPR.

Not all devices, just the device which support the multiple MSI by hardware,
can use it



Test platform: One partition of pSeries with one cpu core(4 SMTs) and
RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) in POWER7
OS version: SUSE Linux Enterprise Server 11 SP2  (ppc64) with 3.8-rc3 kernel

IRQ 21 and 22 are assigned to the ipr device which support 2 mutiple MSI.

The test results is shown by 'cat /proc/interrups':
   CPU0   CPU1   CPU2   CPU3
21:  6  5  5  5  XICS Level host1-0
22:817814816813  XICS Level host1-1

This shows that you are correctly configuring two MSIs.

But the key advantage of using multiple interrupts is to distribute load
across CPUs and improve performance. So I would like to see some
performance numbers that show that there is a real benefit for all the
extra complexity in the code.

Yes, the system just has suport two MSIs. Anyway, I will try to do
some proformance test, to show the real benefit.
But actually it needs the driver to do so. As the data show above, it
seems there is some problems in use the interrupt, the irq 21 use few,
most use 22, I will discuss with the driver author to see why and if
she fixed, I will give out the proformance result.

Yeah that would be good.

I really dislike that we have a separate API for multi-MSI vs MSI-X, and
pci_enable_msi_block() also pushes the contiguous power-of-2 allocation
into the irq domain layer, which is unpleasant. So if we really must do
multi-MSI I would like to do it differently.
Yes, but the multi-MSI must need the hardware support, it is one extend 
for MSI,

The device may sopport MSI and multiple MSI, but not support MSI-X.
for these devices, we'd better use multiple MSI to makes it more efficiency,
compare with MSI.

multi-MSI just can use no more than 32 interrupts

Thanks


cheers




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support

2013-01-14 Thread Mike Qiu
Adding a function irq_create_mapping_many() which can associate
multiple MSIs to a continous irq mapping.

This is needed to enable multiple MSI support for pSeries.

Signed-off-by: Mike Qiu 
---
 include/linux/irq.h   |2 +
 include/linux/irqdomain.h |3 ++
 kernel/irq/irqdomain.c|   61 +
 3 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 60ef45b..e00a7ec 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned 
int cnt, int node,
 #define irq_alloc_desc_from(from, node)\
irq_alloc_descs(-1, from, 1, node)
 
+#define irq_alloc_desc_n(nevc, node)   \
+   irq_alloc_descs(-1, 0, nevc, node)
 void irq_free_descs(unsigned int irq, unsigned int cnt);
 int irq_reserve_irqs(unsigned int from, unsigned int cnt);
 
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 0d5b17b..831dded 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -168,6 +168,9 @@ extern int irq_create_strict_mappings(struct irq_domain 
*domain,
  unsigned int irq_base,
  irq_hw_number_t hwirq_base, int count);
 
+extern int irq_create_mapping_many(struct irq_domain *domain,
+   irq_hw_number_t hwirq_base, int count);
+
 static inline int irq_create_identity_mapping(struct irq_domain *host,
  irq_hw_number_t hwirq)
 {
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 96f3a1d..38648e6 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, 
unsigned int irq_base,
 }
 EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
 
+/**
+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
+ * @domain: domain owning the interrupt range
+ * @hwirq_base: beginning of continuous hardware IRQ range
+ * @count: Number of interrupts to map
+ *
+ * This routine is used for allocating and mapping a range of hardware
+ * irqs to virtual IRQs where the virtual irq numbers are not at pre-defined
+ * locations.
+ *
+ * Greater than 0 is returned upon success, while any failure to establish a
+ * static mapping is treated as an error.
+ */
+int irq_create_mapping_many(struct irq_domain *domain,
+   irq_hw_number_t hwirq_base, int count)
+{
+   int ret, irq_base;
+   int virq, i;
+
+   pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq_base);
+
+   /* Look for default domain if nececssary */
+   if (!domain)
+   domain = irq_default_domain;
+   if (!domain) {
+   pr_warn("irq_create_mapping called for NULL domain, hwirq=%lx\n"
+   , hwirq_base);
+   WARN_ON(1);
+   return 0;
+   }
+   pr_debug("-> using domain @%p\n", domain);
+
+   /* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
+   if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
+   return irq_domain_legacy_revmap(domain, hwirq_base);
+
+   /* Check if mapping already exists */
+   for (i = 0; i < count; i++) {
+   virq = irq_find_mapping(domain, hwirq_base+i);
+   if (virq) {
+   pr_debug("existing mapping on virq %d,"
+   " now dispose it first\n", virq);
+   irq_dispose_mapping(virq);
+   }
+   }
+
+   /* Allocate the continuous virtual interrupt numbers */
+   irq_base = irq_alloc_desc_n(count, of_node_to_nid(domain->of_node));
+   if (unlikely(irq_base < 0))
+   return  irq_base;
+
+   ret = irq_domain_associate_many(domain, irq_base, hwirq_base, count);
+   if (unlikely(ret < 0)) {
+   irq_free_descs(irq_base, count);
+   return ret;
+   }
+
+   return irq_base;
+}
+EXPORT_SYMBOL_GPL(irq_create_mapping_many);
+
 unsigned int irq_create_of_mapping(struct device_node *controller,
   const u32 *intspec, unsigned int intsize)
 {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] Enable multiple MSI feature in pSeries

2013-01-14 Thread Mike Qiu
Currently, multiple MSI feature hasn't been enabled in pSeries,
These patches try to enbale this feature.

These patches have been tested by using ipr driver, and the driver patch
has been made by Wen Xiong :

[PATCH 0/7] Add support for new IBM SAS controllers

Test platform: One partition of pSeries with one cpu core(4 SMTs) and 
   RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) in POWER7
OS version: SUSE Linux Enterprise Server 11 SP2  (ppc64) with 3.8-rc3 kernel 

IRQ 21 and 22 are assigned to the ipr device which support 2 mutiple MSI.

The test results is shown by 'cat /proc/interrups':
  CPU0   CPU1   CPU2   CPU3   
16: 240458 261601 226310 200425  XICS Level IPI
17:  0  0  0  0  XICS Level RAS_EPOW
18: 10  0  3  2  XICS Level hvc_console
19: 122182  28481  28527  28864  XICS Level ibmvscsi
20:5067388226108118  XICS Level eth0
21:  6  5  5  5  XICS Level host1-0
22:817814816813  XICS Level host1-1
LOC: 398077 316725 231882 203049   Local timer interrupts
SPU:   1659919961903   Spurious interrupts
CNT:  0  0  0  0   Performance
monitoring interrupts
MCE:  0  0  0  0   Machine check exceptions

Mike Qiu (3):
  irq: Set multiple MSI descriptor data for multiple IRQs
  irq: Add hw continuous IRQs map to virtual continuous IRQs support
  powerpc/pci: Enable pSeries multiple MSI feature

 arch/powerpc/kernel/msi.c|4 --
 arch/powerpc/platforms/pseries/msi.c |   62 -
 include/linux/irq.h  |4 ++
 include/linux/irqdomain.h|3 ++
 kernel/irq/chip.c|   40 -
 kernel/irq/irqdomain.c   |   61 +
 6 files changed, 158 insertions(+), 16 deletions(-)

-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] irq: Set multiple MSI descriptor data for multiple IRQs

2013-01-14 Thread Mike Qiu
Multiple MSI only requires the IRQ in msi_desc entry to be set as
the value of irq_base.

This patch implements the above mentioned technique.

Signed-off-by: Mike Qiu 
---
 include/linux/irq.h |2 ++
 kernel/irq/chip.c   |   40 ++--
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index fdf2c4a..60ef45b 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -528,6 +528,8 @@ extern int irq_set_handler_data(unsigned int irq, void 
*data);
 extern int irq_set_chip_data(unsigned int irq, void *data);
 extern int irq_set_irq_type(unsigned int irq, unsigned int type);
 extern int irq_set_msi_desc(unsigned int irq, struct msi_desc *entry);
+extern int irq_set_multiple_msi_desc(unsigned int irq_base, unsigned int nvec,
+   struct msi_desc *entry);
 extern struct irq_data *irq_get_irq_data(unsigned int irq);
 
 static inline struct irq_chip *irq_get_chip(unsigned int irq)
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 3aca9f2..c4c39d3 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -90,6 +90,35 @@ int irq_set_handler_data(unsigned int irq, void *data)
 EXPORT_SYMBOL(irq_set_handler_data);
 
 /**
+ * irq_set_multiple_msi_desc - set Multiple MSI descriptor data
+ * for multiple IRQs
+ * @irq_base:  Interrupt number base
+ * @nvec:  The number of interrupts
+ * @entry: Pointer to MSI descriptor data
+ *
+ * Set IRQ descriptors for multiple MSIs
+ */
+int irq_set_multiple_msi_desc(unsigned int irq_base, unsigned int nvec,
+   struct msi_desc *entry)
+{
+   unsigned long flags, i;
+   struct irq_desc *desc;
+
+   for (i = 0; i < nvec; i++) {
+   desc = irq_get_desc_lock(irq_base + i, ,
+   IRQ_GET_DESC_CHECK_GLOBAL);
+   if (!desc)
+   return -EINVAL;
+   desc->irq_data.msi_desc = entry;
+   if (i == 0 && entry)
+   entry->irq = irq_base;
+   irq_put_desc_unlock(desc, flags);
+   }
+
+   return 0;
+}
+
+/**
  * irq_set_msi_desc - set MSI descriptor data for an irq
  * @irq:   Interrupt number
  * @entry: Pointer to MSI descriptor data
@@ -98,16 +127,7 @@ EXPORT_SYMBOL(irq_set_handler_data);
  */
 int irq_set_msi_desc(unsigned int irq, struct msi_desc *entry)
 {
-   unsigned long flags;
-   struct irq_desc *desc = irq_get_desc_lock(irq, , 
IRQ_GET_DESC_CHECK_GLOBAL);
-
-   if (!desc)
-   return -EINVAL;
-   desc->irq_data.msi_desc = entry;
-   if (entry)
-   entry->irq = irq;
-   irq_put_desc_unlock(desc, flags);
-   return 0;
+   return irq_set_multiple_msi_desc(irq, 1, entry);
 }
 
 /**
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] powerpc/pci: Enable pSeries multiple MSI feature

2013-01-14 Thread Mike Qiu
PCI devices support MSI, MSIX as well as multiple MSI.
But pSeries does not support multiple MSI yet.

This patch enable multiple MSI feature in pSeries.

Signed-off-by: Mike Qiu 
---
 arch/powerpc/kernel/msi.c|4 --
 arch/powerpc/platforms/pseries/msi.c |   62 -
 2 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 8bbc12d..46b1470 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -20,10 +20,6 @@ int arch_msi_check_device(struct pci_dev* dev, int nvec, int 
type)
return -ENOSYS;
}
 
-   /* PowerPC doesn't support multiple MSI yet */
-   if (type == PCI_CAP_ID_MSI && nvec > 1)
-   return 1;
-
if (ppc_md.msi_check_device) {
pr_debug("msi: Using platform check routine.\n");
return ppc_md.msi_check_device(dev, nvec, type);
diff --git a/arch/powerpc/platforms/pseries/msi.c 
b/arch/powerpc/platforms/pseries/msi.c
index e5b0847..6633b18 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -132,13 +132,17 @@ static int rtas_query_irq_number(struct pci_dn *pdn, int 
offset)
 static void rtas_teardown_msi_irqs(struct pci_dev *pdev)
 {
struct msi_desc *entry;
+   int nvec, i;
 
list_for_each_entry(entry, >msi_list, list) {
if (entry->irq == NO_IRQ)
continue;
 
irq_set_msi_desc(entry->irq, NULL);
-   irq_dispose_mapping(entry->irq);
+   nvec = entry->msi_attrib.is_msix ? 1 : 1 <<
+   entry->msi_attrib.multiple;
+   for (i = 0; i < nvec; i++)
+   irq_dispose_mapping(entry->irq + i);
}
 
rtas_disable_msi(pdev);
@@ -392,6 +396,55 @@ static int check_msix_entries(struct pci_dev *pdev)
return 0;
 }
 
+static int setup_multiple_msi_irqs(struct pci_dev *pdev, int nvec)
+{
+   struct pci_dn *pdn;
+   int hwirq, virq_base, i, hwirq_base = 0;
+   struct msi_desc *entry;
+   struct msi_msg msg;
+
+   pdn = get_pdn(pdev);
+   entry = list_entry(pdev->msi_list.next, typeof(*entry), list);
+
+   /*
+* Get the hardware IRQ base and ensure the retrieved
+* hardware IRQs are continuous
+*/
+   for (i = 0; i < nvec; i++) {
+   hwirq = rtas_query_irq_number(pdn, i);
+   if (i == 0)
+   hwirq_base = hwirq;
+
+   if (hwirq < 0 || hwirq != (hwirq_base + i)) {
+   pr_debug("rtas_msi: Failure to get %d IRQs on"
+   "PCI device %04x:%02x:%02x.%01x\n", nvec,
+   pci_domain_nr(pdev->bus), pdev->bus->number,
+   PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+   return hwirq;
+   }
+   }
+
+   virq_base = irq_create_mapping_many(NULL, hwirq_base, nvec);
+   if (virq_base <= 0) {
+   pr_debug("rtas_msi: Failure to map IRQs (%d, %d) "
+   "for PCI device %04x:%02x:%02x.%01x\n",
+   hwirq_base, nvec, pci_domain_nr(pdev->bus),
+   pdev->bus->number, PCI_SLOT(pdev->devfn),
+   PCI_FUNC(pdev->devfn));
+   return -ENOSPC;
+   }
+
+   entry->msi_attrib.multiple = ilog2(nvec & 0x3f);
+   irq_set_multiple_msi_desc(virq_base, nvec, entry);
+   for (i = 0; i < nvec; i++) {
+   /* Read config space back so we can restore after reset */
+   read_msi_msg(virq_base + i, );
+   entry->msg = msg;
+   }
+
+   return 0;
+}
+
 static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
 {
struct pci_dn *pdn;
@@ -444,11 +497,16 @@ again:
return rc;
}
 
+   if (type == PCI_CAP_ID_MSI && nvec > 1) {
+   rc = setup_multiple_msi_irqs(pdev, nvec);
+   return rc;
+   }
+
i = 0;
list_for_each_entry(entry, >msi_list, list) {
hwirq = rtas_query_irq_number(pdn, i++);
if (hwirq < 0) {
-   pr_debug("rtas_msi: error (%d) getting hwirq\n", rc);
+   pr_debug("rtas_msi: error (%d) getting hwirq\n", nvec);
return hwirq;
}
 
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] irq: Set multiple MSI descriptor data for multiple IRQs

2013-01-14 Thread Mike Qiu
Multiple MSI only requires the IRQ in msi_desc entry to be set as
the value of irq_base.

This patch implements the above mentioned technique.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
 include/linux/irq.h |2 ++
 kernel/irq/chip.c   |   40 ++--
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index fdf2c4a..60ef45b 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -528,6 +528,8 @@ extern int irq_set_handler_data(unsigned int irq, void 
*data);
 extern int irq_set_chip_data(unsigned int irq, void *data);
 extern int irq_set_irq_type(unsigned int irq, unsigned int type);
 extern int irq_set_msi_desc(unsigned int irq, struct msi_desc *entry);
+extern int irq_set_multiple_msi_desc(unsigned int irq_base, unsigned int nvec,
+   struct msi_desc *entry);
 extern struct irq_data *irq_get_irq_data(unsigned int irq);
 
 static inline struct irq_chip *irq_get_chip(unsigned int irq)
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 3aca9f2..c4c39d3 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -90,6 +90,35 @@ int irq_set_handler_data(unsigned int irq, void *data)
 EXPORT_SYMBOL(irq_set_handler_data);
 
 /**
+ * irq_set_multiple_msi_desc - set Multiple MSI descriptor data
+ * for multiple IRQs
+ * @irq_base:  Interrupt number base
+ * @nvec:  The number of interrupts
+ * @entry: Pointer to MSI descriptor data
+ *
+ * Set IRQ descriptors for multiple MSIs
+ */
+int irq_set_multiple_msi_desc(unsigned int irq_base, unsigned int nvec,
+   struct msi_desc *entry)
+{
+   unsigned long flags, i;
+   struct irq_desc *desc;
+
+   for (i = 0; i  nvec; i++) {
+   desc = irq_get_desc_lock(irq_base + i, flags,
+   IRQ_GET_DESC_CHECK_GLOBAL);
+   if (!desc)
+   return -EINVAL;
+   desc-irq_data.msi_desc = entry;
+   if (i == 0  entry)
+   entry-irq = irq_base;
+   irq_put_desc_unlock(desc, flags);
+   }
+
+   return 0;
+}
+
+/**
  * irq_set_msi_desc - set MSI descriptor data for an irq
  * @irq:   Interrupt number
  * @entry: Pointer to MSI descriptor data
@@ -98,16 +127,7 @@ EXPORT_SYMBOL(irq_set_handler_data);
  */
 int irq_set_msi_desc(unsigned int irq, struct msi_desc *entry)
 {
-   unsigned long flags;
-   struct irq_desc *desc = irq_get_desc_lock(irq, flags, 
IRQ_GET_DESC_CHECK_GLOBAL);
-
-   if (!desc)
-   return -EINVAL;
-   desc-irq_data.msi_desc = entry;
-   if (entry)
-   entry-irq = irq;
-   irq_put_desc_unlock(desc, flags);
-   return 0;
+   return irq_set_multiple_msi_desc(irq, 1, entry);
 }
 
 /**
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] powerpc/pci: Enable pSeries multiple MSI feature

2013-01-14 Thread Mike Qiu
PCI devices support MSI, MSIX as well as multiple MSI.
But pSeries does not support multiple MSI yet.

This patch enable multiple MSI feature in pSeries.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
 arch/powerpc/kernel/msi.c|4 --
 arch/powerpc/platforms/pseries/msi.c |   62 -
 2 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 8bbc12d..46b1470 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -20,10 +20,6 @@ int arch_msi_check_device(struct pci_dev* dev, int nvec, int 
type)
return -ENOSYS;
}
 
-   /* PowerPC doesn't support multiple MSI yet */
-   if (type == PCI_CAP_ID_MSI  nvec  1)
-   return 1;
-
if (ppc_md.msi_check_device) {
pr_debug(msi: Using platform check routine.\n);
return ppc_md.msi_check_device(dev, nvec, type);
diff --git a/arch/powerpc/platforms/pseries/msi.c 
b/arch/powerpc/platforms/pseries/msi.c
index e5b0847..6633b18 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -132,13 +132,17 @@ static int rtas_query_irq_number(struct pci_dn *pdn, int 
offset)
 static void rtas_teardown_msi_irqs(struct pci_dev *pdev)
 {
struct msi_desc *entry;
+   int nvec, i;
 
list_for_each_entry(entry, pdev-msi_list, list) {
if (entry-irq == NO_IRQ)
continue;
 
irq_set_msi_desc(entry-irq, NULL);
-   irq_dispose_mapping(entry-irq);
+   nvec = entry-msi_attrib.is_msix ? 1 : 1 
+   entry-msi_attrib.multiple;
+   for (i = 0; i  nvec; i++)
+   irq_dispose_mapping(entry-irq + i);
}
 
rtas_disable_msi(pdev);
@@ -392,6 +396,55 @@ static int check_msix_entries(struct pci_dev *pdev)
return 0;
 }
 
+static int setup_multiple_msi_irqs(struct pci_dev *pdev, int nvec)
+{
+   struct pci_dn *pdn;
+   int hwirq, virq_base, i, hwirq_base = 0;
+   struct msi_desc *entry;
+   struct msi_msg msg;
+
+   pdn = get_pdn(pdev);
+   entry = list_entry(pdev-msi_list.next, typeof(*entry), list);
+
+   /*
+* Get the hardware IRQ base and ensure the retrieved
+* hardware IRQs are continuous
+*/
+   for (i = 0; i  nvec; i++) {
+   hwirq = rtas_query_irq_number(pdn, i);
+   if (i == 0)
+   hwirq_base = hwirq;
+
+   if (hwirq  0 || hwirq != (hwirq_base + i)) {
+   pr_debug(rtas_msi: Failure to get %d IRQs on
+   PCI device %04x:%02x:%02x.%01x\n, nvec,
+   pci_domain_nr(pdev-bus), pdev-bus-number,
+   PCI_SLOT(pdev-devfn), PCI_FUNC(pdev-devfn));
+   return hwirq;
+   }
+   }
+
+   virq_base = irq_create_mapping_many(NULL, hwirq_base, nvec);
+   if (virq_base = 0) {
+   pr_debug(rtas_msi: Failure to map IRQs (%d, %d) 
+   for PCI device %04x:%02x:%02x.%01x\n,
+   hwirq_base, nvec, pci_domain_nr(pdev-bus),
+   pdev-bus-number, PCI_SLOT(pdev-devfn),
+   PCI_FUNC(pdev-devfn));
+   return -ENOSPC;
+   }
+
+   entry-msi_attrib.multiple = ilog2(nvec  0x3f);
+   irq_set_multiple_msi_desc(virq_base, nvec, entry);
+   for (i = 0; i  nvec; i++) {
+   /* Read config space back so we can restore after reset */
+   read_msi_msg(virq_base + i, msg);
+   entry-msg = msg;
+   }
+
+   return 0;
+}
+
 static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
 {
struct pci_dn *pdn;
@@ -444,11 +497,16 @@ again:
return rc;
}
 
+   if (type == PCI_CAP_ID_MSI  nvec  1) {
+   rc = setup_multiple_msi_irqs(pdev, nvec);
+   return rc;
+   }
+
i = 0;
list_for_each_entry(entry, pdev-msi_list, list) {
hwirq = rtas_query_irq_number(pdn, i++);
if (hwirq  0) {
-   pr_debug(rtas_msi: error (%d) getting hwirq\n, rc);
+   pr_debug(rtas_msi: error (%d) getting hwirq\n, nvec);
return hwirq;
}
 
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] Enable multiple MSI feature in pSeries

2013-01-14 Thread Mike Qiu
Currently, multiple MSI feature hasn't been enabled in pSeries,
These patches try to enbale this feature.

These patches have been tested by using ipr driver, and the driver patch
has been made by Wen Xiong wenxi...@linux.vnet.ibm.com:

[PATCH 0/7] Add support for new IBM SAS controllers

Test platform: One partition of pSeries with one cpu core(4 SMTs) and 
   RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) in POWER7
OS version: SUSE Linux Enterprise Server 11 SP2  (ppc64) with 3.8-rc3 kernel 

IRQ 21 and 22 are assigned to the ipr device which support 2 mutiple MSI.

The test results is shown by 'cat /proc/interrups':
  CPU0   CPU1   CPU2   CPU3   
16: 240458 261601 226310 200425  XICS Level IPI
17:  0  0  0  0  XICS Level RAS_EPOW
18: 10  0  3  2  XICS Level hvc_console
19: 122182  28481  28527  28864  XICS Level ibmvscsi
20:5067388226108118  XICS Level eth0
21:  6  5  5  5  XICS Level host1-0
22:817814816813  XICS Level host1-1
LOC: 398077 316725 231882 203049   Local timer interrupts
SPU:   1659919961903   Spurious interrupts
CNT:  0  0  0  0   Performance
monitoring interrupts
MCE:  0  0  0  0   Machine check exceptions

Mike Qiu (3):
  irq: Set multiple MSI descriptor data for multiple IRQs
  irq: Add hw continuous IRQs map to virtual continuous IRQs support
  powerpc/pci: Enable pSeries multiple MSI feature

 arch/powerpc/kernel/msi.c|4 --
 arch/powerpc/platforms/pseries/msi.c |   62 -
 include/linux/irq.h  |4 ++
 include/linux/irqdomain.h|3 ++
 kernel/irq/chip.c|   40 -
 kernel/irq/irqdomain.c   |   61 +
 6 files changed, 158 insertions(+), 16 deletions(-)

-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support

2013-01-14 Thread Mike Qiu
Adding a function irq_create_mapping_many() which can associate
multiple MSIs to a continous irq mapping.

This is needed to enable multiple MSI support for pSeries.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
 include/linux/irq.h   |2 +
 include/linux/irqdomain.h |3 ++
 kernel/irq/irqdomain.c|   61 +
 3 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 60ef45b..e00a7ec 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned 
int cnt, int node,
 #define irq_alloc_desc_from(from, node)\
irq_alloc_descs(-1, from, 1, node)
 
+#define irq_alloc_desc_n(nevc, node)   \
+   irq_alloc_descs(-1, 0, nevc, node)
 void irq_free_descs(unsigned int irq, unsigned int cnt);
 int irq_reserve_irqs(unsigned int from, unsigned int cnt);
 
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 0d5b17b..831dded 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -168,6 +168,9 @@ extern int irq_create_strict_mappings(struct irq_domain 
*domain,
  unsigned int irq_base,
  irq_hw_number_t hwirq_base, int count);
 
+extern int irq_create_mapping_many(struct irq_domain *domain,
+   irq_hw_number_t hwirq_base, int count);
+
 static inline int irq_create_identity_mapping(struct irq_domain *host,
  irq_hw_number_t hwirq)
 {
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 96f3a1d..38648e6 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, 
unsigned int irq_base,
 }
 EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
 
+/**
+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
+ * @domain: domain owning the interrupt range
+ * @hwirq_base: beginning of continuous hardware IRQ range
+ * @count: Number of interrupts to map
+ *
+ * This routine is used for allocating and mapping a range of hardware
+ * irqs to virtual IRQs where the virtual irq numbers are not at pre-defined
+ * locations.
+ *
+ * Greater than 0 is returned upon success, while any failure to establish a
+ * static mapping is treated as an error.
+ */
+int irq_create_mapping_many(struct irq_domain *domain,
+   irq_hw_number_t hwirq_base, int count)
+{
+   int ret, irq_base;
+   int virq, i;
+
+   pr_debug(irq_create_mapping(0x%p, 0x%lx)\n, domain, hwirq_base);
+
+   /* Look for default domain if nececssary */
+   if (!domain)
+   domain = irq_default_domain;
+   if (!domain) {
+   pr_warn(irq_create_mapping called for NULL domain, hwirq=%lx\n
+   , hwirq_base);
+   WARN_ON(1);
+   return 0;
+   }
+   pr_debug(- using domain @%p\n, domain);
+
+   /* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
+   if (domain-revmap_type == IRQ_DOMAIN_MAP_LEGACY)
+   return irq_domain_legacy_revmap(domain, hwirq_base);
+
+   /* Check if mapping already exists */
+   for (i = 0; i  count; i++) {
+   virq = irq_find_mapping(domain, hwirq_base+i);
+   if (virq) {
+   pr_debug(existing mapping on virq %d,
+now dispose it first\n, virq);
+   irq_dispose_mapping(virq);
+   }
+   }
+
+   /* Allocate the continuous virtual interrupt numbers */
+   irq_base = irq_alloc_desc_n(count, of_node_to_nid(domain-of_node));
+   if (unlikely(irq_base  0))
+   return  irq_base;
+
+   ret = irq_domain_associate_many(domain, irq_base, hwirq_base, count);
+   if (unlikely(ret  0)) {
+   irq_free_descs(irq_base, count);
+   return ret;
+   }
+
+   return irq_base;
+}
+EXPORT_SYMBOL_GPL(irq_create_mapping_many);
+
 unsigned int irq_create_of_mapping(struct device_node *controller,
   const u32 *intspec, unsigned int intsize)
 {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] No need to call irq_domain_legacy_revmap() for twice

2012-09-24 Thread Mike Qiu
Function irq_create_mapping() calls irq_find_mapping(). The later
function has checked if the indicated IRQ domain has hw IRQ mapped to
virtual IRQ through legacy mode or not and return the value of the
legacy irq number by call irq_domain_legacy_revmap(). We needn't
to call irq_domain_legacy_revmap() to do same check in
irq_create_mapping() again.

The patch removes the duplicate call.

Signed-off-by: Mike Qiu 
---
 kernel/irq/irqdomain.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 49a7772..286d672 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -547,9 +547,12 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
return virq;
}
 
-   /* Get a virtual interrupt number */
+   /*
+* For IRQ domain with type of IRQ_DOMAIN_MAP_LEGACY, we needn't
+* create the IRQ mapping for non-existing one, so just return 0.
+*/
if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
-   return irq_domain_legacy_revmap(domain, hwirq);
+   return 0;
 
/* Allocate a virtual interrupt number */
hint = hwirq % nr_irqs;
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] No need to call irq_domain_legacy_revmap() for twice

2012-09-24 Thread Mike Qiu
Function irq_create_mapping() calls irq_find_mapping(). The later
function has checked if the indicated IRQ domain has hw IRQ mapped to
virtual IRQ through legacy mode or not and return the value of the
legacy irq number by call irq_domain_legacy_revmap(). We needn't
to call irq_domain_legacy_revmap() to do same check in
irq_create_mapping() again.

The patch removes the duplicate call.

Signed-off-by: Mike Qiu qiud...@linux.vnet.ibm.com
---
 kernel/irq/irqdomain.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 49a7772..286d672 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -547,9 +547,12 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
return virq;
}
 
-   /* Get a virtual interrupt number */
+   /*
+* For IRQ domain with type of IRQ_DOMAIN_MAP_LEGACY, we needn't
+* create the IRQ mapping for non-existing one, so just return 0.
+*/
if (domain-revmap_type == IRQ_DOMAIN_MAP_LEGACY)
-   return irq_domain_legacy_revmap(domain, hwirq);
+   return 0;
 
/* Allocate a virtual interrupt number */
hint = hwirq % nr_irqs;
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/