[PATCH 3.4 085/125] sh_eth: fix TX buffer byte-swapping

2016-10-12 Thread lizf
From: Sergei Shtylyov 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 3e2309937f1e5d538ff13da5fb8de41196927c61 upstream.

For the little-endian SH771x kernels the driver has to byte-swap the RX/TX
buffers,  however yet unset physcial address from the TX descriptor is used
to call sh_eth_soft_swap(). Use 'skb->data' instead...

Fixes: 31fcb99d9958 ("net: sh_eth: remove __flush_purge_region")
Signed-off-by: Sergei Shtylyov 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 drivers/net/ethernet/renesas/sh_eth.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c 
b/drivers/net/ethernet/renesas/sh_eth.c
index 16caeba..53f5a96 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1513,8 +1513,7 @@ static int sh_eth_start_xmit(struct sk_buff *skb, struct 
net_device *ndev)
txdesc = >tx_ring[entry];
/* soft swap. */
if (!mdp->cd->hw_swap)
-   sh_eth_soft_swap(phys_to_virt(ALIGN(txdesc->addr, 4)),
-skb->len + 2);
+   sh_eth_soft_swap(PTR_ALIGN(skb->data, 4), skb->len + 2);
txdesc->addr = dma_map_single(>dev, skb->data, skb->len,
  DMA_TO_DEVICE);
if (skb->len < ETHERSMALL)
-- 
1.9.1



[PATCH 3.4 085/125] sh_eth: fix TX buffer byte-swapping

2016-10-12 Thread lizf
From: Sergei Shtylyov 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 3e2309937f1e5d538ff13da5fb8de41196927c61 upstream.

For the little-endian SH771x kernels the driver has to byte-swap the RX/TX
buffers,  however yet unset physcial address from the TX descriptor is used
to call sh_eth_soft_swap(). Use 'skb->data' instead...

Fixes: 31fcb99d9958 ("net: sh_eth: remove __flush_purge_region")
Signed-off-by: Sergei Shtylyov 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 drivers/net/ethernet/renesas/sh_eth.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c 
b/drivers/net/ethernet/renesas/sh_eth.c
index 16caeba..53f5a96 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1513,8 +1513,7 @@ static int sh_eth_start_xmit(struct sk_buff *skb, struct 
net_device *ndev)
txdesc = >tx_ring[entry];
/* soft swap. */
if (!mdp->cd->hw_swap)
-   sh_eth_soft_swap(phys_to_virt(ALIGN(txdesc->addr, 4)),
-skb->len + 2);
+   sh_eth_soft_swap(PTR_ALIGN(skb->data, 4), skb->len + 2);
txdesc->addr = dma_map_single(>dev, skb->data, skb->len,
  DMA_TO_DEVICE);
if (skb->len < ETHERSMALL)
-- 
1.9.1



[PATCH 3.4 067/125] sata_sil: disable trim

2016-10-12 Thread lizf
From: Mikulas Patocka 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit d98f1cd0a3b70ea91f1dfda3ac36c3b2e1a4d5e2 upstream.

When I connect an Intel SSD to SATA SIL controller (PCI ID 1095:3114), any
TRIM command results in I/O errors being reported in the log. There is
other similar error reported with TRIM and the SIL controller:
https://bugs.centos.org/view.php?id=5880

Apparently the controller doesn't support TRIM commands. This patch
disables TRIM support on the SATA SIL controller.

ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata7.00: BMDMA2 stat 0x50001
ata7.00: failed command: DATA SET MANAGEMENT
ata7.00: cmd 06/01:01:00:00:00/00:00:00:00:00/a0 tag 0 dma 512 out
 res 51/04:01:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
ata7.00: status: { DRDY ERR }
ata7.00: error: { ABRT }
ata7.00: device reported invalid CHS sector 0
sd 8:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 8:0:0:0: [sdb] tag#0 Sense Key : Illegal Request [current] [descriptor]
sd 8:0:0:0: [sdb] tag#0 Add. Sense: Unaligned write command
sd 8:0:0:0: [sdb] tag#0 CDB: Write same(16) 93 08 00 00 00 00 00 21 95 88 00 20 
00 00 00 00
blk_update_request: I/O error, dev sdb, sector 2200968

Signed-off-by: Mikulas Patocka 
Signed-off-by: Tejun Heo 
Signed-off-by: Zefan Li 
---
 drivers/ata/sata_sil.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/ata/sata_sil.c b/drivers/ata/sata_sil.c
index 0c4ed89..7f0c7f0 100644
--- a/drivers/ata/sata_sil.c
+++ b/drivers/ata/sata_sil.c
@@ -631,6 +631,9 @@ static void sil_dev_config(struct ata_device *dev)
unsigned int n, quirks = 0;
unsigned char model_num[ATA_ID_PROD_LEN + 1];
 
+   /* This controller doesn't support trim */
+   dev->horkage |= ATA_HORKAGE_NOTRIM;
+
ata_id_c_string(dev->id, model_num, ATA_ID_PROD, sizeof(model_num));
 
for (n = 0; sil_blacklist[n].product; n++)
-- 
1.9.1



[PATCH 3.4 006/125] MIPS: atomic: Fix comment describing atomic64_add_unless's return value.

2016-10-12 Thread lizf
From: Ralf Baechle 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit f25319d2cb439249a6859f53ad42ffa332b0acba upstream.

Signed-off-by: Ralf Baechle 
Fixes: f24219b4e90cf70ec4a211b17fbabc725a0ddf3c
(cherry picked from commit f0a232cde7be18a207fd057dd79bbac8a0a45dec)
Signed-off-by: Zefan Li 
---
 arch/mips/include/asm/atomic.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index 3f4c5cb..939a6b7 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -679,7 +679,7 @@ static __inline__ long atomic64_sub_if_positive(long i, 
atomic64_t * v)
  * @u: ...unless v is equal to u.
  *
  * Atomically adds @a to @v, so long as it was not @u.
- * Returns the old value of @v.
+ * Returns true iff @v was not @u.
  */
 static __inline__ int atomic64_add_unless(atomic64_t *v, long a, long u)
 {
-- 
1.9.1



[PATCH 3.4 067/125] sata_sil: disable trim

2016-10-12 Thread lizf
From: Mikulas Patocka 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit d98f1cd0a3b70ea91f1dfda3ac36c3b2e1a4d5e2 upstream.

When I connect an Intel SSD to SATA SIL controller (PCI ID 1095:3114), any
TRIM command results in I/O errors being reported in the log. There is
other similar error reported with TRIM and the SIL controller:
https://bugs.centos.org/view.php?id=5880

Apparently the controller doesn't support TRIM commands. This patch
disables TRIM support on the SATA SIL controller.

ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata7.00: BMDMA2 stat 0x50001
ata7.00: failed command: DATA SET MANAGEMENT
ata7.00: cmd 06/01:01:00:00:00/00:00:00:00:00/a0 tag 0 dma 512 out
 res 51/04:01:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
ata7.00: status: { DRDY ERR }
ata7.00: error: { ABRT }
ata7.00: device reported invalid CHS sector 0
sd 8:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 8:0:0:0: [sdb] tag#0 Sense Key : Illegal Request [current] [descriptor]
sd 8:0:0:0: [sdb] tag#0 Add. Sense: Unaligned write command
sd 8:0:0:0: [sdb] tag#0 CDB: Write same(16) 93 08 00 00 00 00 00 21 95 88 00 20 
00 00 00 00
blk_update_request: I/O error, dev sdb, sector 2200968

Signed-off-by: Mikulas Patocka 
Signed-off-by: Tejun Heo 
Signed-off-by: Zefan Li 
---
 drivers/ata/sata_sil.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/ata/sata_sil.c b/drivers/ata/sata_sil.c
index 0c4ed89..7f0c7f0 100644
--- a/drivers/ata/sata_sil.c
+++ b/drivers/ata/sata_sil.c
@@ -631,6 +631,9 @@ static void sil_dev_config(struct ata_device *dev)
unsigned int n, quirks = 0;
unsigned char model_num[ATA_ID_PROD_LEN + 1];
 
+   /* This controller doesn't support trim */
+   dev->horkage |= ATA_HORKAGE_NOTRIM;
+
ata_id_c_string(dev->id, model_num, ATA_ID_PROD, sizeof(model_num));
 
for (n = 0; sil_blacklist[n].product; n++)
-- 
1.9.1



[PATCH 3.4 006/125] MIPS: atomic: Fix comment describing atomic64_add_unless's return value.

2016-10-12 Thread lizf
From: Ralf Baechle 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit f25319d2cb439249a6859f53ad42ffa332b0acba upstream.

Signed-off-by: Ralf Baechle 
Fixes: f24219b4e90cf70ec4a211b17fbabc725a0ddf3c
(cherry picked from commit f0a232cde7be18a207fd057dd79bbac8a0a45dec)
Signed-off-by: Zefan Li 
---
 arch/mips/include/asm/atomic.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index 3f4c5cb..939a6b7 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -679,7 +679,7 @@ static __inline__ long atomic64_sub_if_positive(long i, 
atomic64_t * v)
  * @u: ...unless v is equal to u.
  *
  * Atomically adds @a to @v, so long as it was not @u.
- * Returns the old value of @v.
+ * Returns true iff @v was not @u.
  */
 static __inline__ int atomic64_add_unless(atomic64_t *v, long a, long u)
 {
-- 
1.9.1



[PATCH 3.4 068/125] USB: whci-hcd: add check for dma mapping error

2016-10-12 Thread lizf
From: Alexey Khoroshilov 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit f9fa1887dcf26bd346665a6ae3d3f53dec54cba1 upstream.

qset_fill_page_list() do not check for dma mapping errors.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 drivers/usb/host/whci/qset.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/usb/host/whci/qset.c b/drivers/usb/host/whci/qset.c
index 76083ae..412b4fe 100644
--- a/drivers/usb/host/whci/qset.c
+++ b/drivers/usb/host/whci/qset.c
@@ -377,6 +377,10 @@ static int qset_fill_page_list(struct whc *whc, struct 
whc_std *std, gfp_t mem_f
if (std->pl_virt == NULL)
return -ENOMEM;
std->dma_addr = dma_map_single(whc->wusbhc.dev, std->pl_virt, pl_len, 
DMA_TO_DEVICE);
+   if (dma_mapping_error(whc->wusbhc.dev, std->dma_addr)) {
+   kfree(std->pl_virt);
+   return -EFAULT;
+   }
 
for (p = 0; p < std->num_pointers; p++) {
std->pl_virt[p].buf_ptr = cpu_to_le64(dma_addr);
-- 
1.9.1



[PATCH 3.4 068/125] USB: whci-hcd: add check for dma mapping error

2016-10-12 Thread lizf
From: Alexey Khoroshilov 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit f9fa1887dcf26bd346665a6ae3d3f53dec54cba1 upstream.

qset_fill_page_list() do not check for dma mapping errors.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 drivers/usb/host/whci/qset.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/usb/host/whci/qset.c b/drivers/usb/host/whci/qset.c
index 76083ae..412b4fe 100644
--- a/drivers/usb/host/whci/qset.c
+++ b/drivers/usb/host/whci/qset.c
@@ -377,6 +377,10 @@ static int qset_fill_page_list(struct whc *whc, struct 
whc_std *std, gfp_t mem_f
if (std->pl_virt == NULL)
return -ENOMEM;
std->dma_addr = dma_map_single(whc->wusbhc.dev, std->pl_virt, pl_len, 
DMA_TO_DEVICE);
+   if (dma_mapping_error(whc->wusbhc.dev, std->dma_addr)) {
+   kfree(std->pl_virt);
+   return -EFAULT;
+   }
 
for (p = 0; p < std->num_pointers; p++) {
std->pl_virt[p].buf_ptr = cpu_to_le64(dma_addr);
-- 
1.9.1



[PATCH 3.4 007/125] recordmcount: Fix endianness handling bug for nop_mcount

2016-10-12 Thread lizf
From: libin 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit c84da8b9ad3761eef43811181c7e896e9834b26b upstream.

In nop_mcount, shdr->sh_offset and welp->r_offset should handle
endianness properly, otherwise it will trigger Segmentation fault
if the recordmcount main and file.o have different endianness.

Link: http://lkml.kernel.org/r/563806c7.7070...@huawei.com

Signed-off-by: Li Bin 
Signed-off-by: Steven Rostedt 
Signed-off-by: Zefan Li 
---
 scripts/recordmcount.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
index 5e29610..799d734 100644
--- a/scripts/recordmcount.h
+++ b/scripts/recordmcount.h
@@ -375,7 +375,7 @@ static void nop_mcount(Elf_Shdr const *const relhdr,
 
if (mcountsym == Elf_r_sym(relp) && !is_fake_mcount(relp)) {
if (make_nop)
-   ret = make_nop((void *)ehdr, shdr->sh_offset + 
relp->r_offset);
+   ret = make_nop((void *)ehdr, 
_w(shdr->sh_offset) + _w(relp->r_offset));
if (warn_on_notrace_sect && !once) {
printf("Section %s has mcount callers being 
ignored\n",
   txtname);
-- 
1.9.1



[PATCH 3.4 069/125] dm btree: fix leak of bufio-backed block in btree_split_sibling error path

2016-10-12 Thread lizf
From: Mike Snitzer 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 30ce6e1cc5a0f781d60227e9096c86e188d2c2bd upstream.

The block allocated at the start of btree_split_sibling() is never
released if later insert_at() fails.

Fix this by releasing the previously allocated bufio block using
unlock_block().

Reported-by: Mikulas Patocka 
Signed-off-by: Mike Snitzer 
Signed-off-by: Zefan Li 
---
 drivers/md/persistent-data/dm-btree.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/persistent-data/dm-btree.c 
b/drivers/md/persistent-data/dm-btree.c
index be86d59..77c615e 100644
--- a/drivers/md/persistent-data/dm-btree.c
+++ b/drivers/md/persistent-data/dm-btree.c
@@ -450,8 +450,10 @@ static int btree_split_sibling(struct shadow_spine *s, 
dm_block_t root,
 
r = insert_at(sizeof(__le64), pn, parent_index + 1,
  le64_to_cpu(rn->keys[0]), );
-   if (r)
+   if (r) {
+   unlock_block(s->info, right);
return r;
+   }
 
if (key < le64_to_cpu(rn->keys[0])) {
unlock_block(s->info, right);
-- 
1.9.1



[PATCH 3.4 005/125] ARM: pxa: remove incorrect __init annotation on pxa27x_set_pwrmode

2016-10-12 Thread lizf
From: Arnd Bergmann 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 54c09889bff6d99c8733eed4a26c9391b177c88b upstream.

The z2 machine calls pxa27x_set_pwrmode() in order to power off
the machine, but this function gets discarded early at boot because
it is marked __init, as pointed out by kbuild:

WARNING: vmlinux.o(.text+0x145c4): Section mismatch in reference from the 
function z2_power_off() to the function .init.text:pxa27x_set_pwrmode()
The function z2_power_off() references
the function __init pxa27x_set_pwrmode().
This is often because z2_power_off lacks a __init
annotation or the annotation of pxa27x_set_pwrmode is wrong.

This removes the __init section modifier to fix rebooting and the
build error.

Signed-off-by: Arnd Bergmann 
Fixes: ba4a90a6d86a ("ARM: pxa/z2: fix building error of pxa27x_cpu_suspend() 
no longer available")
Signed-off-by: Robert Jarzmik 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 arch/arm/mach-pxa/include/mach/pxa27x.h | 2 +-
 arch/arm/mach-pxa/pxa27x.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mach-pxa/include/mach/pxa27x.h 
b/arch/arm/mach-pxa/include/mach/pxa27x.h
index 7cff640..66c4cbf 100644
--- a/arch/arm/mach-pxa/include/mach/pxa27x.h
+++ b/arch/arm/mach-pxa/include/mach/pxa27x.h
@@ -21,7 +21,7 @@
 
 extern void __init pxa27x_map_io(void);
 extern void __init pxa27x_init_irq(void);
-extern int __init pxa27x_set_pwrmode(unsigned int mode);
+extern int pxa27x_set_pwrmode(unsigned int mode);
 extern void pxa27x_cpu_pm_enter(suspend_state_t state);
 
 #define pxa27x_handle_irq  ichp_handle_irq
diff --git a/arch/arm/mach-pxa/pxa27x.c b/arch/arm/mach-pxa/pxa27x.c
index a2fe795..f7c9978 100644
--- a/arch/arm/mach-pxa/pxa27x.c
+++ b/arch/arm/mach-pxa/pxa27x.c
@@ -242,7 +242,7 @@ static struct clk_lookup pxa27x_clkregs[] = {
  */
 static unsigned int pwrmode = PWRMODE_SLEEP;
 
-int __init pxa27x_set_pwrmode(unsigned int mode)
+int pxa27x_set_pwrmode(unsigned int mode)
 {
switch (mode) {
case PWRMODE_SLEEP:
-- 
1.9.1



[PATCH 3.4 007/125] recordmcount: Fix endianness handling bug for nop_mcount

2016-10-12 Thread lizf
From: libin 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit c84da8b9ad3761eef43811181c7e896e9834b26b upstream.

In nop_mcount, shdr->sh_offset and welp->r_offset should handle
endianness properly, otherwise it will trigger Segmentation fault
if the recordmcount main and file.o have different endianness.

Link: http://lkml.kernel.org/r/563806c7.7070...@huawei.com

Signed-off-by: Li Bin 
Signed-off-by: Steven Rostedt 
Signed-off-by: Zefan Li 
---
 scripts/recordmcount.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
index 5e29610..799d734 100644
--- a/scripts/recordmcount.h
+++ b/scripts/recordmcount.h
@@ -375,7 +375,7 @@ static void nop_mcount(Elf_Shdr const *const relhdr,
 
if (mcountsym == Elf_r_sym(relp) && !is_fake_mcount(relp)) {
if (make_nop)
-   ret = make_nop((void *)ehdr, shdr->sh_offset + 
relp->r_offset);
+   ret = make_nop((void *)ehdr, 
_w(shdr->sh_offset) + _w(relp->r_offset));
if (warn_on_notrace_sect && !once) {
printf("Section %s has mcount callers being 
ignored\n",
   txtname);
-- 
1.9.1



[PATCH 3.4 069/125] dm btree: fix leak of bufio-backed block in btree_split_sibling error path

2016-10-12 Thread lizf
From: Mike Snitzer 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 30ce6e1cc5a0f781d60227e9096c86e188d2c2bd upstream.

The block allocated at the start of btree_split_sibling() is never
released if later insert_at() fails.

Fix this by releasing the previously allocated bufio block using
unlock_block().

Reported-by: Mikulas Patocka 
Signed-off-by: Mike Snitzer 
Signed-off-by: Zefan Li 
---
 drivers/md/persistent-data/dm-btree.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/persistent-data/dm-btree.c 
b/drivers/md/persistent-data/dm-btree.c
index be86d59..77c615e 100644
--- a/drivers/md/persistent-data/dm-btree.c
+++ b/drivers/md/persistent-data/dm-btree.c
@@ -450,8 +450,10 @@ static int btree_split_sibling(struct shadow_spine *s, 
dm_block_t root,
 
r = insert_at(sizeof(__le64), pn, parent_index + 1,
  le64_to_cpu(rn->keys[0]), );
-   if (r)
+   if (r) {
+   unlock_block(s->info, right);
return r;
+   }
 
if (key < le64_to_cpu(rn->keys[0])) {
unlock_block(s->info, right);
-- 
1.9.1



[PATCH 3.4 005/125] ARM: pxa: remove incorrect __init annotation on pxa27x_set_pwrmode

2016-10-12 Thread lizf
From: Arnd Bergmann 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 54c09889bff6d99c8733eed4a26c9391b177c88b upstream.

The z2 machine calls pxa27x_set_pwrmode() in order to power off
the machine, but this function gets discarded early at boot because
it is marked __init, as pointed out by kbuild:

WARNING: vmlinux.o(.text+0x145c4): Section mismatch in reference from the 
function z2_power_off() to the function .init.text:pxa27x_set_pwrmode()
The function z2_power_off() references
the function __init pxa27x_set_pwrmode().
This is often because z2_power_off lacks a __init
annotation or the annotation of pxa27x_set_pwrmode is wrong.

This removes the __init section modifier to fix rebooting and the
build error.

Signed-off-by: Arnd Bergmann 
Fixes: ba4a90a6d86a ("ARM: pxa/z2: fix building error of pxa27x_cpu_suspend() 
no longer available")
Signed-off-by: Robert Jarzmik 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 arch/arm/mach-pxa/include/mach/pxa27x.h | 2 +-
 arch/arm/mach-pxa/pxa27x.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mach-pxa/include/mach/pxa27x.h 
b/arch/arm/mach-pxa/include/mach/pxa27x.h
index 7cff640..66c4cbf 100644
--- a/arch/arm/mach-pxa/include/mach/pxa27x.h
+++ b/arch/arm/mach-pxa/include/mach/pxa27x.h
@@ -21,7 +21,7 @@
 
 extern void __init pxa27x_map_io(void);
 extern void __init pxa27x_init_irq(void);
-extern int __init pxa27x_set_pwrmode(unsigned int mode);
+extern int pxa27x_set_pwrmode(unsigned int mode);
 extern void pxa27x_cpu_pm_enter(suspend_state_t state);
 
 #define pxa27x_handle_irq  ichp_handle_irq
diff --git a/arch/arm/mach-pxa/pxa27x.c b/arch/arm/mach-pxa/pxa27x.c
index a2fe795..f7c9978 100644
--- a/arch/arm/mach-pxa/pxa27x.c
+++ b/arch/arm/mach-pxa/pxa27x.c
@@ -242,7 +242,7 @@ static struct clk_lookup pxa27x_clkregs[] = {
  */
 static unsigned int pwrmode = PWRMODE_SLEEP;
 
-int __init pxa27x_set_pwrmode(unsigned int mode)
+int pxa27x_set_pwrmode(unsigned int mode)
 {
switch (mode) {
case PWRMODE_SLEEP:
-- 
1.9.1



[PATCH 3.4 002/125] wm831x_power: Use IRQF_ONESHOT to request threaded IRQs

2016-10-12 Thread lizf
From: Valentin Rothberg 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 90adf98d9530054b8e665ba5a928de4307231d84 upstream.

Since commit 1c6c69525b40 ("genirq: Reject bogus threaded irq requests")
threaded IRQs without a primary handler need to be requested with
IRQF_ONESHOT, otherwise the request will fail.

scripts/coccinelle/misc/irqf_oneshot.cocci detected this issue.

Fixes: b5874f33bbaf ("wm831x_power: Use genirq")
Signed-off-by: Valentin Rothberg 
Signed-off-by: Sebastian Reichel 
Signed-off-by: Zefan Li 
---
 drivers/power/wm831x_power.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/power/wm831x_power.c b/drivers/power/wm831x_power.c
index 987332b..036ee0b 100644
--- a/drivers/power/wm831x_power.c
+++ b/drivers/power/wm831x_power.c
@@ -567,7 +567,7 @@ static __devinit int wm831x_power_probe(struct 
platform_device *pdev)
 
irq = platform_get_irq_byname(pdev, "SYSLO");
ret = request_threaded_irq(irq, NULL, wm831x_syslo_irq,
-  IRQF_TRIGGER_RISING, "System power low",
+  IRQF_TRIGGER_RISING | IRQF_ONESHOT, "System 
power low",
   power);
if (ret != 0) {
dev_err(>dev, "Failed to request SYSLO IRQ %d: %d\n",
@@ -577,7 +577,7 @@ static __devinit int wm831x_power_probe(struct 
platform_device *pdev)
 
irq = platform_get_irq_byname(pdev, "PWR SRC");
ret = request_threaded_irq(irq, NULL, wm831x_pwr_src_irq,
-  IRQF_TRIGGER_RISING, "Power source",
+  IRQF_TRIGGER_RISING | IRQF_ONESHOT, "Power 
source",
   power);
if (ret != 0) {
dev_err(>dev, "Failed to request PWR SRC IRQ %d: %d\n",
@@ -588,7 +588,7 @@ static __devinit int wm831x_power_probe(struct 
platform_device *pdev)
for (i = 0; i < ARRAY_SIZE(wm831x_bat_irqs); i++) {
irq = platform_get_irq_byname(pdev, wm831x_bat_irqs[i]);
ret = request_threaded_irq(irq, NULL, wm831x_bat_irq,
-  IRQF_TRIGGER_RISING,
+  IRQF_TRIGGER_RISING | IRQF_ONESHOT,
   wm831x_bat_irqs[i],
   power);
if (ret != 0) {
-- 
1.9.1



[PATCH 3.4 002/125] wm831x_power: Use IRQF_ONESHOT to request threaded IRQs

2016-10-12 Thread lizf
From: Valentin Rothberg 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 90adf98d9530054b8e665ba5a928de4307231d84 upstream.

Since commit 1c6c69525b40 ("genirq: Reject bogus threaded irq requests")
threaded IRQs without a primary handler need to be requested with
IRQF_ONESHOT, otherwise the request will fail.

scripts/coccinelle/misc/irqf_oneshot.cocci detected this issue.

Fixes: b5874f33bbaf ("wm831x_power: Use genirq")
Signed-off-by: Valentin Rothberg 
Signed-off-by: Sebastian Reichel 
Signed-off-by: Zefan Li 
---
 drivers/power/wm831x_power.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/power/wm831x_power.c b/drivers/power/wm831x_power.c
index 987332b..036ee0b 100644
--- a/drivers/power/wm831x_power.c
+++ b/drivers/power/wm831x_power.c
@@ -567,7 +567,7 @@ static __devinit int wm831x_power_probe(struct 
platform_device *pdev)
 
irq = platform_get_irq_byname(pdev, "SYSLO");
ret = request_threaded_irq(irq, NULL, wm831x_syslo_irq,
-  IRQF_TRIGGER_RISING, "System power low",
+  IRQF_TRIGGER_RISING | IRQF_ONESHOT, "System 
power low",
   power);
if (ret != 0) {
dev_err(>dev, "Failed to request SYSLO IRQ %d: %d\n",
@@ -577,7 +577,7 @@ static __devinit int wm831x_power_probe(struct 
platform_device *pdev)
 
irq = platform_get_irq_byname(pdev, "PWR SRC");
ret = request_threaded_irq(irq, NULL, wm831x_pwr_src_irq,
-  IRQF_TRIGGER_RISING, "Power source",
+  IRQF_TRIGGER_RISING | IRQF_ONESHOT, "Power 
source",
   power);
if (ret != 0) {
dev_err(>dev, "Failed to request PWR SRC IRQ %d: %d\n",
@@ -588,7 +588,7 @@ static __devinit int wm831x_power_probe(struct 
platform_device *pdev)
for (i = 0; i < ARRAY_SIZE(wm831x_bat_irqs); i++) {
irq = platform_get_irq_byname(pdev, wm831x_bat_irqs[i]);
ret = request_threaded_irq(irq, NULL, wm831x_bat_irq,
-  IRQF_TRIGGER_RISING,
+  IRQF_TRIGGER_RISING | IRQF_ONESHOT,
   wm831x_bat_irqs[i],
   power);
if (ret != 0) {
-- 
1.9.1



[PATCH 3.4 008/125] ipv6: fix tunnel error handling

2016-10-12 Thread lizf
From: Michal Kubeček 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit ebac62fe3d24c0ce22dd83afa7b07d1a2aaef44d upstream.

Both tunnel6_protocol and tunnel46_protocol share the same error
handler, tunnel6_err(), which traverses through tunnel6_handlers list.
For ipip6 tunnels, we need to traverse tunnel46_handlers as we do e.g.
in tunnel46_rcv(). Current code can generate an ICMPv6 error message
with an IPv4 packet embedded in it.

Fixes: 73d605d1abbd ("[IPSEC]: changing API of xfrm6_tunnel_register")
Signed-off-by: Michal Kubecek 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/tunnel6.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/tunnel6.c b/net/ipv6/tunnel6.c
index 4f3cec1..aa109da 100644
--- a/net/ipv6/tunnel6.c
+++ b/net/ipv6/tunnel6.c
@@ -145,6 +145,16 @@ static void tunnel6_err(struct sk_buff *skb, struct 
inet6_skb_parm *opt,
break;
 }
 
+static void tunnel46_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
+u8 type, u8 code, int offset, __be32 info)
+{
+   struct xfrm6_tunnel *handler;
+
+   for_each_tunnel_rcu(tunnel46_handlers, handler)
+   if (!handler->err_handler(skb, opt, type, code, offset, info))
+   break;
+}
+
 static const struct inet6_protocol tunnel6_protocol = {
.handler= tunnel6_rcv,
.err_handler= tunnel6_err,
@@ -153,7 +163,7 @@ static const struct inet6_protocol tunnel6_protocol = {
 
 static const struct inet6_protocol tunnel46_protocol = {
.handler= tunnel46_rcv,
-   .err_handler= tunnel6_err,
+   .err_handler= tunnel46_err,
.flags  = INET6_PROTO_NOPOLICY|INET6_PROTO_FINAL,
 };
 
-- 
1.9.1



[PATCH 3.4 008/125] ipv6: fix tunnel error handling

2016-10-12 Thread lizf
From: Michal Kubeček 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit ebac62fe3d24c0ce22dd83afa7b07d1a2aaef44d upstream.

Both tunnel6_protocol and tunnel46_protocol share the same error
handler, tunnel6_err(), which traverses through tunnel6_handlers list.
For ipip6 tunnels, we need to traverse tunnel46_handlers as we do e.g.
in tunnel46_rcv(). Current code can generate an ICMPv6 error message
with an IPv4 packet embedded in it.

Fixes: 73d605d1abbd ("[IPSEC]: changing API of xfrm6_tunnel_register")
Signed-off-by: Michal Kubecek 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/tunnel6.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/tunnel6.c b/net/ipv6/tunnel6.c
index 4f3cec1..aa109da 100644
--- a/net/ipv6/tunnel6.c
+++ b/net/ipv6/tunnel6.c
@@ -145,6 +145,16 @@ static void tunnel6_err(struct sk_buff *skb, struct 
inet6_skb_parm *opt,
break;
 }
 
+static void tunnel46_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
+u8 type, u8 code, int offset, __be32 info)
+{
+   struct xfrm6_tunnel *handler;
+
+   for_each_tunnel_rcu(tunnel46_handlers, handler)
+   if (!handler->err_handler(skb, opt, type, code, offset, info))
+   break;
+}
+
 static const struct inet6_protocol tunnel6_protocol = {
.handler= tunnel6_rcv,
.err_handler= tunnel6_err,
@@ -153,7 +163,7 @@ static const struct inet6_protocol tunnel6_protocol = {
 
 static const struct inet6_protocol tunnel46_protocol = {
.handler= tunnel46_rcv,
-   .err_handler= tunnel6_err,
+   .err_handler= tunnel46_err,
.flags  = INET6_PROTO_NOPOLICY|INET6_PROTO_FINAL,
 };
 
-- 
1.9.1



[PATCH 3.4 014/125] HID: core: Avoid uninitialized buffer access

2016-10-12 Thread lizf
From: Richard Purdie 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 79b568b9d0c7c5d81932f4486d50b38efdd6da6d upstream.

hid_connect adds various strings to the buffer but they're all
conditional. You can find circumstances where nothing would be written
to it but the kernel will still print the supposedly empty buffer with
printk. This leads to corruption on the console/in the logs.

Ensure buf is initialized to an empty string.

Signed-off-by: Richard Purdie 
[dvhart: Initialize string to "" rather than assign buf[0] = NULL;]
Cc: Jiri Kosina 
Cc: linux-in...@vger.kernel.org
Signed-off-by: Darren Hart 
Signed-off-by: Jiri Kosina 
Signed-off-by: Zefan Li 
---
 drivers/hid/hid-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index 75fa2e7..b8ad132 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1301,7 +1301,7 @@ int hid_connect(struct hid_device *hdev, unsigned int 
connect_mask)
"Multi-Axis Controller"
};
const char *type, *bus;
-   char buf[64];
+   char buf[64] = "";
unsigned int i;
int len;
int ret;
-- 
1.9.1



[PATCH 3.4 020/125] ACPI: Use correct IRQ when uninstalling ACPI interrupt handler

2016-10-12 Thread lizf
From: Chen Yu 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 49e4b84333f338d4f183f28f1f3c1131b9fb2b5a upstream.

Currently when the system is trying to uninstall the ACPI interrupt
handler, it uses acpi_gbl_FADT.sci_interrupt as the IRQ number.
However, the IRQ number that the ACPI interrupt handled is installed
for comes from acpi_gsi_to_irq() and that is the number that should
be used for the handler removal.

Fix this problem by using the mapped IRQ returned from acpi_gsi_to_irq()
as appropriate.

Acked-by: Lv Zheng 
Signed-off-by: Chen Yu 
Signed-off-by: Rafael J. Wysocki 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 drivers/acpi/osl.c   | 9 ++---
 include/linux/acpi.h | 6 ++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index f48720c..2788c15 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -85,6 +85,7 @@ static void *acpi_irq_context;
 static struct workqueue_struct *kacpid_wq;
 static struct workqueue_struct *kacpi_notify_wq;
 struct workqueue_struct *kacpi_hotplug_wq;
+unsigned int acpi_sci_irq = INVALID_ACPI_IRQ;
 EXPORT_SYMBOL(kacpi_hotplug_wq);
 
 /*
@@ -612,17 +613,19 @@ acpi_os_install_interrupt_handler(u32 gsi, 
acpi_osd_handler handler,
acpi_irq_handler = NULL;
return AE_NOT_ACQUIRED;
}
+   acpi_sci_irq = irq;
 
return AE_OK;
 }
 
-acpi_status acpi_os_remove_interrupt_handler(u32 irq, acpi_osd_handler handler)
+acpi_status acpi_os_remove_interrupt_handler(u32 gsi, acpi_osd_handler handler)
 {
-   if (irq != acpi_gbl_FADT.sci_interrupt)
+   if (gsi != acpi_gbl_FADT.sci_interrupt || !acpi_sci_irq_valid())
return AE_BAD_PARAMETER;
 
-   free_irq(irq, acpi_irq);
+   free_irq(acpi_sci_irq, acpi_irq);
acpi_irq_handler = NULL;
+   acpi_sci_irq = INVALID_ACPI_IRQ;
 
return AE_OK;
 }
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index f421dd8..668351a 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -110,6 +110,12 @@ int acpi_unregister_ioapic(acpi_handle handle, u32 
gsi_base);
 void acpi_irq_stats_init(void);
 extern u32 acpi_irq_handled;
 extern u32 acpi_irq_not_handled;
+extern unsigned int acpi_sci_irq;
+#define INVALID_ACPI_IRQ   ((unsigned)-1)
+static inline bool acpi_sci_irq_valid(void)
+{
+   return acpi_sci_irq != INVALID_ACPI_IRQ;
+}
 
 extern int sbf_port;
 extern unsigned long acpi_realmode_flags;
-- 
1.9.1



[PATCH 3.4 021/125] ALSA: hda - Disable 64bit address for Creative HDA controllers

2016-10-12 Thread lizf
From: Takashi Iwai 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit cadd16ea33a938d49aee99edd4758cc76048b399 upstream.

We've had many reports that some Creative sound cards with CA0132
don't work well.  Some reported that it starts working after reloading
the module, while some reported it starts working when a 32bit kernel
is used.  All these facts seem implying that the chip fails to
communicate when the buffer is located in 64bit address.

This patch addresses these issues by just adding AZX_DCAPS_NO_64BIT
flag to the corresponding PCI entries.  I casually had a chance to
test an SB Recon3D board, and indeed this seems helping.

Although this hasn't been tested on all Creative devices, it's safer
to assume that this restriction applies to the rest of them, too.  So
the flag is applied to all Creative entries.

Signed-off-by: Takashi Iwai 
[lizf: Backported to 3.4: drop the change to macro AZX_DCAPS_PRESET_CTHDA]
Signed-off-by: Zefan Li 
---
 sound/pci/hda/hda_intel.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
index f461737..833d835 100644
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -3144,11 +3144,13 @@ static DEFINE_PCI_DEVICE_TABLE(azx_ids) = {
  .class = PCI_CLASS_MULTIMEDIA_HD_AUDIO << 8,
  .class_mask = 0xff,
  .driver_data = AZX_DRIVER_CTX | AZX_DCAPS_CTX_WORKAROUND |
+ AZX_DCAPS_NO_64BIT |
  AZX_DCAPS_RIRB_PRE_DELAY | AZX_DCAPS_POSFIX_LPIB },
 #else
/* this entry seems still valid -- i.e. without emu20kx chip */
{ PCI_DEVICE(0x1102, 0x0009),
  .driver_data = AZX_DRIVER_CTX | AZX_DCAPS_CTX_WORKAROUND |
+ AZX_DCAPS_NO_64BIT |
  AZX_DCAPS_RIRB_PRE_DELAY | AZX_DCAPS_POSFIX_LPIB },
 #endif
/* Vortex86MX */
-- 
1.9.1



[PATCH 3.4 023/125] Revert "dm mpath: fix stalls when handling invalid ioctls"

2016-10-12 Thread lizf
From: Mauricio Faria de Oliveira 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 47796938c46b943d157ac8a6f9ed4e3b98b83cf4 upstream.

This reverts commit a1989b330093578ea5470bea0a00f940c444c466.

That commit introduced a regression at least for the case of the SG_IO ioctl()
running without CAP_SYS_RAWIO capability (e.g., unprivileged users) when there
are no active paths: the ioctl() fails with the ENOTTY errno immediately rather
than blocking due to queue_if_no_path until a path becomes active, for example.

That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
(qemu "-device scsi-block" [1], libvirt "" [2])
from multipath devices; which leads to SCSI/filesystem errors in such a guest.

More general scenarios can hit that regression too. The following demonstration
employs a SG_IO ioctl() with a standard SCSI INQUIRY command for this objective
(some output & user changes omitted for brevity and comments added for clarity).

Reverting that commit restores normal operation (queueing) in failing scenarios;
tested on linux-next (next-20151022).

1) Test-case is based on sg_simple0 [3] (just SG_IO; remove SG_GET_VERSION_NUM)

$ cat sg_simple0.c
... see [3] ...
$ sed '/SG_GET_VERSION_NUM/,/}/d' sg_simple0.c > sgio_inquiry.c
$ gcc sgio_inquiry.c -o sgio_inquiry

2) The ioctl() works fine with active paths present.

# multipath -l 85ag56
85ag56 (...) dm-19 IBM ,2145
size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=active
| |- 8:0:11:0  sdz  65:144  active undef running
| `- 9:0:9:0   sdbf 67:144  active undef running
`-+- policy='service-time 0' prio=0 status=enabled
  |- 8:0:12:0  sdae 65:224  active undef running
  `- 9:0:12:0  sdbo 68:32   active undef running

$ ./sgio_inquiry /dev/mapper/85ag56
Some of the INQUIRY command's response:
IBM   2145  
INQUIRY duration=0 millisecs, resid=0

3) The ioctl() fails with ENOTTY errno with _no_ active paths present,
   for unprivileged users (rather than blocking due to queue_if_no_path).

# for path in $(multipath -l 85ag56 | grep -o 'sd[a-z]\+'); \
  do multipathd -k"fail path $path"; done

# multipath -l 85ag56
85ag56 (...) dm-19 IBM ,2145
size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| |- 8:0:11:0  sdz  65:144  failed undef running
| `- 9:0:9:0   sdbf 67:144  failed undef running
`-+- policy='service-time 0' prio=0 status=enabled
  |- 8:0:12:0  sdae 65:224  failed undef running
  `- 9:0:12:0  sdbo 68:32   failed undef running

$ ./sgio_inquiry /dev/mapper/85ag56
sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device

4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
   it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().

$ dmesg
<...>
[] device-mapper: multipath: Failing path 65:144.
[] device-mapper: multipath: Failing path 67:144.
[] device-mapper: multipath: Failing path 65:224.
[] device-mapper: multipath: Failing path 68:32.
[] sgio_inquiry: sending ioctl 2285 to a partition!

5) The ioctl() only works if the SYS_CAP_RAWIO capability is present
   (then queueing happens -- in this example, queue_if_no_path is set);
   this is due to a conditional check in scsi_verify_blk_ioctl().

# capsh --drop=cap_sys_rawio -- -c './sgio_inquiry /dev/mapper/85ag56'
sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device

# ./sgio_inquiry /dev/mapper/85ag56 &
[1] 72830

# cat /proc/72830/stack
[] 0xc0171c0df700
[] __switch_to+0x204/0x350
[] msleep+0x5c/0x80
[] dm_blk_ioctl+0x70/0x170
[] blkdev_ioctl+0x2b0/0x9b0
[] block_ioctl+0x64/0xd0
[] do_vfs_ioctl+0x490/0x780
[] SyS_ioctl+0xd4/0xf0
[] system_call+0x38/0xd0

6) This is the function call chain exercised in this analysis:

SYSCALL_DEFINE3(ioctl, <...>) @ fs/ioctl.c
-> do_vfs_ioctl()
-> vfs_ioctl()
...
error = filp->f_op->unlocked_ioctl(filp, cmd, arg);
...
-> dm_blk_ioctl() @ drivers/md/dm.c
-> multipath_ioctl() @ drivers/md/dm-mpath.c
...
(bdev = NULL, due to no active paths)
...
if (!bdev || <...>) {
int err = scsi_verify_blk_ioctl(NULL, cmd);
if (err)
r = err;
}
...
-> scsi_verify_blk_ioctl() @ block/scsi_ioctl.c
...
if (bd && bd == bd->bd_contains) // not taken 
(bd = NULL)

[PATCH 3.4 010/125] net: fix a race in dst_release()

2016-10-12 Thread lizf
From: Eric Dumazet 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit d69bbf88c8d0b367cf3e3a052f6daadf630ee566 upstream.

Only cpu seeing dst refcount going to 0 can safely
dereference dst->flags.

Otherwise an other cpu might already have freed the dst.

Fixes: 27b75c95f10d ("net: avoid RCU for NOCACHE dst")
Reported-by: Greg Thelen 
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 net/core/dst.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dst.c b/net/core/dst.c
index 43d94ce..54ba1eb 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -272,7 +272,7 @@ void dst_release(struct dst_entry *dst)
 
newrefcnt = atomic_dec_return(>__refcnt);
WARN_ON(newrefcnt < 0);
-   if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) {
+   if (!newrefcnt && unlikely(dst->flags & DST_NOCACHE)) {
dst = dst_destroy(dst);
if (dst)
__dst_free(dst);
-- 
1.9.1



[PATCH 3.4 022/125] megaraid_sas: Do not use PAGE_SIZE for max_sectors

2016-10-12 Thread lizf
From: "sumit.sax...@avagotech.com" 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 357ae967ad66e357f78b5cfb5ab6ca07fb4a7758 upstream.

Do not use PAGE_SIZE marco to calculate max_sectors per I/O
request. Driver code assumes PAGE_SIZE will be always 4096 which can
lead to wrongly calculated value if PAGE_SIZE is not 4096. This issue
was reported in Ubuntu Bugzilla Bug #1475166.

Signed-off-by: Sumit Saxena 
Signed-off-by: Kashyap Desai 
Reviewed-by: Tomas Henzl 
Reviewed-by: Martin K. Petersen 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Zefan Li 
---
 drivers/scsi/megaraid/megaraid_sas.h  | 2 ++
 drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h 
b/drivers/scsi/megaraid/megaraid_sas.h
index 1a7955a..0eaf196 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -300,6 +300,8 @@ enum MR_EVT_ARGS {
MR_EVT_ARGS_GENERIC,
 };
 
+
+#define SGE_BUFFER_SIZE4096
 /*
  * define constants for device list query options
  */
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
b/drivers/scsi/megaraid/megaraid_sas_base.c
index bacd344..a74dc74 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -3582,7 +3582,7 @@ static int megasas_init_fw(struct megasas_instance 
*instance)
}
 
instance->max_sectors_per_req = instance->max_num_sge *
-   PAGE_SIZE / 512;
+   SGE_BUFFER_SIZE / 512;
if (tmp_sectors && (instance->max_sectors_per_req > tmp_sectors))
instance->max_sectors_per_req = tmp_sectors;
 
-- 
1.9.1



[PATCH 3.4 013/125] FS-Cache: Handle a write to the page immediately beyond the EOF marker

2016-10-12 Thread lizf
From: David Howells 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 102f4d900c9c8f5ed89ae4746d493fe3ebd7ba64 upstream.

Handle a write being requested to the page immediately beyond the EOF
marker on a cache object.  Currently this gets an assertion failure in
CacheFiles because the EOF marker is used there to encode information about
a partial page at the EOF - which could lead to an unknown blank spot in
the file if we extend the file over it.

The problem is actually in fscache where we check the index of the page
being written against store_limit.  store_limit is set to the number of
pages that we're allowed to store by fscache_set_store_limit() - which
means it's one more than the index of the last page we're allowed to store.
The problem is that we permit writing to a page with an index _equal_ to
the store limit - when we should reject that case.

Whilst we're at it, change the triggered assertion in CacheFiles to just
return -ENOBUFS instead.

The assertion failure looks something like this:

CacheFiles: Assertion failed
1000 < 7b1 is false
[ cut here ]
kernel BUG at fs/cachefiles/rdwr.c:962!
...
RIP: 0010:[]  [] 
cachefiles_write_page+0x273/0x2d0 [cachefiles]

Signed-off-by: David Howells 
Signed-off-by: Al Viro 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 fs/cachefiles/rdwr.c | 78 +---
 fs/fscache/page.c|  2 +-
 2 files changed, 44 insertions(+), 36 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index b4d2438..00d9425 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -914,6 +914,15 @@ int cachefiles_write_page(struct fscache_storage *op, 
struct page *page)
cache = container_of(object->fscache.cache,
 struct cachefiles_cache, cache);
 
+   pos = (loff_t)page->index << PAGE_SHIFT;
+
+   /* We mustn't write more data than we have, so we have to beware of a
+* partial page at EOF.
+*/
+   eof = object->fscache.store_limit_l;
+   if (pos >= eof)
+   goto error;
+
/* write the page to the backing filesystem and let it store it in its
 * own time */
dget(object->backer);
@@ -922,47 +931,46 @@ int cachefiles_write_page(struct fscache_storage *op, 
struct page *page)
   cache->cache_cred);
if (IS_ERR(file)) {
ret = PTR_ERR(file);
-   } else {
+   goto error_2;
+   }
+   if (!file->f_op->write) {
ret = -EIO;
-   if (file->f_op->write) {
-   pos = (loff_t) page->index << PAGE_SHIFT;
-
-   /* we mustn't write more data than we have, so we have
-* to beware of a partial page at EOF */
-   eof = object->fscache.store_limit_l;
-   len = PAGE_SIZE;
-   if (eof & ~PAGE_MASK) {
-   ASSERTCMP(pos, <, eof);
-   if (eof - pos < PAGE_SIZE) {
-   _debug("cut short %llx to %llx",
-  pos, eof);
-   len = eof - pos;
-   ASSERTCMP(pos + len, ==, eof);
-   }
-   }
+   goto error_2;
+   }
 
-   data = kmap(page);
-   old_fs = get_fs();
-   set_fs(KERNEL_DS);
-   ret = file->f_op->write(
-   file, (const void __user *) data, len, );
-   set_fs(old_fs);
-   kunmap(page);
-   if (ret != len)
-   ret = -EIO;
+   len = PAGE_SIZE;
+   if (eof & ~PAGE_MASK) {
+   if (eof - pos < PAGE_SIZE) {
+   _debug("cut short %llx to %llx",
+  pos, eof);
+   len = eof - pos;
+   ASSERTCMP(pos + len, ==, eof);
}
-   fput(file);
}
 
-   if (ret < 0) {
-   if (ret == -EIO)
-   cachefiles_io_error_obj(
-   object, "Write page to backing file failed");
-   ret = -ENOBUFS;
-   }
+   data = kmap(page);
+   old_fs = get_fs();
+   set_fs(KERNEL_DS);
+   ret = file->f_op->write(
+   file, (const void __user *) data, len, );
+   set_fs(old_fs);
+   kunmap(page);
+   fput(file);
+   if (ret != len)
+   goto error_eio;
+
+   _leave(" = 0");
+   return 0;
 
-   _leave(" = %d", ret);
-   return ret;

[PATCH 3.4 010/125] net: fix a race in dst_release()

2016-10-12 Thread lizf
From: Eric Dumazet 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit d69bbf88c8d0b367cf3e3a052f6daadf630ee566 upstream.

Only cpu seeing dst refcount going to 0 can safely
dereference dst->flags.

Otherwise an other cpu might already have freed the dst.

Fixes: 27b75c95f10d ("net: avoid RCU for NOCACHE dst")
Reported-by: Greg Thelen 
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 net/core/dst.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dst.c b/net/core/dst.c
index 43d94ce..54ba1eb 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -272,7 +272,7 @@ void dst_release(struct dst_entry *dst)
 
newrefcnt = atomic_dec_return(>__refcnt);
WARN_ON(newrefcnt < 0);
-   if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) {
+   if (!newrefcnt && unlikely(dst->flags & DST_NOCACHE)) {
dst = dst_destroy(dst);
if (dst)
__dst_free(dst);
-- 
1.9.1



[PATCH 3.4 022/125] megaraid_sas: Do not use PAGE_SIZE for max_sectors

2016-10-12 Thread lizf
From: "sumit.sax...@avagotech.com" 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 357ae967ad66e357f78b5cfb5ab6ca07fb4a7758 upstream.

Do not use PAGE_SIZE marco to calculate max_sectors per I/O
request. Driver code assumes PAGE_SIZE will be always 4096 which can
lead to wrongly calculated value if PAGE_SIZE is not 4096. This issue
was reported in Ubuntu Bugzilla Bug #1475166.

Signed-off-by: Sumit Saxena 
Signed-off-by: Kashyap Desai 
Reviewed-by: Tomas Henzl 
Reviewed-by: Martin K. Petersen 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Zefan Li 
---
 drivers/scsi/megaraid/megaraid_sas.h  | 2 ++
 drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h 
b/drivers/scsi/megaraid/megaraid_sas.h
index 1a7955a..0eaf196 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -300,6 +300,8 @@ enum MR_EVT_ARGS {
MR_EVT_ARGS_GENERIC,
 };
 
+
+#define SGE_BUFFER_SIZE4096
 /*
  * define constants for device list query options
  */
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
b/drivers/scsi/megaraid/megaraid_sas_base.c
index bacd344..a74dc74 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -3582,7 +3582,7 @@ static int megasas_init_fw(struct megasas_instance 
*instance)
}
 
instance->max_sectors_per_req = instance->max_num_sge *
-   PAGE_SIZE / 512;
+   SGE_BUFFER_SIZE / 512;
if (tmp_sectors && (instance->max_sectors_per_req > tmp_sectors))
instance->max_sectors_per_req = tmp_sectors;
 
-- 
1.9.1



[PATCH 3.4 013/125] FS-Cache: Handle a write to the page immediately beyond the EOF marker

2016-10-12 Thread lizf
From: David Howells 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 102f4d900c9c8f5ed89ae4746d493fe3ebd7ba64 upstream.

Handle a write being requested to the page immediately beyond the EOF
marker on a cache object.  Currently this gets an assertion failure in
CacheFiles because the EOF marker is used there to encode information about
a partial page at the EOF - which could lead to an unknown blank spot in
the file if we extend the file over it.

The problem is actually in fscache where we check the index of the page
being written against store_limit.  store_limit is set to the number of
pages that we're allowed to store by fscache_set_store_limit() - which
means it's one more than the index of the last page we're allowed to store.
The problem is that we permit writing to a page with an index _equal_ to
the store limit - when we should reject that case.

Whilst we're at it, change the triggered assertion in CacheFiles to just
return -ENOBUFS instead.

The assertion failure looks something like this:

CacheFiles: Assertion failed
1000 < 7b1 is false
[ cut here ]
kernel BUG at fs/cachefiles/rdwr.c:962!
...
RIP: 0010:[]  [] 
cachefiles_write_page+0x273/0x2d0 [cachefiles]

Signed-off-by: David Howells 
Signed-off-by: Al Viro 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 fs/cachefiles/rdwr.c | 78 +---
 fs/fscache/page.c|  2 +-
 2 files changed, 44 insertions(+), 36 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index b4d2438..00d9425 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -914,6 +914,15 @@ int cachefiles_write_page(struct fscache_storage *op, 
struct page *page)
cache = container_of(object->fscache.cache,
 struct cachefiles_cache, cache);
 
+   pos = (loff_t)page->index << PAGE_SHIFT;
+
+   /* We mustn't write more data than we have, so we have to beware of a
+* partial page at EOF.
+*/
+   eof = object->fscache.store_limit_l;
+   if (pos >= eof)
+   goto error;
+
/* write the page to the backing filesystem and let it store it in its
 * own time */
dget(object->backer);
@@ -922,47 +931,46 @@ int cachefiles_write_page(struct fscache_storage *op, 
struct page *page)
   cache->cache_cred);
if (IS_ERR(file)) {
ret = PTR_ERR(file);
-   } else {
+   goto error_2;
+   }
+   if (!file->f_op->write) {
ret = -EIO;
-   if (file->f_op->write) {
-   pos = (loff_t) page->index << PAGE_SHIFT;
-
-   /* we mustn't write more data than we have, so we have
-* to beware of a partial page at EOF */
-   eof = object->fscache.store_limit_l;
-   len = PAGE_SIZE;
-   if (eof & ~PAGE_MASK) {
-   ASSERTCMP(pos, <, eof);
-   if (eof - pos < PAGE_SIZE) {
-   _debug("cut short %llx to %llx",
-  pos, eof);
-   len = eof - pos;
-   ASSERTCMP(pos + len, ==, eof);
-   }
-   }
+   goto error_2;
+   }
 
-   data = kmap(page);
-   old_fs = get_fs();
-   set_fs(KERNEL_DS);
-   ret = file->f_op->write(
-   file, (const void __user *) data, len, );
-   set_fs(old_fs);
-   kunmap(page);
-   if (ret != len)
-   ret = -EIO;
+   len = PAGE_SIZE;
+   if (eof & ~PAGE_MASK) {
+   if (eof - pos < PAGE_SIZE) {
+   _debug("cut short %llx to %llx",
+  pos, eof);
+   len = eof - pos;
+   ASSERTCMP(pos + len, ==, eof);
}
-   fput(file);
}
 
-   if (ret < 0) {
-   if (ret == -EIO)
-   cachefiles_io_error_obj(
-   object, "Write page to backing file failed");
-   ret = -ENOBUFS;
-   }
+   data = kmap(page);
+   old_fs = get_fs();
+   set_fs(KERNEL_DS);
+   ret = file->f_op->write(
+   file, (const void __user *) data, len, );
+   set_fs(old_fs);
+   kunmap(page);
+   fput(file);
+   if (ret != len)
+   goto error_eio;
+
+   _leave(" = 0");
+   return 0;
 
-   _leave(" = %d", ret);
-   return ret;
+error_eio:
+   ret = -EIO;
+error_2:
+   if (ret == -EIO)
+   

[PATCH 3.4 014/125] HID: core: Avoid uninitialized buffer access

2016-10-12 Thread lizf
From: Richard Purdie 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 79b568b9d0c7c5d81932f4486d50b38efdd6da6d upstream.

hid_connect adds various strings to the buffer but they're all
conditional. You can find circumstances where nothing would be written
to it but the kernel will still print the supposedly empty buffer with
printk. This leads to corruption on the console/in the logs.

Ensure buf is initialized to an empty string.

Signed-off-by: Richard Purdie 
[dvhart: Initialize string to "" rather than assign buf[0] = NULL;]
Cc: Jiri Kosina 
Cc: linux-in...@vger.kernel.org
Signed-off-by: Darren Hart 
Signed-off-by: Jiri Kosina 
Signed-off-by: Zefan Li 
---
 drivers/hid/hid-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index 75fa2e7..b8ad132 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1301,7 +1301,7 @@ int hid_connect(struct hid_device *hdev, unsigned int 
connect_mask)
"Multi-Axis Controller"
};
const char *type, *bus;
-   char buf[64];
+   char buf[64] = "";
unsigned int i;
int len;
int ret;
-- 
1.9.1



[PATCH 3.4 020/125] ACPI: Use correct IRQ when uninstalling ACPI interrupt handler

2016-10-12 Thread lizf
From: Chen Yu 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 49e4b84333f338d4f183f28f1f3c1131b9fb2b5a upstream.

Currently when the system is trying to uninstall the ACPI interrupt
handler, it uses acpi_gbl_FADT.sci_interrupt as the IRQ number.
However, the IRQ number that the ACPI interrupt handled is installed
for comes from acpi_gsi_to_irq() and that is the number that should
be used for the handler removal.

Fix this problem by using the mapped IRQ returned from acpi_gsi_to_irq()
as appropriate.

Acked-by: Lv Zheng 
Signed-off-by: Chen Yu 
Signed-off-by: Rafael J. Wysocki 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 drivers/acpi/osl.c   | 9 ++---
 include/linux/acpi.h | 6 ++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index f48720c..2788c15 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -85,6 +85,7 @@ static void *acpi_irq_context;
 static struct workqueue_struct *kacpid_wq;
 static struct workqueue_struct *kacpi_notify_wq;
 struct workqueue_struct *kacpi_hotplug_wq;
+unsigned int acpi_sci_irq = INVALID_ACPI_IRQ;
 EXPORT_SYMBOL(kacpi_hotplug_wq);
 
 /*
@@ -612,17 +613,19 @@ acpi_os_install_interrupt_handler(u32 gsi, 
acpi_osd_handler handler,
acpi_irq_handler = NULL;
return AE_NOT_ACQUIRED;
}
+   acpi_sci_irq = irq;
 
return AE_OK;
 }
 
-acpi_status acpi_os_remove_interrupt_handler(u32 irq, acpi_osd_handler handler)
+acpi_status acpi_os_remove_interrupt_handler(u32 gsi, acpi_osd_handler handler)
 {
-   if (irq != acpi_gbl_FADT.sci_interrupt)
+   if (gsi != acpi_gbl_FADT.sci_interrupt || !acpi_sci_irq_valid())
return AE_BAD_PARAMETER;
 
-   free_irq(irq, acpi_irq);
+   free_irq(acpi_sci_irq, acpi_irq);
acpi_irq_handler = NULL;
+   acpi_sci_irq = INVALID_ACPI_IRQ;
 
return AE_OK;
 }
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index f421dd8..668351a 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -110,6 +110,12 @@ int acpi_unregister_ioapic(acpi_handle handle, u32 
gsi_base);
 void acpi_irq_stats_init(void);
 extern u32 acpi_irq_handled;
 extern u32 acpi_irq_not_handled;
+extern unsigned int acpi_sci_irq;
+#define INVALID_ACPI_IRQ   ((unsigned)-1)
+static inline bool acpi_sci_irq_valid(void)
+{
+   return acpi_sci_irq != INVALID_ACPI_IRQ;
+}
 
 extern int sbf_port;
 extern unsigned long acpi_realmode_flags;
-- 
1.9.1



[PATCH 3.4 021/125] ALSA: hda - Disable 64bit address for Creative HDA controllers

2016-10-12 Thread lizf
From: Takashi Iwai 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit cadd16ea33a938d49aee99edd4758cc76048b399 upstream.

We've had many reports that some Creative sound cards with CA0132
don't work well.  Some reported that it starts working after reloading
the module, while some reported it starts working when a 32bit kernel
is used.  All these facts seem implying that the chip fails to
communicate when the buffer is located in 64bit address.

This patch addresses these issues by just adding AZX_DCAPS_NO_64BIT
flag to the corresponding PCI entries.  I casually had a chance to
test an SB Recon3D board, and indeed this seems helping.

Although this hasn't been tested on all Creative devices, it's safer
to assume that this restriction applies to the rest of them, too.  So
the flag is applied to all Creative entries.

Signed-off-by: Takashi Iwai 
[lizf: Backported to 3.4: drop the change to macro AZX_DCAPS_PRESET_CTHDA]
Signed-off-by: Zefan Li 
---
 sound/pci/hda/hda_intel.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
index f461737..833d835 100644
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -3144,11 +3144,13 @@ static DEFINE_PCI_DEVICE_TABLE(azx_ids) = {
  .class = PCI_CLASS_MULTIMEDIA_HD_AUDIO << 8,
  .class_mask = 0xff,
  .driver_data = AZX_DRIVER_CTX | AZX_DCAPS_CTX_WORKAROUND |
+ AZX_DCAPS_NO_64BIT |
  AZX_DCAPS_RIRB_PRE_DELAY | AZX_DCAPS_POSFIX_LPIB },
 #else
/* this entry seems still valid -- i.e. without emu20kx chip */
{ PCI_DEVICE(0x1102, 0x0009),
  .driver_data = AZX_DRIVER_CTX | AZX_DCAPS_CTX_WORKAROUND |
+ AZX_DCAPS_NO_64BIT |
  AZX_DCAPS_RIRB_PRE_DELAY | AZX_DCAPS_POSFIX_LPIB },
 #endif
/* Vortex86MX */
-- 
1.9.1



[PATCH 3.4 023/125] Revert "dm mpath: fix stalls when handling invalid ioctls"

2016-10-12 Thread lizf
From: Mauricio Faria de Oliveira 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 47796938c46b943d157ac8a6f9ed4e3b98b83cf4 upstream.

This reverts commit a1989b330093578ea5470bea0a00f940c444c466.

That commit introduced a regression at least for the case of the SG_IO ioctl()
running without CAP_SYS_RAWIO capability (e.g., unprivileged users) when there
are no active paths: the ioctl() fails with the ENOTTY errno immediately rather
than blocking due to queue_if_no_path until a path becomes active, for example.

That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
(qemu "-device scsi-block" [1], libvirt "" [2])
from multipath devices; which leads to SCSI/filesystem errors in such a guest.

More general scenarios can hit that regression too. The following demonstration
employs a SG_IO ioctl() with a standard SCSI INQUIRY command for this objective
(some output & user changes omitted for brevity and comments added for clarity).

Reverting that commit restores normal operation (queueing) in failing scenarios;
tested on linux-next (next-20151022).

1) Test-case is based on sg_simple0 [3] (just SG_IO; remove SG_GET_VERSION_NUM)

$ cat sg_simple0.c
... see [3] ...
$ sed '/SG_GET_VERSION_NUM/,/}/d' sg_simple0.c > sgio_inquiry.c
$ gcc sgio_inquiry.c -o sgio_inquiry

2) The ioctl() works fine with active paths present.

# multipath -l 85ag56
85ag56 (...) dm-19 IBM ,2145
size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=active
| |- 8:0:11:0  sdz  65:144  active undef running
| `- 9:0:9:0   sdbf 67:144  active undef running
`-+- policy='service-time 0' prio=0 status=enabled
  |- 8:0:12:0  sdae 65:224  active undef running
  `- 9:0:12:0  sdbo 68:32   active undef running

$ ./sgio_inquiry /dev/mapper/85ag56
Some of the INQUIRY command's response:
IBM   2145  
INQUIRY duration=0 millisecs, resid=0

3) The ioctl() fails with ENOTTY errno with _no_ active paths present,
   for unprivileged users (rather than blocking due to queue_if_no_path).

# for path in $(multipath -l 85ag56 | grep -o 'sd[a-z]\+'); \
  do multipathd -k"fail path $path"; done

# multipath -l 85ag56
85ag56 (...) dm-19 IBM ,2145
size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| |- 8:0:11:0  sdz  65:144  failed undef running
| `- 9:0:9:0   sdbf 67:144  failed undef running
`-+- policy='service-time 0' prio=0 status=enabled
  |- 8:0:12:0  sdae 65:224  failed undef running
  `- 9:0:12:0  sdbo 68:32   failed undef running

$ ./sgio_inquiry /dev/mapper/85ag56
sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device

4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
   it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().

$ dmesg
<...>
[] device-mapper: multipath: Failing path 65:144.
[] device-mapper: multipath: Failing path 67:144.
[] device-mapper: multipath: Failing path 65:224.
[] device-mapper: multipath: Failing path 68:32.
[] sgio_inquiry: sending ioctl 2285 to a partition!

5) The ioctl() only works if the SYS_CAP_RAWIO capability is present
   (then queueing happens -- in this example, queue_if_no_path is set);
   this is due to a conditional check in scsi_verify_blk_ioctl().

# capsh --drop=cap_sys_rawio -- -c './sgio_inquiry /dev/mapper/85ag56'
sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device

# ./sgio_inquiry /dev/mapper/85ag56 &
[1] 72830

# cat /proc/72830/stack
[] 0xc0171c0df700
[] __switch_to+0x204/0x350
[] msleep+0x5c/0x80
[] dm_blk_ioctl+0x70/0x170
[] blkdev_ioctl+0x2b0/0x9b0
[] block_ioctl+0x64/0xd0
[] do_vfs_ioctl+0x490/0x780
[] SyS_ioctl+0xd4/0xf0
[] system_call+0x38/0xd0

6) This is the function call chain exercised in this analysis:

SYSCALL_DEFINE3(ioctl, <...>) @ fs/ioctl.c
-> do_vfs_ioctl()
-> vfs_ioctl()
...
error = filp->f_op->unlocked_ioctl(filp, cmd, arg);
...
-> dm_blk_ioctl() @ drivers/md/dm.c
-> multipath_ioctl() @ drivers/md/dm-mpath.c
...
(bdev = NULL, due to no active paths)
...
if (!bdev || <...>) {
int err = scsi_verify_blk_ioctl(NULL, cmd);
if (err)
r = err;
}
...
-> scsi_verify_blk_ioctl() @ block/scsi_ioctl.c
...
if (bd && bd == bd->bd_contains) // not taken 
(bd = NULL)
return 

[PATCH 3.4 025/125] megaraid_sas : SMAP restriction--do not access user memory from IOCTL code

2016-10-12 Thread lizf
From: "sumit.sax...@avagotech.com" 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 323c4a02c631d00851d8edc4213c4d184ef83647 upstream.

This is an issue on SMAP enabled CPUs and 32 bit apps running on 64 bit
OS. Do not access user memory from kernel code. The SMAP bit restricts
accessing user memory from kernel code.

Signed-off-by: Sumit Saxena 
Signed-off-by: Kashyap Desai 
Reviewed-by: Tomas Henzl 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Zefan Li 
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
b/drivers/scsi/megaraid/megaraid_sas_base.c
index a74dc74..2e46060 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -5014,6 +5014,9 @@ static int megasas_mgmt_compat_ioctl_fw(struct file 
*file, unsigned long arg)
int i;
int error = 0;
compat_uptr_t ptr;
+   unsigned long local_raw_ptr;
+   u32 local_sense_off;
+   u32 local_sense_len;
 
if (clear_user(ioc, sizeof(*ioc)))
return -EFAULT;
@@ -5031,9 +5034,15 @@ static int megasas_mgmt_compat_ioctl_fw(struct file 
*file, unsigned long arg)
 * sense_len is not null, so prepare the 64bit value under
 * the same condition.
 */
-   if (ioc->sense_len) {
+   if (get_user(local_raw_ptr, ioc->frame.raw) ||
+   get_user(local_sense_off, >sense_off) ||
+   get_user(local_sense_len, >sense_len))
+   return -EFAULT;
+
+
+   if (local_sense_len) {
void __user **sense_ioc_ptr =
-   (void __user **)(ioc->frame.raw + ioc->sense_off);
+   (void __user **)((u8*)local_raw_ptr + local_sense_off);
compat_uptr_t *sense_cioc_ptr =
(compat_uptr_t *)(cioc->frame.raw + cioc->sense_off);
if (get_user(ptr, sense_cioc_ptr) ||
-- 
1.9.1



[PATCH 3.4 009/125] scsi: restart list search after unlock in scsi_remove_target

2016-10-12 Thread lizf
From: Christoph Hellwig 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 40998193560dab6c3ce8d25f4fa58a23e252ef38 upstream.

When dropping a lock while iterating a list we must restart the search
as other threads could have manipulated the list under us.  Without this
we can get stuck in an endless loop.  This bug was introduced by

commit bc3f02a795d3b4faa99d37390174be2a75d091bd
Author: Dan Williams 
Date:   Tue Aug 28 22:12:10 2012 -0700

[SCSI] scsi_remove_target: fix softlockup regression on hot remove

Which was itself trying to fix a reported soft lockup issue

http://thread.gmane.org/gmane.linux.kernel/1348679

However, we believe even with this revert of the original patch, the soft
lockup problem has been fixed by

commit f2495e228fce9f9cec84367547813cbb0d6db15a
Author: James Bottomley 
Date:   Tue Jan 21 07:01:41 2014 -0800

[SCSI] dual scan thread bug fix

Thanks go to Dan Williams  for tracking all this
prior history down.

Reported-by: Johannes Thumshirn 
Signed-off-by: Christoph Hellwig 
Tested-by: Johannes Thumshirn 
Reviewed-by: Johannes Thumshirn 
Fixes: bc3f02a795d3b4faa99d37390174be2a75d091bd
Signed-off-by: James Bottomley 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 drivers/scsi/scsi_sysfs.c | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 72ca515..05c99af 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1020,31 +1020,23 @@ static void __scsi_remove_target(struct scsi_target 
*starget)
 void scsi_remove_target(struct device *dev)
 {
struct Scsi_Host *shost = dev_to_shost(dev->parent);
-   struct scsi_target *starget, *last = NULL;
+   struct scsi_target *starget;
unsigned long flags;
 
-   /* remove targets being careful to lookup next entry before
-* deleting the last
-*/
+restart:
spin_lock_irqsave(shost->host_lock, flags);
list_for_each_entry(starget, >__targets, siblings) {
if (starget->state == STARGET_DEL)
continue;
if (starget->dev.parent == dev || >dev == dev) {
-   /* assuming new targets arrive at the end */
starget->reap_ref++;
spin_unlock_irqrestore(shost->host_lock, flags);
-   if (last)
-   scsi_target_reap(last);
-   last = starget;
__scsi_remove_target(starget);
-   spin_lock_irqsave(shost->host_lock, flags);
+   scsi_target_reap(starget);
+   goto restart;
}
}
spin_unlock_irqrestore(shost->host_lock, flags);
-
-   if (last)
-   scsi_target_reap(last);
 }
 EXPORT_SYMBOL(scsi_remove_target);
 
-- 
1.9.1



[PATCH 3.4 024/125] crypto: algif_hash - Only export and import on sockets with data

2016-10-12 Thread lizf
From: Herbert Xu 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 4afa5f9617927453ac04b24b584f6c718dfb4f45 upstream.

The hash_accept call fails to work on sockets that have not received
any data.  For some algorithm implementations it may cause crashes.

This patch fixes this by ensuring that we only export and import on
sockets that have received data.

Reported-by: Harsh Jain 
Signed-off-by: Herbert Xu 
Tested-by: Stephan Mueller 
Signed-off-by: Zefan Li 
---
 crypto/algif_hash.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 8502462..a68b56a 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -192,9 +192,14 @@ static int hash_accept(struct socket *sock, struct socket 
*newsock, int flags)
struct sock *sk2;
struct alg_sock *ask2;
struct hash_ctx *ctx2;
+   bool more;
int err;
 
-   err = crypto_ahash_export(req, state);
+   lock_sock(sk);
+   more = ctx->more;
+   err = more ? crypto_ahash_export(req, state) : 0;
+   release_sock(sk);
+
if (err)
return err;
 
@@ -205,7 +210,10 @@ static int hash_accept(struct socket *sock, struct socket 
*newsock, int flags)
sk2 = newsock->sk;
ask2 = alg_sk(sk2);
ctx2 = ask2->private;
-   ctx2->more = 1;
+   ctx2->more = more;
+
+   if (!more)
+   return err;
 
err = crypto_ahash_import(>req, state);
if (err) {
-- 
1.9.1



[PATCH 3.4 009/125] scsi: restart list search after unlock in scsi_remove_target

2016-10-12 Thread lizf
From: Christoph Hellwig 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 40998193560dab6c3ce8d25f4fa58a23e252ef38 upstream.

When dropping a lock while iterating a list we must restart the search
as other threads could have manipulated the list under us.  Without this
we can get stuck in an endless loop.  This bug was introduced by

commit bc3f02a795d3b4faa99d37390174be2a75d091bd
Author: Dan Williams 
Date:   Tue Aug 28 22:12:10 2012 -0700

[SCSI] scsi_remove_target: fix softlockup regression on hot remove

Which was itself trying to fix a reported soft lockup issue

http://thread.gmane.org/gmane.linux.kernel/1348679

However, we believe even with this revert of the original patch, the soft
lockup problem has been fixed by

commit f2495e228fce9f9cec84367547813cbb0d6db15a
Author: James Bottomley 
Date:   Tue Jan 21 07:01:41 2014 -0800

[SCSI] dual scan thread bug fix

Thanks go to Dan Williams  for tracking all this
prior history down.

Reported-by: Johannes Thumshirn 
Signed-off-by: Christoph Hellwig 
Tested-by: Johannes Thumshirn 
Reviewed-by: Johannes Thumshirn 
Fixes: bc3f02a795d3b4faa99d37390174be2a75d091bd
Signed-off-by: James Bottomley 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 drivers/scsi/scsi_sysfs.c | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 72ca515..05c99af 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1020,31 +1020,23 @@ static void __scsi_remove_target(struct scsi_target 
*starget)
 void scsi_remove_target(struct device *dev)
 {
struct Scsi_Host *shost = dev_to_shost(dev->parent);
-   struct scsi_target *starget, *last = NULL;
+   struct scsi_target *starget;
unsigned long flags;
 
-   /* remove targets being careful to lookup next entry before
-* deleting the last
-*/
+restart:
spin_lock_irqsave(shost->host_lock, flags);
list_for_each_entry(starget, >__targets, siblings) {
if (starget->state == STARGET_DEL)
continue;
if (starget->dev.parent == dev || >dev == dev) {
-   /* assuming new targets arrive at the end */
starget->reap_ref++;
spin_unlock_irqrestore(shost->host_lock, flags);
-   if (last)
-   scsi_target_reap(last);
-   last = starget;
__scsi_remove_target(starget);
-   spin_lock_irqsave(shost->host_lock, flags);
+   scsi_target_reap(starget);
+   goto restart;
}
}
spin_unlock_irqrestore(shost->host_lock, flags);
-
-   if (last)
-   scsi_target_reap(last);
 }
 EXPORT_SYMBOL(scsi_remove_target);
 
-- 
1.9.1



[PATCH 3.4 024/125] crypto: algif_hash - Only export and import on sockets with data

2016-10-12 Thread lizf
From: Herbert Xu 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 4afa5f9617927453ac04b24b584f6c718dfb4f45 upstream.

The hash_accept call fails to work on sockets that have not received
any data.  For some algorithm implementations it may cause crashes.

This patch fixes this by ensuring that we only export and import on
sockets that have received data.

Reported-by: Harsh Jain 
Signed-off-by: Herbert Xu 
Tested-by: Stephan Mueller 
Signed-off-by: Zefan Li 
---
 crypto/algif_hash.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 8502462..a68b56a 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -192,9 +192,14 @@ static int hash_accept(struct socket *sock, struct socket 
*newsock, int flags)
struct sock *sk2;
struct alg_sock *ask2;
struct hash_ctx *ctx2;
+   bool more;
int err;
 
-   err = crypto_ahash_export(req, state);
+   lock_sock(sk);
+   more = ctx->more;
+   err = more ? crypto_ahash_export(req, state) : 0;
+   release_sock(sk);
+
if (err)
return err;
 
@@ -205,7 +210,10 @@ static int hash_accept(struct socket *sock, struct socket 
*newsock, int flags)
sk2 = newsock->sk;
ask2 = alg_sk(sk2);
ctx2 = ask2->private;
-   ctx2->more = 1;
+   ctx2->more = more;
+
+   if (!more)
+   return err;
 
err = crypto_ahash_import(>req, state);
if (err) {
-- 
1.9.1



[PATCH 3.4 025/125] megaraid_sas : SMAP restriction--do not access user memory from IOCTL code

2016-10-12 Thread lizf
From: "sumit.sax...@avagotech.com" 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 323c4a02c631d00851d8edc4213c4d184ef83647 upstream.

This is an issue on SMAP enabled CPUs and 32 bit apps running on 64 bit
OS. Do not access user memory from kernel code. The SMAP bit restricts
accessing user memory from kernel code.

Signed-off-by: Sumit Saxena 
Signed-off-by: Kashyap Desai 
Reviewed-by: Tomas Henzl 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Zefan Li 
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
b/drivers/scsi/megaraid/megaraid_sas_base.c
index a74dc74..2e46060 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -5014,6 +5014,9 @@ static int megasas_mgmt_compat_ioctl_fw(struct file 
*file, unsigned long arg)
int i;
int error = 0;
compat_uptr_t ptr;
+   unsigned long local_raw_ptr;
+   u32 local_sense_off;
+   u32 local_sense_len;
 
if (clear_user(ioc, sizeof(*ioc)))
return -EFAULT;
@@ -5031,9 +5034,15 @@ static int megasas_mgmt_compat_ioctl_fw(struct file 
*file, unsigned long arg)
 * sense_len is not null, so prepare the 64bit value under
 * the same condition.
 */
-   if (ioc->sense_len) {
+   if (get_user(local_raw_ptr, ioc->frame.raw) ||
+   get_user(local_sense_off, >sense_off) ||
+   get_user(local_sense_len, >sense_len))
+   return -EFAULT;
+
+
+   if (local_sense_len) {
void __user **sense_ioc_ptr =
-   (void __user **)(ioc->frame.raw + ioc->sense_off);
+   (void __user **)((u8*)local_raw_ptr + local_sense_off);
compat_uptr_t *sense_cioc_ptr =
(compat_uptr_t *)(cioc->frame.raw + cioc->sense_off);
if (get_user(ptr, sense_cioc_ptr) ||
-- 
1.9.1



[PATCH 3.4 017/125] ext4, jbd2: ensure entering into panic after recording an error in superblock

2016-10-12 Thread lizf
From: Daeho Jeong 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 4327ba52afd03fc4b5afa0ee1d774c9c5b0e85c5 upstream.

If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
journaling will be aborted first and the error number will be recorded
into JBD2 superblock and, finally, the system will enter into the
panic state in "errors=panic" option.  But, in the rare case, this
sequence is little twisted like the below figure and it will happen
that the system enters into panic state, which means the system reset
in mobile environment, before completion of recording an error in the
journal superblock. In this case, e2fsck cannot recognize that the
filesystem failure occurred in the previous run and the corruption
wouldn't be fixed.

Task ATask B
ext4_handle_error()
-> jbd2_journal_abort()
  -> __journal_abort_soft()
-> __jbd2_journal_abort_hard()
| -> journal->j_flags |= JBD2_ABORT;
|
| __ext4_abort()
| -> jbd2_journal_abort()
| | -> __journal_abort_soft()
| |   -> if (journal->j_flags & JBD2_ABORT)
| |   return;
| -> panic()
|
-> jbd2_journal_update_sb_errno()

Tested-by: Hobin Woo 
Signed-off-by: Daeho Jeong 
Signed-off-by: Theodore Ts'o 
Signed-off-by: Zefan Li 
---
 fs/ext4/super.c  | 12 ++--
 fs/jbd2/journal.c|  6 +-
 include/linux/jbd2.h |  1 +
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 3de888c3..5862518 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -483,9 +483,13 @@ static void ext4_handle_error(struct super_block *sb)
ext4_msg(sb, KERN_CRIT, "Remounting filesystem read-only");
sb->s_flags |= MS_RDONLY;
}
-   if (test_opt(sb, ERRORS_PANIC))
+   if (test_opt(sb, ERRORS_PANIC)) {
+   if (EXT4_SB(sb)->s_journal &&
+ !(EXT4_SB(sb)->s_journal->j_flags & JBD2_REC_ERR))
+   return;
panic("EXT4-fs (device %s): panic forced after error\n",
sb->s_id);
+   }
 }
 
 void __ext4_error(struct super_block *sb, const char *function,
@@ -659,8 +663,12 @@ void __ext4_abort(struct super_block *sb, const char 
*function,
jbd2_journal_abort(EXT4_SB(sb)->s_journal, -EIO);
save_error_info(sb, function, line);
}
-   if (test_opt(sb, ERRORS_PANIC))
+   if (test_opt(sb, ERRORS_PANIC)) {
+   if (EXT4_SB(sb)->s_journal &&
+ !(EXT4_SB(sb)->s_journal->j_flags & JBD2_REC_ERR))
+   return;
panic("EXT4-fs panic from previous error\n");
+   }
 }
 
 void ext4_msg(struct super_block *sb, const char *prefix, const char *fmt, ...)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index a327944..2e3063c 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1921,8 +1921,12 @@ static void __journal_abort_soft (journal_t *journal, 
int errno)
 
__jbd2_journal_abort_hard(journal);
 
-   if (errno)
+   if (errno) {
jbd2_journal_update_sb_errno(journal);
+   write_lock(>j_state_lock);
+   journal->j_flags |= JBD2_REC_ERR;
+   write_unlock(>j_state_lock);
+   }
 }
 
 /**
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 2179d78..ee8090f 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -954,6 +954,7 @@ struct journal_s
 #define JBD2_ABORT_ON_SYNCDATA_ERR 0x040   /* Abort the journal on file
 * data write error in ordered
 * mode */
+#define JBD2_REC_ERR   0x080   /* The errno in the sb has been recorded */
 
 /*
  * Function declarations for the journaling transaction and buffer
-- 
1.9.1



[PATCH 3.4 017/125] ext4, jbd2: ensure entering into panic after recording an error in superblock

2016-10-12 Thread lizf
From: Daeho Jeong 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 4327ba52afd03fc4b5afa0ee1d774c9c5b0e85c5 upstream.

If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
journaling will be aborted first and the error number will be recorded
into JBD2 superblock and, finally, the system will enter into the
panic state in "errors=panic" option.  But, in the rare case, this
sequence is little twisted like the below figure and it will happen
that the system enters into panic state, which means the system reset
in mobile environment, before completion of recording an error in the
journal superblock. In this case, e2fsck cannot recognize that the
filesystem failure occurred in the previous run and the corruption
wouldn't be fixed.

Task ATask B
ext4_handle_error()
-> jbd2_journal_abort()
  -> __journal_abort_soft()
-> __jbd2_journal_abort_hard()
| -> journal->j_flags |= JBD2_ABORT;
|
| __ext4_abort()
| -> jbd2_journal_abort()
| | -> __journal_abort_soft()
| |   -> if (journal->j_flags & JBD2_ABORT)
| |   return;
| -> panic()
|
-> jbd2_journal_update_sb_errno()

Tested-by: Hobin Woo 
Signed-off-by: Daeho Jeong 
Signed-off-by: Theodore Ts'o 
Signed-off-by: Zefan Li 
---
 fs/ext4/super.c  | 12 ++--
 fs/jbd2/journal.c|  6 +-
 include/linux/jbd2.h |  1 +
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 3de888c3..5862518 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -483,9 +483,13 @@ static void ext4_handle_error(struct super_block *sb)
ext4_msg(sb, KERN_CRIT, "Remounting filesystem read-only");
sb->s_flags |= MS_RDONLY;
}
-   if (test_opt(sb, ERRORS_PANIC))
+   if (test_opt(sb, ERRORS_PANIC)) {
+   if (EXT4_SB(sb)->s_journal &&
+ !(EXT4_SB(sb)->s_journal->j_flags & JBD2_REC_ERR))
+   return;
panic("EXT4-fs (device %s): panic forced after error\n",
sb->s_id);
+   }
 }
 
 void __ext4_error(struct super_block *sb, const char *function,
@@ -659,8 +663,12 @@ void __ext4_abort(struct super_block *sb, const char 
*function,
jbd2_journal_abort(EXT4_SB(sb)->s_journal, -EIO);
save_error_info(sb, function, line);
}
-   if (test_opt(sb, ERRORS_PANIC))
+   if (test_opt(sb, ERRORS_PANIC)) {
+   if (EXT4_SB(sb)->s_journal &&
+ !(EXT4_SB(sb)->s_journal->j_flags & JBD2_REC_ERR))
+   return;
panic("EXT4-fs panic from previous error\n");
+   }
 }
 
 void ext4_msg(struct super_block *sb, const char *prefix, const char *fmt, ...)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index a327944..2e3063c 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1921,8 +1921,12 @@ static void __journal_abort_soft (journal_t *journal, 
int errno)
 
__jbd2_journal_abort_hard(journal);
 
-   if (errno)
+   if (errno) {
jbd2_journal_update_sb_errno(journal);
+   write_lock(>j_state_lock);
+   journal->j_flags |= JBD2_REC_ERR;
+   write_unlock(>j_state_lock);
+   }
 }
 
 /**
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 2179d78..ee8090f 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -954,6 +954,7 @@ struct journal_s
 #define JBD2_ABORT_ON_SYNCDATA_ERR 0x040   /* Abort the journal on file
 * data write error in ordered
 * mode */
+#define JBD2_REC_ERR   0x080   /* The errno in the sb has been recorded */
 
 /*
  * Function declarations for the journaling transaction and buffer
-- 
1.9.1



[PATCH 3.4 012/125] FS-Cache: Don't override netfs's primary_index if registering failed

2016-10-12 Thread lizf
From: Kinglong Mee 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit b130ed5998e62879a66bad08931a2b5e832da95c upstream.

Only override netfs->primary_index when registering success.

Signed-off-by: Kinglong Mee 
Signed-off-by: David Howells 
Signed-off-by: Al Viro 
[lizf: Backported to 3.4: there are no n_active and flags in primary_index]
Signed-off-by: Zefan Li 
---
 fs/fscache/netfs.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
index 0912b90..6f4e4ed 100644
--- a/fs/fscache/netfs.c
+++ b/fs/fscache/netfs.c
@@ -22,6 +22,7 @@ static LIST_HEAD(fscache_netfs_list);
 int __fscache_register_netfs(struct fscache_netfs *netfs)
 {
struct fscache_netfs *ptr;
+   struct fscache_cookie *cookie;
int ret;
 
_enter("{%s}", netfs->name);
@@ -29,24 +30,23 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
INIT_LIST_HEAD(>link);
 
/* allocate a cookie for the primary index */
-   netfs->primary_index =
-   kmem_cache_zalloc(fscache_cookie_jar, GFP_KERNEL);
+   cookie = kmem_cache_zalloc(fscache_cookie_jar, GFP_KERNEL);
 
-   if (!netfs->primary_index) {
+   if (!cookie) {
_leave(" = -ENOMEM");
return -ENOMEM;
}
 
/* initialise the primary index cookie */
-   atomic_set(>primary_index->usage, 1);
-   atomic_set(>primary_index->n_children, 0);
+   atomic_set(>usage, 1);
+   atomic_set(>n_children, 0);
 
-   netfs->primary_index->def   = _fsdef_netfs_def;
-   netfs->primary_index->parent= _fsdef_index;
-   netfs->primary_index->netfs_data= netfs;
+   cookie->def = _fsdef_netfs_def;
+   cookie->parent  = _fsdef_index;
+   cookie->netfs_data  = netfs;
 
-   spin_lock_init(>primary_index->lock);
-   INIT_HLIST_HEAD(>primary_index->backing_objects);
+   spin_lock_init(>lock);
+   INIT_HLIST_HEAD(>backing_objects);
 
/* check the netfs type is not already present */
down_write(_addremove_sem);
@@ -57,9 +57,10 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
goto already_registered;
}
 
-   atomic_inc(>primary_index->parent->usage);
-   atomic_inc(>primary_index->parent->n_children);
+   atomic_inc(>parent->usage);
+   atomic_inc(>parent->n_children);
 
+   netfs->primary_index = cookie;
list_add(>link, _netfs_list);
ret = 0;
 
@@ -69,10 +70,8 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
 already_registered:
up_write(_addremove_sem);
 
-   if (ret < 0) {
-   kmem_cache_free(fscache_cookie_jar, netfs->primary_index);
-   netfs->primary_index = NULL;
-   }
+   if (ret < 0)
+   kmem_cache_free(fscache_cookie_jar, cookie);
 
_leave(" = %d", ret);
return ret;
-- 
1.9.1



[PATCH 3.4 015/125] mtd: mtdpart: fix add_mtd_partitions error path

2016-10-12 Thread lizf
From: Boris BREZILLON 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit e5bae86797141e4a95e42d825f737cb36d7b8c37 upstream.

If we fail to allocate a partition structure in the middle of the partition
creation process, the already allocated partitions are never removed, which
means they are still present in the partition list and their resources are
never freed.

Signed-off-by: Boris Brezillon 
Signed-off-by: Brian Norris 
Signed-off-by: Zefan Li 
---
 drivers/mtd/mtdpart.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index bf24aa7..1c96f3d 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -632,8 +632,10 @@ int add_mtd_partitions(struct mtd_info *master,
 
for (i = 0; i < nbparts; i++) {
slave = allocate_partition(master, parts + i, i, cur_offset);
-   if (IS_ERR(slave))
+   if (IS_ERR(slave)) {
+   del_mtd_partitions(master);
return PTR_ERR(slave);
+   }
 
mutex_lock(_partitions_mutex);
list_add(>list, _partitions);
-- 
1.9.1



[PATCH 3.4 018/125] Bluetooth: ath3k: Add support of AR3012 0cf3:817b device

2016-10-12 Thread lizf
From: Dmitry Tunin 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 18e0afab8ce3f1230ce3fef52b2e73374fd9c0e7 upstream.

T: Bus=04 Lev=02 Prnt=02 Port=04 Cnt=01 Dev#= 3 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=0cf3 ProdID=817b Rev=00.02
C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb

BugLink: https://bugs.launchpad.net/bugs/1506615

Signed-off-by: Dmitry Tunin 
Signed-off-by: Marcel Holtmann 
Signed-off-by: Zefan Li 
---
 drivers/bluetooth/ath3k.c | 2 ++
 drivers/bluetooth/btusb.c | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/bluetooth/ath3k.c b/drivers/bluetooth/ath3k.c
index 4b8e03f..8ff6f5c 100644
--- a/drivers/bluetooth/ath3k.c
+++ b/drivers/bluetooth/ath3k.c
@@ -94,6 +94,7 @@ static struct usb_device_id ath3k_table[] = {
{ USB_DEVICE(0x0CF3, 0x311D) },
{ USB_DEVICE(0x0cf3, 0x3121) },
{ USB_DEVICE(0x0CF3, 0x817a) },
+   { USB_DEVICE(0x0CF3, 0x817b) },
{ USB_DEVICE(0x0cf3, 0xe003) },
{ USB_DEVICE(0x0CF3, 0xE004) },
{ USB_DEVICE(0x0CF3, 0xE005) },
@@ -144,6 +145,7 @@ static struct usb_device_id ath3k_blist_tbl[] = {
{ USB_DEVICE(0x0cf3, 0x311D), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x3121), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0CF3, 0x817a), .driver_info = BTUSB_ATH3012 },
+   { USB_DEVICE(0x0CF3, 0x817b), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe004), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe005), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe003), .driver_info = BTUSB_ATH3012 },
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index bbd1e6c..2302075 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -172,6 +172,7 @@ static struct usb_device_id blacklist_table[] = {
{ USB_DEVICE(0x0cf3, 0x311d), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x3121), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x817a), .driver_info = BTUSB_ATH3012 },
+   { USB_DEVICE(0x0cf3, 0x817b), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe003), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe004), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe005), .driver_info = BTUSB_ATH3012 },
-- 
1.9.1



[PATCH 3.4 016/125] iommu/vt-d: Fix ATSR handling for Root-Complex integrated endpoints

2016-10-12 Thread lizf
From: David Woodhouse 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit d14053b3c714178525f22660e6aaf41263d00056 upstream.

The VT-d specification says that "Software must enable ATS on endpoint
devices behind a Root Port only if the Root Port is reported as
supporting ATS transactions."

We walk up the tree to find a Root Port, but for integrated devices we
don't find one — we get to the host bridge. In that case we *should*
allow ATS. Currently we don't, which means that we are incorrectly
failing to use ATS for the integrated graphics. Fix that.

We should never break out of this loop "naturally" with bus==NULL,
since we'll always find bridge==NULL in that case (and now return 1).

So remove the check for (!bridge) after the loop, since it can never
happen. If it did, it would be worthy of a BUG_ON(!bridge). But since
it'll oops anyway in that case, that'll do just as well.

Signed-off-by: David Woodhouse 
[lizf: Backported to 3.4:
 - adjust context
 - drop the last part of the changes of the patch]
Signed-off-by: Zefan Li 
---
 drivers/iommu/intel-iommu.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index bd400f2..99e4974 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3586,10 +3586,15 @@ found:
for (bus = dev->bus; bus; bus = bus->parent) {
struct pci_dev *bridge = bus->self;
 
-   if (!bridge || !pci_is_pcie(bridge) ||
+   /* If it's an integrated device, allow ATS */
+   if (!bridge)
+   return 1;
+   /* Connected via non-PCIe: no ATS */
+   if (!pci_is_pcie(bridge) ||
bridge->pcie_type == PCI_EXP_TYPE_PCI_BRIDGE)
return 0;
 
+   /* If we found the root port, look it up in the ATSR */
if (bridge->pcie_type == PCI_EXP_TYPE_ROOT_PORT) {
for (i = 0; i < atsru->devices_cnt; i++)
if (atsru->devices[i] == bridge)
-- 
1.9.1



[PATCH 3.4 019/125] staging: rtl8712: Add device ID for Sitecom WLA2100

2016-10-12 Thread lizf
From: Larry Finger 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 1e6e63283691a2a9048a35d9c6c59cf0abd342e4 upstream.

This adds the USB ID for the Sitecom WLA2100. The Windows 10 inf file
was checked to verify that the addition is correct.

Reported-by: Frans van de Wiel 
Signed-off-by: Larry Finger 
Cc: Frans van de Wiel 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 drivers/staging/rtl8712/usb_intf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/rtl8712/usb_intf.c 
b/drivers/staging/rtl8712/usb_intf.c
index 1b1bf38..3c4a54c 100644
--- a/drivers/staging/rtl8712/usb_intf.c
+++ b/drivers/staging/rtl8712/usb_intf.c
@@ -147,6 +147,7 @@ static struct usb_device_id rtl871x_usb_id_tbl[] = {
{USB_DEVICE(0x0DF6, 0x0058)},
{USB_DEVICE(0x0DF6, 0x0049)},
{USB_DEVICE(0x0DF6, 0x004C)},
+   {USB_DEVICE(0x0DF6, 0x006C)},
{USB_DEVICE(0x0DF6, 0x0064)},
/* Skyworth */
{USB_DEVICE(0x14b2, 0x3300)},
-- 
1.9.1



[PATCH 3.4 012/125] FS-Cache: Don't override netfs's primary_index if registering failed

2016-10-12 Thread lizf
From: Kinglong Mee 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit b130ed5998e62879a66bad08931a2b5e832da95c upstream.

Only override netfs->primary_index when registering success.

Signed-off-by: Kinglong Mee 
Signed-off-by: David Howells 
Signed-off-by: Al Viro 
[lizf: Backported to 3.4: there are no n_active and flags in primary_index]
Signed-off-by: Zefan Li 
---
 fs/fscache/netfs.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
index 0912b90..6f4e4ed 100644
--- a/fs/fscache/netfs.c
+++ b/fs/fscache/netfs.c
@@ -22,6 +22,7 @@ static LIST_HEAD(fscache_netfs_list);
 int __fscache_register_netfs(struct fscache_netfs *netfs)
 {
struct fscache_netfs *ptr;
+   struct fscache_cookie *cookie;
int ret;
 
_enter("{%s}", netfs->name);
@@ -29,24 +30,23 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
INIT_LIST_HEAD(>link);
 
/* allocate a cookie for the primary index */
-   netfs->primary_index =
-   kmem_cache_zalloc(fscache_cookie_jar, GFP_KERNEL);
+   cookie = kmem_cache_zalloc(fscache_cookie_jar, GFP_KERNEL);
 
-   if (!netfs->primary_index) {
+   if (!cookie) {
_leave(" = -ENOMEM");
return -ENOMEM;
}
 
/* initialise the primary index cookie */
-   atomic_set(>primary_index->usage, 1);
-   atomic_set(>primary_index->n_children, 0);
+   atomic_set(>usage, 1);
+   atomic_set(>n_children, 0);
 
-   netfs->primary_index->def   = _fsdef_netfs_def;
-   netfs->primary_index->parent= _fsdef_index;
-   netfs->primary_index->netfs_data= netfs;
+   cookie->def = _fsdef_netfs_def;
+   cookie->parent  = _fsdef_index;
+   cookie->netfs_data  = netfs;
 
-   spin_lock_init(>primary_index->lock);
-   INIT_HLIST_HEAD(>primary_index->backing_objects);
+   spin_lock_init(>lock);
+   INIT_HLIST_HEAD(>backing_objects);
 
/* check the netfs type is not already present */
down_write(_addremove_sem);
@@ -57,9 +57,10 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
goto already_registered;
}
 
-   atomic_inc(>primary_index->parent->usage);
-   atomic_inc(>primary_index->parent->n_children);
+   atomic_inc(>parent->usage);
+   atomic_inc(>parent->n_children);
 
+   netfs->primary_index = cookie;
list_add(>link, _netfs_list);
ret = 0;
 
@@ -69,10 +70,8 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
 already_registered:
up_write(_addremove_sem);
 
-   if (ret < 0) {
-   kmem_cache_free(fscache_cookie_jar, netfs->primary_index);
-   netfs->primary_index = NULL;
-   }
+   if (ret < 0)
+   kmem_cache_free(fscache_cookie_jar, cookie);
 
_leave(" = %d", ret);
return ret;
-- 
1.9.1



[PATCH 3.4 015/125] mtd: mtdpart: fix add_mtd_partitions error path

2016-10-12 Thread lizf
From: Boris BREZILLON 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit e5bae86797141e4a95e42d825f737cb36d7b8c37 upstream.

If we fail to allocate a partition structure in the middle of the partition
creation process, the already allocated partitions are never removed, which
means they are still present in the partition list and their resources are
never freed.

Signed-off-by: Boris Brezillon 
Signed-off-by: Brian Norris 
Signed-off-by: Zefan Li 
---
 drivers/mtd/mtdpart.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index bf24aa7..1c96f3d 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -632,8 +632,10 @@ int add_mtd_partitions(struct mtd_info *master,
 
for (i = 0; i < nbparts; i++) {
slave = allocate_partition(master, parts + i, i, cur_offset);
-   if (IS_ERR(slave))
+   if (IS_ERR(slave)) {
+   del_mtd_partitions(master);
return PTR_ERR(slave);
+   }
 
mutex_lock(_partitions_mutex);
list_add(>list, _partitions);
-- 
1.9.1



[PATCH 3.4 018/125] Bluetooth: ath3k: Add support of AR3012 0cf3:817b device

2016-10-12 Thread lizf
From: Dmitry Tunin 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 18e0afab8ce3f1230ce3fef52b2e73374fd9c0e7 upstream.

T: Bus=04 Lev=02 Prnt=02 Port=04 Cnt=01 Dev#= 3 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=0cf3 ProdID=817b Rev=00.02
C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb

BugLink: https://bugs.launchpad.net/bugs/1506615

Signed-off-by: Dmitry Tunin 
Signed-off-by: Marcel Holtmann 
Signed-off-by: Zefan Li 
---
 drivers/bluetooth/ath3k.c | 2 ++
 drivers/bluetooth/btusb.c | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/bluetooth/ath3k.c b/drivers/bluetooth/ath3k.c
index 4b8e03f..8ff6f5c 100644
--- a/drivers/bluetooth/ath3k.c
+++ b/drivers/bluetooth/ath3k.c
@@ -94,6 +94,7 @@ static struct usb_device_id ath3k_table[] = {
{ USB_DEVICE(0x0CF3, 0x311D) },
{ USB_DEVICE(0x0cf3, 0x3121) },
{ USB_DEVICE(0x0CF3, 0x817a) },
+   { USB_DEVICE(0x0CF3, 0x817b) },
{ USB_DEVICE(0x0cf3, 0xe003) },
{ USB_DEVICE(0x0CF3, 0xE004) },
{ USB_DEVICE(0x0CF3, 0xE005) },
@@ -144,6 +145,7 @@ static struct usb_device_id ath3k_blist_tbl[] = {
{ USB_DEVICE(0x0cf3, 0x311D), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x3121), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0CF3, 0x817a), .driver_info = BTUSB_ATH3012 },
+   { USB_DEVICE(0x0CF3, 0x817b), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe004), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe005), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe003), .driver_info = BTUSB_ATH3012 },
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index bbd1e6c..2302075 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -172,6 +172,7 @@ static struct usb_device_id blacklist_table[] = {
{ USB_DEVICE(0x0cf3, 0x311d), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x3121), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x817a), .driver_info = BTUSB_ATH3012 },
+   { USB_DEVICE(0x0cf3, 0x817b), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe003), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe004), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0xe005), .driver_info = BTUSB_ATH3012 },
-- 
1.9.1



[PATCH 3.4 016/125] iommu/vt-d: Fix ATSR handling for Root-Complex integrated endpoints

2016-10-12 Thread lizf
From: David Woodhouse 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit d14053b3c714178525f22660e6aaf41263d00056 upstream.

The VT-d specification says that "Software must enable ATS on endpoint
devices behind a Root Port only if the Root Port is reported as
supporting ATS transactions."

We walk up the tree to find a Root Port, but for integrated devices we
don't find one — we get to the host bridge. In that case we *should*
allow ATS. Currently we don't, which means that we are incorrectly
failing to use ATS for the integrated graphics. Fix that.

We should never break out of this loop "naturally" with bus==NULL,
since we'll always find bridge==NULL in that case (and now return 1).

So remove the check for (!bridge) after the loop, since it can never
happen. If it did, it would be worthy of a BUG_ON(!bridge). But since
it'll oops anyway in that case, that'll do just as well.

Signed-off-by: David Woodhouse 
[lizf: Backported to 3.4:
 - adjust context
 - drop the last part of the changes of the patch]
Signed-off-by: Zefan Li 
---
 drivers/iommu/intel-iommu.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index bd400f2..99e4974 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3586,10 +3586,15 @@ found:
for (bus = dev->bus; bus; bus = bus->parent) {
struct pci_dev *bridge = bus->self;
 
-   if (!bridge || !pci_is_pcie(bridge) ||
+   /* If it's an integrated device, allow ATS */
+   if (!bridge)
+   return 1;
+   /* Connected via non-PCIe: no ATS */
+   if (!pci_is_pcie(bridge) ||
bridge->pcie_type == PCI_EXP_TYPE_PCI_BRIDGE)
return 0;
 
+   /* If we found the root port, look it up in the ATSR */
if (bridge->pcie_type == PCI_EXP_TYPE_ROOT_PORT) {
for (i = 0; i < atsru->devices_cnt; i++)
if (atsru->devices[i] == bridge)
-- 
1.9.1



[PATCH 3.4 019/125] staging: rtl8712: Add device ID for Sitecom WLA2100

2016-10-12 Thread lizf
From: Larry Finger 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 1e6e63283691a2a9048a35d9c6c59cf0abd342e4 upstream.

This adds the USB ID for the Sitecom WLA2100. The Windows 10 inf file
was checked to verify that the addition is correct.

Reported-by: Frans van de Wiel 
Signed-off-by: Larry Finger 
Cc: Frans van de Wiel 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 drivers/staging/rtl8712/usb_intf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/rtl8712/usb_intf.c 
b/drivers/staging/rtl8712/usb_intf.c
index 1b1bf38..3c4a54c 100644
--- a/drivers/staging/rtl8712/usb_intf.c
+++ b/drivers/staging/rtl8712/usb_intf.c
@@ -147,6 +147,7 @@ static struct usb_device_id rtl871x_usb_id_tbl[] = {
{USB_DEVICE(0x0DF6, 0x0058)},
{USB_DEVICE(0x0DF6, 0x0049)},
{USB_DEVICE(0x0DF6, 0x004C)},
+   {USB_DEVICE(0x0DF6, 0x006C)},
{USB_DEVICE(0x0DF6, 0x0064)},
/* Skyworth */
{USB_DEVICE(0x14b2, 0x3300)},
-- 
1.9.1



[PATCH 3.4 011/125] FS-Cache: Increase reference of parent after registering, netfs success

2016-10-12 Thread lizf
From: Kinglong Mee 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 86108c2e34a26e4bec3c6ddb23390bf8cedcf391 upstream.

If netfs exist, fscache should not increase the reference of parent's
usage and n_children, otherwise, never be decreased.

v2: thanks David's suggest,
 move increasing reference of parent if success
 use kmem_cache_free() freeing primary_index directly

v3: don't move "netfs->primary_index->parent = _fsdef_index;"

Signed-off-by: Kinglong Mee 
Signed-off-by: David Howells 
Signed-off-by: Al Viro 
Signed-off-by: Zefan Li 
---
 fs/fscache/netfs.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
index e028b8e..0912b90 100644
--- a/fs/fscache/netfs.c
+++ b/fs/fscache/netfs.c
@@ -45,9 +45,6 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
netfs->primary_index->parent= _fsdef_index;
netfs->primary_index->netfs_data= netfs;
 
-   atomic_inc(>primary_index->parent->usage);
-   atomic_inc(>primary_index->parent->n_children);
-
spin_lock_init(>primary_index->lock);
INIT_HLIST_HEAD(>primary_index->backing_objects);
 
@@ -60,6 +57,9 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
goto already_registered;
}
 
+   atomic_inc(>primary_index->parent->usage);
+   atomic_inc(>primary_index->parent->n_children);
+
list_add(>link, _netfs_list);
ret = 0;
 
@@ -70,8 +70,7 @@ already_registered:
up_write(_addremove_sem);
 
if (ret < 0) {
-   netfs->primary_index->parent = NULL;
-   __fscache_cookie_put(netfs->primary_index);
+   kmem_cache_free(fscache_cookie_jar, netfs->primary_index);
netfs->primary_index = NULL;
}
 
-- 
1.9.1



[PATCH 3.4 011/125] FS-Cache: Increase reference of parent after registering, netfs success

2016-10-12 Thread lizf
From: Kinglong Mee 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 86108c2e34a26e4bec3c6ddb23390bf8cedcf391 upstream.

If netfs exist, fscache should not increase the reference of parent's
usage and n_children, otherwise, never be decreased.

v2: thanks David's suggest,
 move increasing reference of parent if success
 use kmem_cache_free() freeing primary_index directly

v3: don't move "netfs->primary_index->parent = _fsdef_index;"

Signed-off-by: Kinglong Mee 
Signed-off-by: David Howells 
Signed-off-by: Al Viro 
Signed-off-by: Zefan Li 
---
 fs/fscache/netfs.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
index e028b8e..0912b90 100644
--- a/fs/fscache/netfs.c
+++ b/fs/fscache/netfs.c
@@ -45,9 +45,6 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
netfs->primary_index->parent= _fsdef_index;
netfs->primary_index->netfs_data= netfs;
 
-   atomic_inc(>primary_index->parent->usage);
-   atomic_inc(>primary_index->parent->n_children);
-
spin_lock_init(>primary_index->lock);
INIT_HLIST_HEAD(>primary_index->backing_objects);
 
@@ -60,6 +57,9 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
goto already_registered;
}
 
+   atomic_inc(>primary_index->parent->usage);
+   atomic_inc(>primary_index->parent->n_children);
+
list_add(>link, _netfs_list);
ret = 0;
 
@@ -70,8 +70,7 @@ already_registered:
up_write(_addremove_sem);
 
if (ret < 0) {
-   netfs->primary_index->parent = NULL;
-   __fscache_cookie_put(netfs->primary_index);
+   kmem_cache_free(fscache_cookie_jar, netfs->primary_index);
netfs->primary_index = NULL;
}
 
-- 
1.9.1



[PATCH 3.4 001/125] mac80211: fix driver RSSI event calculations

2016-10-12 Thread lizf
From: Johannes Berg 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 8ec6d97871f37e4743678ea4a455bd59580aa0f4 upstream.

The ifmgd->ave_beacon_signal value cannot be taken as is for
comparisons, it must be divided by since it's represented
like that for better accuracy of the EWMA calculations. This
would lead to invalid driver RSSI events. Fix the used value.

Fixes: 615f7b9bb1f8 ("mac80211: add driver RSSI threshold events")
Signed-off-by: Johannes Berg 
Signed-off-by: Zefan Li 
---
 net/mac80211/mlme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index abc31d7..1dae142 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2384,7 +2384,7 @@ static void ieee80211_rx_mgmt_beacon(struct 
ieee80211_sub_if_data *sdata,
 
if (ifmgd->rssi_min_thold != ifmgd->rssi_max_thold &&
ifmgd->count_beacon_signal >= IEEE80211_SIGNAL_AVE_MIN_COUNT) {
-   int sig = ifmgd->ave_beacon_signal;
+   int sig = ifmgd->ave_beacon_signal / 16;
int last_sig = ifmgd->last_ave_beacon_signal;
 
/*
-- 
1.9.1



[PATCH 3.4 001/125] mac80211: fix driver RSSI event calculations

2016-10-12 Thread lizf
From: Johannes Berg 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 8ec6d97871f37e4743678ea4a455bd59580aa0f4 upstream.

The ifmgd->ave_beacon_signal value cannot be taken as is for
comparisons, it must be divided by since it's represented
like that for better accuracy of the EWMA calculations. This
would lead to invalid driver RSSI events. Fix the used value.

Fixes: 615f7b9bb1f8 ("mac80211: add driver RSSI threshold events")
Signed-off-by: Johannes Berg 
Signed-off-by: Zefan Li 
---
 net/mac80211/mlme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index abc31d7..1dae142 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2384,7 +2384,7 @@ static void ieee80211_rx_mgmt_beacon(struct 
ieee80211_sub_if_data *sdata,
 
if (ifmgd->rssi_min_thold != ifmgd->rssi_max_thold &&
ifmgd->count_beacon_signal >= IEEE80211_SIGNAL_AVE_MIN_COUNT) {
-   int sig = ifmgd->ave_beacon_signal;
+   int sig = ifmgd->ave_beacon_signal / 16;
int last_sig = ifmgd->last_ave_beacon_signal;
 
/*
-- 
1.9.1



[PATCH 3.4 004/125] devres: fix a for loop bounds check

2016-10-12 Thread lizf
From: Dan Carpenter 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 1f35d04a02a652f14566f875aef3a6f2af4cb77b upstream.

The iomap[] array has PCIM_IOMAP_MAX (6) elements and not
DEVICE_COUNT_RESOURCE (16).  This bug was found using a static checker.
It may be that the "if (!(mask & (1 << i)))" check means we never
actually go past the end of the array in real life.

Fixes: ec04b075843d ('iomap: implement pcim_iounmap_regions()')
Signed-off-by: Dan Carpenter 
Acked-by: Tejun Heo 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 lib/devres.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/devres.c b/lib/devres.c
index 80b9c76..584c2dc 100644
--- a/lib/devres.c
+++ b/lib/devres.c
@@ -390,7 +390,7 @@ void pcim_iounmap_regions(struct pci_dev *pdev, int mask)
if (!iomap)
return;
 
-   for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+   for (i = 0; i < PCIM_IOMAP_MAX; i++) {
if (!(mask & (1 << i)))
continue;
 
-- 
1.9.1



[PATCH 3.4 003/125] mwifiex: fix mwifiex_rdeeprom_read()

2016-10-12 Thread lizf
From: Dan Carpenter 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 1f9c6e1bc1ba5f8a10fcd6e99d170954d7c6d382 upstream.

There were several bugs here.

1)  The done label was in the wrong place so we didn't copy any
information out when there was no command given.

2)  We were using PAGE_SIZE as the size of the buffer instead of
"PAGE_SIZE - pos".

3)  snprintf() returns the number of characters that would have been
printed if there were enough space.  If there was not enough space
(and we had fixed the memory corruption bug #2) then it would result
in an information leak when we do simple_read_from_buffer().  I've
changed it to use scnprintf() instead.

I also removed the initialization at the start of the function, because
I thought it made the code a little more clear.

Fixes: 5e6e3a92b9a4 ('wireless: mwifiex: initial commit for Marvell mwifiex 
driver')
Signed-off-by: Dan Carpenter 
Acked-by: Amitkumar Karwar 
Signed-off-by: Kalle Valo 
Signed-off-by: Zefan Li 
---
 drivers/net/wireless/mwifiex/debugfs.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/debugfs.c 
b/drivers/net/wireless/mwifiex/debugfs.c
index 1a84507..e24ef9a 100644
--- a/drivers/net/wireless/mwifiex/debugfs.c
+++ b/drivers/net/wireless/mwifiex/debugfs.c
@@ -621,7 +621,7 @@ mwifiex_rdeeprom_read(struct file *file, char __user *ubuf,
(struct mwifiex_private *) file->private_data;
unsigned long addr = get_zeroed_page(GFP_KERNEL);
char *buf = (char *) addr;
-   int pos = 0, ret = 0, i;
+   int pos, ret, i;
u8 value[MAX_EEPROM_DATA];
 
if (!buf)
@@ -629,7 +629,7 @@ mwifiex_rdeeprom_read(struct file *file, char __user *ubuf,
 
if (saved_offset == -1) {
/* No command has been given */
-   pos += snprintf(buf, PAGE_SIZE, "0");
+   pos = snprintf(buf, PAGE_SIZE, "0");
goto done;
}
 
@@ -638,17 +638,17 @@ mwifiex_rdeeprom_read(struct file *file, char __user 
*ubuf,
  (u16) saved_bytes, value);
if (ret) {
ret = -EINVAL;
-   goto done;
+   goto out_free;
}
 
-   pos += snprintf(buf, PAGE_SIZE, "%d %d ", saved_offset, saved_bytes);
+   pos = snprintf(buf, PAGE_SIZE, "%d %d ", saved_offset, saved_bytes);
 
for (i = 0; i < saved_bytes; i++)
-   pos += snprintf(buf + strlen(buf), PAGE_SIZE, "%d ", value[i]);
-
-   ret = simple_read_from_buffer(ubuf, count, ppos, buf, pos);
+   pos += scnprintf(buf + pos, PAGE_SIZE - pos, "%d ", value[i]);
 
 done:
+   ret = simple_read_from_buffer(ubuf, count, ppos, buf, pos);
+out_free:
free_page(addr);
return ret;
 }
-- 
1.9.1



[PATCH 3.4 003/125] mwifiex: fix mwifiex_rdeeprom_read()

2016-10-12 Thread lizf
From: Dan Carpenter 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 1f9c6e1bc1ba5f8a10fcd6e99d170954d7c6d382 upstream.

There were several bugs here.

1)  The done label was in the wrong place so we didn't copy any
information out when there was no command given.

2)  We were using PAGE_SIZE as the size of the buffer instead of
"PAGE_SIZE - pos".

3)  snprintf() returns the number of characters that would have been
printed if there were enough space.  If there was not enough space
(and we had fixed the memory corruption bug #2) then it would result
in an information leak when we do simple_read_from_buffer().  I've
changed it to use scnprintf() instead.

I also removed the initialization at the start of the function, because
I thought it made the code a little more clear.

Fixes: 5e6e3a92b9a4 ('wireless: mwifiex: initial commit for Marvell mwifiex 
driver')
Signed-off-by: Dan Carpenter 
Acked-by: Amitkumar Karwar 
Signed-off-by: Kalle Valo 
Signed-off-by: Zefan Li 
---
 drivers/net/wireless/mwifiex/debugfs.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/debugfs.c 
b/drivers/net/wireless/mwifiex/debugfs.c
index 1a84507..e24ef9a 100644
--- a/drivers/net/wireless/mwifiex/debugfs.c
+++ b/drivers/net/wireless/mwifiex/debugfs.c
@@ -621,7 +621,7 @@ mwifiex_rdeeprom_read(struct file *file, char __user *ubuf,
(struct mwifiex_private *) file->private_data;
unsigned long addr = get_zeroed_page(GFP_KERNEL);
char *buf = (char *) addr;
-   int pos = 0, ret = 0, i;
+   int pos, ret, i;
u8 value[MAX_EEPROM_DATA];
 
if (!buf)
@@ -629,7 +629,7 @@ mwifiex_rdeeprom_read(struct file *file, char __user *ubuf,
 
if (saved_offset == -1) {
/* No command has been given */
-   pos += snprintf(buf, PAGE_SIZE, "0");
+   pos = snprintf(buf, PAGE_SIZE, "0");
goto done;
}
 
@@ -638,17 +638,17 @@ mwifiex_rdeeprom_read(struct file *file, char __user 
*ubuf,
  (u16) saved_bytes, value);
if (ret) {
ret = -EINVAL;
-   goto done;
+   goto out_free;
}
 
-   pos += snprintf(buf, PAGE_SIZE, "%d %d ", saved_offset, saved_bytes);
+   pos = snprintf(buf, PAGE_SIZE, "%d %d ", saved_offset, saved_bytes);
 
for (i = 0; i < saved_bytes; i++)
-   pos += snprintf(buf + strlen(buf), PAGE_SIZE, "%d ", value[i]);
-
-   ret = simple_read_from_buffer(ubuf, count, ppos, buf, pos);
+   pos += scnprintf(buf + pos, PAGE_SIZE - pos, "%d ", value[i]);
 
 done:
+   ret = simple_read_from_buffer(ubuf, count, ppos, buf, pos);
+out_free:
free_page(addr);
return ret;
 }
-- 
1.9.1



[PATCH 3.4 004/125] devres: fix a for loop bounds check

2016-10-12 Thread lizf
From: Dan Carpenter 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 1f35d04a02a652f14566f875aef3a6f2af4cb77b upstream.

The iomap[] array has PCIM_IOMAP_MAX (6) elements and not
DEVICE_COUNT_RESOURCE (16).  This bug was found using a static checker.
It may be that the "if (!(mask & (1 << i)))" check means we never
actually go past the end of the array in real life.

Fixes: ec04b075843d ('iomap: implement pcim_iounmap_regions()')
Signed-off-by: Dan Carpenter 
Acked-by: Tejun Heo 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 lib/devres.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/devres.c b/lib/devres.c
index 80b9c76..584c2dc 100644
--- a/lib/devres.c
+++ b/lib/devres.c
@@ -390,7 +390,7 @@ void pcim_iounmap_regions(struct pci_dev *pdev, int mask)
if (!iomap)
return;
 
-   for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+   for (i = 0; i < PCIM_IOMAP_MAX; i++) {
if (!(mask & (1 << i)))
continue;
 
-- 
1.9.1



Re: [v4.8-rc1 Regression] sched/fair: Apply more PELT fixes

2016-10-12 Thread Vincent Guittot
On 8 October 2016 at 13:49, Mike Galbraith  wrote:
> On Sat, 2016-10-08 at 13:37 +0200, Vincent Guittot wrote:
>> On 8 October 2016 at 10:39, Ingo Molnar  wrote:
>> >
>> > * Peter Zijlstra  wrote:
>> >
>> > > On Fri, Oct 07, 2016 at 03:38:23PM -0400, Joseph Salisbury wrote:
>> > > > Hello Peter,
>> > > >
>> > > > A kernel bug report was opened against Ubuntu [0].  After a
>> > > > kernel
>> > > > bisect, it was found that reverting the following commit
>> > > > resolved this bug:
>> > > >
>> > > > commit 3d30544f02120b884bba2a9466c87dba980e3be5
>> > > > Author: Peter Zijlstra 
>> > > > Date:   Tue Jun 21 14:27:50 2016 +0200
>> > > >
>> > > > sched/fair: Apply more PELT fixes
>>
>> This patch only speeds up the update of task group load in order to
>> reflect the new load balance but It should not change the final value
>> and as a result the final behavior. I will try to reproduce it in my
>> target later today
>
> FWIW, I tried and failed w/wo autogroup on 4.8 and master.

Me too

Is it possible to get some dump of  /proc/sched_debug while the problem occurs ?

Vincent

> -Mike


Re: [PATCH] iwlwifi: pcie: reduce "unsupported splx" to a warning

2016-10-12 Thread Luca Coelho
On Tue, 2016-10-11 at 23:32 -0500, Chris Rorvick wrote:
> On Tue, Oct 11, 2016 at 5:11 AM, Paul Bolle  wrote:
> > For what it's worth, on my machine I have twenty (!) SPLX entries, all
> > reading:
> > Name (SPLX, Package (0x04)
> > {
> > Zero,
> > Package (0x03)
> > {
> > 0x8000,
> > 0x8000,
> > 0x8000
> > },
> > 
> > Package (0x03)
> > {
> >0x8000,
> >0x8000,
> >0x8000
> > },
> > 
> > Package (0x03)
> > {
> > 0x8000,
> > 0x8000,
> > 0x8000
> > }
> > })
> 
> 
> I actually see exactly the same on my Dell XPS 13 (9350) when I  use
> acpidump, etc.  I typed the entry I included in the commit log by hand
> based on what the driver gets back from the SPLC method (I added a
> function to dump the returned object.)

Okay... Actually this is a structure in the BIOS and the actual method
we call is SPLC.  The SPLC method may return one item from this table,
or something entirely different, possible one of the three values
depending on a configuration option or so.

Can you to find and send me the actual SPLC method that we call, from
your BIOS?

--
Cheers,
Luca.


Re: [v4.8-rc1 Regression] sched/fair: Apply more PELT fixes

2016-10-12 Thread Vincent Guittot
On 8 October 2016 at 13:49, Mike Galbraith  wrote:
> On Sat, 2016-10-08 at 13:37 +0200, Vincent Guittot wrote:
>> On 8 October 2016 at 10:39, Ingo Molnar  wrote:
>> >
>> > * Peter Zijlstra  wrote:
>> >
>> > > On Fri, Oct 07, 2016 at 03:38:23PM -0400, Joseph Salisbury wrote:
>> > > > Hello Peter,
>> > > >
>> > > > A kernel bug report was opened against Ubuntu [0].  After a
>> > > > kernel
>> > > > bisect, it was found that reverting the following commit
>> > > > resolved this bug:
>> > > >
>> > > > commit 3d30544f02120b884bba2a9466c87dba980e3be5
>> > > > Author: Peter Zijlstra 
>> > > > Date:   Tue Jun 21 14:27:50 2016 +0200
>> > > >
>> > > > sched/fair: Apply more PELT fixes
>>
>> This patch only speeds up the update of task group load in order to
>> reflect the new load balance but It should not change the final value
>> and as a result the final behavior. I will try to reproduce it in my
>> target later today
>
> FWIW, I tried and failed w/wo autogroup on 4.8 and master.

Me too

Is it possible to get some dump of  /proc/sched_debug while the problem occurs ?

Vincent

> -Mike


Re: [PATCH] iwlwifi: pcie: reduce "unsupported splx" to a warning

2016-10-12 Thread Luca Coelho
On Tue, 2016-10-11 at 23:32 -0500, Chris Rorvick wrote:
> On Tue, Oct 11, 2016 at 5:11 AM, Paul Bolle  wrote:
> > For what it's worth, on my machine I have twenty (!) SPLX entries, all
> > reading:
> > Name (SPLX, Package (0x04)
> > {
> > Zero,
> > Package (0x03)
> > {
> > 0x8000,
> > 0x8000,
> > 0x8000
> > },
> > 
> > Package (0x03)
> > {
> >0x8000,
> >0x8000,
> >0x8000
> > },
> > 
> > Package (0x03)
> > {
> > 0x8000,
> > 0x8000,
> > 0x8000
> > }
> > })
> 
> 
> I actually see exactly the same on my Dell XPS 13 (9350) when I  use
> acpidump, etc.  I typed the entry I included in the commit log by hand
> based on what the driver gets back from the SPLC method (I added a
> function to dump the returned object.)

Okay... Actually this is a structure in the BIOS and the actual method
we call is SPLC.  The SPLC method may return one item from this table,
or something entirely different, possible one of the three values
depending on a configuration option or so.

Can you to find and send me the actual SPLC method that we call, from
your BIOS?

--
Cheers,
Luca.


Re: [RFC PATCH 00/11] pci: support for configurable PCI endpoint

2016-10-12 Thread Christoph Hellwig
On Mon, Sep 26, 2016 at 11:38:41AM +0530, Kishon Vijay Abraham I wrote:
> > Ok, so in theory there can be other hardware (and quite likely is)
> > that supports multiple functions, and we can extend the framework
> > to support them without major obstacles, but your hardware doesn't,
> > so you kept it simple with one hardcoded function, right?
> 
> right, PCIe can have upto 8 functions. So the issues with the current 
> framework
> has to be fixed. I don't expect major obstacles with this as of now.

I wouldn't be too worried about.  We have two kinds of functions in
PCIe: physical functions, or virtual functions using SR-IOV.

For the first one we pretty much just need the controller driver to
report them separately as there is almost no interaction between
functions.

SR-IOV support will be more interesting as the physical functions
controls creation of the associated virtual functions.  I'd like to
defer that problem until we get hold of a software programmable
controller that supports SR-IOV and has open documentation.  (That
beeing said, if someone has a pointer to such a beast send it my way!)

> > We should still find out whether it's important that you can have
> > a single PCI function with a software multi-function support of some
> > sort. We'd still be limited to six BARs in total, and would also need
> > something to identify those sub-functions, so implementing that might
> > get quite hairy.
> > 
> > Possibly this could be done at a higher level, e.g. by implementing
> > a PCI-virtio multiplexer that can host multiple virtio based devices
> > inside of a single PCI function. If we think that would be a good idea,
> > we should make sure the configfs interface is extensible enough to
> > handle that.
> 
> Okay. So here the main function (actual PCI function) *can* perform the work 
> of
> virtio muliplexer if the platform wants to support sub-functions or it can be 
> a
> normal PCI function. right?

I really don't think we should be worried about this multiplexer.  It's
not something real PCIe devices do (sane ones anyway, the rest is
handled by ad-hoc multiplexers), and we should avoid creating our own
magic periphals for it.

> > One use case I have in mind for this is to have a PCI function that
> > can use virtio to provide rootfs (virtio-blk or 9pfs), network
> > and console to the system that implements the PCI function (note
> > that this is the opposite direction of what almost everyone else
> > uses PCI devices for).
> 
> Do you mean the virtio should actually be in the host side? Even here the
> system that implements PCI function should have multiple functions right? (one
> for network, other for console etc..). So there should be a virtio multiplexer
> both in the host side and in the device side?

We already support virtio over phsysical PCIe buses to support intel MIC
devices.  Take a look at drivers/misc/mic/bus/vop_bus.c and
drivers/misc/mic/vop (yes, what a horrible place for that code, not my
fault)


Re: [RFC PATCH 00/11] pci: support for configurable PCI endpoint

2016-10-12 Thread Christoph Hellwig
On Mon, Sep 26, 2016 at 11:38:41AM +0530, Kishon Vijay Abraham I wrote:
> > Ok, so in theory there can be other hardware (and quite likely is)
> > that supports multiple functions, and we can extend the framework
> > to support them without major obstacles, but your hardware doesn't,
> > so you kept it simple with one hardcoded function, right?
> 
> right, PCIe can have upto 8 functions. So the issues with the current 
> framework
> has to be fixed. I don't expect major obstacles with this as of now.

I wouldn't be too worried about.  We have two kinds of functions in
PCIe: physical functions, or virtual functions using SR-IOV.

For the first one we pretty much just need the controller driver to
report them separately as there is almost no interaction between
functions.

SR-IOV support will be more interesting as the physical functions
controls creation of the associated virtual functions.  I'd like to
defer that problem until we get hold of a software programmable
controller that supports SR-IOV and has open documentation.  (That
beeing said, if someone has a pointer to such a beast send it my way!)

> > We should still find out whether it's important that you can have
> > a single PCI function with a software multi-function support of some
> > sort. We'd still be limited to six BARs in total, and would also need
> > something to identify those sub-functions, so implementing that might
> > get quite hairy.
> > 
> > Possibly this could be done at a higher level, e.g. by implementing
> > a PCI-virtio multiplexer that can host multiple virtio based devices
> > inside of a single PCI function. If we think that would be a good idea,
> > we should make sure the configfs interface is extensible enough to
> > handle that.
> 
> Okay. So here the main function (actual PCI function) *can* perform the work 
> of
> virtio muliplexer if the platform wants to support sub-functions or it can be 
> a
> normal PCI function. right?

I really don't think we should be worried about this multiplexer.  It's
not something real PCIe devices do (sane ones anyway, the rest is
handled by ad-hoc multiplexers), and we should avoid creating our own
magic periphals for it.

> > One use case I have in mind for this is to have a PCI function that
> > can use virtio to provide rootfs (virtio-blk or 9pfs), network
> > and console to the system that implements the PCI function (note
> > that this is the opposite direction of what almost everyone else
> > uses PCI devices for).
> 
> Do you mean the virtio should actually be in the host side? Even here the
> system that implements PCI function should have multiple functions right? (one
> for network, other for console etc..). So there should be a virtio multiplexer
> both in the host side and in the device side?

We already support virtio over phsysical PCIe buses to support intel MIC
devices.  Take a look at drivers/misc/mic/bus/vop_bus.c and
drivers/misc/mic/vop (yes, what a horrible place for that code, not my
fault)


Re: [PATCH] padata: add helper function for queue length

2016-10-12 Thread Jason A. Donenfeld
Hi Steffen,

On Fri, Oct 7, 2016 at 5:15 AM, Steffen Klassert
 wrote:
> Why you want to have this?

I'm working on some bufferbloat/queue code that could benefit from
knowing how many items are currently in flight. The goal is to always
keep padata busy, but never with more jobs than absolutely necessary.
The model is CoDel.

Regards,
Jason


Re: [PATCH] padata: add helper function for queue length

2016-10-12 Thread Jason A. Donenfeld
Hi Steffen,

On Fri, Oct 7, 2016 at 5:15 AM, Steffen Klassert
 wrote:
> Why you want to have this?

I'm working on some bufferbloat/queue code that could benefit from
knowing how many items are currently in flight. The goal is to always
keep padata busy, but never with more jobs than absolutely necessary.
The model is CoDel.

Regards,
Jason


[PATCH v2 4/4] Add R3MWAIT to CPU features

2016-10-12 Thread Grzegorz Andrejczuk
Add cpu feature for ring 3 monitor/mwait.

Change-Id: Iba4d20639efd8d3637d37db9294cbc43a98f009a
Signed-off-by: Grzegorz Andrejczuk 
---
 arch/x86/include/asm/cpufeatures.h | 2 ++
 arch/x86/kernel/cpu/common.c   | 3 +++
 arch/x86/kernel/cpu/scattered.c| 5 +
 3 files changed, 10 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 92a8308..9caf9c4 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -71,6 +71,8 @@
 #define X86_FEATURE_RECOVERY   ( 2*32+ 0) /* CPU in recovery mode */
 #define X86_FEATURE_LONGRUN( 2*32+ 1) /* Longrun power control */
 #define X86_FEATURE_LRTI   ( 2*32+ 3) /* LongRun table interface */
+/* non architectural Intel-defined CPU features not present in CPUID */
+#define X86_FEATURE_PHIR3MWAIT (2*32+ 4)
 
 /* Other features, Linux-defined mapping, word 3 */
 /* This range is used for feature bits which conflict or are synthesized */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 93ffaa5..15fe27f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1108,6 +1108,9 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 #endif
/* The boot/hotplug time assigment got cleared, restore it */
c->logical_proc_id = topology_phys_to_logical_pkg(c->phys_proc_id);
+
+   if (cpu_has(c, X86_FEATURE_PHIR3MWAIT))
+   elf_hwcap2 |= HWCAP2_PHIR3MWAIT;
 }
 
 /*
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 8cb57df..e4ff3d0 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -29,6 +29,7 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
u32 max_level;
u32 regs[4];
const struct cpuid_bit *cb;
+   u64 misc_thd_enable;
 
static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_INTEL_PT, CR_EBX,25, 0x0007, 0 },
@@ -54,4 +55,8 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
if (regs[cb->reg] & (1 << cb->bit))
set_cpu_cap(c, cb->feature);
}
+
+   rdmsrl(MSR_PHI_MISC_THD_FEATURE, misc_thd_enable);
+   if ((misc_thd_enable & MSR_PHI_MISC_THD_FEATURE_R3MWAIT) != 0)
+   set_cpu_cap(c, X86_FEATURE_PHIR3MWAIT);
 }
-- 
2.5.1



[PATCH v2 4/4] Add R3MWAIT to CPU features

2016-10-12 Thread Grzegorz Andrejczuk
Add cpu feature for ring 3 monitor/mwait.

Change-Id: Iba4d20639efd8d3637d37db9294cbc43a98f009a
Signed-off-by: Grzegorz Andrejczuk 
---
 arch/x86/include/asm/cpufeatures.h | 2 ++
 arch/x86/kernel/cpu/common.c   | 3 +++
 arch/x86/kernel/cpu/scattered.c| 5 +
 3 files changed, 10 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 92a8308..9caf9c4 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -71,6 +71,8 @@
 #define X86_FEATURE_RECOVERY   ( 2*32+ 0) /* CPU in recovery mode */
 #define X86_FEATURE_LONGRUN( 2*32+ 1) /* Longrun power control */
 #define X86_FEATURE_LRTI   ( 2*32+ 3) /* LongRun table interface */
+/* non architectural Intel-defined CPU features not present in CPUID */
+#define X86_FEATURE_PHIR3MWAIT (2*32+ 4)
 
 /* Other features, Linux-defined mapping, word 3 */
 /* This range is used for feature bits which conflict or are synthesized */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 93ffaa5..15fe27f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1108,6 +1108,9 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 #endif
/* The boot/hotplug time assigment got cleared, restore it */
c->logical_proc_id = topology_phys_to_logical_pkg(c->phys_proc_id);
+
+   if (cpu_has(c, X86_FEATURE_PHIR3MWAIT))
+   elf_hwcap2 |= HWCAP2_PHIR3MWAIT;
 }
 
 /*
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 8cb57df..e4ff3d0 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -29,6 +29,7 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
u32 max_level;
u32 regs[4];
const struct cpuid_bit *cb;
+   u64 misc_thd_enable;
 
static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_INTEL_PT, CR_EBX,25, 0x0007, 0 },
@@ -54,4 +55,8 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
if (regs[cb->reg] & (1 << cb->bit))
set_cpu_cap(c, cb->feature);
}
+
+   rdmsrl(MSR_PHI_MISC_THD_FEATURE, misc_thd_enable);
+   if ((misc_thd_enable & MSR_PHI_MISC_THD_FEATURE_R3MWAIT) != 0)
+   set_cpu_cap(c, X86_FEATURE_PHIR3MWAIT);
 }
-- 
2.5.1



[PATCH v2 1/4] Add R3MWAIT register and bit to msr-info.h

2016-10-12 Thread Grzegorz Andrejczuk
Intel Xeon Phi x200 (codenamed Knights Landing) has MSR
MISC_THD_FEATURE_ENABLE 0x140.

Setting its 2nd bit make MONITOR and MWAIT instructions do not cause
invalid-opcode exception.

This commit adds this register prefixed by PHI and bit to msr-info.h
Reference:
https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait

Change-Id: If3b14c78f4e66d734e5a00921023a8c7cafc0cf3
Signed-off-by: Grzegorz Andrejczuk 
---
 arch/x86/include/asm/msr-index.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 56f4c66..df9d8d3 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -540,6 +540,11 @@
 #define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT   39
 #define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE   (1ULL << 
MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT)
 
+/* Intel Xeon Phi x200 ring 3 MONITOR/MWAIT */
+#define MSR_PHI_MISC_THD_FEATURE   0x0140
+#define MSR_PHI_MISC_THD_FEATURE_R3MWAIT_BIT   1
+#define MSR_PHI_MISC_THD_FEATURE_R3MWAIT   (1ULL << 
MSR_PHI_MISC_THD_FEATURE_R3MWAIT_BIT)
+
 #define MSR_IA32_TSC_DEADLINE  0x06E0
 
 /* P4/Xeon+ specific */
-- 
2.5.1



Re: [PATCH] iommu/vt-d: Fix the size calculation of pasid table

2016-10-12 Thread David Woodhouse
On Mon, 2016-09-19 at 14:18 +0200, Joerg Roedel wrote:
> [Cc'ing David]
> 
> On Mon, Sep 12, 2016 at 10:49:11AM +0800, Xunlei Pang wrote:
> > 
> > According to the vt-d spec, the size of pasid (state) entry is 8B
> > which equals 3 in power of 2, the number of pasid (state) entries
> > is (ecap_pss + 1) in power of 2.
> > 
> > Thus the right size of pasid (state) table in power of 2 should be
> > ecap_pss(iommu->ecap) plus "1+3=4" other than 7.
> > 
> > Signed-off-by: Xunlei Pang 
> > ---
> >  drivers/iommu/intel-svm.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> > index 8ebb353..cfa75c2 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -39,7 +39,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu 
> > *iommu)
> > struct page *pages;
> > int order;
> >  
> > -   order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
> > +   order = ecap_pss(iommu->ecap) + 4 - PAGE_SHIFT;
> > if (order < 0)
> > order = 0;
> 
> The patch seems to be correct, but I'll let David comment on it first.

Yes, that looks correct. I think we may also need to limit it, because
full 20-bit PASID support means we'll attempt an order 11 allocation.
But that's certainly correct for now

Acked-by: David Woodhouse 


-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 7/10] mmc: sdhci-xenon: Add support to PHYs of Marvell Xenon SDHC

2016-10-12 Thread Ziji Hu
Hi Adrian,

On 2016/10/11 20:39, Adrian Hunter wrote:
> On 07/10/16 18:22, Gregory CLEMENT wrote:
>> From: Ziji Hu 
>>
>> Marvell Xenon eMMC/SD/SDIO Host Controller contains PHY.
>> Three types of PHYs are supported.
>>
>> Add support to multiple types of PHYs init and configuration.
>> Add register definitions of PHYs.
>>
>> Signed-off-by: Hu Ziji 
>> Reviewed-by: Gregory CLEMENT 
>> Signed-off-by: Gregory CLEMENT 
>> ---
>>  MAINTAINERS|1 +-
>>  drivers/mmc/host/Makefile  |2 +-
>>  drivers/mmc/host/sdhci-xenon-phy.c | 1141 +-
>>  drivers/mmc/host/sdhci-xenon-phy.h |  157 -
>>  drivers/mmc/host/sdhci-xenon.c |4 +-
>>  drivers/mmc/host/sdhci-xenon.h |   17 +-
>>  6 files changed, 1321 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/mmc/host/sdhci-xenon-phy.c
>>  create mode 100644 drivers/mmc/host/sdhci-xenon-phy.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 859420e5dfd3..b5673c2ee5f2 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -7583,6 +7583,7 @@ M: Ziji Hu 
>>  L:  linux-...@vger.kernel.org
>>  S:  Supported
>>  F:  drivers/mmc/host/sdhci-xenon.*
>> +F:  drivers/mmc/host/sdhci-xenon-phy.*
>>  F:  Documentation/devicetree/bindings/mmc/marvell,sdhci-xenon.txt
>>  
>>  MATROX FRAMEBUFFER DRIVER
>> diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
>> index 75eaf743486c..4f2854556ff7 100644
>> --- a/drivers/mmc/host/Makefile
>> +++ b/drivers/mmc/host/Makefile
>> @@ -82,4 +82,4 @@ ifeq ($(CONFIG_CB710_DEBUG),y)
>>  endif
>>  
>>  obj-$(CONFIG_MMC_SDHCI_XENON)   += sdhci-xenon-driver.o
>> -sdhci-xenon-driver-y+= sdhci-xenon.o
>> +sdhci-xenon-driver-y+= sdhci-xenon.o sdhci-xenon-phy.o
>> diff --git a/drivers/mmc/host/sdhci-xenon-phy.c 
>> b/drivers/mmc/host/sdhci-xenon-phy.c
>> new file mode 100644
>> index ..4eb8fea1bec9
>> --- /dev/null
>> +++ b/drivers/mmc/host/sdhci-xenon-phy.c
> 
> 
> 
>> +static int __xenon_emmc_delay_adj_test(struct mmc_card *card)
>> +{
>> +int err;
>> +u8 *ext_csd = NULL;
>> +
>> +err = mmc_get_ext_csd(card, _csd);
>> +kfree(ext_csd);
>> +
>> +return err;
>> +}
>> +
>> +static int __xenon_sdio_delay_adj_test(struct mmc_card *card)
>> +{
>> +struct mmc_command cmd = {0};
>> +int err;
>> +
>> +cmd.opcode = SD_IO_RW_DIRECT;
>> +cmd.flags = MMC_RSP_R5 | MMC_CMD_AC;
>> +
>> +err = mmc_wait_for_cmd(card->host, , 0);
>> +if (err)
>> +return err;
>> +
>> +if (cmd.resp[0] & R5_ERROR)
>> +return -EIO;
>> +if (cmd.resp[0] & R5_FUNCTION_NUMBER)
>> +return -EINVAL;
>> +if (cmd.resp[0] & R5_OUT_OF_RANGE)
>> +return -ERANGE;
>> +return 0;
>> +}
>> +
>> +static int __xenon_sd_delay_adj_test(struct mmc_card *card)
>> +{
>> +struct mmc_command cmd = {0};
>> +int err;
>> +
>> +cmd.opcode = MMC_SEND_STATUS;
>> +cmd.arg = card->rca << 16;
>> +cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
>> +
>> +err = mmc_wait_for_cmd(card->host, , 0);
>> +return err;
>> +}
>> +
>> +static int xenon_delay_adj_test(struct mmc_card *card)
>> +{
>> +WARN_ON(!card);
>> +WARN_ON(!card->host);
>> +
>> +if (mmc_card_mmc(card))
>> +return __xenon_emmc_delay_adj_test(card);
>> +else if (mmc_card_sd(card))
>> +return __xenon_sd_delay_adj_test(card);
>> +else if (mmc_card_sdio(card))
>> +return __xenon_sdio_delay_adj_test(card);
>> +else
>> +return -EINVAL;
>> +}
> 
> So you are issuing commands from the ->set_ios() callback.  I would want to
> get Ulf's OK for that before going further.
> 
Yes, you are correct.
In some speed mode, Xenon SDHC has to send a series of transfers to 
search for a perfect sampling point in PHY delay line.
It is like tuning process.

> One thing: you will need to ensure you don't trigger get HS400 re-tuning
> because it will call back into ->set_ios().
> 
Could you please make the term "HS400 re-tuning" more detailed?
In current MMC driver, "HS400 re-tuning" will go back to HS200, execute 
HS200 tuning and come back to HS400.
I'm sure our Xenon SDHC will not execute it.

However, in coming eMMC 5.2, there is a real HS400 re-tuning, in which 
tuning can be directly executed in HS400 mode.
Our Xenon SDHC will neither trigger this HS400 re-tuning.
But since so far there is no such feature in MMC driver, I cannot give 
you a 100% guarantee now.

> And you have the problem that you need to get a reference to the card before
> the card device has been added.  As I wrote in response to the previous
> patch, you should get Ulf's help with that too.
> 
Sure.
I will get card_candidate solved at first.
Thank you 

Re: [PATCH 24/54] md/raid1: Improve another size determination in setup_conf()

2016-10-12 Thread Jes Sorensen
Dan Carpenter  writes:
> Compare:
>
>   foo = kmalloc(sizeof(*foo), GFP_KERNEL);
>
> This says you are allocating enough space for foo.  It can be reviewed
> by looking at one line.  If you change the type of foo it will still
> work.
>
>   foo = kmalloc(sizeof(struct whatever), GFP_KERNEL);
>
> There isn't enough information to say if this is correct.  If you change
> the type of foo then you have to update the allocation as well.
>
> It's not a super common type of bug, but I see it occasionally.

I know what you are saying, but the latter in my book is easier to read
and reminds you what the type is when you review the code.

Point being this comes down to personal preference and stating that the
former is the right way or making that a rule and using checkpatch to
harrass people with patches to change it is bogus.

Jes


[GIT PULL] xfs: shared data extents support for 4.9-rc1

2016-10-12 Thread Dave Chinner
Hi Linus,

This is the second part of the XFS updates for this merge cycle.
This pullreq contains the new shared data extents feature for XFS,
and can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git 
tags/xfs-reflink-for-linus-4.9-rc1

The full pull request output is below.

Given the complexity and size of this change I am expecting - like
the addition of reverse mapping last cycle - that there will be some
follow-up bug fixes and cleanups around the -rc3 stage for issues
that I'm sure will show up once the code hits a wider userbase.

What it is:

At the most basic level we are simply adding shared data extents to
XFS - i.e. a single extent on disk can now have multiple owners. To
do this we have to add new on-disk features to both track the shared
extents and the number of times they've been shared. This is done by
the new "refcount" btree that sits in every allocation group. When
we share or unshare an extent, this tree gets updated.

Along with this new tree, the reverse mapping tree needs to be
updated to track each owner or a shared extent. This also needs to
be updated ever share/unshare operation. These interactions at
extent allocation and freeing time have complex ordering and
recovery constraints, so there's a significant amount of new
intent-based transaction code to ensure that operations are
performed atomically from both the runtime and integrity/crash
recovery perspectives.

We also need to break sharing when writes hit a shared extent - this
is where the new copy-on-write implementation comes in. We allocate
new storage and copy the original data along with the overwrite data
into the new location.  We only do this for data as we don't share
metadata at all - each inode has it's own metadata that tracks the
shared data extents, the extents undergoing CoW and it's own private
extents.

Of course, being XFS, nothing is simple - we use delayed allocation
for CoW similar to how we use it for normal writes. ENOSPC is a
significant issue here - we build on the reservation code added
in 4.8-rc1 with the reverse mapping feature to ensure we don't get
spurious ENOSPC issues part way through a CoW operation. These
mechanisms also help minimise fragmentation due to repeated CoW
operations.  To further reduce fragmentation overhead, we've also
introduced a CoW extent size hint, which indicates how large a
region we should allocate when we execute a CoW operation.

With all this functionality in place, we can hook up
.copy_file_range, .clone_file_range and .dedupe_file_range and we
gain all the capabilities of reflink and other vfs provided
functionality that enable manipulation to shared extents. We also
added a fallocate mode that explicitly unshares a range of a file,
which we implemented as an explicit CoW of all the shared extents in
a file.

As such, it's a huge chunk of new functionality with new on-disk
format features and internal infrastructure. It warns at mount time
as an experimental feature and that it may eat data (as we do with
all new on-disk features until they stabilise).  We have not
released userspace suport for it yet - userspace support currently
requires download from Darrick's xfsprogs repo and build from
source, so the access to this feature is really developer/tester
only at this point. Initial userspace support will be released at
the same time the kernel with this code in it is released.

The new code causes 5-6 new failures with xfstests - these aren't
serious functional failures but things the output of tests changing
slightly due to perturbations in layouts, space usage, etc.  OTOH,
we've added 150+ new tests to xfstests that specifically exercise
this new functionality so it's got far better test coverage than any
functionality we've previously added to XFS.

Darrick has done a pretty amazing job getting us to this stage, and
special mention also needs to go to Christoph (review, testing,
improvements and bug fixes) and Brian (caught several intricate
bugs during review) for the effort they've also put in.

Thanks,

-Dave.

--
The following changes since commit 155cd433b516506df065866f3d974661f6473572:

  Merge branch 'xfs-4.9-log-recovery-fixes' into for-next (2016-10-03 09:56:28 
+1100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git 
tags/xfs-reflink-for-linus-4.9-rc1

for you to fetch changes up to feac470e3642e8956ac9b7f14224e6b301b9219d:

  xfs: convert COW blocks to real blocks before unwritten extent conversion 
(2016-10-11 09:03:19 +1100)


xfs: reflink update for 4.9-rc1

< XFS has gained super CoW powers! >
 --
\   ^__^
 \  (oo)\___
(__)\   )\/\
||w |
|| ||

Included in this update:
- unshare range (FALLOC_FL_UNSHARE) support for fallocate
- copy-on-write extent size hints 

[PATCH v2 1/4] Add R3MWAIT register and bit to msr-info.h

2016-10-12 Thread Grzegorz Andrejczuk
Intel Xeon Phi x200 (codenamed Knights Landing) has MSR
MISC_THD_FEATURE_ENABLE 0x140.

Setting its 2nd bit make MONITOR and MWAIT instructions do not cause
invalid-opcode exception.

This commit adds this register prefixed by PHI and bit to msr-info.h
Reference:
https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait

Change-Id: If3b14c78f4e66d734e5a00921023a8c7cafc0cf3
Signed-off-by: Grzegorz Andrejczuk 
---
 arch/x86/include/asm/msr-index.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 56f4c66..df9d8d3 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -540,6 +540,11 @@
 #define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT   39
 #define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE   (1ULL << 
MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT)
 
+/* Intel Xeon Phi x200 ring 3 MONITOR/MWAIT */
+#define MSR_PHI_MISC_THD_FEATURE   0x0140
+#define MSR_PHI_MISC_THD_FEATURE_R3MWAIT_BIT   1
+#define MSR_PHI_MISC_THD_FEATURE_R3MWAIT   (1ULL << 
MSR_PHI_MISC_THD_FEATURE_R3MWAIT_BIT)
+
 #define MSR_IA32_TSC_DEADLINE  0x06E0
 
 /* P4/Xeon+ specific */
-- 
2.5.1



Re: [PATCH] iommu/vt-d: Fix the size calculation of pasid table

2016-10-12 Thread David Woodhouse
On Mon, 2016-09-19 at 14:18 +0200, Joerg Roedel wrote:
> [Cc'ing David]
> 
> On Mon, Sep 12, 2016 at 10:49:11AM +0800, Xunlei Pang wrote:
> > 
> > According to the vt-d spec, the size of pasid (state) entry is 8B
> > which equals 3 in power of 2, the number of pasid (state) entries
> > is (ecap_pss + 1) in power of 2.
> > 
> > Thus the right size of pasid (state) table in power of 2 should be
> > ecap_pss(iommu->ecap) plus "1+3=4" other than 7.
> > 
> > Signed-off-by: Xunlei Pang 
> > ---
> >  drivers/iommu/intel-svm.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> > index 8ebb353..cfa75c2 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -39,7 +39,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu 
> > *iommu)
> > struct page *pages;
> > int order;
> >  
> > -   order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
> > +   order = ecap_pss(iommu->ecap) + 4 - PAGE_SHIFT;
> > if (order < 0)
> > order = 0;
> 
> The patch seems to be correct, but I'll let David comment on it first.

Yes, that looks correct. I think we may also need to limit it, because
full 20-bit PASID support means we'll attempt an order 11 allocation.
But that's certainly correct for now

Acked-by: David Woodhouse 


-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 7/10] mmc: sdhci-xenon: Add support to PHYs of Marvell Xenon SDHC

2016-10-12 Thread Ziji Hu
Hi Adrian,

On 2016/10/11 20:39, Adrian Hunter wrote:
> On 07/10/16 18:22, Gregory CLEMENT wrote:
>> From: Ziji Hu 
>>
>> Marvell Xenon eMMC/SD/SDIO Host Controller contains PHY.
>> Three types of PHYs are supported.
>>
>> Add support to multiple types of PHYs init and configuration.
>> Add register definitions of PHYs.
>>
>> Signed-off-by: Hu Ziji 
>> Reviewed-by: Gregory CLEMENT 
>> Signed-off-by: Gregory CLEMENT 
>> ---
>>  MAINTAINERS|1 +-
>>  drivers/mmc/host/Makefile  |2 +-
>>  drivers/mmc/host/sdhci-xenon-phy.c | 1141 +-
>>  drivers/mmc/host/sdhci-xenon-phy.h |  157 -
>>  drivers/mmc/host/sdhci-xenon.c |4 +-
>>  drivers/mmc/host/sdhci-xenon.h |   17 +-
>>  6 files changed, 1321 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/mmc/host/sdhci-xenon-phy.c
>>  create mode 100644 drivers/mmc/host/sdhci-xenon-phy.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 859420e5dfd3..b5673c2ee5f2 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -7583,6 +7583,7 @@ M: Ziji Hu 
>>  L:  linux-...@vger.kernel.org
>>  S:  Supported
>>  F:  drivers/mmc/host/sdhci-xenon.*
>> +F:  drivers/mmc/host/sdhci-xenon-phy.*
>>  F:  Documentation/devicetree/bindings/mmc/marvell,sdhci-xenon.txt
>>  
>>  MATROX FRAMEBUFFER DRIVER
>> diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
>> index 75eaf743486c..4f2854556ff7 100644
>> --- a/drivers/mmc/host/Makefile
>> +++ b/drivers/mmc/host/Makefile
>> @@ -82,4 +82,4 @@ ifeq ($(CONFIG_CB710_DEBUG),y)
>>  endif
>>  
>>  obj-$(CONFIG_MMC_SDHCI_XENON)   += sdhci-xenon-driver.o
>> -sdhci-xenon-driver-y+= sdhci-xenon.o
>> +sdhci-xenon-driver-y+= sdhci-xenon.o sdhci-xenon-phy.o
>> diff --git a/drivers/mmc/host/sdhci-xenon-phy.c 
>> b/drivers/mmc/host/sdhci-xenon-phy.c
>> new file mode 100644
>> index ..4eb8fea1bec9
>> --- /dev/null
>> +++ b/drivers/mmc/host/sdhci-xenon-phy.c
> 
> 
> 
>> +static int __xenon_emmc_delay_adj_test(struct mmc_card *card)
>> +{
>> +int err;
>> +u8 *ext_csd = NULL;
>> +
>> +err = mmc_get_ext_csd(card, _csd);
>> +kfree(ext_csd);
>> +
>> +return err;
>> +}
>> +
>> +static int __xenon_sdio_delay_adj_test(struct mmc_card *card)
>> +{
>> +struct mmc_command cmd = {0};
>> +int err;
>> +
>> +cmd.opcode = SD_IO_RW_DIRECT;
>> +cmd.flags = MMC_RSP_R5 | MMC_CMD_AC;
>> +
>> +err = mmc_wait_for_cmd(card->host, , 0);
>> +if (err)
>> +return err;
>> +
>> +if (cmd.resp[0] & R5_ERROR)
>> +return -EIO;
>> +if (cmd.resp[0] & R5_FUNCTION_NUMBER)
>> +return -EINVAL;
>> +if (cmd.resp[0] & R5_OUT_OF_RANGE)
>> +return -ERANGE;
>> +return 0;
>> +}
>> +
>> +static int __xenon_sd_delay_adj_test(struct mmc_card *card)
>> +{
>> +struct mmc_command cmd = {0};
>> +int err;
>> +
>> +cmd.opcode = MMC_SEND_STATUS;
>> +cmd.arg = card->rca << 16;
>> +cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
>> +
>> +err = mmc_wait_for_cmd(card->host, , 0);
>> +return err;
>> +}
>> +
>> +static int xenon_delay_adj_test(struct mmc_card *card)
>> +{
>> +WARN_ON(!card);
>> +WARN_ON(!card->host);
>> +
>> +if (mmc_card_mmc(card))
>> +return __xenon_emmc_delay_adj_test(card);
>> +else if (mmc_card_sd(card))
>> +return __xenon_sd_delay_adj_test(card);
>> +else if (mmc_card_sdio(card))
>> +return __xenon_sdio_delay_adj_test(card);
>> +else
>> +return -EINVAL;
>> +}
> 
> So you are issuing commands from the ->set_ios() callback.  I would want to
> get Ulf's OK for that before going further.
> 
Yes, you are correct.
In some speed mode, Xenon SDHC has to send a series of transfers to 
search for a perfect sampling point in PHY delay line.
It is like tuning process.

> One thing: you will need to ensure you don't trigger get HS400 re-tuning
> because it will call back into ->set_ios().
> 
Could you please make the term "HS400 re-tuning" more detailed?
In current MMC driver, "HS400 re-tuning" will go back to HS200, execute 
HS200 tuning and come back to HS400.
I'm sure our Xenon SDHC will not execute it.

However, in coming eMMC 5.2, there is a real HS400 re-tuning, in which 
tuning can be directly executed in HS400 mode.
Our Xenon SDHC will neither trigger this HS400 re-tuning.
But since so far there is no such feature in MMC driver, I cannot give 
you a 100% guarantee now.

> And you have the problem that you need to get a reference to the card before
> the card device has been added.  As I wrote in response to the previous
> patch, you should get Ulf's help with that too.
> 
Sure.
I will get card_candidate solved at first.
Thank you again for your review and help.

Thank you.

Best regards,
Hu Ziji
> 


Re: [PATCH 24/54] md/raid1: Improve another size determination in setup_conf()

2016-10-12 Thread Jes Sorensen
Dan Carpenter  writes:
> Compare:
>
>   foo = kmalloc(sizeof(*foo), GFP_KERNEL);
>
> This says you are allocating enough space for foo.  It can be reviewed
> by looking at one line.  If you change the type of foo it will still
> work.
>
>   foo = kmalloc(sizeof(struct whatever), GFP_KERNEL);
>
> There isn't enough information to say if this is correct.  If you change
> the type of foo then you have to update the allocation as well.
>
> It's not a super common type of bug, but I see it occasionally.

I know what you are saying, but the latter in my book is easier to read
and reminds you what the type is when you review the code.

Point being this comes down to personal preference and stating that the
former is the right way or making that a rule and using checkpatch to
harrass people with patches to change it is bogus.

Jes


[GIT PULL] xfs: shared data extents support for 4.9-rc1

2016-10-12 Thread Dave Chinner
Hi Linus,

This is the second part of the XFS updates for this merge cycle.
This pullreq contains the new shared data extents feature for XFS,
and can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git 
tags/xfs-reflink-for-linus-4.9-rc1

The full pull request output is below.

Given the complexity and size of this change I am expecting - like
the addition of reverse mapping last cycle - that there will be some
follow-up bug fixes and cleanups around the -rc3 stage for issues
that I'm sure will show up once the code hits a wider userbase.

What it is:

At the most basic level we are simply adding shared data extents to
XFS - i.e. a single extent on disk can now have multiple owners. To
do this we have to add new on-disk features to both track the shared
extents and the number of times they've been shared. This is done by
the new "refcount" btree that sits in every allocation group. When
we share or unshare an extent, this tree gets updated.

Along with this new tree, the reverse mapping tree needs to be
updated to track each owner or a shared extent. This also needs to
be updated ever share/unshare operation. These interactions at
extent allocation and freeing time have complex ordering and
recovery constraints, so there's a significant amount of new
intent-based transaction code to ensure that operations are
performed atomically from both the runtime and integrity/crash
recovery perspectives.

We also need to break sharing when writes hit a shared extent - this
is where the new copy-on-write implementation comes in. We allocate
new storage and copy the original data along with the overwrite data
into the new location.  We only do this for data as we don't share
metadata at all - each inode has it's own metadata that tracks the
shared data extents, the extents undergoing CoW and it's own private
extents.

Of course, being XFS, nothing is simple - we use delayed allocation
for CoW similar to how we use it for normal writes. ENOSPC is a
significant issue here - we build on the reservation code added
in 4.8-rc1 with the reverse mapping feature to ensure we don't get
spurious ENOSPC issues part way through a CoW operation. These
mechanisms also help minimise fragmentation due to repeated CoW
operations.  To further reduce fragmentation overhead, we've also
introduced a CoW extent size hint, which indicates how large a
region we should allocate when we execute a CoW operation.

With all this functionality in place, we can hook up
.copy_file_range, .clone_file_range and .dedupe_file_range and we
gain all the capabilities of reflink and other vfs provided
functionality that enable manipulation to shared extents. We also
added a fallocate mode that explicitly unshares a range of a file,
which we implemented as an explicit CoW of all the shared extents in
a file.

As such, it's a huge chunk of new functionality with new on-disk
format features and internal infrastructure. It warns at mount time
as an experimental feature and that it may eat data (as we do with
all new on-disk features until they stabilise).  We have not
released userspace suport for it yet - userspace support currently
requires download from Darrick's xfsprogs repo and build from
source, so the access to this feature is really developer/tester
only at this point. Initial userspace support will be released at
the same time the kernel with this code in it is released.

The new code causes 5-6 new failures with xfstests - these aren't
serious functional failures but things the output of tests changing
slightly due to perturbations in layouts, space usage, etc.  OTOH,
we've added 150+ new tests to xfstests that specifically exercise
this new functionality so it's got far better test coverage than any
functionality we've previously added to XFS.

Darrick has done a pretty amazing job getting us to this stage, and
special mention also needs to go to Christoph (review, testing,
improvements and bug fixes) and Brian (caught several intricate
bugs during review) for the effort they've also put in.

Thanks,

-Dave.

--
The following changes since commit 155cd433b516506df065866f3d974661f6473572:

  Merge branch 'xfs-4.9-log-recovery-fixes' into for-next (2016-10-03 09:56:28 
+1100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git 
tags/xfs-reflink-for-linus-4.9-rc1

for you to fetch changes up to feac470e3642e8956ac9b7f14224e6b301b9219d:

  xfs: convert COW blocks to real blocks before unwritten extent conversion 
(2016-10-11 09:03:19 +1100)


xfs: reflink update for 4.9-rc1

< XFS has gained super CoW powers! >
 --
\   ^__^
 \  (oo)\___
(__)\   )\/\
||w |
|| ||

Included in this update:
- unshare range (FALLOC_FL_UNSHARE) support for fallocate
- copy-on-write extent size hints 

[PATCH v2 3/4] Add hwcap2 for x86

2016-10-12 Thread Grzegorz Andrejczuk
Add hwcap2 attribute for x86.
Reserve 1st bit of HWCAP2 for exposing Xeon Phi ring 3 monitor/mwait.
With this userspace apps can detect Ring 3 MONITOR/MWAIT instructions.

Change-Id: I37d0354d1e2b9594d7feebc2bacda30b68163efe
Signed-off-by: Grzegorz Andrejczuk 
---
 arch/x86/include/asm/elf.h| 7 +++
 arch/x86/include/uapi/asm/hwcap.h | 7 +++
 arch/x86/kernel/cpu/common.c  | 3 +++
 3 files changed, 17 insertions(+)
 create mode 100644 arch/x86/include/uapi/asm/hwcap.h

diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index e7f155c..a3f7856 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -258,6 +258,13 @@ extern int force_personality32;
 
 #define ELF_HWCAP  (boot_cpu_data.x86_capability[CPUID_1_EDX])
 
+extern unsigned int elf_hwcap2;
+
+/* HWCAP2 supplies kernel enabled CPU feature, so that the application
+   can know that it can safely use them. The bits are defined in
+   uapi/asm/hwcap.h. */
+#define ELF_HWCAP2 elf_hwcap2
+
 /* This yields a string that ld.so will use to load implementation
specific libraries for optimization.  This is more specific in
intent than poking at uname or /proc/cpuinfo.
diff --git a/arch/x86/include/uapi/asm/hwcap.h 
b/arch/x86/include/uapi/asm/hwcap.h
new file mode 100644
index 000..d1f4f98
--- /dev/null
+++ b/arch/x86/include/uapi/asm/hwcap.h
@@ -0,0 +1,7 @@
+#ifndef _ASM_HWCAP_H
+#define _ASM_HWCAP_H 1
+
+/* Kernel enabled Ring 3 MWAIT for Xeon Phi*/
+#define HWCAP2_PHIR3MWAIT  (1 << 0)
+/* upto bit 31 free */
+#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index bcc9ccc..93ffaa5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -51,6 +52,8 @@
 
 #include "cpu.h"
 
+unsigned elf_hwcap2 __read_mostly;
+
 /* all of these masks are initialized in setup_cpu_local_masks() */
 cpumask_var_t cpu_initialized_mask;
 cpumask_var_t cpu_callout_mask;
-- 
2.5.1



[PATCH v2 2/4] Add enabling of the R3 MWAIT during boot for KNL

2016-10-12 Thread Grzegorz Andrejczuk
If processor is Intel Xeon Phi we enable user-level mwait feature.
Enabling this feature suppreses invalid-opcode error, when MONITOR/MWAIT
is called from ring 3.

Change-Id: I1c7defb99296b022790a068a6c725b3e860cd68c
Signed-off-by: Grzegorz Andrejczuk 
---
 arch/x86/kernel/cpu/intel.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index fcd484d..ac6df08 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -61,6 +61,14 @@ void check_mpx_erratum(struct cpuinfo_x86 *c)
}
 }
 
+static int phir3mwait = 1;
+static int __init phir3mwait_disable(char *value)
+{
+   phir3mwait = 0;
+   return 1;
+}
+__setup("intel-phir3mwait=disable", phir3mwait_disable);
+
 static void early_init_intel(struct cpuinfo_x86 *c)
 {
u64 misc_enable;
@@ -211,6 +219,25 @@ static void early_init_intel(struct cpuinfo_x86 *c)
}
 
check_mpx_erratum(c);
+
+   /*
+   * Setting ring 3 MONITOR/MWAIT for all threads
+   * when CPU is Xeon Phi Family x200
+   * This can be disabled with phir3mwait=disable cmdline switch.
+   * We preserve the reserved values and set only 2nd bit.
+   * Ref:
+   * 
https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait
+   */
+   if (c->x86 == 6 &&
+   c->x86_model == INTEL_FAM6_XEON_PHI_KNL &&
+   phir3mwait) {
+   u64 prev;
+
+   rdmsrl(MSR_PHI_MISC_THD_FEATURE, prev);
+   if ((prev & MSR_PHI_MISC_THD_FEATURE_R3MWAIT) == 0)
+   wrmsrl(MSR_PHI_MISC_THD_FEATURE,
+  prev | MSR_PHI_MISC_THD_FEATURE_R3MWAIT);
+   }
 }
 
 #ifdef CONFIG_X86_32
-- 
2.5.1



[PATCH v2 3/4] Add hwcap2 for x86

2016-10-12 Thread Grzegorz Andrejczuk
Add hwcap2 attribute for x86.
Reserve 1st bit of HWCAP2 for exposing Xeon Phi ring 3 monitor/mwait.
With this userspace apps can detect Ring 3 MONITOR/MWAIT instructions.

Change-Id: I37d0354d1e2b9594d7feebc2bacda30b68163efe
Signed-off-by: Grzegorz Andrejczuk 
---
 arch/x86/include/asm/elf.h| 7 +++
 arch/x86/include/uapi/asm/hwcap.h | 7 +++
 arch/x86/kernel/cpu/common.c  | 3 +++
 3 files changed, 17 insertions(+)
 create mode 100644 arch/x86/include/uapi/asm/hwcap.h

diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index e7f155c..a3f7856 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -258,6 +258,13 @@ extern int force_personality32;
 
 #define ELF_HWCAP  (boot_cpu_data.x86_capability[CPUID_1_EDX])
 
+extern unsigned int elf_hwcap2;
+
+/* HWCAP2 supplies kernel enabled CPU feature, so that the application
+   can know that it can safely use them. The bits are defined in
+   uapi/asm/hwcap.h. */
+#define ELF_HWCAP2 elf_hwcap2
+
 /* This yields a string that ld.so will use to load implementation
specific libraries for optimization.  This is more specific in
intent than poking at uname or /proc/cpuinfo.
diff --git a/arch/x86/include/uapi/asm/hwcap.h 
b/arch/x86/include/uapi/asm/hwcap.h
new file mode 100644
index 000..d1f4f98
--- /dev/null
+++ b/arch/x86/include/uapi/asm/hwcap.h
@@ -0,0 +1,7 @@
+#ifndef _ASM_HWCAP_H
+#define _ASM_HWCAP_H 1
+
+/* Kernel enabled Ring 3 MWAIT for Xeon Phi*/
+#define HWCAP2_PHIR3MWAIT  (1 << 0)
+/* upto bit 31 free */
+#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index bcc9ccc..93ffaa5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -51,6 +52,8 @@
 
 #include "cpu.h"
 
+unsigned elf_hwcap2 __read_mostly;
+
 /* all of these masks are initialized in setup_cpu_local_masks() */
 cpumask_var_t cpu_initialized_mask;
 cpumask_var_t cpu_callout_mask;
-- 
2.5.1



[PATCH v2 2/4] Add enabling of the R3 MWAIT during boot for KNL

2016-10-12 Thread Grzegorz Andrejczuk
If processor is Intel Xeon Phi we enable user-level mwait feature.
Enabling this feature suppreses invalid-opcode error, when MONITOR/MWAIT
is called from ring 3.

Change-Id: I1c7defb99296b022790a068a6c725b3e860cd68c
Signed-off-by: Grzegorz Andrejczuk 
---
 arch/x86/kernel/cpu/intel.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index fcd484d..ac6df08 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -61,6 +61,14 @@ void check_mpx_erratum(struct cpuinfo_x86 *c)
}
 }
 
+static int phir3mwait = 1;
+static int __init phir3mwait_disable(char *value)
+{
+   phir3mwait = 0;
+   return 1;
+}
+__setup("intel-phir3mwait=disable", phir3mwait_disable);
+
 static void early_init_intel(struct cpuinfo_x86 *c)
 {
u64 misc_enable;
@@ -211,6 +219,25 @@ static void early_init_intel(struct cpuinfo_x86 *c)
}
 
check_mpx_erratum(c);
+
+   /*
+   * Setting ring 3 MONITOR/MWAIT for all threads
+   * when CPU is Xeon Phi Family x200
+   * This can be disabled with phir3mwait=disable cmdline switch.
+   * We preserve the reserved values and set only 2nd bit.
+   * Ref:
+   * 
https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait
+   */
+   if (c->x86 == 6 &&
+   c->x86_model == INTEL_FAM6_XEON_PHI_KNL &&
+   phir3mwait) {
+   u64 prev;
+
+   rdmsrl(MSR_PHI_MISC_THD_FEATURE, prev);
+   if ((prev & MSR_PHI_MISC_THD_FEATURE_R3MWAIT) == 0)
+   wrmsrl(MSR_PHI_MISC_THD_FEATURE,
+  prev | MSR_PHI_MISC_THD_FEATURE_R3MWAIT);
+   }
 }
 
 #ifdef CONFIG_X86_32
-- 
2.5.1



[PATCH v2 0/4] Enabling Ring 3 MONITOR/MWAIT feature for Knights Landing

2016-10-12 Thread Grzegorz Andrejczuk
These patches enable Intel Xeon Phi x200 feature to use MONITOR/MWAIT
instruction in ring 3 (userspace) Patches set MSR 0x140 for all logical CPUs.
Then expose it as CPU feature and introduces elf HWCAP capability for x86.
Reference:
https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait

v2:
Check MSR before wrmsrl
Shortened names
Used Word 3 for feature init_scattered_cpuid_features()
Fixed commit messages

Grzegorz Andrejczuk (4):
  Add R3MWAIT register and bit to msr-info.h
  Add enabling of the R3 MWAIT during boot for KNL
  Add hwcap2 for x86
  Add R3MWAIT to CPU features

 arch/x86/include/asm/cpufeatures.h |  2 ++
 arch/x86/include/asm/elf.h |  7 +++
 arch/x86/include/asm/msr-index.h   |  5 +
 arch/x86/include/uapi/asm/hwcap.h  |  7 +++
 arch/x86/kernel/cpu/common.c   |  6 ++
 arch/x86/kernel/cpu/intel.c| 27 +++
 arch/x86/kernel/cpu/scattered.c|  5 +
 7 files changed, 59 insertions(+)
 create mode 100644 arch/x86/include/uapi/asm/hwcap.h

-- 
2.5.1



[PATCH v2 0/4] Enabling Ring 3 MONITOR/MWAIT feature for Knights Landing

2016-10-12 Thread Grzegorz Andrejczuk
These patches enable Intel Xeon Phi x200 feature to use MONITOR/MWAIT
instruction in ring 3 (userspace) Patches set MSR 0x140 for all logical CPUs.
Then expose it as CPU feature and introduces elf HWCAP capability for x86.
Reference:
https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait

v2:
Check MSR before wrmsrl
Shortened names
Used Word 3 for feature init_scattered_cpuid_features()
Fixed commit messages

Grzegorz Andrejczuk (4):
  Add R3MWAIT register and bit to msr-info.h
  Add enabling of the R3 MWAIT during boot for KNL
  Add hwcap2 for x86
  Add R3MWAIT to CPU features

 arch/x86/include/asm/cpufeatures.h |  2 ++
 arch/x86/include/asm/elf.h |  7 +++
 arch/x86/include/asm/msr-index.h   |  5 +
 arch/x86/include/uapi/asm/hwcap.h  |  7 +++
 arch/x86/kernel/cpu/common.c   |  6 ++
 arch/x86/kernel/cpu/intel.c| 27 +++
 arch/x86/kernel/cpu/scattered.c|  5 +
 7 files changed, 59 insertions(+)
 create mode 100644 arch/x86/include/uapi/asm/hwcap.h

-- 
2.5.1



Re: [PATCH] char/tpm: Check return code of wait_for_tpm_stat

2016-10-12 Thread Jarkko Sakkinen
On Tue, Oct 11, 2016 at 08:01:09PM +0200, Peter Huewe wrote:
> 
> 
> Hi
> Am 11. Oktober 2016 19:13:13 MESZ, schrieb Jason Gunthorpe 
> :
> >On Tue, Oct 11, 2016 at 03:01:01PM +0300, Jarkko Sakkinen wrote:
> >> From: Peter Huewe 
> >> 
> >> In some weird cases it might be possible that the TPM does not set
> >> STS.VALID within the given timeout time (or ever) but sets STS.EXPECT
> >> (STS=0x0C) In this case the driver gets stuck in the while loop of
> >> tpm_tis_send_data and loops endlessly.
> >
> >Doesn't that exchange mean the TPM has lost synchronization with the
> >driver? Or maybe it crashed executing a command or something..
> 
> I saw that in the field on quite a few (similar) systems with our lpc tpms - 
> so it affects end users.
> Yes it is caused by some desynchronization or something similar.
> 
> If you manually send a commandReady by mmaping the memory region you can 
> un-stuck the driver and the situation was never seen again on that system.
> 
> The exact reason how this happens is yet unknown, but the driver should 
> definitely not be stuck in an endless loop (which zombies the application 
> too) in that case but bail out as defined in the TIS protocol. The next 
> access sends the cr which cures the unsynchronization.

Even as a sanity check return codes should be checked so in
any case I leaned towards applying this patch. It makes the
driver more robust.

/Jarkko


Re: [PATCH] char/tpm: Check return code of wait_for_tpm_stat

2016-10-12 Thread Jarkko Sakkinen
On Tue, Oct 11, 2016 at 08:01:09PM +0200, Peter Huewe wrote:
> 
> 
> Hi
> Am 11. Oktober 2016 19:13:13 MESZ, schrieb Jason Gunthorpe 
> :
> >On Tue, Oct 11, 2016 at 03:01:01PM +0300, Jarkko Sakkinen wrote:
> >> From: Peter Huewe 
> >> 
> >> In some weird cases it might be possible that the TPM does not set
> >> STS.VALID within the given timeout time (or ever) but sets STS.EXPECT
> >> (STS=0x0C) In this case the driver gets stuck in the while loop of
> >> tpm_tis_send_data and loops endlessly.
> >
> >Doesn't that exchange mean the TPM has lost synchronization with the
> >driver? Or maybe it crashed executing a command or something..
> 
> I saw that in the field on quite a few (similar) systems with our lpc tpms - 
> so it affects end users.
> Yes it is caused by some desynchronization or something similar.
> 
> If you manually send a commandReady by mmaping the memory region you can 
> un-stuck the driver and the situation was never seen again on that system.
> 
> The exact reason how this happens is yet unknown, but the driver should 
> definitely not be stuck in an endless loop (which zombies the application 
> too) in that case but bail out as defined in the TIS protocol. The next 
> access sends the cr which cures the unsynchronization.

Even as a sanity check return codes should be checked so in
any case I leaned towards applying this patch. It makes the
driver more robust.

/Jarkko


Re: GPU-DRM-Savage: Less function calls in savage_bci_cmdbuf() after error detection

2016-10-12 Thread SF Markus Elfring
>> Date: Thu, 18 Aug 2016 21:28:58 +0200
>>
>> The kfree() function was called in a few cases by the
>> savage_bci_cmdbuf() function during error handling
>> even if a passed variable contained a null pointer.
>>
>> Adjust jump targets according to the Linux coding style convention.
>>
>> Signed-off-by: Markus Elfring 
> 
> Not sure this is worth it, I'll pass. Patch 1 merged.

Unfortunately, it seems that this selection of only one update step
from this small patch series has got unwanted consequences.

Will the update suggestion “[patch] drm/savage: dereferencing an error pointer”
by Dan Carpenter (from today) trigger further software development discussions?

https://patchwork.kernel.org/patch/9372127/
https://lkml.kernel.org/r/<20161012062227.GU12841@mwanda>


Will an update step like “[PATCH 2/2] GPU-DRM-Savage: Less function calls in
savage_bci_cmdbuf() after error detection” (from 2016-08-18) become worth
for related consideratons once more?

https://patchwork.kernel.org/patch/9289183/
https://lkml.kernel.org/r/

Regards,
Markus


Re: GPU-DRM-Savage: Less function calls in savage_bci_cmdbuf() after error detection

2016-10-12 Thread SF Markus Elfring
>> Date: Thu, 18 Aug 2016 21:28:58 +0200
>>
>> The kfree() function was called in a few cases by the
>> savage_bci_cmdbuf() function during error handling
>> even if a passed variable contained a null pointer.
>>
>> Adjust jump targets according to the Linux coding style convention.
>>
>> Signed-off-by: Markus Elfring 
> 
> Not sure this is worth it, I'll pass. Patch 1 merged.

Unfortunately, it seems that this selection of only one update step
from this small patch series has got unwanted consequences.

Will the update suggestion “[patch] drm/savage: dereferencing an error pointer”
by Dan Carpenter (from today) trigger further software development discussions?

https://patchwork.kernel.org/patch/9372127/
https://lkml.kernel.org/r/<20161012062227.GU12841@mwanda>


Will an update step like “[PATCH 2/2] GPU-DRM-Savage: Less function calls in
savage_bci_cmdbuf() after error detection” (from 2016-08-18) become worth
for related consideratons once more?

https://patchwork.kernel.org/patch/9289183/
https://lkml.kernel.org/r/

Regards,
Markus


[PATCH v2][RESEND] seq_file: don't set read position for invalid iterator

2016-10-12 Thread Tomasz Majchrzak
If kernfs file is empty on a first read, successive read operations
using the same file descriptor will return no data, even when data is
available. Default kernfs 'seq_next' implementation advances iterator
position even when next object is not there. Kernfs 'seq_start' for
following requests will not return iterator as position is already on
the second object.

This bug doesn't allow to monitor badblocks sysfs files from MD raid.
They are initially empty but if data appears at some stage, userspace is
not able to read it. It doesn't affect any released applications but it
is necessary for upcoming bad block support for external metadata in MD
raid.

Signed-off-by: Tomasz Majchrzak 
Reviewed-by: Dan Williams 
---
 fs/seq_file.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/seq_file.c b/fs/seq_file.c
index 6dc4296..74197f4 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -235,7 +235,7 @@ ssize_t seq_read(struct file *file, char __user *buf, 
size_t size, loff_t *ppos)
p = m->op->start(m, );
while (1) {
err = PTR_ERR(p);
-   if (!p || IS_ERR(p))
+   if (IS_ERR_OR_NULL(p))
break;
err = m->op->show(m, p);
if (err < 0)
@@ -244,7 +244,8 @@ ssize_t seq_read(struct file *file, char __user *buf, 
size_t size, loff_t *ppos)
m->count = 0;
if (unlikely(!m->count)) {
p = m->op->next(m, p, );
-   m->index = pos;
+   if (!IS_ERR_OR_NULL(p))
+   m->index = pos;
continue;
}
if (m->count < m->size)
-- 
1.8.3.1



[PATCH v2][RESEND] seq_file: don't set read position for invalid iterator

2016-10-12 Thread Tomasz Majchrzak
If kernfs file is empty on a first read, successive read operations
using the same file descriptor will return no data, even when data is
available. Default kernfs 'seq_next' implementation advances iterator
position even when next object is not there. Kernfs 'seq_start' for
following requests will not return iterator as position is already on
the second object.

This bug doesn't allow to monitor badblocks sysfs files from MD raid.
They are initially empty but if data appears at some stage, userspace is
not able to read it. It doesn't affect any released applications but it
is necessary for upcoming bad block support for external metadata in MD
raid.

Signed-off-by: Tomasz Majchrzak 
Reviewed-by: Dan Williams 
---
 fs/seq_file.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/seq_file.c b/fs/seq_file.c
index 6dc4296..74197f4 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -235,7 +235,7 @@ ssize_t seq_read(struct file *file, char __user *buf, 
size_t size, loff_t *ppos)
p = m->op->start(m, );
while (1) {
err = PTR_ERR(p);
-   if (!p || IS_ERR(p))
+   if (IS_ERR_OR_NULL(p))
break;
err = m->op->show(m, p);
if (err < 0)
@@ -244,7 +244,8 @@ ssize_t seq_read(struct file *file, char __user *buf, 
size_t size, loff_t *ppos)
m->count = 0;
if (unlikely(!m->count)) {
p = m->op->next(m, p, );
-   m->index = pos;
+   if (!IS_ERR_OR_NULL(p))
+   m->index = pos;
continue;
}
if (m->count < m->size)
-- 
1.8.3.1



Re: [Intel-gfx] drm/i915: WARN_ON_ONCE(!crtc_clock || cdclk < crtc_clock)

2016-10-12 Thread Paul Bolle
On Wed, 2016-10-12 at 14:08 +0300, Joonas Lahtinen wrote:
> Bisecting the offending commit between v4.8 and v4.8.1 would be a good
> start.

That would be between v4.7 and v4.8. (I guess my report was ambiguous.)

That might take some time. Because bisecting always takes a long time
and especially since hitting this WARNING sometimes takes over an hour.
Anyhow, please prod me if I stay silent for too long.

Thanks,


Paul Bolle


Re: [Intel-gfx] drm/i915: WARN_ON_ONCE(!crtc_clock || cdclk < crtc_clock)

2016-10-12 Thread Paul Bolle
On Wed, 2016-10-12 at 14:08 +0300, Joonas Lahtinen wrote:
> Bisecting the offending commit between v4.8 and v4.8.1 would be a good
> start.

That would be between v4.7 and v4.8. (I guess my report was ambiguous.)

That might take some time. Because bisecting always takes a long time
and especially since hitting this WARNING sometimes takes over an hour.
Anyhow, please prod me if I stay silent for too long.

Thanks,


Paul Bolle


[PATCH] mm, compaction: allow compaction for GFP_NOFS requests

2016-10-12 Thread Michal Hocko
From: Michal Hocko 

compaction has been disabled for GFP_NOFS and GFP_NOIO requests since
the direct compaction was introduced by 56de7263fcf3 ("mm: compaction:
direct compact when a high-order allocation fails"). The main reason
is that the migration of page cache pages might recurse back to fs/io
layer and we could potentially deadlock. This is overly conservative
because all the anonymous memory is migrateable in the GFP_NOFS context
just fine.  This might be a large portion of the memory in many/most
workkloads.

Remove the GFP_NOFS restriction and make sure that we skip all fs pages
(those with a mapping) while isolating pages to be migrated. We cannot
consider clean fs pages because they might need a metadata update so
only isolate pages without any mapping for nofs requests.

The effect of this patch will be probably very limited in many/most
workloads because higher order GFP_NOFS requests are quite rare,
although different configurations might lead to very different results.
David Chinner has mentioned a heavy metadata workload with 64kB block
which to quote him:
"
Unfortunately, there was an era of cargo cult configuration tweaks
in the Ceph community that has resulted in a large number of
production machines with XFS filesystems configured this way. And a
lot of them store large numbers of small files and run under
significant sustained memory pressure.

I slowly working towards getting rid of these high order allocations
and replacing them with the equivalent number of single page
allocations, but I haven't got that (complex) change working yet.
"

We can do the following to simulate that workload:
$ mkfs.xfs -f -n size=64k 
$ mount  /mnt/scratch
$ time ./fs_mark  -D  1  -S0  -n  10  -s  0  -L  32 \
-d  /mnt/scratch/0  -d  /mnt/scratch/1 \
-d  /mnt/scratch/2  -d  /mnt/scratch/3 \
-d  /mnt/scratch/4  -d  /mnt/scratch/5 \
-d  /mnt/scratch/6  -d  /mnt/scratch/7 \
-d  /mnt/scratch/8  -d  /mnt/scratch/9 \
-d  /mnt/scratch/10  -d  /mnt/scratch/11 \
-d  /mnt/scratch/12  -d  /mnt/scratch/13 \
-d  /mnt/scratch/14  -d  /mnt/scratch/15

and indeed is hammers the system with many high order GFP_NOFS requests as
per a simle tracepoint during the load:
$ echo '!(gfp_flags & 0x80) && (gfp_flags &0x40)' > 
$TRACE_MNT/events/kmem/mm_page_alloc/filter
I am getting
5287609 order=0
 37 order=1
1594905 order=2
3048439 order=3
6699207 order=4
  66645 order=5

My testing was done in a kvm guest so performance numbers should be
taken with a grain of salt but there seems to be a difference when the
patch is applied:

* Original kernel
FSUse%Count SizeFiles/sec App Overhead
 1  1600   4300.1 20745838
 3  3200   4239.9 23849857
 5  4800   4243.4 25939543
 6  6400   4248.4 19514050
 8  8000   4262.1 20796169
 9  9600   4257.6 21288675
11 11200   4259.7 19375120
13 12800   4220.7 22734141
14 14400   4238.5 31936458
16 16000   4231.5 23409901
18 17600   4045.3 23577700
19 19200   2783.4 58299526
21 20800   2678.2 40616302
23 22400   2693.5 83973996

and xfs complaining about memory allocation not making progress
[ 2304.372647] XFS: fs_mark(3289) possible memory allocation deadlock size 
65624 in kmem_alloc (mode:0x2408240)
[ 2304.443323] XFS: fs_mark(3285) possible memory allocation deadlock size 
65728 in kmem_alloc (mode:0x2408240)
[ 4796.772477] XFS: fs_mark(3424) possible memory allocation deadlock size 
46936 in kmem_alloc (mode:0x2408240)
[ 4796.775329] XFS: fs_mark(3423) possible memory allocation deadlock size 
51416 in kmem_alloc (mode:0x2408240)
[ 4797.388808] XFS: fs_mark(3424) possible memory allocation deadlock size 
65728 in kmem_alloc (mode:0x2408240)

* Patched kernel
FSUse%Count SizeFiles/sec App Overhead
 1  1600   4289.1 19243934
 3  3200   4241.6 32828865
 5  4800   4248.7 32884693
 6  6400   4314.4 19608921
 8  8000   4269.9 24953292
 9  9600   4270.7 33235572
11 11200   4346.4 40817101
13 12800   4285.3 29972397
14 14400   4297.2 20539765
16 16000   4219.6 18596767
18 17600   4273.8  

[PATCH] mm, compaction: allow compaction for GFP_NOFS requests

2016-10-12 Thread Michal Hocko
From: Michal Hocko 

compaction has been disabled for GFP_NOFS and GFP_NOIO requests since
the direct compaction was introduced by 56de7263fcf3 ("mm: compaction:
direct compact when a high-order allocation fails"). The main reason
is that the migration of page cache pages might recurse back to fs/io
layer and we could potentially deadlock. This is overly conservative
because all the anonymous memory is migrateable in the GFP_NOFS context
just fine.  This might be a large portion of the memory in many/most
workkloads.

Remove the GFP_NOFS restriction and make sure that we skip all fs pages
(those with a mapping) while isolating pages to be migrated. We cannot
consider clean fs pages because they might need a metadata update so
only isolate pages without any mapping for nofs requests.

The effect of this patch will be probably very limited in many/most
workloads because higher order GFP_NOFS requests are quite rare,
although different configurations might lead to very different results.
David Chinner has mentioned a heavy metadata workload with 64kB block
which to quote him:
"
Unfortunately, there was an era of cargo cult configuration tweaks
in the Ceph community that has resulted in a large number of
production machines with XFS filesystems configured this way. And a
lot of them store large numbers of small files and run under
significant sustained memory pressure.

I slowly working towards getting rid of these high order allocations
and replacing them with the equivalent number of single page
allocations, but I haven't got that (complex) change working yet.
"

We can do the following to simulate that workload:
$ mkfs.xfs -f -n size=64k 
$ mount  /mnt/scratch
$ time ./fs_mark  -D  1  -S0  -n  10  -s  0  -L  32 \
-d  /mnt/scratch/0  -d  /mnt/scratch/1 \
-d  /mnt/scratch/2  -d  /mnt/scratch/3 \
-d  /mnt/scratch/4  -d  /mnt/scratch/5 \
-d  /mnt/scratch/6  -d  /mnt/scratch/7 \
-d  /mnt/scratch/8  -d  /mnt/scratch/9 \
-d  /mnt/scratch/10  -d  /mnt/scratch/11 \
-d  /mnt/scratch/12  -d  /mnt/scratch/13 \
-d  /mnt/scratch/14  -d  /mnt/scratch/15

and indeed is hammers the system with many high order GFP_NOFS requests as
per a simle tracepoint during the load:
$ echo '!(gfp_flags & 0x80) && (gfp_flags &0x40)' > 
$TRACE_MNT/events/kmem/mm_page_alloc/filter
I am getting
5287609 order=0
 37 order=1
1594905 order=2
3048439 order=3
6699207 order=4
  66645 order=5

My testing was done in a kvm guest so performance numbers should be
taken with a grain of salt but there seems to be a difference when the
patch is applied:

* Original kernel
FSUse%Count SizeFiles/sec App Overhead
 1  1600   4300.1 20745838
 3  3200   4239.9 23849857
 5  4800   4243.4 25939543
 6  6400   4248.4 19514050
 8  8000   4262.1 20796169
 9  9600   4257.6 21288675
11 11200   4259.7 19375120
13 12800   4220.7 22734141
14 14400   4238.5 31936458
16 16000   4231.5 23409901
18 17600   4045.3 23577700
19 19200   2783.4 58299526
21 20800   2678.2 40616302
23 22400   2693.5 83973996

and xfs complaining about memory allocation not making progress
[ 2304.372647] XFS: fs_mark(3289) possible memory allocation deadlock size 
65624 in kmem_alloc (mode:0x2408240)
[ 2304.443323] XFS: fs_mark(3285) possible memory allocation deadlock size 
65728 in kmem_alloc (mode:0x2408240)
[ 4796.772477] XFS: fs_mark(3424) possible memory allocation deadlock size 
46936 in kmem_alloc (mode:0x2408240)
[ 4796.775329] XFS: fs_mark(3423) possible memory allocation deadlock size 
51416 in kmem_alloc (mode:0x2408240)
[ 4797.388808] XFS: fs_mark(3424) possible memory allocation deadlock size 
65728 in kmem_alloc (mode:0x2408240)

* Patched kernel
FSUse%Count SizeFiles/sec App Overhead
 1  1600   4289.1 19243934
 3  3200   4241.6 32828865
 5  4800   4248.7 32884693
 6  6400   4314.4 19608921
 8  8000   4269.9 24953292
 9  9600   4270.7 33235572
11 11200   4346.4 40817101
13 12800   4285.3 29972397
14 14400   4297.2 20539765
16 16000   4219.6 18596767
18 17600   4273.8 49611187
 

Re: [PATCH 6/10] mmc: sdhci-xenon: Add Marvell Xenon SDHC core functionality

2016-10-12 Thread Ziji Hu
Hi Adrian,

Thank you very much for your review.
I will firstly fix the typo.

On 2016/10/11 20:37, Adrian Hunter wrote:
> On 07/10/16 18:22, Gregory CLEMENT wrote:
>> From: Ziji Hu 
>>
>> Add Xenon eMMC/SD/SDIO host controller core functionality.
>> Add Xenon specific intialization process.
>> Add Xenon specific mmc_host_ops APIs.
>> Add Xenon specific register definitions.
>>
>> Add CONFIG_MMC_SDHCI_XENON support in drivers/mmc/host/Kconfig.
>>
>> Marvell Xenon SDHC conforms to SD Physical Layer Specification
>> Version 3.01 and is designed according to the guidelines provided
>> in the SD Host Controller Standard Specification Version 3.00.
>>
>> Signed-off-by: Hu Ziji 
>> Reviewed-by: Gregory CLEMENT 
>> Signed-off-by: Gregory CLEMENT 
> 
> I looked at a couple of things but you need to sort out the issues with
> card_candidate before going further.
> 
Understood.
I will improve the card_candidate. Please help check the details in 
below.

>> ---

>> +
>> +static int xenon_emmc_signal_voltage_switch(struct mmc_host *mmc,
>> +struct mmc_ios *ios)
>> +{
>> +unsigned char voltage = ios->signal_voltage;
>> +
>> +if ((voltage == MMC_SIGNAL_VOLTAGE_330) ||
>> +(voltage == MMC_SIGNAL_VOLTAGE_180))
>> +return __emmc_signal_voltage_switch(mmc, voltage);
>> +
>> +dev_err(mmc_dev(mmc), "Unsupported signal voltage: %d\n",
>> +voltage);
>> +return -EINVAL;
>> +}
>> +
>> +static int xenon_start_signal_voltage_switch(struct mmc_host *mmc,
>> + struct mmc_ios *ios)
>> +{
>> +struct sdhci_host *host = mmc_priv(mmc);
>> +struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>> +struct sdhci_xenon_priv *priv = sdhci_pltfm_priv(pltfm_host);
>> +
>> +/*
>> + * Before SD/SDIO set signal voltage, SD bus clock should be
>> + * disabled. However, sdhci_set_clock will also disable the Internal
>> + * clock in mmc_set_signal_voltage().
>> + * If Internal clock is disabled, the 3.3V/1.8V bit can not be updated.
>> + * Thus here manually enable internal clock.
>> + *
>> + * After switch completes, it is unnecessary to disable internal clock,
>> + * since keeping internal clock active obeys SD spec.
>> + */
>> +enable_xenon_internal_clk(host);
>> +
>> +if (priv->card_candidate) {
> 
> mmc_power_up() calls __mmc_set_signal_voltage() calls
> host->ops->start_signal_voltage_switch so priv->card_candidate could be an
> invalid reference to an old card.
> 
> So that's not going to work if the card changes - not only for removable
> cards but even for eMMC if init fails and retries.
> 
As you point out, this piece of code have defects, even though it 
actually works on Marvell multiple platforms, unless eMMC card is removable.

I can add a property to explicitly indicate eMMC type in DTS.
Then card_candidate access can be removed here.
Does it sounds more reasonable to you?

>> +if (mmc_card_mmc(priv->card_candidate))
>> +return xenon_emmc_signal_voltage_switch(mmc, ios);
> 
> So if all you need to know is whether it is a eMMC, why can't DT tell you?
> 
I can add an eMMC type property in DTS, to remove the card_candidate 
access here.

>> +}
>> +
>> +return sdhci_start_signal_voltage_switch(mmc, ios);
>> +}
>> +
>> +/*
>> + * After determining which slot is used for SDIO,
>> + * some additional task is required.
>> + */
>> +static void xenon_init_card(struct mmc_host *mmc, struct mmc_card *card)
>> +{
>> +struct sdhci_host *host = mmc_priv(mmc);
>> +u32 reg;
>> +u8 slot_idx;
>> +struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>> +struct sdhci_xenon_priv *priv = sdhci_pltfm_priv(pltfm_host);
>> +
>> +/* Link the card for delay adjustment */
>> +priv->card_candidate = card;
> 
> You really need a better way to get the card.  I suggest you take up the
> issue with Ulf.  One possibility is to have mmc core set host->card = card
> much earlier.
> 
Could you please tell me if any issue related to card_candidate still 
exists, after the card_candidate is removed from 
xenon_start_signal_voltage_switch() in above?
It seems that when init_card is called, the structure card has already 
been updated and stable in MMC/SD/SDIO initialization sequence.
May I keep it here?

>> +/* Set tuning functionality of this slot */
>> +xenon_slot_tuning_setup(host);
>> +
>> +slot_idx = priv->slot_idx;
>> +if (!mmc_card_sdio(card)) {
>> +/* Re-enable the Auto-CMD12 cap flag. */
>> +host->quirks |= SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12;
>> +host->flags |= SDHCI_AUTO_CMD12;
>> +
>> +/* Clear SDIO Card Inserted indication */
>> +

Re: [PATCH 6/10] mmc: sdhci-xenon: Add Marvell Xenon SDHC core functionality

2016-10-12 Thread Ziji Hu
Hi Adrian,

Thank you very much for your review.
I will firstly fix the typo.

On 2016/10/11 20:37, Adrian Hunter wrote:
> On 07/10/16 18:22, Gregory CLEMENT wrote:
>> From: Ziji Hu 
>>
>> Add Xenon eMMC/SD/SDIO host controller core functionality.
>> Add Xenon specific intialization process.
>> Add Xenon specific mmc_host_ops APIs.
>> Add Xenon specific register definitions.
>>
>> Add CONFIG_MMC_SDHCI_XENON support in drivers/mmc/host/Kconfig.
>>
>> Marvell Xenon SDHC conforms to SD Physical Layer Specification
>> Version 3.01 and is designed according to the guidelines provided
>> in the SD Host Controller Standard Specification Version 3.00.
>>
>> Signed-off-by: Hu Ziji 
>> Reviewed-by: Gregory CLEMENT 
>> Signed-off-by: Gregory CLEMENT 
> 
> I looked at a couple of things but you need to sort out the issues with
> card_candidate before going further.
> 
Understood.
I will improve the card_candidate. Please help check the details in 
below.

>> ---

>> +
>> +static int xenon_emmc_signal_voltage_switch(struct mmc_host *mmc,
>> +struct mmc_ios *ios)
>> +{
>> +unsigned char voltage = ios->signal_voltage;
>> +
>> +if ((voltage == MMC_SIGNAL_VOLTAGE_330) ||
>> +(voltage == MMC_SIGNAL_VOLTAGE_180))
>> +return __emmc_signal_voltage_switch(mmc, voltage);
>> +
>> +dev_err(mmc_dev(mmc), "Unsupported signal voltage: %d\n",
>> +voltage);
>> +return -EINVAL;
>> +}
>> +
>> +static int xenon_start_signal_voltage_switch(struct mmc_host *mmc,
>> + struct mmc_ios *ios)
>> +{
>> +struct sdhci_host *host = mmc_priv(mmc);
>> +struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>> +struct sdhci_xenon_priv *priv = sdhci_pltfm_priv(pltfm_host);
>> +
>> +/*
>> + * Before SD/SDIO set signal voltage, SD bus clock should be
>> + * disabled. However, sdhci_set_clock will also disable the Internal
>> + * clock in mmc_set_signal_voltage().
>> + * If Internal clock is disabled, the 3.3V/1.8V bit can not be updated.
>> + * Thus here manually enable internal clock.
>> + *
>> + * After switch completes, it is unnecessary to disable internal clock,
>> + * since keeping internal clock active obeys SD spec.
>> + */
>> +enable_xenon_internal_clk(host);
>> +
>> +if (priv->card_candidate) {
> 
> mmc_power_up() calls __mmc_set_signal_voltage() calls
> host->ops->start_signal_voltage_switch so priv->card_candidate could be an
> invalid reference to an old card.
> 
> So that's not going to work if the card changes - not only for removable
> cards but even for eMMC if init fails and retries.
> 
As you point out, this piece of code have defects, even though it 
actually works on Marvell multiple platforms, unless eMMC card is removable.

I can add a property to explicitly indicate eMMC type in DTS.
Then card_candidate access can be removed here.
Does it sounds more reasonable to you?

>> +if (mmc_card_mmc(priv->card_candidate))
>> +return xenon_emmc_signal_voltage_switch(mmc, ios);
> 
> So if all you need to know is whether it is a eMMC, why can't DT tell you?
> 
I can add an eMMC type property in DTS, to remove the card_candidate 
access here.

>> +}
>> +
>> +return sdhci_start_signal_voltage_switch(mmc, ios);
>> +}
>> +
>> +/*
>> + * After determining which slot is used for SDIO,
>> + * some additional task is required.
>> + */
>> +static void xenon_init_card(struct mmc_host *mmc, struct mmc_card *card)
>> +{
>> +struct sdhci_host *host = mmc_priv(mmc);
>> +u32 reg;
>> +u8 slot_idx;
>> +struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>> +struct sdhci_xenon_priv *priv = sdhci_pltfm_priv(pltfm_host);
>> +
>> +/* Link the card for delay adjustment */
>> +priv->card_candidate = card;
> 
> You really need a better way to get the card.  I suggest you take up the
> issue with Ulf.  One possibility is to have mmc core set host->card = card
> much earlier.
> 
Could you please tell me if any issue related to card_candidate still 
exists, after the card_candidate is removed from 
xenon_start_signal_voltage_switch() in above?
It seems that when init_card is called, the structure card has already 
been updated and stable in MMC/SD/SDIO initialization sequence.
May I keep it here?

>> +/* Set tuning functionality of this slot */
>> +xenon_slot_tuning_setup(host);
>> +
>> +slot_idx = priv->slot_idx;
>> +if (!mmc_card_sdio(card)) {
>> +/* Re-enable the Auto-CMD12 cap flag. */
>> +host->quirks |= SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12;
>> +host->flags |= SDHCI_AUTO_CMD12;
>> +
>> +/* Clear SDIO Card Inserted indication */
>> +reg = sdhci_readl(host, SDHC_SYS_CFG_INFO);
>> +reg &= ~(1 << (slot_idx + SLOT_TYPE_SDIO_SHIFT));

Re: [PATCH RESEND] ARM: dts: keystone-k2*: Increase SPI Flash partition size for U-Boot

2016-10-12 Thread Russell King - ARM Linux
On Wed, Oct 12, 2016 at 04:30:28PM +0530, Vignesh R wrote:
> Hi,
> 
> On Monday 10 October 2016 08:01 PM, Russell King - ARM Linux wrote:
> > On Mon, Oct 10, 2016 at 07:41:41PM +0530, Vignesh R wrote:
> >> U-Boot SPI Boot image is now more than 512KB for Keystone2 devices and
> >> cannot fit into existing partition. So, increase the SPI Flash partition
> >> for U-Boot to 1MB for all Keystone2 devices.
> >>
> >> Signed-off-by: Vignesh R 
> >> ---
> >>
> >> This was submitted to v4.9 merge window but was never picked up:
> >> https://patchwork.kernel.org/patch/9135023/
> > 
> > I think you need to explain why it's safe to change the layout of the
> > flash partitions like this.
> > 
> > - What is this "misc" partition?
> > 
> 
> This partition seems to exists from the very beginning.  I believe, this
> is just a spare area of flash that can be used as per end-user
> requirement. Either to store a small filesystem or kernel. Copying
> Murali who added above partition if he has any input here.
> 
> > - Why is it safe to move the "misc" partition in this way?
> > 
> > - Do users need to do anything with data stored in the "misc" partition
> >   when changing kernels?
> > 
> 
> MTD layer will take care of most abstractions (like start address etc).
> Will add a note in commit message informing about the reduction in size
> of the partition.
> 
> > If the "misc" partition is simply unused space on the flash device, why
> > list it in DT?
> > 
> 
> If the unused space is not listed in the DT, then there is no /dev/mtdX
> node created for the unused section. User will then have to manually
> edit DT, in order to get the node and mount it. Instead, lets make it
> available by default.

So, taken all together, your argument is:

- We want a user partition
- It's okay to destroy the data in the user's partition by moving it
  around randomly between kernel versions.

The two do not naturally go together at all.  You're messing with user
expectations in ways you should not be.  This really is not an acceptable
approach.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH RESEND] ARM: dts: keystone-k2*: Increase SPI Flash partition size for U-Boot

2016-10-12 Thread Russell King - ARM Linux
On Wed, Oct 12, 2016 at 04:30:28PM +0530, Vignesh R wrote:
> Hi,
> 
> On Monday 10 October 2016 08:01 PM, Russell King - ARM Linux wrote:
> > On Mon, Oct 10, 2016 at 07:41:41PM +0530, Vignesh R wrote:
> >> U-Boot SPI Boot image is now more than 512KB for Keystone2 devices and
> >> cannot fit into existing partition. So, increase the SPI Flash partition
> >> for U-Boot to 1MB for all Keystone2 devices.
> >>
> >> Signed-off-by: Vignesh R 
> >> ---
> >>
> >> This was submitted to v4.9 merge window but was never picked up:
> >> https://patchwork.kernel.org/patch/9135023/
> > 
> > I think you need to explain why it's safe to change the layout of the
> > flash partitions like this.
> > 
> > - What is this "misc" partition?
> > 
> 
> This partition seems to exists from the very beginning.  I believe, this
> is just a spare area of flash that can be used as per end-user
> requirement. Either to store a small filesystem or kernel. Copying
> Murali who added above partition if he has any input here.
> 
> > - Why is it safe to move the "misc" partition in this way?
> > 
> > - Do users need to do anything with data stored in the "misc" partition
> >   when changing kernels?
> > 
> 
> MTD layer will take care of most abstractions (like start address etc).
> Will add a note in commit message informing about the reduction in size
> of the partition.
> 
> > If the "misc" partition is simply unused space on the flash device, why
> > list it in DT?
> > 
> 
> If the unused space is not listed in the DT, then there is no /dev/mtdX
> node created for the unused section. User will then have to manually
> edit DT, in order to get the node and mount it. Instead, lets make it
> available by default.

So, taken all together, your argument is:

- We want a user partition
- It's okay to destroy the data in the user's partition by moving it
  around randomly between kernel versions.

The two do not naturally go together at all.  You're messing with user
expectations in ways you should not be.  This really is not an acceptable
approach.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [rtc-linux] [PATCH] rtc: Add support for maxim dallas rtc max-6917

2016-10-12 Thread Alexandre Belloni
Hi,

Seeing this has the same register map as max6916, please use regmap to
abstract the accesses and handle both with the same driver. You can have
a look at rtc-ds3232.c for an example.

On 12/10/2016 at 01:33:32 -0700, VENKAT PRASHANTH B U wrote :
> This is a patch to add support for
> maxim dallas rtc max6917.
> 
> Signed-off-by: Venkat Prashanth B U 
> ---
> ---
>  drivers/rtc/Kconfig   |   9 +
>  drivers/rtc/Makefile  |   1 +
>  drivers/rtc/rtc-max6917.c | 406 
> ++
>  3 files changed, 416 insertions(+)
> 
> diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig
> index e215f50..2163606 100644
> --- a/drivers/rtc/Kconfig
> +++ b/drivers/rtc/Kconfig
> @@ -277,6 +277,15 @@ config RTC_DRV_MAX6900
> This driver can also be built as a module. If so, the module
> will be called rtc-max6900.
>  
> +config RTC_DRV_MAX6917
> + tristate "Maxim MAX6917"
> + help
> +   If you say yes here you will get support for the
> +   Maxim MAX6917 I2C RTC chip.
> +
> +   This driver can also be built as a module. If so, the module
> +   will be called rtc-max6917.
> +
>  config RTC_DRV_MAX8907
>   tristate "Maxim MAX8907"
>   depends on MFD_MAX8907
> diff --git a/drivers/rtc/Makefile b/drivers/rtc/Makefile
> index 7cf7ad5..29332fb 100644
> --- a/drivers/rtc/Makefile
> +++ b/drivers/rtc/Makefile
> @@ -87,6 +87,7 @@ obj-$(CONFIG_RTC_DRV_M48T86)+= rtc-m48t86.o
>  obj-$(CONFIG_RTC_DRV_MAX6900)+= rtc-max6900.o
>  obj-$(CONFIG_RTC_DRV_MAX6902)+= rtc-max6902.o
>  obj-$(CONFIG_RTC_DRV_MAX6916)+= rtc-max6916.o
> +obj-$(CONFIG_RTC_DRV_MAX6917)+= rtc-max6917.o
>  obj-$(CONFIG_RTC_DRV_MAX77686)   += rtc-max77686.o
>  obj-$(CONFIG_RTC_DRV_MAX8907)+= rtc-max8907.o
>  obj-$(CONFIG_RTC_DRV_MAX8925)+= rtc-max8925.o
> diff --git a/drivers/rtc/rtc-max6917.c b/drivers/rtc/rtc-max6917.c
> index e69de29..1176384 100644
> --- a/drivers/rtc/rtc-max6917.c
> +++ b/drivers/rtc/rtc-max6917.c
> @@ -0,0 +1,406 @@
> + /* rtc-max6917.c
> + *
> + * Driver for MAXIM  max6917  I2C-Compatible Real Time Clock
> + *
> + * Author : Venkat Prashanth B U 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + */
> +
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> +
> + #define MAX6917_REG_SECS0x01/* 00-59 */
> + #define MAX6917_REG_MIN 0x02/* 00-59 */
> + #define MAX6917_REG_HOUR0x03/* 00-23, or 
> 1-12{am,pm} */
> + #define MAX6917_REG_WDAY0x04/* 01-07 */
> + #define MAX6917_REG_MDAY0x05/* 01-31 */
> + #define MAX6917_REG_MONTH   0x06/* 01-12 */
> + #define MAX6917_REG_YEAR0x07/* 00-99 */
> + #define MAX6917_REG_CONTROL 0x08
> + #define MAX6917_REG_STATUS  0x0c
> + #define MAX6917_REG_ALARM   0x0a
> + #define MAX6917_BURST_LEN   8   /* can burst r/w first 
> 8 regs */
> + #define MAX6917_REG_CENTURY 9   /* century */
> + #define MAX6917_REG_LEN 10
> + #define MAX6917_REG_CT_WP   (1 << 7)/* Write 
> Protect */
> + /*
> + * register read/write commands
> + */
> + #define MAX6917_REG_CONTROL_WRITE   0x8e
> + #define MAX6917_REG_CENTURY_WRITE   0x92
> + #define MAX6917_REG_CENTURY_READ0x93
> + #define MAX6917_REG_RESERVED_READ   0x96
> + #define MAX6917_REG_BURST_WRITE 0xbe
> + #define MAX6917_REG_BURST_READ  0xbf
> +
> + #define MAX6917_IDLE_TIME_AFTER_WRITE   3   /* specification says 
> 2.5 mS */
> +
> + static struct i2c_driver max6917_driver;
> +
> + struct max6917
> + {
> + u8 offset;  /* register's offset */
> + u8 regs[11];
> + u16 nvram_offset;
> + struct bin_attribute *nvram;
> + unsigned long flags;
> + #define HAS_NVRAM   0   /* bit 0 == sysfs file 
> active */
> + #define HAS_ALARM   1   /* bit 1 == irq claimed 
> */
> + struct i2c_client *client;
> + struct rtc_device *rtc;
> + s32 (*read_block_data) (const struct i2c_client * client, u8 
> command,
> + u8 length, u8 * values);
> + s32 (*write_block_data) (const struct i2c_client * client, u8 
> command,
> + u8 length, const u8 * values);
> + };
> +
> + struct chip_desc
> + {
> +

Re: [rtc-linux] [PATCH] rtc: Add support for maxim dallas rtc max-6917

2016-10-12 Thread Alexandre Belloni
Hi,

Seeing this has the same register map as max6916, please use regmap to
abstract the accesses and handle both with the same driver. You can have
a look at rtc-ds3232.c for an example.

On 12/10/2016 at 01:33:32 -0700, VENKAT PRASHANTH B U wrote :
> This is a patch to add support for
> maxim dallas rtc max6917.
> 
> Signed-off-by: Venkat Prashanth B U 
> ---
> ---
>  drivers/rtc/Kconfig   |   9 +
>  drivers/rtc/Makefile  |   1 +
>  drivers/rtc/rtc-max6917.c | 406 
> ++
>  3 files changed, 416 insertions(+)
> 
> diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig
> index e215f50..2163606 100644
> --- a/drivers/rtc/Kconfig
> +++ b/drivers/rtc/Kconfig
> @@ -277,6 +277,15 @@ config RTC_DRV_MAX6900
> This driver can also be built as a module. If so, the module
> will be called rtc-max6900.
>  
> +config RTC_DRV_MAX6917
> + tristate "Maxim MAX6917"
> + help
> +   If you say yes here you will get support for the
> +   Maxim MAX6917 I2C RTC chip.
> +
> +   This driver can also be built as a module. If so, the module
> +   will be called rtc-max6917.
> +
>  config RTC_DRV_MAX8907
>   tristate "Maxim MAX8907"
>   depends on MFD_MAX8907
> diff --git a/drivers/rtc/Makefile b/drivers/rtc/Makefile
> index 7cf7ad5..29332fb 100644
> --- a/drivers/rtc/Makefile
> +++ b/drivers/rtc/Makefile
> @@ -87,6 +87,7 @@ obj-$(CONFIG_RTC_DRV_M48T86)+= rtc-m48t86.o
>  obj-$(CONFIG_RTC_DRV_MAX6900)+= rtc-max6900.o
>  obj-$(CONFIG_RTC_DRV_MAX6902)+= rtc-max6902.o
>  obj-$(CONFIG_RTC_DRV_MAX6916)+= rtc-max6916.o
> +obj-$(CONFIG_RTC_DRV_MAX6917)+= rtc-max6917.o
>  obj-$(CONFIG_RTC_DRV_MAX77686)   += rtc-max77686.o
>  obj-$(CONFIG_RTC_DRV_MAX8907)+= rtc-max8907.o
>  obj-$(CONFIG_RTC_DRV_MAX8925)+= rtc-max8925.o
> diff --git a/drivers/rtc/rtc-max6917.c b/drivers/rtc/rtc-max6917.c
> index e69de29..1176384 100644
> --- a/drivers/rtc/rtc-max6917.c
> +++ b/drivers/rtc/rtc-max6917.c
> @@ -0,0 +1,406 @@
> + /* rtc-max6917.c
> + *
> + * Driver for MAXIM  max6917  I2C-Compatible Real Time Clock
> + *
> + * Author : Venkat Prashanth B U 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + */
> +
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> + #include 
> +
> + #define MAX6917_REG_SECS0x01/* 00-59 */
> + #define MAX6917_REG_MIN 0x02/* 00-59 */
> + #define MAX6917_REG_HOUR0x03/* 00-23, or 
> 1-12{am,pm} */
> + #define MAX6917_REG_WDAY0x04/* 01-07 */
> + #define MAX6917_REG_MDAY0x05/* 01-31 */
> + #define MAX6917_REG_MONTH   0x06/* 01-12 */
> + #define MAX6917_REG_YEAR0x07/* 00-99 */
> + #define MAX6917_REG_CONTROL 0x08
> + #define MAX6917_REG_STATUS  0x0c
> + #define MAX6917_REG_ALARM   0x0a
> + #define MAX6917_BURST_LEN   8   /* can burst r/w first 
> 8 regs */
> + #define MAX6917_REG_CENTURY 9   /* century */
> + #define MAX6917_REG_LEN 10
> + #define MAX6917_REG_CT_WP   (1 << 7)/* Write 
> Protect */
> + /*
> + * register read/write commands
> + */
> + #define MAX6917_REG_CONTROL_WRITE   0x8e
> + #define MAX6917_REG_CENTURY_WRITE   0x92
> + #define MAX6917_REG_CENTURY_READ0x93
> + #define MAX6917_REG_RESERVED_READ   0x96
> + #define MAX6917_REG_BURST_WRITE 0xbe
> + #define MAX6917_REG_BURST_READ  0xbf
> +
> + #define MAX6917_IDLE_TIME_AFTER_WRITE   3   /* specification says 
> 2.5 mS */
> +
> + static struct i2c_driver max6917_driver;
> +
> + struct max6917
> + {
> + u8 offset;  /* register's offset */
> + u8 regs[11];
> + u16 nvram_offset;
> + struct bin_attribute *nvram;
> + unsigned long flags;
> + #define HAS_NVRAM   0   /* bit 0 == sysfs file 
> active */
> + #define HAS_ALARM   1   /* bit 1 == irq claimed 
> */
> + struct i2c_client *client;
> + struct rtc_device *rtc;
> + s32 (*read_block_data) (const struct i2c_client * client, u8 
> command,
> + u8 length, u8 * values);
> + s32 (*write_block_data) (const struct i2c_client * client, u8 
> command,
> + u8 length, const u8 * values);
> + };
> +
> + struct chip_desc
> + {
> + unsigned alarm:1;
> + u16 nvram_offset;
> + u16 

<    6   7   8   9   10   11   12   13   14   >