Re: Sym2 scsi hang on boot on sparc64
Bisection (on PA-RISC) points to: 71e75c97f97a9645d25fbf3d8e4165a558f18747 is the first bad commit commit 71e75c97f97a9645d25fbf3d8e4165a558f18747 Author: Christoph Hellwig h...@lst.de Date: Fri Apr 11 19:07:01 2014 +0200 scsi: convert device_busy to atomic_t That's fixed upstream: commit 480cadc2b7e0fa2bbab20141efb547dfe0c3707c Yes, works for both sparc64 and parisc. -- Meelis Roos (mr...@linux.ee) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sym2 scsi hang on boot on sparc64
On Tue, 2014-08-19 at 14:25 +0300, Meelis Roos wrote: 3.16 scsi worked fine, 3.17-rc1 misbehaves on 3 of my sparc64 test machines. E220R and E420R are with onboard 5c3875, V210 is with onboarc 53c1010 and all behave the same. Any ideas whre to dig deeper? bisection might be nontrivial, because of sparc64 changes that are OK on 3.17-rc1 again - but is possible if nothing else helps. We've got a parisc with an 875 as a root SCSI bus ... I haven't got around to building for it yet, but I might find time to try today. [ 164.639697] PCI: Enabling device: (:00:03.0), cmd 147 [ 164.705076] sym0: 875 rev 0x14 at pci :00:03.0 irq 13 [ 164.858446] sym0: No NVRAM, ID 7, Fast-20, SE, parity checking [ 164.935031] sym0: SCSI BUS has been reset. [ 164.983113] scsi host0: sym-2.2.3 [ 165.026358] PCI: Enabling device: (:00:03.1), cmd 3 [ 165.089634] sym1: 875 rev 0x14 at pci :00:03.1 irq 14 [ 165.242820] sym1: No NVRAM, ID 7, Fast-20, SE, parity checking [ 165.319227] sym1: SCSI BUS has been reset. [ 165.367281] scsi host1: sym-2.2.3 Does it detect drives in the bit you cut? I ask because one of the symptoms of a misrouted irq is random problems with bring up. However, if anything is detected, then the irq must be OK. James [ 388.835999] INFO: task swapper/0:1 blocked for more than 120 seconds. [ 388.912181] Not tainted 3.17.0-rc1 #46 [ 388.963187] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 389.056953] swapper/0 D 00483958 7584 1 0 0x2000100 [ 389.148575] Call Trace: [ 389.177747] [0082e5fc] schedule+0x1c/0x80 [ 389.235024] [00483958] async_synchronize_cookie_domain+0x58/0x100 [ 389.317301] [00483a28] async_synchronize_full+0x8/0x20 [ 389.388133] [006ebe04] wait_for_device_probe+0x64/0x80 [ 389.458938] [009dcffc] prepare_namespace+0x4/0x1b8 [ 389.525590] [009dcbac] kernel_init_freeable+0x1c0/0x1d8 [ 389.597450] [008298e4] kernel_init+0x4/0x100 [ 389.657868] [004060c4] ret_from_fork+0x1c/0x2c [ 389.720324] [] (null) [ 389.775518] no locks held by swapper/0/1. -- Meelis Roos (mr...@linux.ee) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sym2 scsi hang on boot on sparc64
3.16 scsi worked fine, 3.17-rc1 misbehaves on 3 of my sparc64 test machines. E220R and E420R are with onboard 5c3875, V210 is with onboarc 53c1010 and all behave the same. Any ideas whre to dig deeper? bisection might be nontrivial, because of sparc64 changes that are OK on 3.17-rc1 again - but is possible if nothing else helps. We've got a parisc with an 875 as a root SCSI bus ... I haven't got around to building for it yet, but I might find time to try today. Come to think of it, I have couple parsisc with 875 too, will try. [ 164.639697] PCI: Enabling device: (:00:03.0), cmd 147 [ 164.705076] sym0: 875 rev 0x14 at pci :00:03.0 irq 13 [ 164.858446] sym0: No NVRAM, ID 7, Fast-20, SE, parity checking [ 164.935031] sym0: SCSI BUS has been reset. [ 164.983113] scsi host0: sym-2.2.3 [ 165.026358] PCI: Enabling device: (:00:03.1), cmd 3 [ 165.089634] sym1: 875 rev 0x14 at pci :00:03.1 irq 14 [ 165.242820] sym1: No NVRAM, ID 7, Fast-20, SE, parity checking [ 165.319227] sym1: SCSI BUS has been reset. [ 165.367281] scsi host1: sym-2.2.3 Does it detect drives in the bit you cut? I ask because one of the symptoms of a misrouted irq is random problems with bring up. However, if anything is detected, then the irq must be OK. No, nothing scsi related - rtc detection etc. James [ 388.835999] INFO: task swapper/0:1 blocked for more than 120 seconds. [ 388.912181] Not tainted 3.17.0-rc1 #46 [ 388.963187] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 389.056953] swapper/0 D 00483958 7584 1 0 0x2000100 [ 389.148575] Call Trace: [ 389.177747] [0082e5fc] schedule+0x1c/0x80 [ 389.235024] [00483958] async_synchronize_cookie_domain+0x58/0x100 [ 389.317301] [00483a28] async_synchronize_full+0x8/0x20 [ 389.388133] [006ebe04] wait_for_device_probe+0x64/0x80 [ 389.458938] [009dcffc] prepare_namespace+0x4/0x1b8 [ 389.525590] [009dcbac] kernel_init_freeable+0x1c0/0x1d8 [ 389.597450] [008298e4] kernel_init+0x4/0x100 [ 389.657868] [004060c4] ret_from_fork+0x1c/0x2c [ 389.720324] [] (null) [ 389.775518] no locks held by swapper/0/1. -- Meelis Roos (mr...@linux.ee) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Meelis Roos (mr...@linux.ee) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sym2 scsi hang on boot on sparc64
On Tue, 2014-08-19 at 14:25 +0300, Meelis Roos wrote: 3.16 scsi worked fine, 3.17-rc1 misbehaves on 3 of my sparc64 test machines. E220R and E420R are with onboard 5c3875, V210 is with onboarc 53c1010 and all behave the same. Any ideas whre to dig deeper? bisection might be nontrivial, because of sparc64 changes that are OK on 3.17-rc1 again - but is possible if nothing else helps. We've got a parisc with an 875 as a root SCSI bus ... I haven't got around to building for it yet, but I might find time to try today. Same on parisc: sym0: 1010-66 rev 0x1 at pci :20:01.0 irq 22 sym0: PA-RISC Firmware, ID 7, Fast-80, LVD, parity checking sym0: SCSI BUS has been reset. scsi host0: sym-2.2.3 random: nonblocking pool is initialized and hangs here. So hopefully it is reproducible for you. -- Meelis Roos (mr...@linux.ee) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sym2 scsi hang on boot on sparc64
On Tue, 2014-08-19 at 17:37 +0300, Meelis Roos wrote: On Tue, 2014-08-19 at 14:25 +0300, Meelis Roos wrote: 3.16 scsi worked fine, 3.17-rc1 misbehaves on 3 of my sparc64 test machines. E220R and E420R are with onboard 5c3875, V210 is with onboarc 53c1010 and all behave the same. Any ideas whre to dig deeper? bisection might be nontrivial, because of sparc64 changes that are OK on 3.17-rc1 again - but is possible if nothing else helps. We've got a parisc with an 875 as a root SCSI bus ... I haven't got around to building for it yet, but I might find time to try today. Same on parisc: sym0: 1010-66 rev 0x1 at pci :20:01.0 irq 22 sym0: PA-RISC Firmware, ID 7, Fast-80, LVD, parity checking sym0: SCSI BUS has been reset. scsi host0: sym-2.2.3 random: nonblocking pool is initialized and hangs here. So hopefully it is reproducible for you. And also independent of the sparc changes. The only other change in the window you quote is 64 bit luns. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sym2 scsi hang on boot on sparc64
Hi, On Tue, Aug 19, 2014 at 09:47:35AM -0500, James Bottomley wrote: On Tue, 2014-08-19 at 17:37 +0300, Meelis Roos wrote: On Tue, 2014-08-19 at 14:25 +0300, Meelis Roos wrote: 3.16 scsi worked fine, 3.17-rc1 misbehaves on 3 of my sparc64 test machines. E220R and E420R are with onboard 5c3875, V210 is with onboarc 53c1010 and all behave the same. Any ideas whre to dig deeper? bisection might be nontrivial, because of sparc64 changes that are OK on 3.17-rc1 again - but is possible if nothing else helps. We've got a parisc with an 875 as a root SCSI bus ... I haven't got around to building for it yet, but I might find time to try today. Same on parisc: sym0: 1010-66 rev 0x1 at pci :20:01.0 irq 22 sym0: PA-RISC Firmware, ID 7, Fast-80, LVD, parity checking sym0: SCSI BUS has been reset. scsi host0: sym-2.2.3 random: nonblocking pool is initialized and hangs here. So hopefully it is reproducible for you. And also independent of the sparc changes. The only other change in the window you quote is 64 bit luns. Bisection (on PA-RISC) points to: 71e75c97f97a9645d25fbf3d8e4165a558f18747 is the first bad commit commit 71e75c97f97a9645d25fbf3d8e4165a558f18747 Author: Christoph Hellwig h...@lst.de Date: Fri Apr 11 19:07:01 2014 +0200 scsi: convert device_busy to atomic_t A. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sym2 scsi hang on boot on sparc64
On Tue, Aug 19, 2014 at 11:17:48PM +0300, Aaro Koskinen wrote: Hi, On Tue, Aug 19, 2014 at 09:47:35AM -0500, James Bottomley wrote: On Tue, 2014-08-19 at 17:37 +0300, Meelis Roos wrote: On Tue, 2014-08-19 at 14:25 +0300, Meelis Roos wrote: 3.16 scsi worked fine, 3.17-rc1 misbehaves on 3 of my sparc64 test machines. E220R and E420R are with onboard 5c3875, V210 is with onboarc 53c1010 and all behave the same. Any ideas whre to dig deeper? bisection might be nontrivial, because of sparc64 changes that are OK on 3.17-rc1 again - but is possible if nothing else helps. We've got a parisc with an 875 as a root SCSI bus ... I haven't got around to building for it yet, but I might find time to try today. Same on parisc: sym0: 1010-66 rev 0x1 at pci :20:01.0 irq 22 sym0: PA-RISC Firmware, ID 7, Fast-80, LVD, parity checking sym0: SCSI BUS has been reset. scsi host0: sym-2.2.3 random: nonblocking pool is initialized and hangs here. So hopefully it is reproducible for you. And also independent of the sparc changes. The only other change in the window you quote is 64 bit luns. Bisection (on PA-RISC) points to: 71e75c97f97a9645d25fbf3d8e4165a558f18747 is the first bad commit commit 71e75c97f97a9645d25fbf3d8e4165a558f18747 Author: Christoph Hellwig h...@lst.de Date: Fri Apr 11 19:07:01 2014 +0200 scsi: convert device_busy to atomic_t I guess you need this fix: diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 9c44392..ce62e87 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1774,7 +1774,7 @@ static void scsi_request_fn(struct request_queue *q) blk_requeue_request(q, req); atomic_dec(sdev-device_busy); out_delay: - if (atomic_read(sdev-device_busy) !scsi_device_blocked(sdev)) + if (!atomic_read(sdev-device_busy) !scsi_device_blocked(sdev)) blk_delay_queue(q, SCSI_QUEUE_DELAY); } James already sent it to Linus. Sam -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sym2 scsi hang on boot on sparc64
On Tue, 2014-08-19 at 23:17 +0300, Aaro Koskinen wrote: Hi, On Tue, Aug 19, 2014 at 09:47:35AM -0500, James Bottomley wrote: On Tue, 2014-08-19 at 17:37 +0300, Meelis Roos wrote: On Tue, 2014-08-19 at 14:25 +0300, Meelis Roos wrote: 3.16 scsi worked fine, 3.17-rc1 misbehaves on 3 of my sparc64 test machines. E220R and E420R are with onboard 5c3875, V210 is with onboarc 53c1010 and all behave the same. Any ideas whre to dig deeper? bisection might be nontrivial, because of sparc64 changes that are OK on 3.17-rc1 again - but is possible if nothing else helps. We've got a parisc with an 875 as a root SCSI bus ... I haven't got around to building for it yet, but I might find time to try today. Same on parisc: sym0: 1010-66 rev 0x1 at pci :20:01.0 irq 22 sym0: PA-RISC Firmware, ID 7, Fast-80, LVD, parity checking sym0: SCSI BUS has been reset. scsi host0: sym-2.2.3 random: nonblocking pool is initialized and hangs here. So hopefully it is reproducible for you. And also independent of the sparc changes. The only other change in the window you quote is 64 bit luns. Bisection (on PA-RISC) points to: 71e75c97f97a9645d25fbf3d8e4165a558f18747 is the first bad commit commit 71e75c97f97a9645d25fbf3d8e4165a558f18747 Author: Christoph Hellwig h...@lst.de Date: Fri Apr 11 19:07:01 2014 +0200 scsi: convert device_busy to atomic_t That's fixed upstream: commit 480cadc2b7e0fa2bbab20141efb547dfe0c3707c Author: Guenter Roeck li...@roeck-us.net Date: Sun Aug 10 05:54:25 2014 -0700 scsi: Fix qemu boot hang problem Could you try with a kernel that has that fix? Thanks, James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sym2 scsi hang on boot on sparc64
Hi, On Tue, Aug 19, 2014 at 03:37:18PM -0500, James Bottomley wrote: On Tue, 2014-08-19 at 23:17 +0300, Aaro Koskinen wrote: Bisection (on PA-RISC) points to: 71e75c97f97a9645d25fbf3d8e4165a558f18747 is the first bad commit commit 71e75c97f97a9645d25fbf3d8e4165a558f18747 Author: Christoph Hellwig h...@lst.de Date: Fri Apr 11 19:07:01 2014 +0200 scsi: convert device_busy to atomic_t That's fixed upstream: commit 480cadc2b7e0fa2bbab20141efb547dfe0c3707c Author: Guenter Roeck li...@roeck-us.net Date: Sun Aug 10 05:54:25 2014 -0700 scsi: Fix qemu boot hang problem Could you try with a kernel that has that fix? Yes, the box boots now fine with the fix. Thanks, A. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html