Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-09 Thread Johannes Weiner
On Mon, Jan 09, 2017 at 09:30:05PM +0100, Jan Kara wrote: > On Sat 07-01-17 21:02:00, Johannes Weiner wrote: > > On Tue, Jan 03, 2017 at 01:28:25PM +0100, Jan Kara wrote: > > > On Mon 02-01-17 16:11:36, Johannes Weiner wrote: > > > > On Fri, Dec 23, 2016 at 03:33:29AM -0500, Johannes Weiner wrote:

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-09 Thread Jan Kara
On Sat 07-01-17 21:02:00, Johannes Weiner wrote: > On Tue, Jan 03, 2017 at 01:28:25PM +0100, Jan Kara wrote: > > On Mon 02-01-17 16:11:36, Johannes Weiner wrote: > > > On Fri, Dec 23, 2016 at 03:33:29AM -0500, Johannes Weiner wrote: > > > > On Fri, Dec 23, 2016 at 02:32:41AM -0500, Johannes Weiner

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-07 Thread Linus Torvalds
On Sat, Jan 7, 2017 at 6:02 PM, Johannes Weiner wrote: > > Linus? Andrew? Looks fine to me. Will apply. Linus

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-07 Thread Johannes Weiner
On Tue, Jan 03, 2017 at 01:28:25PM +0100, Jan Kara wrote: > On Mon 02-01-17 16:11:36, Johannes Weiner wrote: > > On Fri, Dec 23, 2016 at 03:33:29AM -0500, Johannes Weiner wrote: > > > On Fri, Dec 23, 2016 at 02:32:41AM -0500, Johannes Weiner wrote: > > > > On Thu, Dec 22, 2016 at 12:22:27PM -0800,

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-04 Thread Laurence Oberman
el Mailing List" , "Lee Duncan" > , open-is...@googlegroups.com, > "Linux SCSI List" , linux-bl...@vger.kernel.org, > "Christoph Hellwig" , > "Andrea Arcangeli" > Sent: Wednesday, January 4, 2017 10:26:09 AM > Subject: Re: [4.10, panic, regres

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-04 Thread Laurence Oberman
;Lee Duncan" , > open-is...@googlegroups.com, "Linux SCSI List" > , linux-bl...@vger.kernel.org, "Christoph > Hellwig" , "Jan Kara" > , "Andrea Arcangeli" > Sent: Tuesday, January 3, 2017 7:28:25 AM > Subject: Re: [4.10, panic, regression]

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-04 Thread Christoph Hellwig
On Sat, Dec 24, 2016 at 02:17:26PM +0100, Hannes Reinecke wrote: > Christoph, do you have a pointer to your patchset? Here is a pointer to the current one after splitting it into properly bisectable chunks. Besides proper changelogs the biggest item left is fixing up dm-mpath to not allocate its

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-03 Thread Jan Kara
On Mon 02-01-17 16:11:36, Johannes Weiner wrote: > On Fri, Dec 23, 2016 at 03:33:29AM -0500, Johannes Weiner wrote: > > On Fri, Dec 23, 2016 at 02:32:41AM -0500, Johannes Weiner wrote: > > > On Thu, Dec 22, 2016 at 12:22:27PM -0800, Hugh Dickins wrote: > > > > On Wed, 21 Dec 2016, Linus Torvalds wr

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-02 Thread Johannes Weiner
On Fri, Dec 23, 2016 at 03:33:29AM -0500, Johannes Weiner wrote: > On Fri, Dec 23, 2016 at 02:32:41AM -0500, Johannes Weiner wrote: > > On Thu, Dec 22, 2016 at 12:22:27PM -0800, Hugh Dickins wrote: > > > On Wed, 21 Dec 2016, Linus Torvalds wrote: > > > > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinne

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-24 Thread Christoph Hellwig
On Sat, Dec 24, 2016 at 02:17:26PM +0100, Hannes Reinecke wrote: > Christoph, do you have a pointer to your patchset? > Not that I'll be able to do any meaningful work until next year, but having > a look would be nice. Just to get a feeling where you want to head to; I > might be able to work on

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-24 Thread Hannes Reinecke
On 12/24/2016 11:07 AM, Christoph Hellwig wrote: On Fri, Dec 23, 2016 at 11:42:45AM -0800, Linus Torvalds wrote: Ugh. This patch is nasty. It's the same SCSI has done for ages - except that is uses a separate kmalloc for the sense buffer. I think we should just fix blk_execute_rq() instead.

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-24 Thread Christoph Hellwig
On Fri, Dec 23, 2016 at 11:42:45AM -0800, Linus Torvalds wrote: > Ugh. This patch is nasty. It's the same SCSI has done for ages - except that is uses a separate kmalloc for the sense buffer. > I think we should just fix blk_execute_rq() instead. As you found out below it's not just blk_execute_

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-24 Thread Christoph Hellwig
On Fri, Dec 23, 2016 at 07:45:45PM -0700, Jens Axboe wrote: > It's not that it's technically hard to fix up, it's more that it's a > pain in the ass to have to do it. For instance, for blk_execute_rq(), we > either should enforce that the caller allocates it dynamically and then > free it, or we ne

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-23 Thread Jens Axboe
On 12/23/2016 12:42 PM, Linus Torvalds wrote: > On Fri, Dec 23, 2016 at 2:00 AM, Christoph Hellwig wrote: >> >> From: Christoph Hellwig >> Date: Fri, 23 Dec 2016 10:57:06 +0100 >> Subject: virtio_blk: avoid DMA to stack for the sense buffer >> >> Most users of BLOCK_PC requests allocate the sense

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-23 Thread Linus Torvalds
On Fri, Dec 23, 2016 at 2:00 AM, Christoph Hellwig wrote: > > From: Christoph Hellwig > Date: Fri, 23 Dec 2016 10:57:06 +0100 > Subject: virtio_blk: avoid DMA to stack for the sense buffer > > Most users of BLOCK_PC requests allocate the sense buffer on the stack, > so to avoid DMA to the stack c

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-23 Thread Christoph Hellwig
On Thu, Dec 22, 2016 at 04:03:56PM -0800, Chris Leech wrote: > Of course, looks like I've screwed up my bisect run on this so I'm still > taking a look. It triggers for me with 'hdparm -B /dev/vda' but may > also depend on kernel configuration. > > I started with the fedora rawhide config with a

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-23 Thread Johannes Weiner
On Fri, Dec 23, 2016 at 02:32:41AM -0500, Johannes Weiner wrote: > On Thu, Dec 22, 2016 at 12:22:27PM -0800, Hugh Dickins wrote: > > On Wed, 21 Dec 2016, Linus Torvalds wrote: > > > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinner wrote: > > > > I unmounted the fs, mkfs'd it again, ran the > > > > wo

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Johannes Weiner
On Thu, Dec 22, 2016 at 12:22:27PM -0800, Hugh Dickins wrote: > On Wed, 21 Dec 2016, Linus Torvalds wrote: > > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinner wrote: > > > I unmounted the fs, mkfs'd it again, ran the > > > workload again and about a minute in this fired: > > > > > > [628867.607417]

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Dave Chinner
On Fri, Dec 23, 2016 at 09:33:36AM +1100, Dave Chinner wrote: > On Fri, Dec 23, 2016 at 09:15:00AM +1100, Dave Chinner wrote: > > On Thu, Dec 22, 2016 at 01:10:19PM -0800, Linus Torvalds wrote: > > > Ok, so the numa issue was a red herring. With that fixed: > > > > > > On Thu, Dec 22, 2016 at 1:06

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Jens Axboe
On 12/22/2016 02:10 PM, Linus Torvalds wrote: > Ok, so the numa issue was a red herring. With that fixed: > > On Thu, Dec 22, 2016 at 1:06 PM, Dave Chinner wrote: >> >> Better, but still bad. average files/s is not up to 200k files/s, >> so still a good 10-15% off where it should be. xfs_repair i

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Chris Leech
On Fri, Dec 23, 2016 at 07:53:50AM +0800, Ming Lei wrote: > On Fri, Dec 23, 2016 at 2:50 AM, Chris Leech wrote: > > I'm not reproducing any problems with xfstests running over iscsi_tcp > > right now. Two 10G luns exported from an LIO target, attached directly > > to a test VM as sda/sdb and xfst

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Ming Lei
On Fri, Dec 23, 2016 at 2:50 AM, Chris Leech wrote: > On Thu, Dec 22, 2016 at 05:50:12PM +1100, Dave Chinner wrote: >> On Wed, Dec 21, 2016 at 09:46:37PM -0800, Linus Torvalds wrote: >> > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinner wrote: >> > > >> > > There may be deeper issues. I just started

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Dave Chinner
On Fri, Dec 23, 2016 at 09:15:00AM +1100, Dave Chinner wrote: > On Thu, Dec 22, 2016 at 01:10:19PM -0800, Linus Torvalds wrote: > > Ok, so the numa issue was a red herring. With that fixed: > > > > On Thu, Dec 22, 2016 at 1:06 PM, Dave Chinner wrote: > > > > > > Better, but still bad. average fil

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Dave Chinner
On Thu, Dec 22, 2016 at 01:10:19PM -0800, Linus Torvalds wrote: > Ok, so the numa issue was a red herring. With that fixed: > > On Thu, Dec 22, 2016 at 1:06 PM, Dave Chinner wrote: > > > > Better, but still bad. average files/s is not up to 200k files/s, > > so still a good 10-15% off where it sh

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Linus Torvalds
Ok, so the numa issue was a red herring. With that fixed: On Thu, Dec 22, 2016 at 1:06 PM, Dave Chinner wrote: > > Better, but still bad. average files/s is not up to 200k files/s, > so still a good 10-15% off where it should be. xfs_repair is back > down to 10-15% off where it should be, too. bu

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Dave Chinner
On Fri, Dec 23, 2016 at 07:42:40AM +1100, Dave Chinner wrote: > On Thu, Dec 22, 2016 at 09:24:12AM -0800, Linus Torvalds wrote: > > On Wed, Dec 21, 2016 at 10:28 PM, Dave Chinner wrote: > > > > > > This sort of thing is normally indicative of a memory reclaim or > > > lock contention problem. Prof

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Dave Chinner
On Thu, Dec 22, 2016 at 09:24:12AM -0800, Linus Torvalds wrote: > On Wed, Dec 21, 2016 at 10:28 PM, Dave Chinner wrote: > > > > This sort of thing is normally indicative of a memory reclaim or > > lock contention problem. Profile showed unusual spinlock contention, > > but then I realised there wa

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Hugh Dickins
On Wed, 21 Dec 2016, Linus Torvalds wrote: > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinner wrote: > > > > There may be deeper issues. I just started running scalability tests > > (e.g. 16-way fsmark create tests) and about a minute in I got a > > directory corruption reported - something I hadn't

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Thomas Gleixner
On Thu, 22 Dec 2016, Linus Torvalds wrote: > On Wed, Dec 21, 2016 at 10:28 PM, Dave Chinner wrote: > > > > This sort of thing is normally indicative of a memory reclaim or > > lock contention problem. Profile showed unusual spinlock contention, > > but then I realised there was only one kswapd thr

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Chris Leech
On Thu, Dec 22, 2016 at 05:50:12PM +1100, Dave Chinner wrote: > On Wed, Dec 21, 2016 at 09:46:37PM -0800, Linus Torvalds wrote: > > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinner wrote: > > > > > > There may be deeper issues. I just started running scalability tests > > > (e.g. 16-way fsmark create

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-22 Thread Linus Torvalds
On Wed, Dec 21, 2016 at 10:28 PM, Dave Chinner wrote: > > This sort of thing is normally indicative of a memory reclaim or > lock contention problem. Profile showed unusual spinlock contention, > but then I realised there was only one kswapd thread running. > Yup, sure enough, it's caused by a maj

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Dave Chinner
On Wed, Dec 21, 2016 at 09:46:37PM -0800, Linus Torvalds wrote: > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinner wrote: > > > > There may be deeper issues. I just started running scalability tests > > (e.g. 16-way fsmark create tests) and about a minute in I got a > > directory corruption reported

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Christoph Hellwig
On Thu, Dec 22, 2016 at 05:30:46PM +1100, Dave Chinner wrote: > > For "normal" bios the for_each_segment loop iterates over bi_vcnt, > > so it will be ignored anyway. That being said both I and the lists > > got CCed halfway through the thread and I haven't seen the original > > report, so I'm not

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Dave Chinner
On Thu, Dec 22, 2016 at 07:18:27AM +0100, Christoph Hellwig wrote: > On Wed, Dec 21, 2016 at 03:19:15PM -0800, Linus Torvalds wrote: > > Looking around a bit, the only even halfway suspicious scatterlist > > initialization thing I see is commit f9d03f96b988 ("block: improve > > handling of the magi

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Dave Chinner
On Thu, Dec 22, 2016 at 04:13:22PM +1100, Dave Chinner wrote: > On Wed, Dec 21, 2016 at 04:13:03PM -0800, Chris Leech wrote: > > On Wed, Dec 21, 2016 at 03:19:15PM -0800, Linus Torvalds wrote: > > > Hi, > > > > > > On Wed, Dec 21, 2016 at 2:16 PM, Dave Chinner wrote: > > > > On Fri, Dec 16, 2016

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Christoph Hellwig
On Wed, Dec 21, 2016 at 03:19:15PM -0800, Linus Torvalds wrote: > Looking around a bit, the only even halfway suspicious scatterlist > initialization thing I see is commit f9d03f96b988 ("block: improve > handling of the magic discard payload") which used to have a magic > hack wrt !bio->bi_vcnt, an

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Linus Torvalds
On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinner wrote: > > There may be deeper issues. I just started running scalability tests > (e.g. 16-way fsmark create tests) and about a minute in I got a > directory corruption reported - something I hadn't seen in the dev > cycle at all. By "in the dev cycle

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Dave Chinner
On Wed, Dec 21, 2016 at 04:13:03PM -0800, Chris Leech wrote: > On Wed, Dec 21, 2016 at 03:19:15PM -0800, Linus Torvalds wrote: > > Hi, > > > > On Wed, Dec 21, 2016 at 2:16 PM, Dave Chinner wrote: > > > On Fri, Dec 16, 2016 at 10:59:06AM -0800, Chris Leech wrote: > > >> Thanks Dave, > > >> > > >>

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Chris Leech
On Wed, Dec 21, 2016 at 03:19:15PM -0800, Linus Torvalds wrote: > Hi, > > On Wed, Dec 21, 2016 at 2:16 PM, Dave Chinner wrote: > > On Fri, Dec 16, 2016 at 10:59:06AM -0800, Chris Leech wrote: > >> Thanks Dave, > >> > >> I'm hitting a bug at scatterlist.h:140 before I even get any iSCSI > >> modul

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Linus Torvalds
Hi, On Wed, Dec 21, 2016 at 2:16 PM, Dave Chinner wrote: > On Fri, Dec 16, 2016 at 10:59:06AM -0800, Chris Leech wrote: >> Thanks Dave, >> >> I'm hitting a bug at scatterlist.h:140 before I even get any iSCSI >> modules loaded (virtio block) so there's something else going on in the >> current me

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-21 Thread Dave Chinner
On Fri, Dec 16, 2016 at 10:59:06AM -0800, Chris Leech wrote: > Thanks Dave, > > I'm hitting a bug at scatterlist.h:140 before I even get any iSCSI > modules loaded (virtio block) so there's something else going on in the > current merge window. I'll keep an eye on it and make sure there's > nothi

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-16 Thread Chris Leech
Thanks Dave, I'm hitting a bug at scatterlist.h:140 before I even get any iSCSI modules loaded (virtio block) so there's something else going on in the current merge window. I'll keep an eye on it and make sure there's nothing iSCSI needs fixing for. Chris On Thu, Dec 15, 2016 at 09:29:53AM +11

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-14 Thread Dave Chinner
On Thu, Dec 15, 2016 at 09:24:11AM +1100, Dave Chinner wrote: > Hi folks, > > Just updated my test boxes from 4.9 to a current Linus 4.10 merge > window kernel to test the XFS merge I am preparing for Linus. > Unfortunately, all my test VMs using iscsi failed pretty much > instantly on the first m

[4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2016-12-14 Thread Dave Chinner
Hi folks, Just updated my test boxes from 4.9 to a current Linus 4.10 merge window kernel to test the XFS merge I am preparing for Linus. Unfortunately, all my test VMs using iscsi failed pretty much instantly on the first mount of an iscsi device: [ 159.372704] XFS (sdb): EXPERIMENTAL reverse m