On Thu, 2016-12-01 at 08:40 -0500, Martin K. Petersen wrote:
> > "Ewan" == Ewan D Milne writes:
...
> Specifically, the problem appears to be caused by the removal of
> the setting of bio->bi_bdev, which would previously be set to NULL.
> If I add:
>
> diff --git
On Fri, 2016-12-02 at 15:10 +0100, Hannes Reinecke wrote:
> On 12/02/2016 02:29 PM, Ewan D. Milne wrote:
> > On Fri, 2016-12-02 at 04:21 -0800, Christoph Hellwig wrote:
> >> On Thu, Dec 01, 2016 at 08:40:31AM -0500, Martin K. Petersen wrote:
> >>> Specifically, the problem appears to be caused by
thumsh...@suse.de>, "Laurence Oberman"
> <lober...@redhat.com>, "Eyal Ben David" <bde...@gmail.com>,
> dgilb...@interlog.com, linux-scsi@vger.kernel.org
> Sent: Friday, December 2, 2016 9:10:01 AM
> Subject: Re: SG does not ignore dxferp (direct io
On 12/02/2016 02:29 PM, Ewan D. Milne wrote:
On Fri, 2016-12-02 at 04:21 -0800, Christoph Hellwig wrote:
On Thu, Dec 01, 2016 at 08:40:31AM -0500, Martin K. Petersen wrote:
Specifically, the problem appears to be caused by the removal of
the setting of bio->bi_bdev, which would previously be
On Fri, 2016-12-02 at 04:21 -0800, Christoph Hellwig wrote:
> On Thu, Dec 01, 2016 at 08:40:31AM -0500, Martin K. Petersen wrote:
> > Specifically, the problem appears to be caused by the removal of
> > the setting of bio->bi_bdev, which would previously be set to NULL.
> > If I add:
>
> Very
On Thu, Dec 01, 2016 at 08:40:31AM -0500, Martin K. Petersen wrote:
> Specifically, the problem appears to be caused by the removal of
> the setting of bio->bi_bdev, which would previously be set to NULL.
> If I add:
Very odd. For one I would expect it to be NULL anyway, second
I don't see why
> "Ewan" == Ewan D Milne writes:
>> I think what we need to understand is what caused the regression in
>> the first place, I probably should have been bisecting the original
>> failure rather than trying to find where it started working.
Ewan> Bisecting leads to this
On Fri, 2016-11-25 at 12:56 -0500, Ewan Milne wrote:
> I think what we need to understand is what caused the regression in the
> first place, I probably should have been bisecting the original failure
> rather than trying to find where it started working.
>
Bisecting leads to this commit:
On Fri, Nov 25, 2016 at 09:46:05AM -0500, Laurence Oberman wrote:
[...]
>
> Johannes, you are reproducing another race in your test I think.
>
This might actually be true. Disclaimer I'm hauntiung that one for quite some
time and now have a reproducer for it not just a downstream bug report.
erlog.com,
> linux-scsi@vger.kernel.org
> Sent: Friday, November 25, 2016 12:56:16 PM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
>
> >> ---
> >>
> >> In other words, this commit made the bad behavior go away in 4.8.
> >> Need
>> ---
>>
>> In other words, this commit made the bad behavior go away in 4.8.
>> Need to look at this in more detail, it doesn't appear as if this patch
>> was intended to fix such a problem.
>>
>> -Ewan
>
>Are you sure it did? I can repropduce copy_to_user() errors with 4.8 as well.
>Using the
terlog.com,
> linux-scsi@vger.kernel.org
> Sent: Friday, November 25, 2016 7:36:34 AM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
>
> On Fri, Nov 25, 2016 at 1:53 PM, Johannes Thumshirn <jthumsh...@suse.de>
> wrote:
> > On Fri, Nov 25, 2016 at 01:20:
On Fri, Nov 25, 2016 at 1:53 PM, Johannes Thumshirn wrote:
> On Fri, Nov 25, 2016 at 01:20:34PM +0200, Eyal Ben David wrote:
>> Note that sg_mmap_read does not parse the SCSI sense, so the script
>> might fail for other reasons (some SCSI error) and think its a zero
>> byte
On Fri, Nov 25, 2016 at 12:53:17PM +0100, Johannes Thumshirn wrote:
> On Fri, Nov 25, 2016 at 01:20:34PM +0200, Eyal Ben David wrote:
> > Note that sg_mmap_read does not parse the SCSI sense, so the script
> > might fail for other reasons (some SCSI error) and think its a zero
> > byte corruption.
On Fri, Nov 25, 2016 at 01:20:34PM +0200, Eyal Ben David wrote:
> Note that sg_mmap_read does not parse the SCSI sense, so the script
> might fail for other reasons (some SCSI error) and think its a zero
> byte corruption.
But SCSI generic checks for errors and returns -EINVAL on CHECK_CONDITION
Note that sg_mmap_read does not parse the SCSI sense, so the script
might fail for other reasons (some SCSI error) and think its a zero
byte corruption.
If you think an improved version could help (compare results within
the program + parse senses) I can help.
On Fri, Nov 25, 2016 at 10:07 AM,
On Wed, Nov 23, 2016 at 03:22:04PM -0500, Ewan Milne wrote:
[...]
> ---
>
> In other words, this commit made the bad behavior go away in 4.8.
> Need to look at this in more detail, it doesn't appear as if this patch
> was intended to fix such a problem.
>
> -Ewan
Are you sure it did? I can
sh...@suse.de>, dgilb...@interlog.com,
> > "Laurence Oberman" <lober...@redhat.com>,
> > linux-scsi@vger.kernel.org
> > Sent: Tuesday, November 22, 2016 3:55:44 PM
> > Subject: Re: SG does not ignore dxferp (direct io + mmap)
> >
> > On Tue, Nov 22, 2016
hat.com>,
> linux-scsi@vger.kernel.org
> Sent: Tuesday, November 22, 2016 3:55:44 PM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
>
> On Tue, Nov 22, 2016 at 8:30 PM, Ewan D. Milne <emi...@redhat.com> wrote:
> >
> > I see the behavior (z
On Tue, Nov 22, 2016 at 8:30 PM, Ewan D. Milne wrote:
>
> I see the behavior (zero byte) on the 4.4.34, 4.5.7, 4.6.7, and 4.7.10
> -stable kernels. But not (of course) on 4.8.10 -stable.
>
> It doesn't look like the sg driver, might be something in the mmap code?
A kernel
hat.com>,
> linux-scsi@vger.kernel.org
> Sent: Tuesday, November 22, 2016 1:30:07 PM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
>
> On Tue, 2016-11-22 at 09:37 +0100, Johannes Thumshirn wrote:
> > On Mon, Nov 21, 2016 at 01:24:02PM -0500, Ewan Milne wrote:
&
On Tue, 2016-11-22 at 09:37 +0100, Johannes Thumshirn wrote:
> On Mon, Nov 21, 2016 at 01:24:02PM -0500, Ewan Milne wrote:
> > On Mon, 2016-11-21 at 12:34 -0500, Douglas Gilbert wrote:
> > > There was also this change which seems closer to the problem area:
> > >
> > > commit
> @Eyal is this with a physical or virtual host? And what kind of HBA do you
> have? Just in case it makes a difference.
All physical hosts,
Original problem was detected on qlogic FC HBA (Ubuntu 16.04)
To check if this is transport related, we reproduced the 0 byte
corruption on iSCSI too.
On Tue, Nov 22, 2016 at 10:31:23AM -0500, Laurence Oberman wrote:
[...]
> Its not failing on 4.8-rc2 so working on tracing rgw ioctl SG_DXFER_FROM_DEV
> as well and will bisect this weekend and/or add some kernel probes.
> Just need to get some time.
> Johannes or Ewan may beat me to it.
Sounds
hat.com>,
> linux-scsi@vger.kernel.org
> Sent: Tuesday, November 22, 2016 8:48:06 AM
> Subject: Re: SG does not ignore dxferp (direct io + mmap)
>
> Same problem on Fedora 23
>
> $ uname -r
> 4.7.10-100.fc23.x86_64
>
> $ sudo ./sg_mmap_read -d /dev/sg0 -l 0 | od -t
Same problem on Fedora 23
$ uname -r
4.7.10-100.fc23.x86_64
$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 | od -t x1
000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
...
$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 -m | od -t x1
000 eb 63 90 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0
...
$ sudo
On Mon, Nov 21, 2016 at 01:24:02PM -0500, Ewan Milne wrote:
> On Mon, 2016-11-21 at 12:34 -0500, Douglas Gilbert wrote:
> > There was also this change which seems closer to the problem area:
> >
> > commit 461c7fa126794157484dca48e88effa4963e3af3
> > Author: Kirill A. Shutemov
On Mon, 2016-11-21 at 12:34 -0500, Douglas Gilbert wrote:
> There was also this change which seems closer to the problem area:
>
> commit 461c7fa126794157484dca48e88effa4963e3af3
> Author: Kirill A. Shutemov
> Date: Tue Feb 2 16:57:35 2016 -0800
>
>
is the source code:
== cut here ==
2016-11-21 2:04 GMT+02:00 Laurence Oberman <lober...@redhat.com>:
- Original Message -
From: "Eyal Ben David" <bde...@gmail.com>
To: linux-scsi@vger.kernel.org
Sent: Sunday, November 20, 2016 11:02:49 AM
Subject: SG does not ig
On Mon, 2016-11-21 at 16:15 +0100, Johannes Thumshirn wrote:
>
> FWIW:
> jthumshirn@linux-x5ow:~$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 | hexdump
> 000 c033 d08e 00bc 8e7c 8ec0 bed8 7c00 00bf
> 010 b906 0200 f3fc 50a4 1c68 cb06 b9fb 0004
> 020 bebd 8007 007e 7c00 0f0b 0e85 8301 10c5
>
I am using the disro kernels, don't know if they have the patch.
Our IO testing utility use the same pattern (mmap + non-null dxferp) for
a long time, on RHEL 6.x, 7.x and Ubuntu 12.04, 14.04 without a problem,
long before the patch was applied.
Thanks,
Eyal
2016-11-21 17:44 GMT+02:00 Johannes
On Mon, Nov 21, 2016 at 04:15:52PM +0100, Johannes Thumshirn wrote:
> On Mon, Nov 21, 2016 at 04:55:29PM +0200, Eyal Ben David wrote:
> > Thanks for your reply,
> >
> > On RHEL system it does not occur.
> >
> > So far I have seen the problem on Ubuntu 16.04 and Fedora 22 (both
> > with kernel
On Mon, Nov 21, 2016 at 04:55:29PM +0200, Eyal Ben David wrote:
> Thanks for your reply,
>
> On RHEL system it does not occur.
>
> So far I have seen the problem on Ubuntu 16.04 and Fedora 22 (both
> with kernel 4.4.x)
FWIW:
jthumshirn@linux-x5ow:~$ sudo ./sg_mmap_read -d /dev/sg0 -l 0 |
- Original Message -
> From: "Eyal Ben David" <bde...@gmail.com>
> To: emi...@redhat.com
> Cc: "Laurence Oberman" <lober...@redhat.com>, dgilb...@interlog.com,
> linux-scsi@vger.kernel.org
> Sent: Monday, November 21, 2016 9:55:29 AM
>
;
>
>
>
>> 2016-11-21 2:04 GMT+02:00 Laurence Oberman <lober...@redhat.com>:
>> >
>> >
>> > - Original Message -
>> >> From: "Eyal Ben David" <bde...@gmail.com>
>> >> To: linux-scsi@vger.kernel.org
&g
:
> > >
> > >
> > > ----- Original Message -----
> > >> From: "Eyal Ben David" <bde...@gmail.com>
> > >> To: linux-scsi@vger.kernel.org
> > >> Sent: Sunday, November 20, 2016 11:02:49 AM
> > >> Subject: SG d
e code:
>
> == cut here ==
>
> 2016-11-21 2:04 GMT+02:00 Laurence Oberman <lober...@redhat.com>:
> >
> >
> > - Original Message -
> >> From: "Eyal Ben David" <bde...@gmail.com>
> >> To: linux-scsi@vger.kernel.org
>
return 1;
}
}
if (!device || lba == -1) {
fprintf(stderr, "command line error: missing device or lba");
return 1;
}
scsi_read_block(device, lba, fmmap, fmmap_bug);
return 0;
}
2016-11-21 2:04 GMT+02:00 Laurence Oberman <lober...@redh
- Original Message -
> From: "Eyal Ben David" <bde...@gmail.com>
> To: linux-scsi@vger.kernel.org
> Sent: Sunday, November 20, 2016 11:02:49 AM
> Subject: SG does not ignore dxferp (direct io + mmap)
>
> Hi all,
>
> We have some IO utility tha
Hi all,
We have some IO utility that perform the IOs using sg and direct io with mmap.
Our current systems are Ubuntu 14.04, RHEL 6,7
The IO utility always set dxferp to either the address or mmap of
other allocation (valloc)
Setting dxferp was harmless since SG is supposed to ignore the address
40 matches
Mail list logo