Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-09 Thread Jens Axboe

On 02/09/2016 10:19 AM, Christoph Hellwig wrote:

Updated version below:

---
 From d63251560cf2670badbc86c83502502f29c087e0 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Tue, 9 Feb 2016 18:11:32 +0100
Subject: nvme: fix Kconfig description for BLK_DEV_NVME_SCSI

Signed-off-by: Christoph Hellwig 
---
  drivers/nvme/host/Kconfig | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
index 59307f8..68fa858 100644
--- a/drivers/nvme/host/Kconfig
+++ b/drivers/nvme/host/Kconfig
@@ -17,8 +17,9 @@ config BLK_DEV_NVME_SCSI
  and block devices nodes, as well a a translation for a small
  number of selected SCSI commands to NVMe commands to the NVMe
  driver.  If you don't know what this means you probably want
- to say N here, and if you know what it means you probably
- want to say N as well.
+ to say N here, unless you run a distro that abuses the SCSI
+ emulation to provide stable device names for mount by id like
+ some OpenSuSE and SLES versions.


Thanks, looks good to me, I'll fold it in for the current series.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-09 Thread Christoph Hellwig
Updated version below:

---
>From d63251560cf2670badbc86c83502502f29c087e0 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Tue, 9 Feb 2016 18:11:32 +0100
Subject: nvme: fix Kconfig description for BLK_DEV_NVME_SCSI

Signed-off-by: Christoph Hellwig 
---
 drivers/nvme/host/Kconfig | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
index 59307f8..68fa858 100644
--- a/drivers/nvme/host/Kconfig
+++ b/drivers/nvme/host/Kconfig
@@ -17,8 +17,9 @@ config BLK_DEV_NVME_SCSI
  and block devices nodes, as well a a translation for a small
  number of selected SCSI commands to NVMe commands to the NVMe
  driver.  If you don't know what this means you probably want
- to say N here, and if you know what it means you probably
- want to say N as well.
+ to say N here, unless you run a distro that abuses the SCSI
+ emulation to provide stable device names for mount by id like
+ some OpenSuSE and SLES versions.
 
 config NVME_FABRICS
tristate
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-09 Thread Jens Axboe

On 02/09/2016 10:12 AM, Christoph Hellwig wrote:

Does this looks reasonable?

---
 From 7843fae979df3fc14007735f54cc6bb2f6f66dc5 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Tue, 9 Feb 2016 18:11:32 +0100
Subject: nvme: fix Kconfig description for BLK_DEV_NVME_SCSI

Signed-off-by: Christoph Hellwig 
---
  drivers/nvme/host/Kconfig | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
index 59307f8..2e24156 100644
--- a/drivers/nvme/host/Kconfig
+++ b/drivers/nvme/host/Kconfig
@@ -17,8 +17,8 @@ config BLK_DEV_NVME_SCSI
  and block devices nodes, as well a a translation for a small
  number of selected SCSI commands to NVMe commands to the NVMe
  driver.  If you don't know what this means you probably want
- to say N here, and if you know what it means you probably
- want to say N as well.
+ to say N here, unless you run a distro that abuses this for
+ stable device names like some OpenSuSE and SLES versions.


Yep, that looks a lot more reasonable to me. Might be wort including 
that it impacts the mount-by-id on those distros.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-09 Thread Christoph Hellwig
Does this looks reasonable?

---
>From 7843fae979df3fc14007735f54cc6bb2f6f66dc5 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Tue, 9 Feb 2016 18:11:32 +0100
Subject: nvme: fix Kconfig description for BLK_DEV_NVME_SCSI

Signed-off-by: Christoph Hellwig 
---
 drivers/nvme/host/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
index 59307f8..2e24156 100644
--- a/drivers/nvme/host/Kconfig
+++ b/drivers/nvme/host/Kconfig
@@ -17,8 +17,8 @@ config BLK_DEV_NVME_SCSI
  and block devices nodes, as well a a translation for a small
  number of selected SCSI commands to NVMe commands to the NVMe
  driver.  If you don't know what this means you probably want
- to say N here, and if you know what it means you probably
- want to say N as well.
+ to say N here, unless you run a distro that abuses this for
+ stable device names like some OpenSuSE and SLES versions.
 
 config NVME_FABRICS
tristate
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-09 Thread James Bottomley
On Tue, 2016-02-09 at 13:50 +0100, Christoph Hellwig wrote:
> Jens,
> 
> do you want a 'default y' patch or just a better description?  I'd be
> happy to send either one.

Since it only appears to be SUSE and they've now been told, better
description is fine.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-09 Thread Jens Axboe
On Feb 9, 2016, at 5:50 AM, Christoph Hellwig  wrote:
> 
> Jens,
> 
> do you want a 'default y' patch or just a better description?  I'd be
> happy to send either one.

A better description  

-- 
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-09 Thread Christoph Hellwig
Jens,

do you want a 'default y' patch or just a better description?  I'd be
happy to send either one.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread Hannes Reinecke
On 02/07/2016 05:04 PM, James Bottomley wrote:
> On Sun, 2016-02-07 at 10:22 +0100, Christoph Hellwig wrote:
>> Keith said it should be on by default, and I promised him to change
>> it once we run into problems, which I guess this counts as.
>>
>> But just curious:  what distro are you using?  Upstream systemd
>> explicitly rejected using scsi_id for NVMe here:
>>
>>  https://github.com/systemd/systemd/issues/1453
>>
>> and all my test systems don't do this either.
> 
> This was SUSE (in my case, openSUSE Leap).  I just checked the source
> package; they patch the by-id rules back in for NVME:
> 
> # PATCH-FIX-SUSE 1101-rules-persistent-device-names-for-NVMe-devices.patch 
> (bsc#944132)
> Patch1101:  1101-rules-persistent-device-names-for-NVMe-devices.patch
> 
> The bugzilla is giving access denied for bug id 944132, so it's likely
> some proprietary vendor problem.  The patch has no preamble, so it's
> hard to tell what they were thinking.
> 
They didn't think at all. That abovementioned bug just states 'by-id
symlinks for NVMe drives are missing'.
And they fixed it by add the respective rules (using sg_inq) to udev.

There's no mentioning of any NVMe specific sysfs attributes whatsoever.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread Jens Axboe

On 02/07/2016 09:04 AM, James Bottomley wrote:

On Sun, 2016-02-07 at 10:22 +0100, Christoph Hellwig wrote:

Keith said it should be on by default, and I promised him to change
it once we run into problems, which I guess this counts as.

But just curious:  what distro are you using?  Upstream systemd
explicitly rejected using scsi_id for NVMe here:

https://github.com/systemd/systemd/issues/1453

and all my test systems don't do this either.


This was SUSE (in my case, openSUSE Leap).  I just checked the source
package; they patch the by-id rules back in for NVME:

# PATCH-FIX-SUSE 1101-rules-persistent-device-names-for-NVMe-devices.patch 
(bsc#944132)
Patch1101:  1101-rules-persistent-device-names-for-NVMe-devices.patch

The bugzilla is giving access denied for bug id 944132, so it's likely
some proprietary vendor problem.  The patch has no preamble, so it's
hard to tell what they were thinking.


I run root-on-nvme on my laptop, and I haven't observed any problems. 
Generally I hate for options to default y unless absolutely necessary, 
it's a sure fire way to feature creep your kernel without noticing. I 
don't think getting all hot about this issue is fair, if the only known 
case is suse.


If anything, let's make the description better. It's trying to be funny, 
it'd be better if it was descriptive and covered this case as well.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread Hannes Reinecke
On 02/08/2016 12:07 AM, James Bottomley wrote:
> On Sun, 2016-02-07 at 15:28 -0700, Jens Axboe wrote:
>> On 02/07/2016 09:04 AM, James Bottomley wrote:
>>> On Sun, 2016-02-07 at 10:22 +0100, Christoph Hellwig wrote:
 Keith said it should be on by default, and I promised him to
 change
 it once we run into problems, which I guess this counts as.

 But just curious:  what distro are you using?  Upstream systemd
 explicitly rejected using scsi_id for NVMe here:

https://github.com/systemd/systemd/issues/1453

 and all my test systems don't do this either.
>>>
>>> This was SUSE (in my case, openSUSE Leap).  I just checked the
>>> source
>>> package; they patch the by-id rules back in for NVME:
>>>
>>> # PATCH-FIX-SUSE 1101-rules-persistent-device-names-for-NVMe
>>> -devices.patch (bsc#944132)
>>> Patch1101:  1101-rules-persistent-device-names-for-NVMe
>>> -devices.patch
>>>
>>> The bugzilla is giving access denied for bug id 944132, so it's
>>> likely
>>> some proprietary vendor problem.  The patch has no preamble, so
>>> it's
>>> hard to tell what they were thinking.
>>
>> I run root-on-nvme on my laptop, and I haven't observed any problems.
> 
> Me too apparently.  It looks like this problem may be SUSE specific
> unless another distro has enabled this.  I can see why they would: you
> do need persistent names for devices, even NVMe ones.
> 
>> Generally I hate for options to default y unless absolutely 
>> necessary, it's a sure fire way to feature creep your kernel without 
>> noticing. I don't think getting all hot about this issue is fair, if 
>> the only known case is suse.
> 
> Well, OK, I'm annoyed because it was a systemd system which means
> debugging boot failures are excruciatingly difficult so it took me a
> week and a half to find out what the problem was.  Perhaps I was a bit
> rash to label this as an easily foreseen problem.
> 
> I opened a bug against SUSE to tell them to turn it on:
> 
> https://bugzilla.opensuse.org/show_bug.cgi?id=965497
> 
> The second problem is that there's currently no way to transition to
> using the serial attribute the way the udev 60-persistent-storage.rules
> are written, so if distros have some by-id hack, it will have to be
> maintained for a while.  I annotated the already closed bug on this in
> systemd with the rules that work for me.
> 
Why, but you can.
That's precisely what I did with the transition to sg_inq; I've
added a new set of rules (55-sg_inq.rules and 59-sg-symlinks.rules)
which will override the values from 60-persistent-storage.rules.

Do we have defined sysfs attributes for NVMe devices nowadays?
If so I'd be willing to create/send some sysfs rules for them.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread Christoph Hellwig
On Sun, Feb 07, 2016 at 03:07:21PM -0800, James Bottomley wrote:
> > I run root-on-nvme on my laptop, and I haven't observed any problems.
> 
> Me too apparently.  It looks like this problem may be SUSE specific
> unless another distro has enabled this.  I can see why they would: you
> do need persistent names for devices, even NVMe ones.

I don't have root on nvme, just my xfstests device, but I still didn't
see the problem, neither did my various nvme development setups.

> I opened a bug against SUSE to tell them to turn it on:
> 
> https://bugzilla.opensuse.org/show_bug.cgi?id=965497
> 
> The second problem is that there's currently no way to transition to
> using the serial attribute the way the udev 60-persistent-storage.rules
> are written, so if distros have some by-id hack, it will have to be
> maintained for a while.  I annotated the already closed bug on this in
> systemd with the rules that work for me.

We now expose the NVMe serial and NGUI, out of which the evpd page is
mangled depending on the NVMe spec version that the device supports as
sysfs attributes, distros can do the same mangling if they want to
support their old ids.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread Keith Busch
On Mon, Feb 08, 2016 at 04:19:13PM +0100, Hannes Reinecke wrote:
> Ok, so what about having a 'wwid' attribute which provides combined
> information (like scsi has)?

That looks like the sensible thing to do. Thanks for pointer.

Going forward, I will solicite more feedback from scsi developers
so NVMe's external attributes better align with storage that already
solved our issues. I agree with Christoph that we never should have
relied on SCSI translations for NVMe, but we don't want to reinvent
generic solutions either.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread James Bottomley
On Mon, 2016-02-08 at 08:32 +0100, Hannes Reinecke wrote:
> On 02/08/2016 12:07 AM, James Bottomley wrote:
> > On Sun, 2016-02-07 at 15:28 -0700, Jens Axboe wrote:
> > > On 02/07/2016 09:04 AM, James Bottomley wrote:
> > > > On Sun, 2016-02-07 at 10:22 +0100, Christoph Hellwig wrote:
> > > > > Keith said it should be on by default, and I promised him to
> > > > > change
> > > > > it once we run into problems, which I guess this counts as.
> > > > > 
> > > > > But just curious:  what distro are you using?  Upstream
> > > > > systemd
> > > > > explicitly rejected using scsi_id for NVMe here:
> > > > > 
> > > > >   https://github.com/systemd/systemd/issues/1453
> > > > > 
> > > > > and all my test systems don't do this either.
> > > > 
> > > > This was SUSE (in my case, openSUSE Leap).  I just checked the
> > > > source
> > > > package; they patch the by-id rules back in for NVME:
> > > > 
> > > > # PATCH-FIX-SUSE 1101-rules-persistent-device-names-for-NVMe
> > > > -devices.patch (bsc#944132)
> > > > Patch1101:  1101-rules-persistent-device-names-for-NVMe
> > > > -devices.patch
> > > > 
> > > > The bugzilla is giving access denied for bug id 944132, so it's
> > > > likely
> > > > some proprietary vendor problem.  The patch has no preamble, so
> > > > it's
> > > > hard to tell what they were thinking.
> > > 
> > > I run root-on-nvme on my laptop, and I haven't observed any
> > > problems.
> > 
> > Me too apparently.  It looks like this problem may be SUSE specific
> > unless another distro has enabled this.  I can see why they would:
> > you
> > do need persistent names for devices, even NVMe ones.
> > 
> > > Generally I hate for options to default y unless absolutely 
> > > necessary, it's a sure fire way to feature creep your kernel
> > > without 
> > > noticing. I don't think getting all hot about this issue is fair,
> > > if 
> > > the only known case is suse.
> > 
> > Well, OK, I'm annoyed because it was a systemd system which means
> > debugging boot failures are excruciatingly difficult so it took me
> > a
> > week and a half to find out what the problem was.  Perhaps I was a
> > bit
> > rash to label this as an easily foreseen problem.
> > 
> > I opened a bug against SUSE to tell them to turn it on:
> > 
> > https://bugzilla.opensuse.org/show_bug.cgi?id=965497
> > 
> > The second problem is that there's currently no way to transition
> > to
> > using the serial attribute the way the udev 60-persistent
> > -storage.rules
> > are written, so if distros have some by-id hack, it will have to be
> > maintained for a while.  I annotated the already closed bug on this
> > in
> > systemd with the rules that work for me.
> > 
> Why, but you can.
> That's precisely what I did with the transition to sg_inq; I've
> added a new set of rules (55-sg_inq.rules and 59-sg-symlinks.rules)
> which will override the values from 60-persistent-storage.rules.
> 
> Do we have defined sysfs attributes for NVMe devices nowadays?
> If so I'd be willing to create/send some sysfs rules for them.

This is the one I finally settled on (for 60-persistent-storage) for
SUSE.  It keeps everything working across old and new kernels.

# nvme
SUBSYSTEMS=="nvme", ATTRS{serial}=="?*", PROGRAM="/usr/bin/echo $attr{serial}", 
ENV{ID_SERIAL}="%c", ENV{ID_BUS}="nvme"

I already complained to Keith that the echo is required to strip the
leading whitespace which I don't think should be in the serial
attribute.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread Hannes Reinecke
On 02/08/2016 04:12 PM, Keith Busch wrote:
> On Mon, Feb 08, 2016 at 11:13:50AM +0100, Christoph Hellwig wrote:
>> On Mon, Feb 08, 2016 at 12:01:16PM +0200, Sagi Grimberg wrote:
>>>
 Do we have defined sysfs attributes for NVMe devices nowadays?
>>>
>>> /sys/block/nvme0n1/uuid
>>
>> That's only supported for NVMe 1.1 and higher devices, and optional.
>> For older or stupid devices we need to support the algorithm based
>> on the serial attribute from nvme_fill_device_id_scsi_string() in
>> drivers/nvme/host/scsi.c.
> 
> It's even worse. NGUID was defined for 1.2 devices and higher. 1.1
> devices should have EUI-64 at:
>  
>   /sys/block/nvmeXnY/eui
> 
> 1.2 devices will have either uuid or eui (or both).
> 
> The majority of devices in circulation today are 1.0, and need to concat
> these three entries to make a unique identifier:
> 
>   /sys/block/nvmeXnY/device/serial
>   /sys/block/nvmeXnY/device/model
>   /sys/block/nvmeXnY/nsid

Ok, so what about having a 'wwid' attribute which provides combined
information (like scsi has)?

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread Keith Busch
On Mon, Feb 08, 2016 at 11:13:50AM +0100, Christoph Hellwig wrote:
> On Mon, Feb 08, 2016 at 12:01:16PM +0200, Sagi Grimberg wrote:
> >
> >> Do we have defined sysfs attributes for NVMe devices nowadays?
> >
> > /sys/block/nvme0n1/uuid
> 
> That's only supported for NVMe 1.1 and higher devices, and optional.
> For older or stupid devices we need to support the algorithm based
> on the serial attribute from nvme_fill_device_id_scsi_string() in
> drivers/nvme/host/scsi.c.

It's even worse. NGUID was defined for 1.2 devices and higher. 1.1
devices should have EUI-64 at:
 
  /sys/block/nvmeXnY/eui

1.2 devices will have either uuid or eui (or both).

The majority of devices in circulation today are 1.0, and need to concat
these three entries to make a unique identifier:

  /sys/block/nvmeXnY/device/serial
  /sys/block/nvmeXnY/device/model
  /sys/block/nvmeXnY/nsid
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread Christoph Hellwig
On Mon, Feb 08, 2016 at 12:01:16PM +0200, Sagi Grimberg wrote:
>
>> Do we have defined sysfs attributes for NVMe devices nowadays?
>
> /sys/block/nvme0n1/uuid

That's only supported for NVMe 1.1 and higher devices, and optional.
For older or stupid devices we need to support the algorithm based
on the serial attribute from nvme_fill_device_id_scsi_string() in
drivers/nvme/host/scsi.c.

Which btw is іncorrect as it doesn't identify namespaces properly,
and thus the associatation of 0 is an actively harmful lie.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-08 Thread Sagi Grimberg



Do we have defined sysfs attributes for NVMe devices nowadays?


/sys/block/nvme0n1/uuid


If so I'd be willing to create/send some sysfs rules for them.


That'd be great!
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-07 Thread James Bottomley
On Sun, 2016-02-07 at 15:28 -0700, Jens Axboe wrote:
> On 02/07/2016 09:04 AM, James Bottomley wrote:
> > On Sun, 2016-02-07 at 10:22 +0100, Christoph Hellwig wrote:
> > > Keith said it should be on by default, and I promised him to
> > > change
> > > it once we run into problems, which I guess this counts as.
> > > 
> > > But just curious:  what distro are you using?  Upstream systemd
> > > explicitly rejected using scsi_id for NVMe here:
> > > 
> > >   https://github.com/systemd/systemd/issues/1453
> > > 
> > > and all my test systems don't do this either.
> > 
> > This was SUSE (in my case, openSUSE Leap).  I just checked the
> > source
> > package; they patch the by-id rules back in for NVME:
> > 
> > # PATCH-FIX-SUSE 1101-rules-persistent-device-names-for-NVMe
> > -devices.patch (bsc#944132)
> > Patch1101:  1101-rules-persistent-device-names-for-NVMe
> > -devices.patch
> > 
> > The bugzilla is giving access denied for bug id 944132, so it's
> > likely
> > some proprietary vendor problem.  The patch has no preamble, so
> > it's
> > hard to tell what they were thinking.
> 
> I run root-on-nvme on my laptop, and I haven't observed any problems.

Me too apparently.  It looks like this problem may be SUSE specific
unless another distro has enabled this.  I can see why they would: you
do need persistent names for devices, even NVMe ones.

> Generally I hate for options to default y unless absolutely 
> necessary, it's a sure fire way to feature creep your kernel without 
> noticing. I don't think getting all hot about this issue is fair, if 
> the only known case is suse.

Well, OK, I'm annoyed because it was a systemd system which means
debugging boot failures are excruciatingly difficult so it took me a
week and a half to find out what the problem was.  Perhaps I was a bit
rash to label this as an easily foreseen problem.

I opened a bug against SUSE to tell them to turn it on:

https://bugzilla.opensuse.org/show_bug.cgi?id=965497

The second problem is that there's currently no way to transition to
using the serial attribute the way the udev 60-persistent-storage.rules
are written, so if distros have some by-id hack, it will have to be
maintained for a while.  I annotated the already closed bug on this in
systemd with the rules that work for me.

> If anything, let's make the description better. It's trying to be
> funny, it'd be better if it was descriptive and covered this case as
> well.

The problem with this is that when moving to new kernels, distro
maintainers don't read the new option help texts, they just take the
defaults.  However, I checked the only other distribution I use
(debian) and they don't have a nvme persistent ID hack, so if someone
checked ubuntu and Red Hat, I think all the majors are now covered and
perhaps there's no need to do anything more.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-07 Thread James Bottomley
On Sun, 2016-02-07 at 10:22 +0100, Christoph Hellwig wrote:
> Keith said it should be on by default, and I promised him to change
> it once we run into problems, which I guess this counts as.
> 
> But just curious:  what distro are you using?  Upstream systemd
> explicitly rejected using scsi_id for NVMe here:
> 
>   https://github.com/systemd/systemd/issues/1453
> 
> and all my test systems don't do this either.

This was SUSE (in my case, openSUSE Leap).  I just checked the source
package; they patch the by-id rules back in for NVME:

# PATCH-FIX-SUSE 1101-rules-persistent-device-names-for-NVMe-devices.patch 
(bsc#944132)
Patch1101:  1101-rules-persistent-device-names-for-NVMe-devices.patch

The bugzilla is giving access denied for bug id 944132, so it's likely
some proprietary vendor problem.  The patch has no preamble, so it's
hard to tell what they were thinking.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-07 Thread Christoph Hellwig
Keith said it should be on by default, and I promised him to change
it once we run into problems, which I guess this counts as.

But just curious:  what distro are you using?  Upstream systemd
explicitly rejected using scsi_id for NVMe here:

https://github.com/systemd/systemd/issues/1453

and all my test systems don't do this either.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


complete boot failure in 4.5-rc1 caused by nvme: make SG_IO support optional

2016-02-06 Thread James Bottomley
The reason is fairly obvious: the default for the new option
 BLK_DEV_NVME_SCSI is N and all the distribution kernels (and me when
testing) take the default options (I checked in the OBS kernel builds
and this is true).

The net result is that scsi_id from udev no longer works on nvme disks
and that means that the /dev/disk/by-id links are all gone in 4.5-rc1. 
 If this happens to be how you name your disks in fstab or crypttab,
you can't boot.

If you're going to break an ABI in this way, you really have to plan
for it in userspace.  How are NVMe disk ids supposed to be exported
now?  Does udev need a nvme_id program to do this?  Until there's an
infrastructure ready to work in this way, we need to unconditionally
enable BLK_DEV_NVME_SCSI.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html