[openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-09 Thread Walter A. Boring IV

Hey folks,
   One of the challenges we have faced with the ability to attach a 
single volume to multiple instances, is how to correctly detach that 
volume.  The issue is a bit complex, but I'll try and explain the 
problem, and then describe one approach to solving one part of the 
detach puzzle.


Problem:
  When a volume is attached to multiple instances on the same host. 
There are 2 scenarios here.


  1) Some Cinder drivers export a new target for every attachment on a 
compute host.  This means that you will get a new unique volume path on 
a host, which is then handed off to the VM instance.


  2) Other Cinder drivers export a single target for all instances on a 
compute host.  This means that every instance on a single host, will 
reuse the same host volume path.



When a user issues a request to detach a volume, the workflow boils down 
to first calling os-brick's connector.disconnect_volume before calling 
Cinder's terminate_connection and detach. disconnect_volume's job is to 
remove the local volume from the host OS and close any sessions.


There is no problem under scenario 1.  Each disconnect_volume only 
affects the attached volume in question and doesn't affect any other VM 
using that same volume, because they are using a different path that has 
shown up on the host.  It's a different target exported from the Cinder 
backend/array.


The problem comes under scenario 2, where that single volume is shared 
for every instance on the same compute host.  Nova needs to be careful 
and not call disconnect_volume if it's a shared volume, otherwise the 
first disconnect_volume call will nuke every instance's access to that 
volume.



Proposed solution:
  Nova needs to determine if the volume that's being detached is a 
shared or non shared volume.  Here is one way to determine that.


  Every Cinder volume has a list of it's attachments.  In those 
attachments it contains the instance_uuid that the volume is attached 
to.  I presume Nova can find which of the volume attachments are on the 
same host.  Then Nova can call Cinder's initialize_connection for each 
of those attachments to get the target's connection_info dictionary.  
This connection_info dictionary describes how to connect to the target 
on the cinder backend.  If the target is shared, then each of the 
connection_info dicts for each attachment on that host will be 
identical.  Then Nova would know that it's a shared target, and then 
only call os-brick's disconnect_volume, if it's the last attachment on 
that host.  I think at most 2 calls to cinder's initialize_connection 
would suffice to determine if the volume is a shared target.  This would 
only need to be done if the volume is multi-attach capable and if there 
are more than 1 attachments on the same host, where the detach is happening.


Walt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-09 Thread Ildikó Váncsa
Hi Walt,

Thanks for starting this thread. It is a good summary of the issue and the 
proposal also looks feasible to me.

I have a quick, hopefully not too wild idea based on the earlier discussions we 
had. We were considering earlier to store the target identifier together with 
the other items of the attachment info. The problem with this idea is that when 
we call initialize_connection from Nova, Cinder does not get the relevant 
information, like instance_id, to be able to do this. This means we cannot do 
that using the functionality we have today.

My idea here is to extend the Cinder API so that Nova can send the missing 
information after a successful attach. Nova should have all the information 
including the 'target', which means that it could update the attachment 
information through the new Cinder API.

It would mean that when we request for the volume info from Cinder at detach 
time the 'attachments' list would contain all the required information for each 
attachments the volume has. If we don't have the 'target' information because 
of any reason we can still use the approach described below as fallback. This 
approach could even be used in case of live migration I think.

The Cinder API extension would need to be added with a new microversion to 
avoid problems with older Cinder versions talking to new Nova.

The advantage of this direction is that we can reduce the round trips to Cinder 
at detach time. The round trip after a successful attach should not have an 
impact on the normal operation as if that fails the only issue we have is we 
need to use the fall back method to be able to detach properly. This would 
still affect only multiattached volumes, where we have more than one 
attachments on the same host. By having the information stored in Cinder as 
well we can also avoid removing a target when there are still active 
attachments connected to it.

What do you think?

Thanks,
Ildikó


> -Original Message-
> From: Walter A. Boring IV [mailto:walter.bor...@hpe.com]
> Sent: February 09, 2016 20:50
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to 
> call os-brick's connector.disconnect_volume
> 
> Hey folks,
> One of the challenges we have faced with the ability to attach a single 
> volume to multiple instances, is how to correctly detach that
> volume.  The issue is a bit complex, but I'll try and explain the problem, 
> and then describe one approach to solving one part of the
> detach puzzle.
> 
> Problem:
>When a volume is attached to multiple instances on the same host.
> There are 2 scenarios here.
> 
>1) Some Cinder drivers export a new target for every attachment on a 
> compute host.  This means that you will get a new unique
> volume path on a host, which is then handed off to the VM instance.
> 
>2) Other Cinder drivers export a single target for all instances on a 
> compute host.  This means that every instance on a single host, will
> reuse the same host volume path.
> 
> 
> When a user issues a request to detach a volume, the workflow boils down to 
> first calling os-brick's connector.disconnect_volume
> before calling Cinder's terminate_connection and detach. disconnect_volume's 
> job is to remove the local volume from the host OS
> and close any sessions.
> 
> There is no problem under scenario 1.  Each disconnect_volume only affects 
> the attached volume in question and doesn't affect any
> other VM using that same volume, because they are using a different path that 
> has shown up on the host.  It's a different target
> exported from the Cinder backend/array.
> 
> The problem comes under scenario 2, where that single volume is shared for 
> every instance on the same compute host.  Nova needs
> to be careful and not call disconnect_volume if it's a shared volume, 
> otherwise the first disconnect_volume call will nuke every
> instance's access to that volume.
> 
> 
> Proposed solution:
>Nova needs to determine if the volume that's being detached is a shared or 
> non shared volume.  Here is one way to determine that.
> 
>Every Cinder volume has a list of it's attachments.  In those attachments 
> it contains the instance_uuid that the volume is attached to.
> I presume Nova can find which of the volume attachments are on the same host. 
>  Then Nova can call Cinder's initialize_connection for
> each of those attachments to get the target's connection_info dictionary.
> This connection_info dictionary describes how to connect to the target on the 
> cinder backend.  If the target is shared, then each of the
> connection_info dicts for each attachment on that host

Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-09 Thread Walter A. Boring IV

On 02/09/2016 02:04 PM, Ildikó Váncsa wrote:

Hi Walt,

Thanks for starting this thread. It is a good summary of the issue and the 
proposal also looks feasible to me.

I have a quick, hopefully not too wild idea based on the earlier discussions we 
had. We were considering earlier to store the target identifier together with 
the other items of the attachment info. The problem with this idea is that when 
we call initialize_connection from Nova, Cinder does not get the relevant 
information, like instance_id, to be able to do this. This means we cannot do 
that using the functionality we have today.

My idea here is to extend the Cinder API so that Nova can send the missing 
information after a successful attach. Nova should have all the information 
including the 'target', which means that it could update the attachment 
information through the new Cinder API.
I think we need to do is to allow the connector to be passed at 
os-attach time.   Then cinder can save it in the attachment's table entry.


We will also need a new cinder API to allow that attachment to be 
updated during live migration, or the connector for the attachment will 
get stale and incorrect.


Walt


It would mean that when we request for the volume info from Cinder at detach 
time the 'attachments' list would contain all the required information for each 
attachments the volume has. If we don't have the 'target' information because 
of any reason we can still use the approach described below as fallback. This 
approach could even be used in case of live migration I think.

The Cinder API extension would need to be added with a new microversion to 
avoid problems with older Cinder versions talking to new Nova.

The advantage of this direction is that we can reduce the round trips to Cinder 
at detach time. The round trip after a successful attach should not have an 
impact on the normal operation as if that fails the only issue we have is we 
need to use the fall back method to be able to detach properly. This would 
still affect only multiattached volumes, where we have more than one 
attachments on the same host. By having the information stored in Cinder as 
well we can also avoid removing a target when there are still active 
attachments connected to it.

What do you think?

Thanks,
Ildikó



-Original Message-
From: Walter A. Boring IV [mailto:walter.bor...@hpe.com]
Sent: February 09, 2016 20:50
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call 
os-brick's connector.disconnect_volume

Hey folks,
 One of the challenges we have faced with the ability to attach a single 
volume to multiple instances, is how to correctly detach that
volume.  The issue is a bit complex, but I'll try and explain the problem, and 
then describe one approach to solving one part of the
detach puzzle.

Problem:
When a volume is attached to multiple instances on the same host.
There are 2 scenarios here.

1) Some Cinder drivers export a new target for every attachment on a 
compute host.  This means that you will get a new unique
volume path on a host, which is then handed off to the VM instance.

2) Other Cinder drivers export a single target for all instances on a 
compute host.  This means that every instance on a single host, will
reuse the same host volume path.


When a user issues a request to detach a volume, the workflow boils down to 
first calling os-brick's connector.disconnect_volume
before calling Cinder's terminate_connection and detach. disconnect_volume's 
job is to remove the local volume from the host OS
and close any sessions.

There is no problem under scenario 1.  Each disconnect_volume only affects the 
attached volume in question and doesn't affect any
other VM using that same volume, because they are using a different path that 
has shown up on the host.  It's a different target
exported from the Cinder backend/array.

The problem comes under scenario 2, where that single volume is shared for 
every instance on the same compute host.  Nova needs
to be careful and not call disconnect_volume if it's a shared volume, otherwise 
the first disconnect_volume call will nuke every
instance's access to that volume.


Proposed solution:
Nova needs to determine if the volume that's being detached is a shared or 
non shared volume.  Here is one way to determine that.

Every Cinder volume has a list of it's attachments.  In those attachments 
it contains the instance_uuid that the volume is attached to.
I presume Nova can find which of the volume attachments are on the same host.  
Then Nova can call Cinder's initialize_connection for
each of those attachments to get the target's connection_info dictionary.
This connection_info dictionary describes how to connect to the target on the 
cinder backend.  If the ta

Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-09 Thread Ildikó Váncsa
Hi Walt,

> -Original Message-
> From: Walter A. Boring IV [mailto:walter.bor...@hpe.com]
> Sent: February 09, 2016 23:15
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to 
> call os-brick's connector.disconnect_volume
> 
> On 02/09/2016 02:04 PM, Ildikó Váncsa wrote:
> > Hi Walt,
> >
> > Thanks for starting this thread. It is a good summary of the issue and the 
> > proposal also looks feasible to me.
> >
> > I have a quick, hopefully not too wild idea based on the earlier 
> > discussions we had. We were considering earlier to store the target
> identifier together with the other items of the attachment info. The problem 
> with this idea is that when we call initialize_connection
> from Nova, Cinder does not get the relevant information, like instance_id, to 
> be able to do this. This means we cannot do that using
> the functionality we have today.
> >
> > My idea here is to extend the Cinder API so that Nova can send the missing 
> > information after a successful attach. Nova should have
> all the information including the 'target', which means that it could update 
> the attachment information through the new Cinder API.
> I think we need to do is to allow the connector to be passed at
> os-attach time.   Then cinder can save it in the attachment's table entry.
> 
> We will also need a new cinder API to allow that attachment to be updated 
> during live migration, or the connector for the attachment
> will get stale and incorrect.

When saying below that it will be good for live migration as well I meant that 
the update is part of the API.

Ildikó

> 
> Walt
> >
> > It would mean that when we request for the volume info from Cinder at 
> > detach time the 'attachments' list would contain all the
> required information for each attachments the volume has. If we don't have 
> the 'target' information because of any reason we can
> still use the approach described below as fallback. This approach could even 
> be used in case of live migration I think.
> >
> > The Cinder API extension would need to be added with a new microversion to 
> > avoid problems with older Cinder versions talking to
> new Nova.
> >
> > The advantage of this direction is that we can reduce the round trips to 
> > Cinder at detach time. The round trip after a successful
> attach should not have an impact on the normal operation as if that fails the 
> only issue we have is we need to use the fall back method
> to be able to detach properly. This would still affect only multiattached 
> volumes, where we have more than one attachments on the
> same host. By having the information stored in Cinder as well we can also 
> avoid removing a target when there are still active
> attachments connected to it.
> >
> > What do you think?
> >
> > Thanks,
> > Ildikó
> >
> >
> >> -Original Message-
> >> From: Walter A. Boring IV [mailto:walter.bor...@hpe.com]
> >> Sent: February 09, 2016 20:50
> >> To: OpenStack Development Mailing List (not for usage questions)
> >> Subject: [openstack-dev] [Nova][Cinder] Multi-attach, determining
> >> when to call os-brick's connector.disconnect_volume
> >>
> >> Hey folks,
> >>  One of the challenges we have faced with the ability to attach a
> >> single volume to multiple instances, is how to correctly detach that
> >> volume.  The issue is a bit complex, but I'll try and explain the problem, 
> >> and then describe one approach to solving one part of the
> detach puzzle.
> >>
> >> Problem:
> >> When a volume is attached to multiple instances on the same host.
> >> There are 2 scenarios here.
> >>
> >> 1) Some Cinder drivers export a new target for every attachment
> >> on a compute host.  This means that you will get a new unique volume path 
> >> on a host, which is then handed off to the VM
> instance.
> >>
> >> 2) Other Cinder drivers export a single target for all instances
> >> on a compute host.  This means that every instance on a single host, will 
> >> reuse the same host volume path.
> >>
> >>
> >> When a user issues a request to detach a volume, the workflow boils
> >> down to first calling os-brick's connector.disconnect_volume before
> >> calling Cinder's terminate_connection and detach. disconnect_volume's job 
> >> is to remove the local volume from the host OS and
> close any sessions.

Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-10 Thread John Griffith
On Tue, Feb 9, 2016 at 3:23 PM, Ildikó Váncsa 
wrote:

> Hi Walt,
>
> > -Original Message-
> > From: Walter A. Boring IV [mailto:walter.bor...@hpe.com]
> > Sent: February 09, 2016 23:15
> > To: openstack-dev@lists.openstack.org
> > Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining
> when to call os-brick's connector.disconnect_volume
> >
> > On 02/09/2016 02:04 PM, Ildikó Váncsa wrote:
> > > Hi Walt,
> > >
> > > Thanks for starting this thread. It is a good summary of the issue and
> the proposal also looks feasible to me.
> > >
> > > I have a quick, hopefully not too wild idea based on the earlier
> discussions we had. We were considering earlier to store the target
> > identifier together with the other items of the attachment info. The
> problem with this idea is that when we call initialize_connection
> > from Nova, Cinder does not get the relevant information, like
> instance_id, to be able to do this. This means we cannot do that using
> > the functionality we have today.
> > >
> > > My idea here is to extend the Cinder API so that Nova can send the
> missing information after a successful attach. Nova should have
> > all the information including the 'target', which means that it could
> update the attachment information through the new Cinder API.
> > I think we need to do is to allow the connector to be passed at
> > os-attach time.   Then cinder can save it in the attachment's table
> entry.
> >
> > We will also need a new cinder API to allow that attachment to be
> updated during live migration, or the connector for the attachment
> > will get stale and incorrect.
>
> When saying below that it will be good for live migration as well I meant
> that the update is part of the API.
>
> Ildikó
>
> >
> > Walt
> > >
> > > It would mean that when we request for the volume info from Cinder at
> detach time the 'attachments' list would contain all the
> > required information for each attachments the volume has. If we don't
> have the 'target' information because of any reason we can
> > still use the approach described below as fallback. This approach could
> even be used in case of live migration I think.
> > >
> > > The Cinder API extension would need to be added with a new
> microversion to avoid problems with older Cinder versions talking to
> > new Nova.
> > >
> > > The advantage of this direction is that we can reduce the round trips
> to Cinder at detach time. The round trip after a successful
> > attach should not have an impact on the normal operation as if that
> fails the only issue we have is we need to use the fall back method
> > to be able to detach properly. This would still affect only
> multiattached volumes, where we have more than one attachments on the
> > same host. By having the information stored in Cinder as well we can
> also avoid removing a target when there are still active
> > attachments connected to it.
> > >
> > > What do you think?
> > >
> > > Thanks,
> > > Ildikó
> > >
> > >
> > >> -Original Message-
> > >> From: Walter A. Boring IV [mailto:walter.bor...@hpe.com]
> > >> Sent: February 09, 2016 20:50
> > >> To: OpenStack Development Mailing List (not for usage questions)
> > >> Subject: [openstack-dev] [Nova][Cinder] Multi-attach, determining
> > >> when to call os-brick's connector.disconnect_volume
> > >>
> > >> Hey folks,
> > >>  One of the challenges we have faced with the ability to attach a
> > >> single volume to multiple instances, is how to correctly detach that
> > >> volume.  The issue is a bit complex, but I'll try and explain the
> problem, and then describe one approach to solving one part of the
> > detach puzzle.
> > >>
> > >> Problem:
> > >> When a volume is attached to multiple instances on the same host.
> > >> There are 2 scenarios here.
> > >>
> > >> 1) Some Cinder drivers export a new target for every attachment
> > >> on a compute host.  This means that you will get a new unique volume
> path on a host, which is then handed off to the VM
> > instance.
> > >>
> > >> 2) Other Cinder drivers export a single target for all instances
> > >> on a compute host.  This means that every instance on a single host,
> will reuse the same host volume path.
> > >>
> > >&g

Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-10 Thread Sean McGinnis
On Wed, Feb 10, 2016 at 03:30:42PM -0700, John Griffith wrote:
> On Tue, Feb 9, 2016 at 3:23 PM, Ildikó Váncsa 
> wrote:
> 
> >
> ​This may still be in fact the easiest way to handle this.  The only other
> thing I am still somewhat torn on here is that maybe Nova should be doing
> ref-counting WRT shared connections and NOT send the detach in that
> scenario to begin with?
> 
> In the case of unique targets per-attach we already just "work", but if you
> are using the same target/attachment on a compute node for multiple
> instances, then you probably should keep track of that on the users end and
> not remove it while in use.  That seems like the more "correct" way to deal
> with this, but ​maybe that's just me.  Keep in mind we could also do the
> same ref-counting on the Cinder side if we so choose.

This is where I've been pushing too. It seems odd to me that the storage
domain should need to track how the volume is being used by the
consumer. Whether it is attached to one instance, 100 instances, or the
host just likes to keep it around as a pet, from the storage perspective
I don't know why we should care.

Looking beyond Nova usage, does Cinder now need to start tracking
information about containers? Bare metal hosts? Apps that are associated
with LUNs. It just seems like concepts that the storage component
shouldn't need to know or care about.

I know there's some history here and it may not be as easy as that. But
just wanted to state my opinion that in an ideal world (which I
recognize we don't live in) this should not be Cinder's concern.

> 
> We talked about this at mid-cycle with the Nova team and I proposed
> independent targets for each connection on Cinder's side.  We can still do
> that IMO but that doesn't seem to be a very popular idea.

John, I don't think folks are against this idea as a concept. I think
the problem is I don't believe all storage vendors can support exposing
new targets for the same volume for each attachment.

> 
> My point here is just that it seems like there might be a way to fix this
> without breaking compatibility in the API.  Thoughts?

> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-10 Thread John Griffith
On Wed, Feb 10, 2016 at 3:59 PM, Sean McGinnis 
wrote:

> On Wed, Feb 10, 2016 at 03:30:42PM -0700, John Griffith wrote:
> > On Tue, Feb 9, 2016 at 3:23 PM, Ildikó Váncsa <
> ildiko.van...@ericsson.com>
> > wrote:
> >
> > >
> > ​This may still be in fact the easiest way to handle this.  The only
> other
> > thing I am still somewhat torn on here is that maybe Nova should be doing
> > ref-counting WRT shared connections and NOT send the detach in that
> > scenario to begin with?
> >
> > In the case of unique targets per-attach we already just "work", but if
> you
> > are using the same target/attachment on a compute node for multiple
> > instances, then you probably should keep track of that on the users end
> and
> > not remove it while in use.  That seems like the more "correct" way to
> deal
> > with this, but ​maybe that's just me.  Keep in mind we could also do the
> > same ref-counting on the Cinder side if we so choose.
>
> This is where I've been pushing too. It seems odd to me that the storage
> domain should need to track how the volume is being used by the
> consumer. Whether it is attached to one instance, 100 instances, or the
> host just likes to keep it around as a pet, from the storage perspective
> I don't know why we should care.
>
> Looking beyond Nova usage, does Cinder now need to start tracking
> information about containers? Bare metal hosts? Apps that are associated
> with LUNs. It just seems like concepts that the storage component
> shouldn't need to know or care about.
>
>
​Well said​

​, I agree
​


> I know there's some history here and it may not be as easy as that. But
> just wanted to state my opinion that in an ideal world (which I
> recognize we don't live in) this should not be Cinder's concern.
>
> >
> > We talked about this at mid-cycle with the Nova team and I proposed
> > independent targets for each connection on Cinder's side.  We can still
> do
> > that IMO but that doesn't seem to be a very popular idea.
>
> John, I don't think folks are against this idea as a concept. I think
> the problem is I don't believe all storage vendors can support exposing
> new targets for the same volume for each attachment.
>
​
Ahh, well that's a very valid reason to take a different approach.​


>
> >
> > My point here is just that it seems like there might be a way to fix this
> > without breaking compatibility in the API.  Thoughts?
>
> >
> __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-10 Thread Fox, Kevin M
I think part of the issue is whether to count or not is cinder driver specific 
and only cinder knows if it should be done or not.

But if cinder told nova that particular multiattach endpoints must be 
refcounted, that might resolve the issue?

Thanks,
Kevin

From: Sean McGinnis [sean.mcgin...@gmx.com]
Sent: Wednesday, February 10, 2016 2:59 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to 
call os-brick's connector.disconnect_volume

On Wed, Feb 10, 2016 at 03:30:42PM -0700, John Griffith wrote:
> On Tue, Feb 9, 2016 at 3:23 PM, Ildikó Váncsa 
> wrote:
>
> >
> ​This may still be in fact the easiest way to handle this.  The only other
> thing I am still somewhat torn on here is that maybe Nova should be doing
> ref-counting WRT shared connections and NOT send the detach in that
> scenario to begin with?
>
> In the case of unique targets per-attach we already just "work", but if you
> are using the same target/attachment on a compute node for multiple
> instances, then you probably should keep track of that on the users end and
> not remove it while in use.  That seems like the more "correct" way to deal
> with this, but ​maybe that's just me.  Keep in mind we could also do the
> same ref-counting on the Cinder side if we so choose.

This is where I've been pushing too. It seems odd to me that the storage
domain should need to track how the volume is being used by the
consumer. Whether it is attached to one instance, 100 instances, or the
host just likes to keep it around as a pet, from the storage perspective
I don't know why we should care.

Looking beyond Nova usage, does Cinder now need to start tracking
information about containers? Bare metal hosts? Apps that are associated
with LUNs. It just seems like concepts that the storage component
shouldn't need to know or care about.

I know there's some history here and it may not be as easy as that. But
just wanted to state my opinion that in an ideal world (which I
recognize we don't live in) this should not be Cinder's concern.

>
> We talked about this at mid-cycle with the Nova team and I proposed
> independent targets for each connection on Cinder's side.  We can still do
> that IMO but that doesn't seem to be a very popular idea.

John, I don't think folks are against this idea as a concept. I think
the problem is I don't believe all storage vendors can support exposing
new targets for the same volume for each attachment.

>
> My point here is just that it seems like there might be a way to fix this
> without breaking compatibility in the API.  Thoughts?

> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-10 Thread Sean McGinnis
On Wed, Feb 10, 2016 at 11:16:28PM +, Fox, Kevin M wrote:
> I think part of the issue is whether to count or not is cinder driver 
> specific and only cinder knows if it should be done or not.
> 
> But if cinder told nova that particular multiattach endpoints must be 
> refcounted, that might resolve the issue?
> 
> Thanks,
> Kevin

I this case (the point John and I were making at least) it doesn't
matter. Nothing is driver specific, so it wouldn't matter which backend
is being used.

If a volume is needed, request it to be attached. When it is no longer
needed, tell Cinder to take it away. Simple as that.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-10 Thread Fox, Kevin M
But the issue is, when told to detach, some of the drivers do bad things. then, 
is it the driver's issue to refcount to fix the issue, or is it nova's to 
refcount so that it doesn't call the release before all users are done with it? 
I think solving it in the middle, in cinder's probably not the right place to 
track it, but if its to be solved on nova's side, nova needs to know when it 
needs to do it. But cinder might have to relay some extra info from the backend.

Either way, On the driver side, there probably needs to be a mechanism on the 
driver to say it either can refcount properly so its multiattach compatible (or 
that nova should refcount), or to default to not allowing multiattach ever, so 
existing drivers don't break.

Thanks,
Kevin

From: Sean McGinnis [sean.mcgin...@gmx.com]
Sent: Wednesday, February 10, 2016 3:25 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to 
call os-brick's connector.disconnect_volume

On Wed, Feb 10, 2016 at 11:16:28PM +, Fox, Kevin M wrote:
> I think part of the issue is whether to count or not is cinder driver 
> specific and only cinder knows if it should be done or not.
>
> But if cinder told nova that particular multiattach endpoints must be 
> refcounted, that might resolve the issue?
>
> Thanks,
> Kevin

I this case (the point John and I were making at least) it doesn't
matter. Nothing is driver specific, so it wouldn't matter which backend
is being used.

If a volume is needed, request it to be attached. When it is no longer
needed, tell Cinder to take it away. Simple as that.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-10 Thread John Griffith
On Wed, Feb 10, 2016 at 5:12 PM, Fox, Kevin M  wrote:

> But the issue is, when told to detach, some of the drivers do bad things.
> then, is it the driver's issue to refcount to fix the issue, or is it
> nova's to refcount so that it doesn't call the release before all users are
> done with it? I think solving it in the middle, in cinder's probably not
> the right place to track it, but if its to be solved on nova's side, nova
> needs to know when it needs to do it. But cinder might have to relay some
> extra info from the backend.
>
> Either way, On the driver side, there probably needs to be a mechanism on
> the driver to say it either can refcount properly so its multiattach
> compatible (or that nova should refcount), or to default to not allowing
> multiattach ever, so existing drivers don't break.
>
> Thanks,
> Kevin
> 
> From: Sean McGinnis [sean.mcgin...@gmx.com]
> Sent: Wednesday, February 10, 2016 3:25 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when
> to call os-brick's connector.disconnect_volume
>
> On Wed, Feb 10, 2016 at 11:16:28PM +, Fox, Kevin M wrote:
> > I think part of the issue is whether to count or not is cinder driver
> specific and only cinder knows if it should be done or not.
> >
> > But if cinder told nova that particular multiattach endpoints must be
> refcounted, that might resolve the issue?
> >
> > Thanks,
> > Kevin
>
> I this case (the point John and I were making at least) it doesn't
> matter. Nothing is driver specific, so it wouldn't matter which backend
> is being used.
>
> If a volume is needed, request it to be attached. When it is no longer
> needed, tell Cinder to take it away. Simple as that.
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

​Hey Kevin,

So I think what Sean M pointed out is still valid in your case.  It's not
really that some drivers do bad things, the problem is actually the way
attach/detach works in OpenStack as a whole.  The original design (which we
haven't strayed very far from) was that you could only attach a single
resource to a single compute node.  That was it, there was no concept of
multi-attach etc.

Now however folks want to introduce multi-attach, which means all of the
old assumptions that the code was written on and designed around are kinda
"bad assumptions" now.  It's true, as you pointed out however that there
are some drivers that behave or deal with targets in a way that makes
things complicated, but they're completely inline with the scsi standards
and aren't doing anything *wrong*.

The point Sean M and I were trying to make is that for the specific use
case of a single volume being attached to a compute node, BUT being passed
through to more than one Instance it might be worth looking at just
ensuring that Compute Node doesn't call detach unless it's *done* with all
of the Instances that it was passing that volume through to.

You're absolutely right, there are some *weird* things that a couple of
vendors do with targets in the case of like replication where they may
actually create a new target and attach; those sorts of things are
ABSOLUTELY Cinder's problem and Nova should not have to know anything about
that as a consumer of the Target.

My view is that maybe we should look at addressing the multiple use of a
single target case in Nova, and then absolutely figure out how to make
things work correctly on the Cinder side for all the different behaviors
that may occur on the Cinder side from the various vendors.

Make sense?

John
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-10 Thread Avishay Traeger
I think Sean and John are in the right direction.  Nova and Cinder need to
be more decoupled in the area of volume attachments.

I think some of the mess here is due to different Cinder backend behavior -
with some Cinder backends you actually attach volumes to a host (e.g., FC,
iSCSI), with some you attach to a VM (e.g., Ceph), and with some you attach
an entire pool of volumes to a host (e.g., NFS).  I think this difference
should all be contained in the Nova drivers that do the attachments.

On Thu, Feb 11, 2016 at 6:06 AM, John Griffith 
wrote:

>
>
> On Wed, Feb 10, 2016 at 5:12 PM, Fox, Kevin M  wrote:
>
>> But the issue is, when told to detach, some of the drivers do bad things.
>> then, is it the driver's issue to refcount to fix the issue, or is it
>> nova's to refcount so that it doesn't call the release before all users are
>> done with it? I think solving it in the middle, in cinder's probably not
>> the right place to track it, but if its to be solved on nova's side, nova
>> needs to know when it needs to do it. But cinder might have to relay some
>> extra info from the backend.
>>
>> Either way, On the driver side, there probably needs to be a mechanism on
>> the driver to say it either can refcount properly so its multiattach
>> compatible (or that nova should refcount), or to default to not allowing
>> multiattach ever, so existing drivers don't break.
>>
>> Thanks,
>> Kevin
>> 
>> From: Sean McGinnis [sean.mcgin...@gmx.com]
>> Sent: Wednesday, February 10, 2016 3:25 PM
>> To: OpenStack Development Mailing List (not for usage questions)
>> Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining
>> when to call os-brick's connector.disconnect_volume
>>
>> On Wed, Feb 10, 2016 at 11:16:28PM +, Fox, Kevin M wrote:
>> > I think part of the issue is whether to count or not is cinder driver
>> specific and only cinder knows if it should be done or not.
>> >
>> > But if cinder told nova that particular multiattach endpoints must be
>> refcounted, that might resolve the issue?
>> >
>> > Thanks,
>> > Kevin
>>
>> I this case (the point John and I were making at least) it doesn't
>> matter. Nothing is driver specific, so it wouldn't matter which backend
>> is being used.
>>
>> If a volume is needed, request it to be attached. When it is no longer
>> needed, tell Cinder to take it away. Simple as that.
>>
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> ​Hey Kevin,
>
> So I think what Sean M pointed out is still valid in your case.  It's not
> really that some drivers do bad things, the problem is actually the way
> attach/detach works in OpenStack as a whole.  The original design (which we
> haven't strayed very far from) was that you could only attach a single
> resource to a single compute node.  That was it, there was no concept of
> multi-attach etc.
>
> Now however folks want to introduce multi-attach, which means all of the
> old assumptions that the code was written on and designed around are kinda
> "bad assumptions" now.  It's true, as you pointed out however that there
> are some drivers that behave or deal with targets in a way that makes
> things complicated, but they're completely inline with the scsi standards
> and aren't doing anything *wrong*.
>
> The point Sean M and I were trying to make is that for the specific use
> case of a single volume being attached to a compute node, BUT being passed
> through to more than one Instance it might be worth looking at just
> ensuring that Compute Node doesn't call detach unless it's *done* with all
> of the Instances that it was passing that volume through to.
>
> You're absolutely right, there are some *weird* things that a couple of
> vendors do with targets in the case of like replication where they may
> actually create a new target and attach; those sorts of things are
> ABSOLUTELY Cinder's problem and Nova should not have to 

Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-11 Thread Ildikó Váncsa
Hi,

As far as I can see volume attachments are handled on attachment level today as 
opposed to host level in Cinder. How the volume is exposed to a host 
technically is another question, but conceptually Cinder is the ultimate source 
of truth regarding how many attachments a volume has and what driver takes care 
of that volume.

In this sense in my understanding what you are suggesting below would need a 
redesign and refactoring so that the concept and the implementation are in line 
with each other. We talked about this at the beginning of Mitaka and this was 
the outcome of that discussion too as far as I can remember.

Tracking the connector info is not Nova's responsibility I think neither 
keeping track of what back end provides the volume to it. I think we need to 
find the solution within the current concept and architecture and then refactor 
to the aimed design. We cannot use what we have and solve our issues according 
what we would like to have. That will not match but bring additional complexity 
to these modules.

Would the new API and the connector info record in the Cinder database cause 
any problems conceptually and/or technically?

Thanks,
Ildikó

> -Original Message-
> From: Avishay Traeger [mailto:avis...@stratoscale.com]
> Sent: February 11, 2016 07:43
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to 
> call os-brick's connector.disconnect_volume
> 
> I think Sean and John are in the right direction.  Nova and Cinder need to be 
> more decoupled in the area of volume attachments.
> 
> I think some of the mess here is due to different Cinder backend behavior - 
> with some Cinder backends you actually attach volumes to
> a host (e.g., FC, iSCSI), with some you attach to a VM (e.g., Ceph), and with 
> some you attach an entire pool of volumes to a host (e.g.,
> NFS).  I think this difference should all be contained in the Nova drivers 
> that do the attachments.
> 
> On Thu, Feb 11, 2016 at 6:06 AM, John Griffith  
> wrote:
> 
> 
> 
> 
>   On Wed, Feb 10, 2016 at 5:12 PM, Fox, Kevin M  
> wrote:
> 
> 
>   But the issue is, when told to detach, some of the drivers do 
> bad things. then, is it the driver's issue to
> refcount to fix the issue, or is it nova's to refcount so that it doesn't 
> call the release before all users are done with it? I think solving it in
> the middle, in cinder's probably not the right place to track it, but if its 
> to be solved on nova's side, nova needs to know when it needs
> to do it. But cinder might have to relay some extra info from the backend.
> 
>   Either way, On the driver side, there probably needs to be a 
> mechanism on the driver to say it either can
> refcount properly so its multiattach compatible (or that nova should 
> refcount), or to default to not allowing multiattach ever, so
> existing drivers don't break.
> 
>   Thanks,
>   Kevin
>   
>   From: Sean McGinnis [sean.mcgin...@gmx.com]
>           Sent: Wednesday, February 10, 2016 3:25 PM
>           To: OpenStack Development Mailing List (not for usage questions)
>   Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, 
> determining when to call os-brick's
> connector.disconnect_volume
> 
> 
>   On Wed, Feb 10, 2016 at 11:16:28PM +, Fox, Kevin M wrote:
>   > I think part of the issue is whether to count or not is 
> cinder driver specific and only cinder knows if it
> should be done or not.
>   >
>   > But if cinder told nova that particular multiattach endpoints 
> must be refcounted, that might resolve the
> issue?
>   >
>   > Thanks,
>   > Kevin
> 
>   I this case (the point John and I were making at least) it 
> doesn't
>   matter. Nothing is driver specific, so it wouldn't matter which 
> backend
>   is being used.
> 
>   If a volume is needed, request it to be attached. When it is no 
> longer
>   needed, tell Cinder to take it away. Simple as that.
> 
> 
>   
> __
>   OpenStack Development Mailing List (not for usage questions)
>   Unsubscribe: 
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>   
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
>   
> __

Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-11 Thread Daniel P. Berrange
On Tue, Feb 09, 2016 at 11:49:33AM -0800, Walter A. Boring IV wrote:
> Hey folks,
>One of the challenges we have faced with the ability to attach a single
> volume to multiple instances, is how to correctly detach that volume.  The
> issue is a bit complex, but I'll try and explain the problem, and then
> describe one approach to solving one part of the detach puzzle.
> 
> Problem:
>   When a volume is attached to multiple instances on the same host. There
> are 2 scenarios here.
> 
>   1) Some Cinder drivers export a new target for every attachment on a
> compute host.  This means that you will get a new unique volume path on a
> host, which is then handed off to the VM instance.
> 
>   2) Other Cinder drivers export a single target for all instances on a
> compute host.  This means that every instance on a single host, will reuse
> the same host volume path.


This problem isn't actually new. It is a problem we already have in Nova
even with single attachments per volume.  eg, with NFS and SMBFS there
is a single mount setup on the host, which can serve up multiple volumes.
We have to avoid unmounting that until no VM is using any volume provided
by that mount point. Except we pretend the problem doesn't exist and just
try to unmount every single time a VM stops, and rely on the kernel
failing umout() with EBUSY.  Except this has a race condition if one VM
is stopping right as another VM is starting

There is a patch up to try to solve this for SMBFS:

   https://review.openstack.org/#/c/187619/

but I don't really much like it, because it only solves it for one
driver.

I think we need a general solution that solves the problem for all
cases, including multi-attach.

AFAICT, the only real answer here is to have nova record more info
about volume attachments, so it can reliably decide when it is safe
to release a connection on the host.


> Proposed solution:
>   Nova needs to determine if the volume that's being detached is a shared or
> non shared volume.  Here is one way to determine that.
> 
>   Every Cinder volume has a list of it's attachments.  In those attachments
> it contains the instance_uuid that the volume is attached to.  I presume
> Nova can find which of the volume attachments are on the same host.  Then
> Nova can call Cinder's initialize_connection for each of those attachments
> to get the target's connection_info dictionary.  This connection_info
> dictionary describes how to connect to the target on the cinder backend.  If
> the target is shared, then each of the connection_info dicts for each
> attachment on that host will be identical.  Then Nova would know that it's a
> shared target, and then only call os-brick's disconnect_volume, if it's the
> last attachment on that host.  I think at most 2 calls to cinder's
> initialize_connection would suffice to determine if the volume is a shared
> target.  This would only need to be done if the volume is multi-attach
> capable and if there are more than 1 attachments on the same host, where the
> detach is happening.

As above, we need to solve this more generally than just multi-attach,
even single-attach is flawed today.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-11 Thread Avishay Traeger
On Thu, Feb 11, 2016 at 12:23 PM, Daniel P. Berrange 
wrote:

> As above, we need to solve this more generally than just multi-attach,
> even single-attach is flawed today.
>

Agreed.  This is what I was getting at.  Because we have at least 3
different types of attach being handled the same way, we are getting into
tricky situations. (3 types: iSCSI/FC attach volume to host, Ceph attach
volume to VM, NFS attach pool to host)
Multi-attach just makes a bad situation worse.

-- 
*Avishay Traeger, PhD*
*System Architect*

Mobile: +972 54 447 1475
E-mail: avis...@stratoscale.com



Web  | Blog 
 | Twitter  | Google+

 | Linkedin 
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-11 Thread Walter A. Boring IV
There seems to be a few discussions going on here wrt to detaches.   One 
is what to do on the Nova side with calling os-brick's 
disconnect_volume, and also when to or not to call Cinder's 
terminate_connection and detach.


My original post was simply to discuss a mechanism to try and figure out 
the first problem.  When should nova call brick to remove

the local volume, prior to calling Cinder to do something.

Nova needs to know if it's safe to call disconnect_volume or not. Cinder 
already tracks each attachment, and it can return the connection_info 
for each attachment with a call to initialize_connection.   If 2 of 
those connection_info dicts are the same, it's a shared volume/target.  
Don't call disconnect_volume if there are any more of those left.


On the Cinder side of things, if terminate_connection, detach is called, 
the volume manager can find the list of attachments for a volume, and 
compare that to the attachments on a host.  The problem is, Cinder 
doesn't track the host along with the instance_uuid in the attachments 
table.  I plan on allowing that as an API change after microversions 
lands, so we know how many times a volume is attached/used on a 
particular host.  The driver can decide what to do with it at 
terminate_connection, detach time. This helps account for
the differences in each of the Cinder backends, which we will never get 
all aligned to the same model.  Each array/backend handles attachments 
different and only the driver knows if it's safe to remove the target or 
not, depending on how many attachments/usages it has
on the host itself.   This is the same thing as a reference counter, 
which we don't need, because we have the count in the attachments table, 
once we allow setting the host and the instance_uuid at the same time.


Walt

On Tue, Feb 09, 2016 at 11:49:33AM -0800, Walter A. Boring IV wrote:

Hey folks,
One of the challenges we have faced with the ability to attach a single
volume to multiple instances, is how to correctly detach that volume.  The
issue is a bit complex, but I'll try and explain the problem, and then
describe one approach to solving one part of the detach puzzle.

Problem:
   When a volume is attached to multiple instances on the same host. There
are 2 scenarios here.

   1) Some Cinder drivers export a new target for every attachment on a
compute host.  This means that you will get a new unique volume path on a
host, which is then handed off to the VM instance.

   2) Other Cinder drivers export a single target for all instances on a
compute host.  This means that every instance on a single host, will reuse
the same host volume path.


This problem isn't actually new. It is a problem we already have in Nova
even with single attachments per volume.  eg, with NFS and SMBFS there
is a single mount setup on the host, which can serve up multiple volumes.
We have to avoid unmounting that until no VM is using any volume provided
by that mount point. Except we pretend the problem doesn't exist and just
try to unmount every single time a VM stops, and rely on the kernel
failing umout() with EBUSY.  Except this has a race condition if one VM
is stopping right as another VM is starting

There is a patch up to try to solve this for SMBFS:

https://review.openstack.org/#/c/187619/

but I don't really much like it, because it only solves it for one
driver.

I think we need a general solution that solves the problem for all
cases, including multi-attach.

AFAICT, the only real answer here is to have nova record more info
about volume attachments, so it can reliably decide when it is safe
to release a connection on the host.



Proposed solution:
   Nova needs to determine if the volume that's being detached is a shared or
non shared volume.  Here is one way to determine that.

   Every Cinder volume has a list of it's attachments.  In those attachments
it contains the instance_uuid that the volume is attached to.  I presume
Nova can find which of the volume attachments are on the same host.  Then
Nova can call Cinder's initialize_connection for each of those attachments
to get the target's connection_info dictionary.  This connection_info
dictionary describes how to connect to the target on the cinder backend.  If
the target is shared, then each of the connection_info dicts for each
attachment on that host will be identical.  Then Nova would know that it's a
shared target, and then only call os-brick's disconnect_volume, if it's the
last attachment on that host.  I think at most 2 calls to cinder's
initialize_connection would suffice to determine if the volume is a shared
target.  This would only need to be done if the volume is multi-attach
capable and if there are more than 1 attachments on the same host, where the
detach is happening.

As above, we need to solve this more generally than just multi-attach,
even single-attach is flawed today.

Regards,
Daniel



__

Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-12 Thread Ildikó Váncsa
Hi Walt,

Thanks for describing the bigger picture.

In my opinion when we will have microversion support available in Cinder that 
will give us a bit of a freedom and also possibility to handle these 
difficulties.

Regarding terminate_connection we will have issues with live_migration as it is 
today. We need to figure out what information would be best to feed back to 
Cinder from Nova, so we should figure out what API we would need after we are 
able to introduce it in a safe way. I still see benefit in storing the 
connection_info for the attachments.

Also I think the multiattach support should be disable for the problematic 
drivers like lvm, until we don't have a solution for proper detach on the whole 
call chain.

Best Regards,
Ildikó

> -Original Message-
> From: Walter A. Boring IV [mailto:walter.bor...@hpe.com]
> Sent: February 11, 2016 18:31
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to 
> call os-brick's connector.disconnect_volume
> 
> There seems to be a few discussions going on here wrt to detaches.   One
> is what to do on the Nova side with calling os-brick's disconnect_volume, and 
> also when to or not to call Cinder's
> terminate_connection and detach.
> 
> My original post was simply to discuss a mechanism to try and figure out the 
> first problem.  When should nova call brick to remove the
> local volume, prior to calling Cinder to do something.
> 
> Nova needs to know if it's safe to call disconnect_volume or not. Cinder 
> already tracks each attachment, and it can return the
> connection_info
> for each attachment with a call to initialize_connection.   If 2 of
> those connection_info dicts are the same, it's a shared volume/target.
> Don't call disconnect_volume if there are any more of those left.
> 
> On the Cinder side of things, if terminate_connection, detach is called, the 
> volume manager can find the list of attachments for a
> volume, and compare that to the attachments on a host.  The problem is, 
> Cinder doesn't track the host along with the instance_uuid in
> the attachments table.  I plan on allowing that as an API change after 
> microversions lands, so we know how many times a volume is
> attached/used on a particular host.  The driver can decide what to do with it 
> at
> terminate_connection, detach time. This helps account for
> the differences in each of the Cinder backends, which we will never get all 
> aligned to the same model.  Each array/backend handles
> attachments different and only the driver knows if it's safe to remove the 
> target or not, depending on how many attachments/usages
> it has
> on the host itself.   This is the same thing as a reference counter,
> which we don't need, because we have the count in the attachments table, once 
> we allow setting the host and the instance_uuid at
> the same time.
> 
> Walt
> > On Tue, Feb 09, 2016 at 11:49:33AM -0800, Walter A. Boring IV wrote:
> >> Hey folks,
> >> One of the challenges we have faced with the ability to attach a
> >> single volume to multiple instances, is how to correctly detach that
> >> volume.  The issue is a bit complex, but I'll try and explain the
> >> problem, and then describe one approach to solving one part of the detach 
> >> puzzle.
> >>
> >> Problem:
> >>When a volume is attached to multiple instances on the same host.
> >> There are 2 scenarios here.
> >>
> >>1) Some Cinder drivers export a new target for every attachment on
> >> a compute host.  This means that you will get a new unique volume
> >> path on a host, which is then handed off to the VM instance.
> >>
> >>2) Other Cinder drivers export a single target for all instances
> >> on a compute host.  This means that every instance on a single host,
> >> will reuse the same host volume path.
> >
> > This problem isn't actually new. It is a problem we already have in
> > Nova even with single attachments per volume.  eg, with NFS and SMBFS
> > there is a single mount setup on the host, which can serve up multiple 
> > volumes.
> > We have to avoid unmounting that until no VM is using any volume
> > provided by that mount point. Except we pretend the problem doesn't
> > exist and just try to unmount every single time a VM stops, and rely
> > on the kernel failing umout() with EBUSY.  Except this has a race
> > condition if one VM is stopping right as another VM is starting
> >
> > There is a patch up to try to solve this for SMBFS:
> >
> > https://review.openstack.org/#/c/

Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-12 Thread John Griffith
On Thu, Feb 11, 2016 at 10:31 AM, Walter A. Boring IV  wrote:

> There seems to be a few discussions going on here wrt to detaches.   One
> is what to do on the Nova side with calling os-brick's disconnect_volume,
> and also when to or not to call Cinder's terminate_connection and detach.
>
> My original post was simply to discuss a mechanism to try and figure out
> the first problem.  When should nova call brick to remove
> the local volume, prior to calling Cinder to do something.
> ​
>


> Nova needs to know if it's safe to call disconnect_volume or not. Cinder
> already tracks each attachment, and it can return the connection_info for
> each attachment with a call to initialize_connection.   If 2 of those
> connection_info dicts are the same, it's a shared volume/target.  Don't
> call disconnect_volume if there are any more of those left.
>
> On the Cinder side of things, if terminate_connection, detach is called,
> the volume manager can find the list of attachments for a volume, and
> compare that to the attachments on a host.  The problem is, Cinder doesn't
> track the host along with the instance_uuid in the attachments table.  I
> plan on allowing that as an API change after microversions lands, so we
> know how many times a volume is attached/used on a particular host.  The
> driver can decide what to do with it at terminate_connection, detach time.
>This helps account for
> the differences in each of the Cinder backends, which we will never get
> all aligned to the same model.  Each array/backend handles attachments
> different and only the driver knows if it's safe to remove the target or
> not, depending on how many attachments/usages it has
> on the host itself.   This is the same thing as a reference counter, which
> we don't need, because we have the count in the attachments table, once we
> allow setting the host and the instance_uuid at the same time.
>
> ​Not trying to drag this out or be difficult I promise.  But, this seems
like it is in fact the same problem, and I'm not exactly following; if you
store the info on the compute side during the attach phase, why would you
need/want to then create a split brain scenario and have Cinder do any sort
of tracking on the detach side of things?

Like the earlier posts said, just don't call terminate_connection if you
don't want to really terminate the connection?  I'm sorry, I'm just not
following the logic of why Cinder should track this and interfere with
things?  It's supposed to be providing a service to consumers and "do what
it's told" even if it's told to do the wrong thing.
 ​


> Walt
>
> On Tue, Feb 09, 2016 at 11:49:33AM -0800, Walter A. Boring IV wrote:
>>
>>> Hey folks,
>>> One of the challenges we have faced with the ability to attach a
>>> single
>>> volume to multiple instances, is how to correctly detach that volume.
>>> The
>>> issue is a bit complex, but I'll try and explain the problem, and then
>>> describe one approach to solving one part of the detach puzzle.
>>>
>>> Problem:
>>>When a volume is attached to multiple instances on the same host.
>>> There
>>> are 2 scenarios here.
>>>
>>>1) Some Cinder drivers export a new target for every attachment on a
>>> compute host.  This means that you will get a new unique volume path on a
>>> host, which is then handed off to the VM instance.
>>>
>>>2) Other Cinder drivers export a single target for all instances on a
>>> compute host.  This means that every instance on a single host, will
>>> reuse
>>> the same host volume path.
>>>
>>
>> This problem isn't actually new. It is a problem we already have in Nova
>> even with single attachments per volume.  eg, with NFS and SMBFS there
>> is a single mount setup on the host, which can serve up multiple volumes.
>> We have to avoid unmounting that until no VM is using any volume provided
>> by that mount point. Except we pretend the problem doesn't exist and just
>> try to unmount every single time a VM stops, and rely on the kernel
>> failing umout() with EBUSY.  Except this has a race condition if one VM
>> is stopping right as another VM is starting
>>
>> There is a patch up to try to solve this for SMBFS:
>>
>> https://review.openstack.org/#/c/187619/
>>
>> but I don't really much like it, because it only solves it for one
>> driver.
>>
>> I think we need a general solution that solves the problem for all
>> cases, including multi-attach.
>>
>> AFAICT, the only real answer here is to have nova record more info
>> about volume attachments, so it can reliably decide when it is safe
>> to release a connection on the host.
>>
>>
>> Proposed solution:
>>>Nova needs to determine if the volume that's being detached is a
>>> shared or
>>> non shared volume.  Here is one way to determine that.
>>>
>>>Every Cinder volume has a list of it's attachments.  In those
>>> attachments
>>> it contains the instance_uuid that the volume is attached to.  I presume
>>> Nova can find which of the volume attachments are o

Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-16 Thread Walter A. Boring IV

On 02/12/2016 04:35 PM, John Griffith wrote:



On Thu, Feb 11, 2016 at 10:31 AM, Walter A. Boring IV 
mailto:walter.bor...@hpe.com>> wrote:


There seems to be a few discussions going on here wrt to
detaches.   One is what to do on the Nova side with calling
os-brick's disconnect_volume, and also when to or not to call
Cinder's terminate_connection and detach.

My original post was simply to discuss a mechanism to try and
figure out the first problem.  When should nova call brick to remove
the local volume, prior to calling Cinder to do something.
​


Nova needs to know if it's safe to call disconnect_volume or not.
Cinder already tracks each attachment, and it can return the
connection_info for each attachment with a call to
initialize_connection.   If 2 of those connection_info dicts are
the same, it's a shared volume/target.  Don't call
disconnect_volume if there are any more of those left.

On the Cinder side of things, if terminate_connection, detach is
called, the volume manager can find the list of attachments for a
volume, and compare that to the attachments on a host.  The
problem is, Cinder doesn't track the host along with the
instance_uuid in the attachments table.  I plan on allowing that
as an API change after microversions lands, so we know how many
times a volume is attached/used on a particular host.  The driver
can decide what to do with it at terminate_connection, detach
time. This helps account for
the differences in each of the Cinder backends, which we will
never get all aligned to the same model.  Each array/backend
handles attachments different and only the driver knows if it's
safe to remove the target or not, depending on how many
attachments/usages it has
on the host itself.   This is the same thing as a reference
counter, which we don't need, because we have the count in the
attachments table, once we allow setting the host and the
instance_uuid at the same time.

​ Not trying to drag this out or be difficult I promise. But, this 
seems like it is in fact the same problem, and I'm not exactly 
following; if you store the info on the compute side during the attach 
phase, why would you need/want to then create a split brain scenario 
and have Cinder do any sort of tracking on the detach side of things?


Like the earlier posts said, just don't call terminate_connection if 
you don't want to really terminate the connection?  I'm sorry, I'm 
just not following the logic of why Cinder should track this and 
interfere with things?  It's supposed to be providing a service to 
consumers and "do what it's told" even if it's told to do the wrong thing.


The only reason to store the connector information on the cinder 
attachments side is in the few use cases when there is no way to get 
that connector any more.  Such as the case for nova evacuate, and force 
detach where nova has no information about where the original attachment 
was, because the instance is gone.   Cinder backends still need the 
connector at terminate_connection time, to find the right 
exports/targets to remove.


Walt
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume

2016-02-22 Thread John Garbutt
So just attempting to read through this thread, I think I hear:

Problems:

1. multi-attach breaks the assumption that made detach work
2. live-migrate, already breaks with some drivers, due to not fully
understanding the side affects of all API calls.
3. evacuate and shelve issues also related


Solution ideas:

1. New export/target for every volume connection
* pro: simple
* con: that doesn't work for all drivers (?)

2. Nova works out when to disconnect volume on host
* pro: no cinder API changes (i.e. no upgrade issue)
* con: adds complexity in Nova
* con: requires all nodes to run fixed code before multi-attach is safe
* con: doesn't help with the live-migrate and evacuate issues anyways?

3. Give Cinder all the info, so it knows what has to happen
* pro: seems to give cinder the info to stop API users doing bad things
* pro: more robust API particularly useful with multiple nova, and
with baremetal, etc
* con: Need cinder micro-versions to do this API change and work across upgrade


So from where I am sat:
1: doesn't work for everyone
2: doesn't fix all the problems we need to fix
3: will take a long time

If so, it feels like we need solution 3, regardless, to solve wider issues.
We only need solution 2, if solution 3 will block multi-attach for too long.

Am I missing something in that summary?

Thanks,
johnthetubaguy

On 12 February 2016 at 20:26, Ildikó Váncsa  wrote:
> Hi Walt,
>
> Thanks for describing the bigger picture.
>
> In my opinion when we will have microversion support available in Cinder that 
> will give us a bit of a freedom and also possibility to handle these 
> difficulties.
>
> Regarding terminate_connection we will have issues with live_migration as it 
> is today. We need to figure out what information would be best to feed back 
> to Cinder from Nova, so we should figure out what API we would need after we 
> are able to introduce it in a safe way. I still see benefit in storing the 
> connection_info for the attachments.
>
> Also I think the multiattach support should be disable for the problematic 
> drivers like lvm, until we don't have a solution for proper detach on the 
> whole call chain.
>
> Best Regards,
> Ildikó
>
>> -Original Message-
>> From: Walter A. Boring IV [mailto:walter.bor...@hpe.com]
>> Sent: February 11, 2016 18:31
>> To: openstack-dev@lists.openstack.org
>> Subject: Re: [openstack-dev] [Nova][Cinder] Multi-attach, determining when 
>> to call os-brick's connector.disconnect_volume
>>
>> There seems to be a few discussions going on here wrt to detaches.   One
>> is what to do on the Nova side with calling os-brick's disconnect_volume, 
>> and also when to or not to call Cinder's
>> terminate_connection and detach.
>>
>> My original post was simply to discuss a mechanism to try and figure out the 
>> first problem.  When should nova call brick to remove the
>> local volume, prior to calling Cinder to do something.
>>
>> Nova needs to know if it's safe to call disconnect_volume or not. Cinder 
>> already tracks each attachment, and it can return the
>> connection_info
>> for each attachment with a call to initialize_connection.   If 2 of
>> those connection_info dicts are the same, it's a shared volume/target.
>> Don't call disconnect_volume if there are any more of those left.
>>
>> On the Cinder side of things, if terminate_connection, detach is called, the 
>> volume manager can find the list of attachments for a
>> volume, and compare that to the attachments on a host.  The problem is, 
>> Cinder doesn't track the host along with the instance_uuid in
>> the attachments table.  I plan on allowing that as an API change after 
>> microversions lands, so we know how many times a volume is
>> attached/used on a particular host.  The driver can decide what to do with 
>> it at
>> terminate_connection, detach time. This helps account for
>> the differences in each of the Cinder backends, which we will never get all 
>> aligned to the same model.  Each array/backend handles
>> attachments different and only the driver knows if it's safe to remove the 
>> target or not, depending on how many attachments/usages
>> it has
>> on the host itself.   This is the same thing as a reference counter,
>> which we don't need, because we have the count in the attachments table, 
>> once we allow setting the host and the instance_uuid at
>> the same time.
>>
>> Walt
>> > On Tue, Feb 09, 2016 at 11:49:33AM -0800, Walter A. Boring IV wrote:
>> >> Hey folks,
>> >> One of the challenges we have faced with the ability to attach a
>> >&g