Re: how to protect rbd from multiple simultaneous mapping

2013-01-24 Thread Josh Durgin

On 01/24/2013 05:30 AM, Ugis wrote:

Hi,

I have rbd which contains non-cluster filesystem. If this rbd is
mapped+mounted on one host, it should not be mapped+mounted on the
other simultaneously.
How to protect such rbd from being mapped on the other host?

At ceph level the only option is to use "lock add [image-name]
[lock-id]" and check for existance of this lock on the other client or
is it possible to protect rbd in a way that on other clients "rbd map
" command would just fail with something like Permission denied
without using arbitrary locks? In other words, can one limit the count
of clients that may map certain rbd?


This is what the lock commands were added for. The lock add command
will exit non-zero if the image is already locked, so you can run
something like:

rbd lock add [image-name] [lock-id] && rbd map [image-name]

to avoid mapping an image that's in use elsewhere.

The lock-id is user-defined, so you could (for example) use the
hostname of the machine mapping the image to tell where it's
in use.

Josh
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to protect rbd from multiple simultaneous mapping

2013-01-24 Thread Mandell Degerness
The advisory locks are nice, but it would be really nice to have the
fencing.  If a node is temporarily off the network and a heartbeat
monitor attempts to bring up a service on a different node, there is
no way to ensure that the first node will not write data to the rbd
after the rbd is mounted on the second node.  It would be nice if, on
seeing that an advisory lock exists, you could tell ceph "Do not
accept data from node X until further notice".

On Thu, Jan 24, 2013 at 11:50 AM, Josh Durgin  wrote:
> On 01/24/2013 05:30 AM, Ugis wrote:
>>
>> Hi,
>>
>> I have rbd which contains non-cluster filesystem. If this rbd is
>> mapped+mounted on one host, it should not be mapped+mounted on the
>> other simultaneously.
>> How to protect such rbd from being mapped on the other host?
>>
>> At ceph level the only option is to use "lock add [image-name]
>> [lock-id]" and check for existance of this lock on the other client or
>> is it possible to protect rbd in a way that on other clients "rbd map
>> " command would just fail with something like Permission denied
>> without using arbitrary locks? In other words, can one limit the count
>> of clients that may map certain rbd?
>
>
> This is what the lock commands were added for. The lock add command
> will exit non-zero if the image is already locked, so you can run
> something like:
>
> rbd lock add [image-name] [lock-id] && rbd map [image-name]
>
> to avoid mapping an image that's in use elsewhere.
>
> The lock-id is user-defined, so you could (for example) use the
> hostname of the machine mapping the image to tell where it's
> in use.
>
> Josh
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to protect rbd from multiple simultaneous mapping

2013-01-24 Thread Sage Weil
On Thu, 24 Jan 2013, Mandell Degerness wrote:
> The advisory locks are nice, but it would be really nice to have the
> fencing.  If a node is temporarily off the network and a heartbeat
> monitor attempts to bring up a service on a different node, there is
> no way to ensure that the first node will not write data to the rbd
> after the rbd is mounted on the second node.  It would be nice if, on
> seeing that an advisory lock exists, you could tell ceph "Do not
> accept data from node X until further notice".

Just a reminder: you can use the information from the locks to fence.  The 
basic process is:

 - identify old rbd lock holder (rbd lock list )
 - blacklist old owner (ceph osd blacklist add )
 - break old rbd lock (rbd lock remove   )
 - lock rbd image on new host (rbd lock add  )
 - map rbd image on new host

The oddity here is that the old VM can in theory continue to write up 
until the OSD hears about the blacklist via the internal gossip.  This is 
okay because the act of the new VM touching any part of the image (and the 
OSD that stores it) ensures that that OSD gets the blacklist information.  
So on XFS, for example, the act of replaying the XFS journal ensures that 
any attempt by the old VM to write to the journal will get EIO.

sage



> 
> On Thu, Jan 24, 2013 at 11:50 AM, Josh Durgin  wrote:
> > On 01/24/2013 05:30 AM, Ugis wrote:
> >>
> >> Hi,
> >>
> >> I have rbd which contains non-cluster filesystem. If this rbd is
> >> mapped+mounted on one host, it should not be mapped+mounted on the
> >> other simultaneously.
> >> How to protect such rbd from being mapped on the other host?
> >>
> >> At ceph level the only option is to use "lock add [image-name]
> >> [lock-id]" and check for existance of this lock on the other client or
> >> is it possible to protect rbd in a way that on other clients "rbd map
> >> " command would just fail with something like Permission denied
> >> without using arbitrary locks? In other words, can one limit the count
> >> of clients that may map certain rbd?
> >
> >
> > This is what the lock commands were added for. The lock add command
> > will exit non-zero if the image is already locked, so you can run
> > something like:
> >
> > rbd lock add [image-name] [lock-id] && rbd map [image-name]
> >
> > to avoid mapping an image that's in use elsewhere.
> >
> > The lock-id is user-defined, so you could (for example) use the
> > hostname of the machine mapping the image to tell where it's
> > in use.
> >
> > Josh
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to protect rbd from multiple simultaneous mapping

2013-01-25 Thread Wido den Hollander

On 01/25/2013 11:47 AM, Ugis wrote:

This could work, thanks!

P.S. Is there a way to tell which client has mapped certain rbd if no
"rbd lock" is used?


What you could do is this:

$ rbd lock add myimage `hostname`

That way you know which client locked the image.

Wido


It would be useful to see that info in output of "rbd info ".
Probably attribute for rbd like "max_map_count_allowed" would be
useful in future - just to make sure rbd is not mapped from multiple
clients if it must not. I suppose it can actually happen if multiple
admins work with same rbds from multiple clients and no strict "rbd
lock add.." procedure is followed.

Ugis


2013/1/25 Sage Weil :

On Thu, 24 Jan 2013, Mandell Degerness wrote:

The advisory locks are nice, but it would be really nice to have the
fencing.  If a node is temporarily off the network and a heartbeat
monitor attempts to bring up a service on a different node, there is
no way to ensure that the first node will not write data to the rbd
after the rbd is mounted on the second node.  It would be nice if, on
seeing that an advisory lock exists, you could tell ceph "Do not
accept data from node X until further notice".


Just a reminder: you can use the information from the locks to fence.  The
basic process is:

  - identify old rbd lock holder (rbd lock list )
  - blacklist old owner (ceph osd blacklist add )
  - break old rbd lock (rbd lock remove   )
  - lock rbd image on new host (rbd lock add  )
  - map rbd image on new host

The oddity here is that the old VM can in theory continue to write up
until the OSD hears about the blacklist via the internal gossip.  This is
okay because the act of the new VM touching any part of the image (and the
OSD that stores it) ensures that that OSD gets the blacklist information.
So on XFS, for example, the act of replaying the XFS journal ensures that
any attempt by the old VM to write to the journal will get EIO.

sage





On Thu, Jan 24, 2013 at 11:50 AM, Josh Durgin  wrote:

On 01/24/2013 05:30 AM, Ugis wrote:


Hi,

I have rbd which contains non-cluster filesystem. If this rbd is
mapped+mounted on one host, it should not be mapped+mounted on the
other simultaneously.
How to protect such rbd from being mapped on the other host?

At ceph level the only option is to use "lock add [image-name]
[lock-id]" and check for existance of this lock on the other client or
is it possible to protect rbd in a way that on other clients "rbd map
" command would just fail with something like Permission denied
without using arbitrary locks? In other words, can one limit the count
of clients that may map certain rbd?



This is what the lock commands were added for. The lock add command
will exit non-zero if the image is already locked, so you can run
something like:

 rbd lock add [image-name] [lock-id] && rbd map [image-name]

to avoid mapping an image that's in use elsewhere.

The lock-id is user-defined, so you could (for example) use the
hostname of the machine mapping the image to tell where it's
in use.

Josh

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to protect rbd from multiple simultaneous mapping

2013-01-25 Thread Andrey Korolyov
On Fri, Jan 25, 2013 at 4:52 PM, Ugis  wrote:
> I mean if you map rbd and do not use "rbd lock.." command. Can you
> tell which client has mapped certain rbd anyway?
>
> Ugis

Assume you has an undistinguishable L3 segment, NAT for example, and
accessing cluster over it - there is no possibility for cluster to
tell who exactly did something(mean, mapping). Locks mechanism is
enough to fulfill your request, anyway.

>
> 2013/1/25 Wido den Hollander :
>> On 01/25/2013 11:47 AM, Ugis wrote:
>>>
>>> This could work, thanks!
>>>
>>> P.S. Is there a way to tell which client has mapped certain rbd if no
>>> "rbd lock" is used?
>>
>>
>> What you could do is this:
>>
>> $ rbd lock add myimage `hostname`
>>
>> That way you know which client locked the image.
>>
>> Wido
>>
>>
>>> It would be useful to see that info in output of "rbd info ".
>>> Probably attribute for rbd like "max_map_count_allowed" would be
>>> useful in future - just to make sure rbd is not mapped from multiple
>>> clients if it must not. I suppose it can actually happen if multiple
>>> admins work with same rbds from multiple clients and no strict "rbd
>>> lock add.." procedure is followed.
>>>
>>> Ugis
>>>
>>>
>>> 2013/1/25 Sage Weil :

 On Thu, 24 Jan 2013, Mandell Degerness wrote:
>
> The advisory locks are nice, but it would be really nice to have the
> fencing.  If a node is temporarily off the network and a heartbeat
> monitor attempts to bring up a service on a different node, there is
> no way to ensure that the first node will not write data to the rbd
> after the rbd is mounted on the second node.  It would be nice if, on
> seeing that an advisory lock exists, you could tell ceph "Do not
> accept data from node X until further notice".


 Just a reminder: you can use the information from the locks to fence.
 The
 basic process is:

   - identify old rbd lock holder (rbd lock list )
   - blacklist old owner (ceph osd blacklist add )
   - break old rbd lock (rbd lock remove   )
   - lock rbd image on new host (rbd lock add  )
   - map rbd image on new host

 The oddity here is that the old VM can in theory continue to write up
 until the OSD hears about the blacklist via the internal gossip.  This is
 okay because the act of the new VM touching any part of the image (and
 the
 OSD that stores it) ensures that that OSD gets the blacklist information.
 So on XFS, for example, the act of replaying the XFS journal ensures that
 any attempt by the old VM to write to the journal will get EIO.

 sage



>
> On Thu, Jan 24, 2013 at 11:50 AM, Josh Durgin 
> wrote:
>>
>> On 01/24/2013 05:30 AM, Ugis wrote:
>>>
>>>
>>> Hi,
>>>
>>> I have rbd which contains non-cluster filesystem. If this rbd is
>>> mapped+mounted on one host, it should not be mapped+mounted on the
>>> other simultaneously.
>>> How to protect such rbd from being mapped on the other host?
>>>
>>> At ceph level the only option is to use "lock add [image-name]
>>> [lock-id]" and check for existance of this lock on the other client or
>>> is it possible to protect rbd in a way that on other clients "rbd map
>>> " command would just fail with something like Permission denied
>>> without using arbitrary locks? In other words, can one limit the count
>>> of clients that may map certain rbd?
>>
>>
>>
>> This is what the lock commands were added for. The lock add command
>> will exit non-zero if the image is already locked, so you can run
>> something like:
>>
>>  rbd lock add [image-name] [lock-id] && rbd map [image-name]
>>
>> to avoid mapping an image that's in use elsewhere.
>>
>> The lock-id is user-defined, so you could (for example) use the
>> hostname of the machine mapping the image to tell where it's
>> in use.
>>
>> Josh
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
 --
 To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a

Re: how to protect rbd from multiple simultaneous mapping

2013-01-25 Thread Sage Weil
On Fri, 25 Jan 2013, Andrey Korolyov wrote:
> On Fri, Jan 25, 2013 at 4:52 PM, Ugis  wrote:
> > I mean if you map rbd and do not use "rbd lock.." command. Can you
> > tell which client has mapped certain rbd anyway?

Not yet.  We need to add the ability to list watchers in librados, which 
will then let us infer that information.

> Assume you has an undistinguishable L3 segment, NAT for example, and
> accessing cluster over it - there is no possibility for cluster to
> tell who exactly did something(mean, mapping). Locks mechanism is
> enough to fulfill your request, anyway.

The addrs listed by the lock list are entity_addr_t's, which include an 
IP, port, and a nonce that uniquely identifies the client.  It won't get 
confused by NAT.  Note that you can blacklist either a full P or an 
individual entity_addr_t.

But, as mentioned above, you can't list users who didn't use the locking 
(yet).

sage


> 
> >
> > 2013/1/25 Wido den Hollander :
> >> On 01/25/2013 11:47 AM, Ugis wrote:
> >>>
> >>> This could work, thanks!
> >>>
> >>> P.S. Is there a way to tell which client has mapped certain rbd if no
> >>> "rbd lock" is used?
> >>
> >>
> >> What you could do is this:
> >>
> >> $ rbd lock add myimage `hostname`
> >>
> >> That way you know which client locked the image.
> >>
> >> Wido
> >>
> >>
> >>> It would be useful to see that info in output of "rbd info ".
> >>> Probably attribute for rbd like "max_map_count_allowed" would be
> >>> useful in future - just to make sure rbd is not mapped from multiple
> >>> clients if it must not. I suppose it can actually happen if multiple
> >>> admins work with same rbds from multiple clients and no strict "rbd
> >>> lock add.." procedure is followed.
> >>>
> >>> Ugis
> >>>
> >>>
> >>> 2013/1/25 Sage Weil :
> 
>  On Thu, 24 Jan 2013, Mandell Degerness wrote:
> >
> > The advisory locks are nice, but it would be really nice to have the
> > fencing.  If a node is temporarily off the network and a heartbeat
> > monitor attempts to bring up a service on a different node, there is
> > no way to ensure that the first node will not write data to the rbd
> > after the rbd is mounted on the second node.  It would be nice if, on
> > seeing that an advisory lock exists, you could tell ceph "Do not
> > accept data from node X until further notice".
> 
> 
>  Just a reminder: you can use the information from the locks to fence.
>  The
>  basic process is:
> 
>    - identify old rbd lock holder (rbd lock list )
>    - blacklist old owner (ceph osd blacklist add )
>    - break old rbd lock (rbd lock remove   )
>    - lock rbd image on new host (rbd lock add  )
>    - map rbd image on new host
> 
>  The oddity here is that the old VM can in theory continue to write up
>  until the OSD hears about the blacklist via the internal gossip.  This is
>  okay because the act of the new VM touching any part of the image (and
>  the
>  OSD that stores it) ensures that that OSD gets the blacklist information.
>  So on XFS, for example, the act of replaying the XFS journal ensures that
>  any attempt by the old VM to write to the journal will get EIO.
> 
>  sage
> 
> 
> 
> >
> > On Thu, Jan 24, 2013 at 11:50 AM, Josh Durgin 
> > wrote:
> >>
> >> On 01/24/2013 05:30 AM, Ugis wrote:
> >>>
> >>>
> >>> Hi,
> >>>
> >>> I have rbd which contains non-cluster filesystem. If this rbd is
> >>> mapped+mounted on one host, it should not be mapped+mounted on the
> >>> other simultaneously.
> >>> How to protect such rbd from being mapped on the other host?
> >>>
> >>> At ceph level the only option is to use "lock add [image-name]
> >>> [lock-id]" and check for existance of this lock on the other client or
> >>> is it possible to protect rbd in a way that on other clients "rbd map
> >>> " command would just fail with something like Permission denied
> >>> without using arbitrary locks? In other words, can one limit the count
> >>> of clients that may map certain rbd?
> >>
> >>
> >>
> >> This is what the lock commands were added for. The lock add command
> >> will exit non-zero if the image is already locked, so you can run
> >> something like:
> >>
> >>  rbd lock add [image-name] [lock-id] && rbd map [image-name]
> >>
> >> to avoid mapping an image that's in use elsewhere.
> >>
> >> The lock-id is user-defined, so you could (for example) use the
> >> hostname of the machine mapping the image to tell where it's
> >> in use.
> >>
> >> Josh
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> >> in
> >> the body of a message to majord...@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> > --
> > To unsubscribe from this li

Re: how to protect rbd from multiple simultaneous mapping

2013-01-25 Thread Andrey Korolyov
On Fri, Jan 25, 2013 at 7:51 PM, Sage Weil  wrote:
> On Fri, 25 Jan 2013, Andrey Korolyov wrote:
>> On Fri, Jan 25, 2013 at 4:52 PM, Ugis  wrote:
>> > I mean if you map rbd and do not use "rbd lock.." command. Can you
>> > tell which client has mapped certain rbd anyway?
>
> Not yet.  We need to add the ability to list watchers in librados, which
> will then let us infer that information.
>
>> Assume you has an undistinguishable L3 segment, NAT for example, and
>> accessing cluster over it - there is no possibility for cluster to
>> tell who exactly did something(mean, mapping). Locks mechanism is
>> enough to fulfill your request, anyway.
>
> The addrs listed by the lock list are entity_addr_t's, which include an
> IP, port, and a nonce that uniquely identifies the client.  It won't get
> confused by NAT.  Note that you can blacklist either a full P or an
> individual entity_addr_t.
>
> But, as mentioned above, you can't list users who didn't use the locking
> (yet).

Yep, I meant impossibility of mapping source address to the specific
client in this case, there is possible to say that some client mapped
image, not exact one with specific identity(since clients using same
credentials, in less-distinguishable case). Client with the root
privileges can be extended to send DMI UUID which is more or less
persistent, but this is generally bad idea since client may be
non-root and still in need of persistent identity.

>
> sage
>
>
>>
>> >
>> > 2013/1/25 Wido den Hollander :
>> >> On 01/25/2013 11:47 AM, Ugis wrote:
>> >>>
>> >>> This could work, thanks!
>> >>>
>> >>> P.S. Is there a way to tell which client has mapped certain rbd if no
>> >>> "rbd lock" is used?
>> >>
>> >>
>> >> What you could do is this:
>> >>
>> >> $ rbd lock add myimage `hostname`
>> >>
>> >> That way you know which client locked the image.
>> >>
>> >> Wido
>> >>
>> >>
>> >>> It would be useful to see that info in output of "rbd info ".
>> >>> Probably attribute for rbd like "max_map_count_allowed" would be
>> >>> useful in future - just to make sure rbd is not mapped from multiple
>> >>> clients if it must not. I suppose it can actually happen if multiple
>> >>> admins work with same rbds from multiple clients and no strict "rbd
>> >>> lock add.." procedure is followed.
>> >>>
>> >>> Ugis
>> >>>
>> >>>
>> >>> 2013/1/25 Sage Weil :
>> 
>>  On Thu, 24 Jan 2013, Mandell Degerness wrote:
>> >
>> > The advisory locks are nice, but it would be really nice to have the
>> > fencing.  If a node is temporarily off the network and a heartbeat
>> > monitor attempts to bring up a service on a different node, there is
>> > no way to ensure that the first node will not write data to the rbd
>> > after the rbd is mounted on the second node.  It would be nice if, on
>> > seeing that an advisory lock exists, you could tell ceph "Do not
>> > accept data from node X until further notice".
>> 
>> 
>>  Just a reminder: you can use the information from the locks to fence.
>>  The
>>  basic process is:
>> 
>>    - identify old rbd lock holder (rbd lock list )
>>    - blacklist old owner (ceph osd blacklist add )
>>    - break old rbd lock (rbd lock remove   )
>>    - lock rbd image on new host (rbd lock add  )
>>    - map rbd image on new host
>> 
>>  The oddity here is that the old VM can in theory continue to write up
>>  until the OSD hears about the blacklist via the internal gossip.  This 
>>  is
>>  okay because the act of the new VM touching any part of the image (and
>>  the
>>  OSD that stores it) ensures that that OSD gets the blacklist 
>>  information.
>>  So on XFS, for example, the act of replaying the XFS journal ensures 
>>  that
>>  any attempt by the old VM to write to the journal will get EIO.
>> 
>>  sage
>> 
>> 
>> 
>> >
>> > On Thu, Jan 24, 2013 at 11:50 AM, Josh Durgin 
>> > wrote:
>> >>
>> >> On 01/24/2013 05:30 AM, Ugis wrote:
>> >>>
>> >>>
>> >>> Hi,
>> >>>
>> >>> I have rbd which contains non-cluster filesystem. If this rbd is
>> >>> mapped+mounted on one host, it should not be mapped+mounted on the
>> >>> other simultaneously.
>> >>> How to protect such rbd from being mapped on the other host?
>> >>>
>> >>> At ceph level the only option is to use "lock add [image-name]
>> >>> [lock-id]" and check for existance of this lock on the other client 
>> >>> or
>> >>> is it possible to protect rbd in a way that on other clients "rbd map
>> >>> " command would just fail with something like Permission denied
>> >>> without using arbitrary locks? In other words, can one limit the 
>> >>> count
>> >>> of clients that may map certain rbd?
>> >>
>> >>
>> >>
>> >> This is what the lock commands were added for. The lock add command
>> >> will exit non-zero if the image is already locked, so you can