Re: geo replication

2013-10-14 Thread John McClean
Is there any more information on an ETA?



--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Geo-replication with RBD

2013-03-31 Thread Lorieri
Hi Slawomir,

I'm not a ceph specialist, nor a developer, but I think Rados Object
Store API and Eleanor Cawthon's paper could be a possible solution for
radosgw replication.

Automatic recovery for RDB would be impraticable due the size of the
clusters. Some databases fixed it by giving a global id to
transactions but I believe it would break some ceph rules. If you look
at Amazon, they replicate databases using the database technology, not
by replicating the storage. If ceph creates a transaction log and the
internet goes down for few days, you would have to be able to save all
the transactions until it comes back, and then you will have to be
able to catch up.

But for your radosgw I believe it is possible to reproduce an
efficient transaction log by moving the logic and computation from
your embedded perl to the librados API (I'm sure if it is correct, I
mean the one you put some logic inside the OSDs) to populate a list of
transactions stored inside ceph, as described here:
http://ceph.com/papers/CawthonKeyValueStore.pdf , it may reduce the
sysadmin mistakes you mentioned. The problem is perl, nginx and AMPQ
is much simpler than rados and C.
If the replication stales the key-value list reduces the replication
because it aggregates updated objects with their last state, it also
makes it easy to deal with deleted objects, parallel copies and
buckets prioritization. If the replica data-center serves read-only
requests, adding a little more complexity, it would be possible to
replicate objects on demand by checking the transaction log before
serving an object, until the replication reaches a certain level of
acceptable delay.

For a complete data-center recovery, it would be nice to have tools to
simplify some operations, for example you could get from the crush map
one server of each branch, move them to the lost datacenter and set
them all as primary, replicate the data and wait for a recovery from
the journals. It is a huge operation that makes sense for a lot of
companies and I know some that did something similar for big raid
systems.

For lots of users like me, replication and its risks would be a
valuable and manageable feature and maybe it could be another project,
less strict with the fundamentals of ceph.



On Wed, Feb 20, 2013 at 2:19 PM, Sage Weil s...@inktank.com wrote:

 On Wed, 20 Feb 2013, S?awomir Skowron wrote:
  Like i say, yes. Now it is only option, to migrate data from one
  cluster to other, and now it must be enough, with some auto features.
 
  But is there any timeline, or any brainstorming in ceph internal
  meetings, about any possible replication in block level, or something
  like that ??

 I would like to get this in for cuttlefish (0.61).  See #4207 for the
 underlying rados bits.  We also need to settle the file format discussion;
 any input there would be appreciated!

 sage


 
  On 20 lut 2013, at 17:33, Sage Weil s...@inktank.com wrote:
 
   On Wed, 20 Feb 2013, S?awomir Skowron wrote:
   My requirement is to have full disaster recovery, buisness continuity,
   and failover of automatet services on second Datacenter, and not on
   same ceph cluster.
   Datacenters have 10GE dedicated link, for communication, and there is
   option to expand cluster into two DataCenters, but it is not what i
   mean.
   There are advantages of this option like fast snapshots, and fast
   switch of services, but there are some problems.
  
   When we talk about disaster recovery i mean that whole storage cluster
   have problems, not only services at top of storage. I am thinking
   about bug, or mistake of admin, that makes cluster not accessible in
   any copy, or a upgrade that makes data corruption, or upgrade that is
   disruptive for services - auto failover services into another DC,
   before upgrade cluster.
  
   If cluster have a solution to replicate data in rbd images to next
   cluster, than, only data are migrated, and when disaster comes, than
   there is no need to work on last imported snapshot (there can be
   constantly imported snapshot with minutes, or hour, before last
   production), but work on data from now. And when we have automated
   solution to recover DB (one of app service on top of rbd) clusters in
   new datacenter infrastructure, than we have a real disaster recovery
   solution.
  
   That's why we made, a s3 api layer synchronization to another DC, and
   Amazon, and only RBD is left.
  
   Have you read the thread from Jens last week, 'snapshot, clone and mount a
   VM-Image'?  Would this type of capability capture you're requirements?
  
   sage
  
  
   Dnia 19 lut 2013 o godz. 10:23 S?bastien Han
   han.sebast...@gmail.com napisa?(a):
  
   Hi,
  
   For of all, I have some questions about your setup:
  
   * What are your requirements?
   * Are the DCs far from each others?
  
   If they are reasonably close to each others, you can setup a single
   cluster, with replicas across both DCs and manage the RBD devices with
   pacemaker.
  
   

Re: Geo-replication with RBD

2013-02-20 Thread Sławomir Skowron
Like i say, yes. Now it is only option, to migrate data from one
cluster to other, and now it must be enough, with some auto features.

But is there any timeline, or any brainstorming in ceph internal
meetings, about any possible replication in block level, or something
like that ??

On 20 lut 2013, at 17:33, Sage Weil s...@inktank.com wrote:

 On Wed, 20 Feb 2013, S?awomir Skowron wrote:
 My requirement is to have full disaster recovery, buisness continuity,
 and failover of automatet services on second Datacenter, and not on
 same ceph cluster.
 Datacenters have 10GE dedicated link, for communication, and there is
 option to expand cluster into two DataCenters, but it is not what i
 mean.
 There are advantages of this option like fast snapshots, and fast
 switch of services, but there are some problems.

 When we talk about disaster recovery i mean that whole storage cluster
 have problems, not only services at top of storage. I am thinking
 about bug, or mistake of admin, that makes cluster not accessible in
 any copy, or a upgrade that makes data corruption, or upgrade that is
 disruptive for services - auto failover services into another DC,
 before upgrade cluster.

 If cluster have a solution to replicate data in rbd images to next
 cluster, than, only data are migrated, and when disaster comes, than
 there is no need to work on last imported snapshot (there can be
 constantly imported snapshot with minutes, or hour, before last
 production), but work on data from now. And when we have automated
 solution to recover DB (one of app service on top of rbd) clusters in
 new datacenter infrastructure, than we have a real disaster recovery
 solution.

 That's why we made, a s3 api layer synchronization to another DC, and
 Amazon, and only RBD is left.

 Have you read the thread from Jens last week, 'snapshot, clone and mount a
 VM-Image'?  Would this type of capability capture you're requirements?

 sage


 Dnia 19 lut 2013 o godz. 10:23 S?bastien Han
 han.sebast...@gmail.com napisa?(a):

 Hi,

 For of all, I have some questions about your setup:

 * What are your requirements?
 * Are the DCs far from each others?

 If they are reasonably close to each others, you can setup a single
 cluster, with replicas across both DCs and manage the RBD devices with
 pacemaker.

 Cheers.

 --
 Regards,
 S?bastien Han.


 On Mon, Feb 18, 2013 at 3:20 PM, S?awomir Skowron szi...@gmail.com wrote:
 Hi, Sorry for very late response, but i was sick.

 Our case is to make a failover rbd instance in another cluster. We are
 storing block device images, for some services like Database. We need
 to have a two clusters, synchronized, for a quick failover, if first
 cluster goes down, or for upgrade with restart, or many other cases.

 Volumes are in many sizes: 1-500GB
 external block device for kvm vm, like EBS.

 On Mon, Feb 18, 2013 at 3:07 PM, S?awomir Skowron szi...@gmail.com wrote:
 Hi, Sorry for very late response, but i was sick.

 Our case is to make a failover rbd instance in another cluster. We are
 storing block device images, for some services like Database. We need to
 have a two clusters, synchronized, for a quick failover, if first cluster
 goes down, or for upgrade with restart, or many other cases.

 Volumes are in many sizes: 1-500GB
 external block device for kvm vm, like EBS.


 On Fri, Feb 1, 2013 at 12:27 AM, Neil Levine neil.lev...@inktank.com
 wrote:

 Skowron,

 Can you go into a bit more detail on your specific use-case? What type
 of data are you storing in rbd (type, volume)?

 Neil

 On Wed, Jan 30, 2013 at 10:42 PM, Skowron S?awomir
 slawomir.skow...@grupaonet.pl wrote:
 I make new thread, because i think it's a diffrent case.

 We have managed async geo-replication of s3 service, beetwen two ceph
 clusters in two DC's, and to amazon s3 as third. All this via s3 API. I 
 love
 to see native RGW geo-replication with described features in another 
 thread.

 There is another case. What about RBD replication ?? It's much more
 complicated, and for disaster recovery much more important, just like in
 enterprise storage arrays.
 One cluster in two DC's, not solving problem, because we need security
 in data consistency, and isolation.
 Do you thinking about this case ??

 Regards
 Slawomir Skowron--
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html




 --
 -
 Pozdrawiam

 S?awek sZiBis Skowron



 --
 -
 Pozdrawiam

 S?awek sZiBis Skowron
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this 

Re: Geo-replication with RBD

2013-02-20 Thread Sage Weil
On Wed, 20 Feb 2013, S?awomir Skowron wrote:
 Like i say, yes. Now it is only option, to migrate data from one
 cluster to other, and now it must be enough, with some auto features.
 
 But is there any timeline, or any brainstorming in ceph internal
 meetings, about any possible replication in block level, or something
 like that ??

I would like to get this in for cuttlefish (0.61).  See #4207 for the 
underlying rados bits.  We also need to settle the file format discussion; 
any input there would be appreciated!

sage


 
 On 20 lut 2013, at 17:33, Sage Weil s...@inktank.com wrote:
 
  On Wed, 20 Feb 2013, S?awomir Skowron wrote:
  My requirement is to have full disaster recovery, buisness continuity,
  and failover of automatet services on second Datacenter, and not on
  same ceph cluster.
  Datacenters have 10GE dedicated link, for communication, and there is
  option to expand cluster into two DataCenters, but it is not what i
  mean.
  There are advantages of this option like fast snapshots, and fast
  switch of services, but there are some problems.
 
  When we talk about disaster recovery i mean that whole storage cluster
  have problems, not only services at top of storage. I am thinking
  about bug, or mistake of admin, that makes cluster not accessible in
  any copy, or a upgrade that makes data corruption, or upgrade that is
  disruptive for services - auto failover services into another DC,
  before upgrade cluster.
 
  If cluster have a solution to replicate data in rbd images to next
  cluster, than, only data are migrated, and when disaster comes, than
  there is no need to work on last imported snapshot (there can be
  constantly imported snapshot with minutes, or hour, before last
  production), but work on data from now. And when we have automated
  solution to recover DB (one of app service on top of rbd) clusters in
  new datacenter infrastructure, than we have a real disaster recovery
  solution.
 
  That's why we made, a s3 api layer synchronization to another DC, and
  Amazon, and only RBD is left.
 
  Have you read the thread from Jens last week, 'snapshot, clone and mount a
  VM-Image'?  Would this type of capability capture you're requirements?
 
  sage
 
 
  Dnia 19 lut 2013 o godz. 10:23 S?bastien Han
  han.sebast...@gmail.com napisa?(a):
 
  Hi,
 
  For of all, I have some questions about your setup:
 
  * What are your requirements?
  * Are the DCs far from each others?
 
  If they are reasonably close to each others, you can setup a single
  cluster, with replicas across both DCs and manage the RBD devices with
  pacemaker.
 
  Cheers.
 
  --
  Regards,
  S?bastien Han.
 
 
  On Mon, Feb 18, 2013 at 3:20 PM, S?awomir Skowron szi...@gmail.com 
  wrote:
  Hi, Sorry for very late response, but i was sick.
 
  Our case is to make a failover rbd instance in another cluster. We are
  storing block device images, for some services like Database. We need
  to have a two clusters, synchronized, for a quick failover, if first
  cluster goes down, or for upgrade with restart, or many other cases.
 
  Volumes are in many sizes: 1-500GB
  external block device for kvm vm, like EBS.
 
  On Mon, Feb 18, 2013 at 3:07 PM, S?awomir Skowron szi...@gmail.com 
  wrote:
  Hi, Sorry for very late response, but i was sick.
 
  Our case is to make a failover rbd instance in another cluster. We are
  storing block device images, for some services like Database. We need to
  have a two clusters, synchronized, for a quick failover, if first 
  cluster
  goes down, or for upgrade with restart, or many other cases.
 
  Volumes are in many sizes: 1-500GB
  external block device for kvm vm, like EBS.
 
 
  On Fri, Feb 1, 2013 at 12:27 AM, Neil Levine neil.lev...@inktank.com
  wrote:
 
  Skowron,
 
  Can you go into a bit more detail on your specific use-case? What type
  of data are you storing in rbd (type, volume)?
 
  Neil
 
  On Wed, Jan 30, 2013 at 10:42 PM, Skowron S?awomir
  slawomir.skow...@grupaonet.pl wrote:
  I make new thread, because i think it's a diffrent case.
 
  We have managed async geo-replication of s3 service, beetwen two ceph
  clusters in two DC's, and to amazon s3 as third. All this via s3 API. 
  I love
  to see native RGW geo-replication with described features in another 
  thread.
 
  There is another case. What about RBD replication ?? It's much more
  complicated, and for disaster recovery much more important, just like 
  in
  enterprise storage arrays.
  One cluster in two DC's, not solving problem, because we need security
  in data consistency, and isolation.
  Do you thinking about this case ??
 
  Regards
  Slawomir Skowron--
  To unsubscribe from this list: send the line unsubscribe ceph-devel 
  in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  --
  To unsubscribe from this list: send the line unsubscribe ceph-devel 
  in
  the body of a message to majord...@vger.kernel.org
  More majordomo 

Re: Geo-replication with RBD

2013-02-19 Thread Sébastien Han
Hi,

For of all, I have some questions about your setup:

* What are your requirements?
* Are the DCs far from each others?

If they are reasonably close to each others, you can setup a single
cluster, with replicas across both DCs and manage the RBD devices with
pacemaker.

Cheers.

--
Regards,
Sébastien Han.


On Mon, Feb 18, 2013 at 3:20 PM, Sławomir Skowron szi...@gmail.com wrote:
 Hi, Sorry for very late response, but i was sick.

 Our case is to make a failover rbd instance in another cluster. We are
 storing block device images, for some services like Database. We need
 to have a two clusters, synchronized, for a quick failover, if first
 cluster goes down, or for upgrade with restart, or many other cases.

 Volumes are in many sizes: 1-500GB
 external block device for kvm vm, like EBS.

 On Mon, Feb 18, 2013 at 3:07 PM, Sławomir Skowron szi...@gmail.com wrote:
 Hi, Sorry for very late response, but i was sick.

 Our case is to make a failover rbd instance in another cluster. We are
 storing block device images, for some services like Database. We need to
 have a two clusters, synchronized, for a quick failover, if first cluster
 goes down, or for upgrade with restart, or many other cases.

 Volumes are in many sizes: 1-500GB
 external block device for kvm vm, like EBS.


 On Fri, Feb 1, 2013 at 12:27 AM, Neil Levine neil.lev...@inktank.com
 wrote:

 Skowron,

 Can you go into a bit more detail on your specific use-case? What type
 of data are you storing in rbd (type, volume)?

 Neil

 On Wed, Jan 30, 2013 at 10:42 PM, Skowron Sławomir
 slawomir.skow...@grupaonet.pl wrote:
  I make new thread, because i think it's a diffrent case.
 
  We have managed async geo-replication of s3 service, beetwen two ceph
  clusters in two DC's, and to amazon s3 as third. All this via s3 API. I 
  love
  to see native RGW geo-replication with described features in another 
  thread.
 
  There is another case. What about RBD replication ?? It's much more
  complicated, and for disaster recovery much more important, just like in
  enterprise storage arrays.
  One cluster in two DC's, not solving problem, because we need security
  in data consistency, and isolation.
  Do you thinking about this case ??
 
  Regards
  Slawomir Skowron--
  To unsubscribe from this list: send the line unsubscribe ceph-devel in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html




 --
 -
 Pozdrawiam

 Sławek sZiBis Skowron



 --
 -
 Pozdrawiam

 Sławek sZiBis Skowron
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Geo-replication with RBD

2013-02-18 Thread Sławomir Skowron
Hi, now i can response, after i was sick.

Nginx is compiled with perl/or lua support. Inside nginx configuration
is hook, for a perl code, or lua code, as you prefer. This code have a
inline functionality. We have testing this from logs, but it's not a
good idea.Now in line option, have advantage, because, we can reject
PUT's if AMQP is not working, and we don't need to resync, all
requests. If it took long, than, wee can't disable queue, and go
direct, without AMQP, and resync offline, from logs, by a simple admin
tool.

This in line functionality, working only on DELETE, PUT, and rest are
skipped, Every DELETE, PUT, have a own queue, with own priorities, and
custom info in header, for calculating, a time, of synchronization.
This nginx functionality, only putting data into queues, and every
data, are going into our, S3 (ceph), and Amazon s3, via nginx, with
almost same configuration, distributed by puppet.

On every DataCenter, we have a bunch of workers, getting data from
queues, dedicated for a location, and then, they are getting data
syncing, from source to destination. If data can't be get from source,
than info is going into error queue, and this queue is re-checked, for
some time.

I am in middle of writing some article, about this, but my sickness,
have slow down this process slightly.


On Thu, Jan 31, 2013 at 10:50 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
 2013/1/31 Sławomir Skowron szi...@gmail.com:
 We are using nginx, on top of rgw. In nginx we manage to create logic, for
 using a AMQP, and async operations via queues. Then workers, on every side
 getiing data from own queue, and then coping data from source, to
 destination in s3 API. Works for PUT/DELETE, and work automatic when
 production goes on another location.

 I don't know much about messaging, are you able to share some
 configuration or more details ?



-- 
-
Pozdrawiam

Sławek sZiBis Skowron

On Thu, Jan 31, 2013 at 10:50 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
 2013/1/31 Sławomir Skowron szi...@gmail.com:
 We are using nginx, on top of rgw. In nginx we manage to create logic, for
 using a AMQP, and async operations via queues. Then workers, on every side
 getiing data from own queue, and then coping data from source, to
 destination in s3 API. Works for PUT/DELETE, and work automatic when
 production goes on another location.

 I don't know much about messaging, are you able to share some
 configuration or more details ?



--
-
Pozdrawiam

Sławek sZiBis Skowron
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Geo-replication with RBD

2013-02-18 Thread Sławomir Skowron
Hi, Sorry for very late response, but i was sick.

Our case is to make a failover rbd instance in another cluster. We are
storing block device images, for some services like Database. We need
to have a two clusters, synchronized, for a quick failover, if first
cluster goes down, or for upgrade with restart, or many other cases.

Volumes are in many sizes: 1-500GB
external block device for kvm vm, like EBS.

On Mon, Feb 18, 2013 at 3:07 PM, Sławomir Skowron szi...@gmail.com wrote:
 Hi, Sorry for very late response, but i was sick.

 Our case is to make a failover rbd instance in another cluster. We are
 storing block device images, for some services like Database. We need to
 have a two clusters, synchronized, for a quick failover, if first cluster
 goes down, or for upgrade with restart, or many other cases.

 Volumes are in many sizes: 1-500GB
 external block device for kvm vm, like EBS.


 On Fri, Feb 1, 2013 at 12:27 AM, Neil Levine neil.lev...@inktank.com
 wrote:

 Skowron,

 Can you go into a bit more detail on your specific use-case? What type
 of data are you storing in rbd (type, volume)?

 Neil

 On Wed, Jan 30, 2013 at 10:42 PM, Skowron Sławomir
 slawomir.skow...@grupaonet.pl wrote:
  I make new thread, because i think it's a diffrent case.
 
  We have managed async geo-replication of s3 service, beetwen two ceph
  clusters in two DC's, and to amazon s3 as third. All this via s3 API. I 
  love
  to see native RGW geo-replication with described features in another 
  thread.
 
  There is another case. What about RBD replication ?? It's much more
  complicated, and for disaster recovery much more important, just like in
  enterprise storage arrays.
  One cluster in two DC's, not solving problem, because we need security
  in data consistency, and isolation.
  Do you thinking about this case ??
 
  Regards
  Slawomir Skowron--
  To unsubscribe from this list: send the line unsubscribe ceph-devel in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html




 --
 -
 Pozdrawiam

 Sławek sZiBis Skowron



--
-
Pozdrawiam

Sławek sZiBis Skowron
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Geo-replication with RBD

2013-01-31 Thread Sławomir Skowron
We are using nginx, on top of rgw. In nginx we manage to create logic,
for using a AMQP, and async operations via queues. Then workers, on
every side getiing data from own queue, and then coping data from
source, to destination in s3 API. Works for PUT/DELETE, and work
automatic when production goes on another location.

On Thu, Jan 31, 2013 at 9:25 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
 2013/1/31 Skowron Sławomir slawomir.skow...@grupaonet.pl:
 We have managed async geo-replication of s3 service, beetwen two ceph 
 clusters in two DC's, and to amazon s3 as third. All this via s3 API. I love 
 to see native RGW geo-replication with described features in another thread.

 how did you do this?
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
-
Pozdrawiam

Sławek sZiBis Skowron
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Geo-replication with RBD

2013-01-31 Thread Neil Levine
Skowron,

Can you go into a bit more detail on your specific use-case? What type
of data are you storing in rbd (type, volume)?

Neil

On Wed, Jan 30, 2013 at 10:42 PM, Skowron Sławomir
slawomir.skow...@grupaonet.pl wrote:
 I make new thread, because i think it's a diffrent case.

 We have managed async geo-replication of s3 service, beetwen two ceph 
 clusters in two DC's, and to amazon s3 as third. All this via s3 API. I love 
 to see native RGW geo-replication with described features in another thread.

 There is another case. What about RBD replication ?? It's much more 
 complicated, and for disaster recovery much more important, just like in 
 enterprise storage arrays.
 One cluster in two DC's, not solving problem, because we need security in 
 data consistency, and isolation.
 Do you thinking about this case ??

 Regards
 Slawomir Skowron--
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Geo-replication with RADOS GW

2013-01-30 Thread John Nielsen
On Jan 28, 2013, at 11:32 AM, Gregory Farnum g...@inktank.com wrote:

 On Monday, January 28, 2013 at 9:54 AM, Ben Rowland wrote:
 Hi,
 
 I'm considering using Ceph to create a cluster across several data
 centres, with the strict requirement that writes should go to both
 DCs. This seems possible by specifying rules in the CRUSH map, with
 an understood latency hit resulting from purely synchronous writes.
 
 The part I'm unsure about is how the RADOS GW fits into this picture.
 For high availability (and to improve best-case latency on reads),
 we'd want to run a gateway in each data centre. However, the first
 paragraph of the following post suggests this is not possible:
 
 http://article.gmane.org/gmane.comp.file-systems.ceph.devel/12238
 
 Is there a hard restriction on how many radosgw instances can run
 across the cluster, or is the point of the above post more about a
 performance hit?
 
 It's talking about the performance hit. Most people can't afford data-center 
 level connectivity between two different buildings. ;) If you did have a Ceph 
 cluster split across two DC (with the bandwidth to support them) this will 
 work fine. There aren't any strict limits on the number of gateways you stick 
 on a cluster, just the scaling costs associated with cache invalidation 
 notifications.
 
 
 It seems to me it should be possible to run more
 than one radosgw, particularly if each instance communicates with a
 local OSD which can proxy reads/writes to the primary (which may or
 may not be DC-local).
 
 They aren't going to do this, though — each gateway will communicate with the 
 primaries directly.

I don't know what the timeline is, but Yehuda proposed recently the idea of 
master and slave zones (subsets of a cluster) and other changes to facilitate 
rgw geo-replication and disaster recovery. See this message:
http://article.gmane.org/gmane.comp.file-systems.ceph.devel/12238

If/when that comes to fruition it would open a lot of possibilities for the 
kind of scenario you're talking about. (Yes, I'm looking forward to it. :) )

JN

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Geo-replication with RADOS GW

2013-01-28 Thread Gregory Farnum
On Monday, January 28, 2013 at 9:54 AM, Ben Rowland wrote:
 Hi,
  
 I'm considering using Ceph to create a cluster across several data
 centres, with the strict requirement that writes should go to both
 DCs. This seems possible by specifying rules in the CRUSH map, with
 an understood latency hit resulting from purely synchronous writes.
  
 The part I'm unsure about is how the RADOS GW fits into this picture.
 For high availability (and to improve best-case latency on reads),
 we'd want to run a gateway in each data centre. However, the first
 paragraph of the following post suggests this is not possible:
  
 http://article.gmane.org/gmane.comp.file-systems.ceph.devel/12238
  
 Is there a hard restriction on how many radosgw instances can run
 across the cluster, or is the point of the above post more about a
 performance hit?

It's talking about the performance hit. Most people can't afford data-center 
level connectivity between two different buildings. ;) If you did have a Ceph 
cluster split across two DC (with the bandwidth to support them) this will work 
fine. There aren't any strict limits on the number of gateways you stick on a 
cluster, just the scaling costs associated with cache invalidation 
notifications.

  
 It seems to me it should be possible to run more
 than one radosgw, particularly if each instance communicates with a
 local OSD which can proxy reads/writes to the primary (which may or
 may not be DC-local).

They aren't going to do this, though — each gateway will communicate with the 
primaries directly.
-Greg

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: geo replication

2013-01-10 Thread Gregory Farnum
On Wed, Jan 9, 2013 at 1:33 PM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
 2013/1/9 Mark Kampe mark.ka...@inktank.com:
 Asynchronous RADOS replication is definitely on our list,
 but more complex and farther out.

 Do you have any ETA?
 1 month?  6 months ? 1 year?

No, but definitely closer to 1 year than either of the other options
at this point.
-Greg
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html