date:20121231

Re: Improving responsiveness of KVM guests on Ceph storage

2012-12-31 Thread Andrey Korolyov

On Mon, Dec 31, 2012 at 3:12 AM, Jens Kristian Søgaard
j...@mermaidconsulting.dk wrote:
 Hi Andrey,

 Thanks for your reply!


 You may try do play with SCHED_RT, I have found it hard to use for
 myself, but you can achieve your goal by adding small RT slices via
 ``cpu'' cgroup to vcpu/emulator threads, it dramatically increases
 overall VM` responsibility.


 I'm not quite sure I understand your suggestion.

 Do you mean that you set the process priority to real-time on each qemu-kvm
 process, and then use cgroups cpu.rt_runtime_us / cpu.rt_period_us to
 restrict the amount of CPU time those processes can receive?

 I'm not sure how that would apply here, as I have only one qemu-kvm process
 and it is not non-responsive because of the lack of allocated CPU time
 slices - but rather because some I/Os take a long time to complete, and
 other I/Os apparently have to wait for those to complete.

Yep, I meant the same. Of course it`ll not help with only one VM, RT
may help in more concurrent cases :)

 threads. Of course, some Ceph tuning like writeback cache and large
 journal may help you too, I`m speaking primarily of VM` performance by


 I have been considering the journal as something where I could improve
 performance by tweaking the setup. I have set aside 10 GB of space for the
 journal, but I'm not sure if this is too little - or if the size really
 doesn't matter that much when it is on the same mdraid as the data itself.

 Is there a tool that can tell me how much of my journal space that is
 actually actively being used?

 I.e. I'm looking for something that could tell me, if increasing the size of
 the journal or placing it on a seperate (SSD) disk could solve my problem.

As I understood right, you have md device holding both journal and
filestore? What type of raid you have here? Of course you`ll need a
separate device (for experimental purposes, fast disk may be enough)
for the journal, and if you set any type of redundant storage under
filestore partition, you may also change it to simple RAID0, or even
separate disks, and create one osd over every disk(you should see to
the journal device` throughput which must be equal to sum of speeds of
all filestore devices, so for commodity-type SSD it sums to two
100MB/s disks, for example). I have ``pure'' disk setup in my dev
environment built on quite old desktop-class machines and one rsync
process may hang VM for short time, despite of using dedicated SATA
disk for journal.

 How do I change the size of the writeback cache when using qemu-kvm like I
 do?

 Does setting rbd cache size in ceph.conf have any effect on qemu-kvm, where
 the drive is defined as:

   format=rbd,file=rbd:data/image1:rbd_cache=1,if=virtio

What size of cache_size/max_dirty you have inside ceph.conf and which
qemu version you use? Default values good enough to prevent pushing
I/O spikes down to the physical storage, but for long I/O-intensive
tasks increasing cache may help OS to align writes more smoothly. Also
you don`t need to set rbd_cache explicitly in the disk config using
qemu 1.2 and younger releases, for older ones
http://lists.gnu.org/archive/html/qemu-devel/2012-05/msg02500.html
should be applied.


 --
 Jens Kristian Søgaard, Mermaid Consulting ApS,
 j...@mermaidconsulting.dk,
 http://www.mermaidconsulting.com/
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ceph for small cluster?

2012-12-31 Thread Wido den Hollander


Hi,

On 12/30/2012 10:38 PM, Miles Fidelman wrote:

Hi Folks,

I'm wondering how ceph would work in a small cluster that supports a mix
of engineering and modest production (email, lists, web server for
several small communities).

Specifically, we have a rack with 4 medium-horsepower servers, each with
4 disk drives, running Xen (debian dom0 and domUs) - all linked together
w/ 4 gigE ethernets.

Currently, 2 of the servers are running a high-availability
configuration, using DRBD to mirror specific volumes, and pacemaker for
failover.

For a while, I've been looking for a way to replace DRBD with something
that would mirror across more than 2 servers - so that we could migrate
VMs arbitrarily - and that will work without splitting up compute vs.
storage nodes (for the short term, at least, we're stuck with rack space
and server limitations).

The thing that looks closest to filling the bill is Sheepdog (at least
architecturally) - but it only provides a KVM interface. GlusterFS,
xTreemFS, and Ceph keep coming up as possibles - with ceph's rbd
interface looking like the easiest to integrate.

Which leads me to two questions:

- On a theoretical level, does using ceph as a storage pool for this
kind of small cluster make any sense (notably, I'd see running an OSD, a
MDS, a MON, and client DomUs on each of the 4 nodes, using LVM to pool
all the storage and it seems like folks recommend XFS as a production
filesystem)



Yes, that could work. But you have to keep in mind that OSDs can spike 
in both CPU and memory when they have to do recovery work for a failed 
node/OSD.


Also, with RBD you don't need an MDS. As a last note, you should always 
have an odd number of monitors. So run a monitor on 3 of the 4 machines.


The monitors work by a voting principle where they need a majority. An 
odd number is best in that situation.



- On a practical level, has anybody tried building this kind of small
cluster, and if so, what kind of results have you had?



I build some small Ceph cluster with sometimes just 3 nodes. It works, 
but you have to keep in mind that when one node in a 4 node cluster 
fails you will loose 25% of the capacity.


This will lead to a heavy recovery within the Ceph cluster which will 
but a lot of pressure on that Gbit links and the CPUs and memory of the 
nodes.


With RBD you might want to consider adding an SSD for the journaling of 
the OSDs, that will give you a pretty nice performance boost.


Wido


Comments and suggestions please!

Thank you very much,

Miles Fidelman


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

what could go wrong with two clusters on the same network?

2012-12-31 Thread Xiaopong Tran


Hi,

If I run two clusters on the same network, each with its own set of
monitors and config files (assuming that we didn't make any error
in the config files), would there be anything wrong with that?

Ceph seems to be quite chatty, so would they mess up their
messages?

Just want to make sure, and just want to know if anyone is
doing that.

Thanks

Xiaopong
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: automatic repair of inconsistent pg?

2012-12-31 Thread Stefan Priebe


Am 31.12.2012 02:10, schrieb Samuel Just:

Are you using xfs?  If so, what mount options?


Yes,
noatime,nodiratime,nobarrier,logbufs=8,logbsize=256k

Stefan



On Dec 30, 2012 1:28 PM, Stefan Priebe s.pri...@profihost.ag
mailto:s.pri...@profihost.ag wrote:
 
  Am 30.12.2012 19:17, schrieb Samuel Just:
 
  This is somewhat more likely to have been a bug in the replication logic
  (there were a few fixed between 0.53 and 0.55).  Had there been any
  recent osd failures?
 
  Yes i was stressing CEPH with failures (power, link, disk, ...).
 
  Stefan
 
  On Dec 24, 2012 10:55 PM, Sage Weil s...@inktank.com
mailto:s...@inktank.com
  mailto:s...@inktank.com mailto:s...@inktank.com wrote:
 
  On Tue, 25 Dec 2012, Stefan Priebe wrote:
Hello list,
   
today i got the following ceph status output:
2012-12-25 02:57:00.632945 mon.0 [INF] pgmap v1394388: 7632
pgs: 7631
active+clean, 1 active+clean+inconsistent; 151 GB data, 307 GB
  used, 5028 GB /
5336 GB avail
   
   
i then grepped the inconsistent pg by:
# ceph pg dump - | grep inconsistent
3.ccf   10  0   0   0   41037824155930
155930
active+clean+inconsistent   2012-12-25 01:51:35.318459
6243'2107
6190'9847   [14,42] [14,42] 6243'2107   2012-12-25
  01:51:35.318436
6007'2074   2012-12-23 01:51:24.386366
   
and initiated a repair:
#  ceph pg repair 3.ccf
instructing pg 3.ccf on osd.14 to repair
   
The log output then was:
2012-12-25 02:56:59.056382 osd.14 [ERR] 3.ccf osd.42 missing
1c602ccf/rbd_data.4904d6b8b4567.0b84/head//3
2012-12-25 02:56:59.056385 osd.14 [ERR] 3.ccf osd.42 missing
ceb55ccf/rbd_data.48cc66b8b4567.1538/head//3
2012-12-25 02:56:59.097989 osd.14 [ERR] 3.ccf osd.42 missing
dba6bccf/rbd_data.4797d6b8b4567.15ad/head//3
2012-12-25 02:56:59.097991 osd.14 [ERR] 3.ccf osd.42 missing
a4deccf/rbd_data.45f956b8b4567.03d5/head//3
2012-12-25 02:56:59.098022 osd.14 [ERR] 3.ccf repair 4 missing, 0
  inconsistent
objects
2012-12-25 02:56:59.098046 osd.14 [ERR] 3.ccf repair 4 errors, 4
  fixed
   
Why doesn't ceph repair this automatically? Ho could this happen
  at all?
 
  We just made some fixes to repair in next (it was broken sometime
  between
  ~0.53 and 0.55).  The latest next should repair it.  In general
we don't
  repair automatically lest we inadvertantly propagate bad data or
paper
  over a bug.
 
  As for the original source of the missing objects... I'm not sure.
There
  were some fixed races related to backfill that could lead to an
object
  being missed, but Sam would know more about how likely that
actually is.
 
  sage
  --
  To unsubscribe from this list: send the line unsubscribe
ceph-devel in
  the body of a message to majord...@vger.kernel.org
mailto:majord...@vger.kernel.org
  mailto:majord...@vger.kernel.org
mailto:majord...@vger.kernel.org
  More majordomo info at http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Improving responsiveness of KVM guests on Ceph storage

2012-12-31 Thread Andrey Korolyov

On Mon, Dec 31, 2012 at 2:58 PM, Jens Kristian Søgaard
j...@mermaidconsulting.dk wrote:
 Hi Andrey,


 As I understood right, you have md device holding both journal and
 filestore? What type of raid you have here?


 Yes, same md device holding both journal and filestore. It is a raid5.

Ahem, of course you need to reassemble it to something faster :)


 Of course you`ll need a
 separate device (for experimental purposes, fast disk may be enough)
 for the journal


 Is there a way to tell if the journal is the bottleneck without actually
 adding such an extra device?

In theory, yes - but your setup already dying under high amount of
write seeks, so it may be not necessary. Also I don`t see a right way
to measure a bottleneck when disk device used for both filestore and
journal - in case of separated ones, you may measure maximum values
using fio and compare to calculated ones from /proc/diskstats,
``all-in-one'' case seems obviously hard to measure, even if you able
to log writes to journal file and filestore files separately without
significant overhead.


 filestore partition, you may also change it to simple RAID0, or even
 separate disks, and create one osd over every disk(you should see to


 I have only 3 OSDs with 4 disks each. I was afraid that it would be too
 brittle as a RAID0, and if I created seperate OSDs for each disk, it would
 stall the file system due to recovery if a server crashes.

No, it isn`t too bad in most cases. Recovery process is not affecting
operations to the rbd storage except small performance degradation, so
you may split your raid setup to the lightweight R0. It depends, on
plain SATA controller software R0 under one OSD will do better work
than 2 separate OSDs having one disk each, on cache-backed controller
separate OSDs is more preferably until controller is not able to align
writes due to overall write bandwidth.



 What size of cache_size/max_dirty you have inside ceph.conf


 I haven't set them explicitly, so I imagine the cache_size is 32 MB and the
 max_dirty is 24 MB.


 and which

 qemu version you use?


 Using the default 0.15 version in Fedora 16.


 tasks increasing cache may help OS to align writes more smoothly. Also
 you don`t need to set rbd_cache explicitly in the disk config using
 qemu 1.2 and younger releases, for older ones
 http://lists.gnu.org/archive/html/qemu-devel/2012-05/msg02500.html
 should be applied.


 I read somewhere that I needed to enable it specifically for older qemu-kvm
 versions, which I did like this:

   format=rbd,file=rbd:data/image1:rbd_cache=1,if=virtio

 However now I read in the docs for qemu-rbd that it needs to be set like
 this:

   format=raw,file=rbd:data/squeeze:rbd_cache=true,cache=writeback

 I'm not sure if 1 and true are interpreted the same way?

 I'll try using true and see if I get any noticable changes in behaviour.

 The link you sent me seems to indicate that I need to compile my own version
 of qemu-kvm to be able to test this?


No, there is no significant changes since 0.15 to the current version
and your options will work just fine. So there may be general
recommendations to remove redundancy from your disk backend and then
move out journal to separate disk or ssd.



 --
 Jens Kristian Søgaard, Mermaid Consulting ApS,
 j...@mermaidconsulting.dk,
 http://www.mermaidconsulting.com/
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ceph for small cluster?

2012-12-31 Thread Miles Fidelman


Matt, Thanks for the comments.  A follow-up if I might (inline):

Matthew Roy wrote:
What I'm not doing that you'd need to test is running VMs on the same 
servers as storage. I'd be careful about mounting RBD volumes on the 
OSDs, you can run into kernel deadlock trying to write out things to 
physical disk when trying to write to the mounted volume. Mounts 
inside VMs should be okay. 


I was thinking of running pinning one CPU to DomO and running the OSD 
from there, and mounting RBD volumes only in DomUs.  And leaving a bit 
of disk space outside the OSD for booting and Dom0.


Which raises another question: how are you combining drives within each 
OSD (raid, lvm, ?).


Thanks again,

Miles


--
In theory, there is no difference between theory and practice.
In practice, there is.    Yogi Berra

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ceph for small cluster?

2012-12-31 Thread Miles Fidelman



Wido, Thanks for the comment, a follow-up if I might (below)?

Wido den Hollander wrote:


I build some small Ceph cluster with sometimes just 3 nodes. It works,
but you have to keep in mind that when one node in a 4 node cluster
fails you will loose 25% of the capacity.

This will lead to a heavy recovery within the Ceph cluster which will
but a lot of pressure on that Gbit links and the CPUs and memory of
the nodes.

With RBD you might want to consider adding an SSD for the journaling
of the OSDs, that will give you a pretty nice performance boost.


Would not journalling alone, say on a separate hard disk volume, help
with recovery?

Thanks,

Miles

--
In theory, there is no difference between theory and practice.
In practice, there is.    Yogi Berra

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: rbd map command hangs for 15 minutes during system start up

2012-12-31 Thread Alex Elder

On 12/26/2012 03:36 PM, Alex Elder wrote:
On 12/26/2012 11:45 AM, Nick Bartos wrote:
Here's a log with a hang on the updated branch:

https://gist.github.com/raw/4381750/772476e1bae1e6366347a223f34aa6c440b92765/rdb-hang-1356543132.log

OK, new naming scheme. Please try: wip-nick-1

Now that we've got this resolved, I've created an updated
stable branch with ceph-related bug fixes, based on the
latest 3.5 stable branch, 3.5.7. It contains a bunch of
other bug fixes that what you had been working with did
not have.

I'm starting my own testing with this branch now. But it
would be great if you'd give it a try as well, since I
know you're a real user of this code base.

It's available as branch linux-3.5.7-ceph on the
ceph-client git repository. Thanks a lot.

-Alex

I added another simple fix, but then collapsed three commits
into one, and added one more (somewhat unrelated).

I've done simple testing with this and will subject it to
more rigorous testing shortly. I wanted to make it available
to you quickly though.

-Alex

On Thu, Dec 20, 2012 at 1:59 PM, Alex Elder el...@inktank.com wrote:
On 12/20/2012 11:48 AM, Nick Bartos wrote:
Unfortunately, we still have a hang:

https://gist.github.com/4347052/download

The saga continues, and each time we get a little more
information. Please try branch: wip-nick-newerest

Thank you.

-Alex

On Wed, Dec 19, 2012 at 2:42 PM, Alex Elder el...@inktank.com wrote:
On 12/19/2012 03:25 PM, Alex Elder wrote:
On 12/18/2012 12:05 PM, Nick Bartos wrote:
I've added the output of ps -ef in addition to triggering a trace
when a hang is detected. Not much is generally running at that point,
but you can have a look:

https://gist.github.com/raw/4330223/2f131ee312ee43cb3d8c307a9bf2f454a7edfe57/rbd-hang-1355851498.txt

This helped a lot. I updated the bug with a little more info.

http://tracker.newdream.net/issues/3519

I also think I have now found something that could explain what you
are seeing, and am developing a fix. I'll provide you an update
as soon as I have tested what I come up with, almost certainly
this afternoon.

Nick, I have a new branch for you to try with a new fix in place.
As you might have predicted, it's named wip-nick-newest.

Please give it a try to see if it resolved the hang you've
been seeing and let me know how it goes. If it continues
to hang, please provide the logs as you have before, it's
been very helpful.

Thanks a lot.

-Alex

Is it possible that there is some sort of deadlock going on? We are
doing the rbd maps (and subsequent filesystem mounts) on the same
systems which are running the ceph-osd and ceph-mon processes. To get
around the 'sync' deadlock problem, we are using a patch from Sage
which ignores system wide sync's on filesystems mounted with the
'mand' option (and we mount the underlying osd filesystems with
'mand'). However I am wondering if there is potential for other types
of deadlocks in this environment.

Also, we recently saw an rbd hang in a much older version, running
kernel 3.5.3 with only the sync hack patch, along side ceph 0.48.1.
It's possible that this issue was around for some time, just the
recent patches made it happen more often (and thus more reproducible)
for us.

On Tue, Dec 18, 2012 at 8:09 AM, Alex Elder el...@inktank.com wrote:
On 12/17/2012 11:12 AM, Nick Bartos wrote:
Here's a log with the rbd debugging enabled:

https://gist.github.com/raw/4319962/d9690fd92c169198efc5eecabf275ef1808929d2/rbd-hang-test-1355763470.log

On Fri, Dec 14, 2012 at 10:03 AM, Alex Elder el...@inktank.com
wrote:
On 12/14/2012 10:53 AM, Nick Bartos wrote:
Yes I was only enabling debugging for libceph. I'm adding debugging
for rbd as well. I'll do a repro later today when a test cluster
opens up.

Excellent, thank you. -Alex

I looked through these debugging messages. Looking only at the
rbd debugging, what I see seems to indicate that rbd is idle at
the point the hang seems to start. This suggests that the hang
is not due to rbd itself, but rather whatever it is that might
be responsible for using the rbd image once it has been mapped.

Is that possible? I don't know what process you have that is
mapping the rbd image, and what is supposed to be the next thing
it does. (I realize this may not make a lot of sense, given
a patch in rdb seems to have caused the hang to begin occurring.)

Also note that the debugging information available (i.e., the
lines in the code that can output debugging information) may
well be incomplete. So if you don't find anything it may be
necessary to provide you with another update which might include
more debugging.

Anyway, could you provide a little more context about what
is going on sort of

Re: what could go wrong with two clusters on the same network?

2012-12-31 Thread Wido den Hollander


Hi,

On 12/31/2012 11:17 AM, Xiaopong Tran wrote:

Hi,

If I run two clusters on the same network, each with its own set of
monitors and config files (assuming that we didn't make any error
in the config files), would there be anything wrong with that?

Ceph seems to be quite chatty, so would they mess up their
messages?



Ceph is chatty, but it doesn't pollute your network.

OSDs only talk to other OSDs they learn from the monitors.

Ceph doesn't use broadcast or multicast in your local network, so it's 
safe to run multiple Ceph clusters in one subnet/VLAN.


Just make sure you use cephx (enabled by default in 0.55) so that you 
don't accidentally connect to the wrong cluster.


Wido


Just want to make sure, and just want to know if anyone is
doing that.

Thanks

Xiaopong
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: automatic repair of inconsistent pg?

2012-12-31 Thread Samuel Just

The ceph-osd relies on fs barriers for correctness.  You will want to
remove the nobarrier option to prevent future corruption.
-Sam

On Mon, Dec 31, 2012 at 3:59 AM, Stefan Priebe s.pri...@profihost.ag wrote:
 Am 31.12.2012 02:10, schrieb Samuel Just:

 Are you using xfs?  If so, what mount options?


 Yes,
 noatime,nodiratime,nobarrier,logbufs=8,logbsize=256k

 Stefan


 On Dec 30, 2012 1:28 PM, Stefan Priebe s.pri...@profihost.ag
 mailto:s.pri...@profihost.ag wrote:
  
   Am 30.12.2012 19:17, schrieb Samuel Just:
  
   This is somewhat more likely to have been a bug in the replication
 logic
   (there were a few fixed between 0.53 and 0.55).  Had there been any
   recent osd failures?
  
   Yes i was stressing CEPH with failures (power, link, disk, ...).
  
   Stefan
  
   On Dec 24, 2012 10:55 PM, Sage Weil s...@inktank.com
 mailto:s...@inktank.com
   mailto:s...@inktank.com mailto:s...@inktank.com wrote:
  
   On Tue, 25 Dec 2012, Stefan Priebe wrote:
 Hello list,

 today i got the following ceph status output:
 2012-12-25 02:57:00.632945 mon.0 [INF] pgmap v1394388: 7632
 pgs: 7631
 active+clean, 1 active+clean+inconsistent; 151 GB data, 307 GB
   used, 5028 GB /
 5336 GB avail


 i then grepped the inconsistent pg by:
 # ceph pg dump - | grep inconsistent
 3.ccf   10  0   0   0   41037824155930
 155930
 active+clean+inconsistent   2012-12-25 01:51:35.318459
 6243'2107
 6190'9847   [14,42] [14,42] 6243'2107   2012-12-25
   01:51:35.318436
 6007'2074   2012-12-23 01:51:24.386366

 and initiated a repair:
 #  ceph pg repair 3.ccf
 instructing pg 3.ccf on osd.14 to repair

 The log output then was:
 2012-12-25 02:56:59.056382 osd.14 [ERR] 3.ccf osd.42 missing
 1c602ccf/rbd_data.4904d6b8b4567.0b84/head//3
 2012-12-25 02:56:59.056385 osd.14 [ERR] 3.ccf osd.42 missing
 ceb55ccf/rbd_data.48cc66b8b4567.1538/head//3
 2012-12-25 02:56:59.097989 osd.14 [ERR] 3.ccf osd.42 missing
 dba6bccf/rbd_data.4797d6b8b4567.15ad/head//3
 2012-12-25 02:56:59.097991 osd.14 [ERR] 3.ccf osd.42 missing
 a4deccf/rbd_data.45f956b8b4567.03d5/head//3
 2012-12-25 02:56:59.098022 osd.14 [ERR] 3.ccf repair 4 missing,
 0
   inconsistent
 objects
 2012-12-25 02:56:59.098046 osd.14 [ERR] 3.ccf repair 4 errors,
 4
   fixed

 Why doesn't ceph repair this automatically? Ho could this
 happen
   at all?
  
   We just made some fixes to repair in next (it was broken sometime
   between
   ~0.53 and 0.55).  The latest next should repair it.  In general
 we don't
   repair automatically lest we inadvertantly propagate bad data or
 paper
   over a bug.
  
   As for the original source of the missing objects... I'm not sure.
 There
   were some fixed races related to backfill that could lead to an
 object
   being missed, but Sam would know more about how likely that
 actually is.
  
   sage
   --
   To unsubscribe from this list: send the line unsubscribe
 ceph-devel in
   the body of a message to majord...@vger.kernel.org
 mailto:majord...@vger.kernel.org
   mailto:majord...@vger.kernel.org

 mailto:majord...@vger.kernel.org
   More majordomo info at http://vger.kernel.org/majordomo-info.html
  
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

v0.56 released

2012-12-31 Thread Sage Weil

We're bringing in the new year with a new release, v0.56, which will form 
the basis of the next stable series bobtail. There is little in the way 
of new functionality since v0.55, as we've been focusing primarily on 
stability, performance, and upgradability from the previous argonaut 
stable series (v0.48.x). If you are a current argonaut user, you can 
either upgrade now, or watch the Inktank blog for the bobtail announcement 
after some additional testing has been completed. If you are a v0.55 or 
v0.55.1 user, we recommend upgrading now.

Notable changes since v0.55 include:

 * librbd: fixes for read-only pools for image cloning
 * osd: fix for mixing argonaut and post-v0.54 OSDs
 * osd: some recovery tuning
 * osd: fix for several scrub, recovery, and watch/notify races/bugs
 * osd: fix pool_stat_t backwawrd compatibility with pre-v0.41 clients
 * osd: experimental split support
 * mkcephfs: misc fixes for fs initialization, mounting
 * radosgw: usage and op logs off by default
 * radosgw: keystone authentication off by default
 * upstart: only enabled with 'upstart' file exists in daemon data 
   directory
 * mount.fuse.ceph: allow mounting of ceph-fuse via /etc/fstab
 * config: always complain about config parsing errors
 * mon: fixed memory leaks, misc bugs
 * mds: many misc fixes

Notable changes since v0.48.2 (argonaut):

 * auth: authentication is now on by default; see release notes!
 * osd: improved threading, small io performance
 * osd: deep scrubbing (verify object data)
 * osd: chunky scrubs (more efficient)
 * osd: improved performance during recovery
 * librbd: cloning support
 * librbd: fine-grained striping support
 * librbd: better caching
 * radosgw: improved Swift and S3 API coverage (POST, multi-object delete, 
   striping)
 * radosgw: OpenStack Keystone integration
 * radosgw: efficient usage stats aggregation (for billing)
 * crush: improvements in distribution (still off by default; see CRUSH 
   tunables)
 * ceph-fuse, mds: general stability improvements
 * release RPMs for OpenSUSE, SLES, Fedora, RHEL, CentOS
 * tons and bug fixes and small improvements across the board

If you are upgrading from v0.55, there are no special upgrade 
instructions. If you are upgrading from an older version, please read the 
release notes. Authentication is now enabled by default, and if you do not 
adjust your ceph.conf accordingly before upgrading the system will not 
come up by itself.

You can get this release from the usual locations:

 * Git at git://github.com/ceph/ceph.git
 * Tarball at http://ceph.com/download/ceph-0.56.tar.gz
 * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian
 * For RPMs, see http://ceph.com/docs/master/install/rpm

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Improving responsiveness of KVM guests on Ceph storage

Re: ceph for small cluster?

what could go wrong with two clusters on the same network?

Re: automatic repair of inconsistent pg?

Re: Improving responsiveness of KVM guests on Ceph storage

Re: ceph for small cluster?

Re: ceph for small cluster?

Re: rbd map command hangs for 15 minutes during system start up

Re: what could go wrong with two clusters on the same network?

Re: automatic repair of inconsistent pg?

v0.56 released

11 matches

Site Navigation

Mail list logo

Footer information