Re: on disk encryption

2013-02-01 Thread Christian Brunner
As a special use-case I would propose to do the encryption inside the qemu-rbd driver (similar to the qcow2 driver). This would also encrypt the network traffic from the KVM host to the osd. I know that this is probably not the generic aproach for on disk encryption, but it would provide some

Re: Ceph and KVM live migration

2012-07-02 Thread Christian Brunner
(untested). Christian From 36314693f8b9be1f3c77621543adf01d7c51cb88 Mon Sep 17 00:00:00 2001 From: Christian Brunner c...@muc.de Date: Tue, 19 Jun 2012 12:23:38 +0200 Subject: [PATCH] libvirt: allow migration for network protocols Live migration should be possible with most (all ?) network protocols

Re: Ceph on btrfs 3.4rc

2012-05-24 Thread Christian Brunner
Same thing here. I've tried really hard, but even after 12 hours I wasn't able to get a single warning from btrfs. I think you cracked it! Thanks, Christian 2012/5/24 Martin Mailand mar...@tuxadero.com: Hi, the ceph cluster is running under heavy load for the last 13 hours without a

Re: Ceph on btrfs 3.4rc

2012-05-22 Thread Christian Brunner
2012/5/21 Miao Xie mi...@cn.fujitsu.com: Hi Josef, On fri, 18 May 2012 15:01:05 -0400, Josef Bacik wrote: diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 9b9b15f..492c74f 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -57,9 +57,6 @@ struct

Re: Designing a cluster guide

2012-05-21 Thread Christian Brunner
2012/5/20 Tim O'Donovan t...@icukhosting.co.uk: - High performance Block Storage (RBD)   Many large SATA SSDs for the storage (prbably in a RAID5 config)   stec zeusram ssd drive for the journal How do you think standard SATA disks would perform in comparison to this, and is a separate

Re: Designing a cluster guide

2012-05-21 Thread Christian Brunner
2012/5/21 Stefan Priebe - Profihost AG s.pri...@profihost.ag: Am 20.05.2012 10:31, schrieb Christian Brunner: That's exactly what i thought too but then you need a seperate ceph / rbd cluster for each type. Which will result in a minimum of: 3x mon servers per type 4x osd servers per type

Re: Designing a cluster guide

2012-05-20 Thread Christian Brunner
2012/5/20 Stefan Priebe s.pri...@profihost.ag: Am 19.05.2012 18:15, schrieb Alexandre DERUMIER: Hi, For your journal , if you have money, you can use stec zeusram ssd drive. (around 2000€ /8GB / 10 iops read/write with 4k block). I'm using them with zfs san, they rocks for journal.

Re: Designing a cluster guide

2012-05-20 Thread Christian Brunner
2012/5/20 Stefan Priebe s.pri...@profihost.ag: Am 20.05.2012 10:19, schrieb Christian Brunner: - Cheap Object Storage (S3):   Many 3,5'' SATA Drives for the storage (probably in a RAID config)   A small and cheap SSD for the journal - Basic Block Storage (RBD):   Many 2,5'' SATA Drives

Re: is rados block cluster production ready ?

2012-05-18 Thread Christian Brunner
2012/5/18 Alexandre DERUMIER aderum...@odiso.com: Hi, I'm going to build a rados block cluster for my kvm hypervisors. Is it already production ready ? (stable,no crash) We are using 0.45 in production. Recent ceph versions are quite stable (although we hat some troubles with excessive

Re: is rados block cluster production ready ?

2012-05-18 Thread Christian Brunner
2012/5/18 Alexandre DERUMIER aderum...@odiso.com: Hi Christian, thanks for your response. We are using 0.45 in production. Recent ceph versions are quite stable (although we hat some troubles with excessive logging and a full log partition lately which caused our cluster to halt). excessive

Re: Ceph on btrfs 3.4rc

2012-05-17 Thread Christian Brunner
2012/5/17 Josef Bacik jo...@redhat.com: On Thu, May 17, 2012 at 05:12:55PM +0200, Martin Mailand wrote: Hi Josef, no there was nothing above. Here the is another dmesg output. Hrm ok give this a try and hopefully this is it, still couldn't reproduce. Thanks, Josef Well, I hate to say it,

[PATCH 1/2] rbd: allow importing from stdin

2012-05-16 Thread Christian Brunner
This patch allows importing images from stdin with the following command: rbd import --size=size in MB - [dest-image] Signed-off-by: Christian Brunner c...@muc.de --- src/rbd.cc | 37 + 1 files changed, 25 insertions(+), 12 deletions(-) diff --git a/src

[PATCH 2/2] rbd: skip empty blocks during import

2012-05-16 Thread Christian Brunner
Check for empty blocks while importing an image. When a read block only consists of zeroes, the import is simply skipping the write. This way you non-sparse images will become sparse rbd images. Signed-off-by: Christian Brunner c...@muc.de --- src/rbd.cc | 44

[PATCH 1/2 v2] rbd: allow importing from stdin

2012-05-16 Thread Christian Brunner
This patch allows importing images from stdin with the following command: rbd import --size=size in MB - [dest-image] v1 - v2: stat stdin as well Signed-off-by: Christian Brunner c...@muc.de --- src/rbd.cc | 26 +++--- 1 files changed, 19 insertions(+), 7 deletions

Re: Ceph on btrfs 3.4rc

2012-05-04 Thread Christian Brunner
2012/5/3 Josef Bacik jo...@redhat.com: On Thu, May 03, 2012 at 09:38:27AM -0700, Josh Durgin wrote: On Thu, 3 May 2012 11:20:53 -0400, Josef Bacik jo...@redhat.com wrote: On Thu, May 03, 2012 at 08:17:43AM -0700, Josh Durgin wrote: Yeah all that was in the right place, I rebooted and I

Re: Ceph on btrfs 3.4rc

2012-04-30 Thread Christian Brunner
2012/4/29 tsuna tsuna...@gmail.com: On Fri, Apr 20, 2012 at 8:09 AM, Christian Brunner christ...@brunner-muc.de wrote: After running ceph on XFS for some time, I decided to try btrfs again. Performance with the current for-linux-min branch and big metadata is much better. I've heard

Re: Ceph on btrfs 3.4rc

2012-04-25 Thread Christian Brunner
Am 24. April 2012 18:26 schrieb Sage Weil s...@newdream.net: On Tue, 24 Apr 2012, Josef Bacik wrote: On Fri, Apr 20, 2012 at 05:09:34PM +0200, Christian Brunner wrote: After running ceph on XFS for some time, I decided to try btrfs again. Performance with the current for-linux-min branch

Re: Ceph on btrfs 3.4rc

2012-04-23 Thread Christian Brunner
Am 20. April 2012 17:09 schrieb Christian Brunner christ...@brunner-muc.de: After running ceph on XFS for some time, I decided to try btrfs again. Performance with the current for-linux-min branch and big metadata is much better. The only problem (?) I'm still seeing is a warning that seems

Inconsistent rbd header

2012-03-16 Thread Christian Brunner
This is probably going in the same direction as the report by Oliver Francke. Ceph is reporting an inconsistent PG. Running a scrub on the PG gave me the folling messages: 2012-03-16 12:55:17.287415 log 2012-03-16 12:55:12.179529 osd.14 10.255.0.63:6818/2014 34280 : [ERR] 2.117 osd.0: soid

protocol version mismatch

2012-03-02 Thread Christian Brunner
Hi, I've just updated our ceph-cluster from 0.39 to 0.42.2. Now I'm seeing a lot of messages like these in the OSD log: Mar 2 10:15:43 os00 osd.003[15096]: 7f0b9831c700 -- 10.255.0.60:6821/15095 10.255.0.61:6812/22474 pipe(0x2e74100 sd=47 pgs=0 cs=0 l=0).connect protocol version mismatch, my 9

Re: about attaching rbd volume from instance on KVM

2012-02-06 Thread Christian Brunner
Libvirt is trying to set security lables even for network shares. This will not work. I think this is fixed in newer libvirt versions. For older versions you can try this patch: http://www.redhat.com/archives/libvir-list/2011-May/msg01446.html Regards, Christian 2012/2/4 Masuko Tomoya

ceph on XFS

2012-01-27 Thread Christian Brunner
Hi, reading the list archives, I get the impression that XFS is the second best alternative to btrfs. But when I start an ceph-osd on an XFS volume, there is still a big warning: WARNING: not btrfs or ext3. We don't currently support file systems other than btrfs and ext3

Btrfs slowdown with ceph (how to reproduce)

2012-01-20 Thread Christian Brunner
As you might know, I have been seeing btrfs slowdowns in our ceph cluster for quite some time. Even with the latest btrfs code for 3.3 I'm still seeing these problems. To make things reproducible, I've now written a small test, that imitates ceph's behavior: On a freshly created btrfs filesystem

Re: BTRFS Warning

2011-12-21 Thread Christian Brunner
2011/12/21 Jens Rehpöhler jens.rehpoeh...@filoo.de: Am 19.12.2011 19:51, schrieb Gregory Farnum: On Mon, Dec 19, 2011 at 12:28 AM, Jens Rehpöhler jens.rehpoeh...@filoo.de wrote: Good morning !! i got the following warning as soon as i use btrfs as underlaying filesystem (for stability

Re: Understanding Ceph

2011-12-18 Thread Christian Brunner
Hi Bill, 2011/12/18 Bill Hastings bllhasti...@gmail.com: I am trying to get my feet wet with Ceph and RADOS. My aim is to use it as a block device for KVM instances. My understanding is that virtual disks get striped at 1 MB boundaries by default. Does that mean that there are going to be

Re: libvirtd + rbd - stale kvm after migrate

2011-12-08 Thread Christian Brunner
Hi Florian, live migration with rbd images usually works fine. A few recommendations: - You should not map the image on the host, while using it in a VM with the qemu driver. - For testing I would remove the ISO-Image from your VM. (Not sure if that matters). Also I'm not using cephx

Re: ceph and ext4

2011-12-08 Thread Christian Brunner
2011/11/15 Andreas Dilger adil...@whamcloud.com: Coincidentally, we have someone working in those patches again. The main obstacle for accepting the previous patch as-is was that Ted wanted to add support for medium-sized xattrs that are addressed as a string of blocks, instead of via an

Re: 'ceph -w' in 0.39

2011-12-07 Thread Christian Brunner
2011/12/5 Sage Weil s...@newdream.net: Hi Christian, On Mon, 5 Dec 2011, Christian Brunner wrote: I've just updated to 0.39. Everything seems to be fine, except one minor thing I noticed: 'ceph -w' output stops after a few minutes. With debug ms = 1 it ends with these lines: 2011-12-05

'ceph -w' in 0.39

2011-12-05 Thread Christian Brunner
I've just updated to 0.39. Everything seems to be fine, except one minor thing I noticed: 'ceph -w' output stops after a few minutes. With debug ms = 1 it ends with these lines: 2011-12-05 14:45:52.939300 7fc700637700 -- 10.255.0.21:0/14145 == mon.2 10.255.0.22:6789/0 315

pginfo updates

2011-11-24 Thread Christian Brunner
I'm running a btrfs-debug patch on one of our nodes. This patch prints calls to btrfs_orphan_add. I'm still waiting for the problem the patch was intended to trace, but in the logs I found something ceph related I don't understand: When I look at the btrfs_orphan_add messages there is one inode

Re: pginfo updates

2011-11-24 Thread Christian Brunner
write to the PG. I'm surprised that one PG has so much more activity than the others, but that's why that inode has so much activity. What are you doing with this installation, and roughly how many PGs are on the node? On Thu, Nov 24, 2011 at 3:33 AM, Christian Brunner c...@muc.de wrote: I'm

Re: pginfo updates

2011-11-24 Thread Christian Brunner
Sorry, just one more question: Why is pginfo truncated every time a write is performed to the PG? Thanks, Christian 2011/11/24 Christian Brunner c...@muc.de: Hmmm, OK. - This really seems to belong to one virtual machine, that is writing to a single object over and over again. Thanks

Re: pginfo updates

2011-11-24 Thread Christian Brunner
2011/11/24 Christian Brunner c...@muc.de: Sorry, just one more question: Why is pginfo truncated every time a write is performed to the PG? Forget about my question. Sorry for the noise, Christian -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body

Recommended btrfs mount options

2011-11-22 Thread Christian Brunner
Reading the latest pull request by Chris Mason, I was wondering about the recommended mount options for an OSD-Filesystem. In the past I came across the following btrfs options, that were used in conjunction with ceph: - nodatacow: To avoid fragmentation. I think this one makes sense when you

Re: btrfs-work master repo

2011-11-14 Thread Christian Brunner
] ---[ end trace 6f256945fc353904 ]--- To aid debuging, I can also send you an ftrace with btrfs events and function_graph enabled. Just tell me if you need it. Thanks, Christian 2011/11/13 Christian Brunner c...@muc.de: Hi Josef, I've patched one of our ceph nodes with the stuff in your master repo

btrfs-work master repo

2011-11-13 Thread Christian Brunner
Hi Josef, I've patched one of our ceph nodes with the stuff in your master repo (including lxo's patches). The WARNING in inode.c seems to be gone now, but load is still going up after a day. Another thing I've witnessed is that I'm getting a new warning, when I umount the filesystem (see

Re: OSD hit suicide timeout

2011-11-11 Thread Christian Brunner
2011/11/11 Sage Weil s...@newdream.net: Hi Christian, Do you have a core file?  Can you dump the thread stack traces so we can see if it got hung up on a syscall or somewhere internally (thread apply all bt)? I'm not sure if it's from btrfs, but ther is no kernel warning at that time. I'm

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-27 Thread Christian Brunner
2011/10/27 Josef Bacik jo...@redhat.com: On Tue, Oct 25, 2011 at 01:56:48PM +0200, Christian Brunner wrote: 2011/10/24 Josef Bacik jo...@redhat.com: On Mon, Oct 24, 2011 at 10:06:49AM -0700, Sage Weil wrote: [adding linux-btrfs to cc] Josef, Chris, any ideas on the below issues

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-26 Thread Christian Brunner
2011/10/26 Sage Weil s...@newdream.net: On Wed, 26 Oct 2011, Christian Brunner wrote: Christian, have you tweaked those settings in your ceph.conf?  It would be something like 'journal dio = false'.  If not, can you verify that directio shows true when the journal is initialized from

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-26 Thread Christian Brunner
2011/10/26 Christian Brunner c...@muc.de: 2011/10/25 Josef Bacik jo...@redhat.com: On Tue, Oct 25, 2011 at 04:15:45PM -0400, Chris Mason wrote: On Tue, Oct 25, 2011 at 11:05:12AM -0400, Josef Bacik wrote: On Tue, Oct 25, 2011 at 04:25:02PM +0200, Christian Brunner wrote: Attached

PG stuck in scrubbing

2011-10-25 Thread Christian Brunner
Here is another problem I've seen. Unfortunatly I do not have any debug output and it's not reproduceable. While removing an image with rbd rm I noticed that rbd stopped making progress. When I looked with ceph -w I saw a PG, that was in state active+clean+scrubbing: 2011-10-25 14:01:34.198961

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-25 Thread Christian Brunner
2011/10/25 Josef Bacik jo...@redhat.com: On Tue, Oct 25, 2011 at 01:56:48PM +0200, Christian Brunner wrote: 2011/10/24 Josef Bacik jo...@redhat.com: On Mon, Oct 24, 2011 at 10:06:49AM -0700, Sage Weil wrote: [adding linux-btrfs to cc] Josef, Chris, any ideas on the below issues

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-25 Thread Christian Brunner
2011/10/25 Josef Bacik jo...@redhat.com: On Tue, Oct 25, 2011 at 04:25:02PM +0200, Christian Brunner wrote: 2011/10/25 Josef Bacik jo...@redhat.com: On Tue, Oct 25, 2011 at 01:56:48PM +0200, Christian Brunner wrote: [...] In our Ceph-OSD server we have 4 disks with 4 btrfs filesystems

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-25 Thread Christian Brunner
2011/10/25 Sage Weil s...@newdream.net: On Tue, 25 Oct 2011, Josef Bacik wrote: At this point it seems like the biggest problem with latency in ceph-osd is not related to btrfs, the latency seems to all be from the fact that ceph-osd is fsyncing a block dev for whatever reason. There is one

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-25 Thread Christian Brunner
2011/10/25 Josef Bacik jo...@redhat.com: On Tue, Oct 25, 2011 at 04:15:45PM -0400, Chris Mason wrote: On Tue, Oct 25, 2011 at 11:05:12AM -0400, Josef Bacik wrote: On Tue, Oct 25, 2011 at 04:25:02PM +0200, Christian Brunner wrote: Attached is a perf-report. I have included the whole

Re: ceph on non-btrfs file systems

2011-10-24 Thread Christian Brunner
Thanks for explaining this. I don't have any objections against btrfs as a osd filesystem. Even the fact that there is no btrfs-fsck doesn't scare me, since I can use the ceph replication to recover a lost btrfs-filesystem. The only problem I have is, that btrfs is not stable on our side and I

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-24 Thread Christian Brunner
2011/10/24 Chris Mason chris.ma...@oracle.com: On Mon, Oct 24, 2011 at 03:51:47PM -0400, Josef Bacik wrote: On Mon, Oct 24, 2011 at 10:06:49AM -0700, Sage Weil wrote: [adding linux-btrfs to cc] Josef, Chris, any ideas on the below issues? On Mon, 24 Oct 2011, Christian Brunner wrote

rbd snap rm error

2011-10-21 Thread Christian Brunner
When I want to delete a snapshot, everything is working, but there is a strange error message: # rbd create --size=100 image # rbd snap create --snap=image-snap1 image # rbd snap ls image 8 image-snap1 104857600 # rbd snap rm --snap=image-snap1 image 2011-10-21 11:35:58.506393

Re: OSD blocked for more than 120 seconds

2011-10-17 Thread Christian Brunner
2011/10/15 Martin Mailand mar...@tuxadero.com: Hi Christian, I have a very similar experience, I also used josef's tree and btrfs snaps = 0, the next problem I had than was excessive fragmentation, so I  used this patch http://marc.info/?l=linux-btrfsm=131495014823121w=2, and changed the

Re: OSD blocked for more than 120 seconds

2011-10-17 Thread Christian Brunner
, schrieb Christian Brunner: Our bugreport with RedHat didn't make any progress for a long time, but last week RedHat made two sugestions: - If you configure ceph with 'filestore flusher = false', do you see any different behavior? - If you mount with -o noauto_da_alloc does it change anything

Re: OSD blocked for more than 120 seconds

2011-10-15 Thread Christian Brunner
I'm not seeing the same problem, but I've experienced something similar: As you might know, I had serious performance problems with btrfs some month ago, after that, I switched to ext4 and had other problems there. Last Saturday I decided to give josef's current btrfs git repo a try in our ceph

Re: Btrfs High IO-Wait

2011-10-11 Thread Christian Brunner
I think this is related to the sync issues. You could try the josef's git tree: git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git Since yesterday I'm using it in our ceph cluster and it seems to do a better job. Regards, Christian 2011/10/9 Martin Mailand mar...@tuxadero.com:

OSD crashes

2011-10-11 Thread Christian Brunner
Here is another one... I've now run mkcephfs and started importing all our data from a backup. However after two days, two of our OSDs are crashing right after the start again. It all started with a hit suicide timeout. Now I can't start it any longer. Here is what I have in the logs. I'm

OSD: no current directory

2011-10-11 Thread Christian Brunner
Maybe this one is easier: One of our OSDs isn't starting, because ther is no current directory. What I have are three snap directories. total 0 -rw-r--r-- 1 root root 37 Oct 9 15:57 ceph_fsid -rw-r--r-- 1 root root8 Oct 9 15:57 fsid -rw-r--r-- 1 root root 21 Oct 9 15:57 magic

Re: OSD: no current directory

2011-10-11 Thread Christian Brunner
2011/10/11 Sage Weil s...@newdream.net: On Tue, 11 Oct 2011, Christian Brunner wrote: Maybe this one is easier: One of our OSDs isn't starting, because ther is no current directory. What I have are three snap directories. total 0 -rw-r--r-- 1 root root   37 Oct  9 15:57 ceph_fsid -rw-r--r

Re: OSD: no current directory

2011-10-11 Thread Christian Brunner
2011/10/11 Sage Weil s...@newdream.net: On Tue, 11 Oct 2011, Christian Brunner wrote: 2011/10/11 Sage Weil s...@newdream.net: On Tue, 11 Oct 2011, Christian Brunner wrote: Maybe this one is easier: One of our OSDs isn't starting, because ther is no current directory. What I have

OSD::disk_tp timeout

2011-10-08 Thread Christian Brunner
Hi, I've upgraded ceph from 0.32 to 0.36 yesterday. Now I have a totaly screwed ceph cluster. :( What bugs me most is the fact, that OSDs become unresponsive frequently. The process is eating a lot of cpu and I can see the following messages in the log: Oct  8 22:30:05 os00 osd.000[31688]:

Re: [PATCH 2/2] rbd: add an option for md5 checksumming (v2)

2011-08-26 Thread Christian Brunner
2011/8/25 Yehuda Sadeh Weinraub yehuda.sa...@dreamhost.com: +  if (buf) { +    dif = ofs-lastofs; +    if (dif 0) { +      byte *tempbuf = (byte *) malloc(dif); +      memset(tempbuf, 0, dif); +      Hash-Update((const byte *) tempbuf, dif); +      free(tempbuf); +    } + +    

Re: [PATCH 2/2] rbd: add an option for md5 checksumming (v3)

2011-08-26 Thread Christian Brunner
2011/8/26 Yehuda Sadeh Weinraub yehud...@gmail.com: On Fri, Aug 26, 2011 at 11:51 AM, Christian Brunner c...@muc.de wrote: +static int hash_read_cb(uint64_t ofs, size_t len, const char *buf, void *arg) +{ +  ceph::crypto::Digest *Hash = (ceph::crypto::Digest *)arg; +  byte *hashbuf = (byte

Re: [PATCH 2/2] rbd: add an option for md5 checksumming (v3)

2011-08-26 Thread Christian Brunner
2011/8/26 Tommi Virtanen tommi.virta...@dreamhost.com: On Fri, Aug 26, 2011 at 12:25, Yehuda Sadeh Weinraub yehud...@gmail.com wrote: On Fri, Aug 26, 2011 at 12:10 PM, Tommi Virtanen tommi.virta...@dreamhost.com wrote: e.g. 8kB at a time. And at that point you might as well just use read

[PATCH 2/2] rbd: add an option for md5 checksumming (v4)

2011-08-26 Thread Christian Brunner
We needed to get an md5 checksum of an rbd image. Since librbd is using a lot of sparse operations, this was not possible without writing an image to a local disk. With this patch exporting the image is no longer needed. You can do rbd md5 image and you will get the same output as you would call

[PATCH 1/2] extended crypto classes

2011-08-25 Thread Christian Brunner
This patch extends the ceph crypto classes: - map CryptoPP::HashTransformation to ceph::crypto::Digest (for cryptopp) - add DigestSize() to ceph::crypto::Digest (for libnss) Thanks, Christian --- src/common/ceph_crypto.h |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff

[PATCH 2/2] rbd: add an option for md5 checksumming (v2)

2011-08-25 Thread Christian Brunner
We needed to get an md5 checksum of an rbd image. Since librbd is using a lot of sparse operations, this was not possible without writing an image to a local disk. With this patch exporting the image is no longer needed. You can do rbd md5 image and you will get the same output as you would call

[PATCH] rbd: add an option for md5 checksumming

2011-08-23 Thread Christian Brunner
We needed to get an md5 checksum of an rbd image. Since librbd is using a lot of sparse operations, this was not possible without writing an image to a local disk. With this patch exporting the image is no longer needed. You can do rbd md5 image and you will get the same output as you would call

Re: rbd snapshot copy

2011-08-23 Thread Christian Brunner
Hi Sage, I saw your patches in the git repository and have now tested them. Everything is looking good so far. Thanks, Christian 2011/8/19 Tommi Virtanen tommi.virta...@dreamhost.com: On Fri, Aug 19, 2011 at 01:46, Christian Brunner c...@muc.de wrote: I tried using rbd snapshots for the first

Re: Kernel 3.0.0 + ext4 + ceph == ...

2011-08-18 Thread Christian Brunner
also provide an e2image with the metadata and the strace output of the cosd, if this would be helpful. Regards, Christian 2011/8/8 Christian Brunner c...@muc.de: I tried 3.0.1 today, which contains the commit Theodore suggested and was no longer able to reproduce the problem. So I think

Re: Btrfs slowdown

2011-08-09 Thread Christian Brunner
Christian 2011/8/8 Sage Weil s...@newdream.net: Hi Christian, Are you still seeing this slowness? sage On Wed, 27 Jul 2011, Christian Brunner wrote: 2011/7/25 Chris Mason chris.ma...@oracle.com: Excerpts from Christian Brunner's message of 2011-07-25 03:54:47 -0400: Hi, we are running

Re: Kernel 3.0.0 + ext4 + ceph == ...

2011-08-08 Thread Christian Brunner
, Christian Brunner c...@muc.de wrote: ... I tried to reproduce this without ceph, but wasn't able to... In the meantime it seams, that I can also see the side effects on the librbd side: I get an librbd: data error! when I do an rbd copy. When I look at the librbd code this is related

Re: Kernel 3.0.0 + ext4 + ceph == ...

2011-08-03 Thread Christian Brunner
2011/8/1 Sage Weil s...@newdream.net: On Mon, 1 Aug 2011, Theodore Tso wrote: On Jul 31, 2011, at 4:42 PM, Sage Weil wrote: This is link(2) 2011-07-31 23:06:50.114316 7f23c048c700 filestore(/osd.0) collection_remove temp/1000483.05d6/head = 0 This is unlink(2)

Re: rados callbacks

2011-08-01 Thread Christian Brunner
2011/8/1 Sage Weil s...@newdream.net: On Mon, 1 Aug 2011, Gregory Farnum wrote: On Mon, Aug 1, 2011 at 8:23 AM, Yehuda Sadeh Weinraub yehuda.sa...@dreamhost.com wrote: The osd first sends the ack when it receives the request Actually it sends an ack when the request is applied in-memory to

Re: Kernel 3.0.0 + ext4 + ceph == ...

2011-07-30 Thread Christian Brunner
This is also reproducable with a RHEL6.1 kernel (2.6.32-131.6.1.el6.x86_64). :( Regards, Christian 2011/7/30 Fyodor Ustinov u...@ufm.su: fail. Epic fail. Absolutely reproducible. I have ceph cluster with this configuration: 8 physical servers 14 osd servers. Each osd server have

Re: Kernel 3.0.0 + ext4 + ceph == ...

2011-07-30 Thread Christian Brunner
2011/7/30 Theodore Tso ty...@mit.edu: P.S  Ext4 in the RHEL 6.x kernel is I believe a technology preview, which means it might not get full support from Red Hat.  Still, if you do see those sorts of problems with ext4 on a RHEL kernel, and you have a valid Red Hat support contract, I'd

FAILED assert(pg_map.count(pgid))

2011-07-30 Thread Christian Brunner
I shifted all my data to new filesystems. I did this, by using the ceph replication. During that process other cosd processes (those who had to send the replication data) died several times. I was able to restart them, but I think this should not happen. Here is one log message... (they all

Re: Btrfs slowdown

2011-07-28 Thread Christian Brunner
2011/7/28 Marcus Sorensen shadow...@gmail.com: Christian, Have you checked up on the disks themselves and hardware? High utilization can mean that the i/o load has increased, but it can also mean that the i/o capacity has decreased.  Your traces seem to indicate that a good portion of the

Filesystems for ceph

2011-07-27 Thread Christian Brunner
We are having quite some problems with the underlying filesystem for the cosd's and I would like to hear about other experiences. Here is what we have gone through so far: btrfs with 2.6.38: - good performance - frequently hitting of various BUG_ON conditions btrfs with 2.6.39: - big

Re: peering PGs

2011-07-26 Thread Christian Brunner
, Christian 2011/7/26 Christian Brunner c...@muc.de: Another kernel crash another invalid ceph state... A memory allocation failure in the kernel (ixgbe) of one OSD-Server lead to a domino effect in our ceph cluster with 0 up, 0 in. When I restarted the cluster everything came up again. But I

Btrfs slowdown

2011-07-25 Thread Christian Brunner
Hi, we are running a ceph cluster with btrfs as it's base filesystem (kernel 3.0). At the beginning everything worked very well, but after a few days (2-3) things are getting very slow. When I look at the object store servers I see heavy disk-i/o on the btrfs filesystems (disk utilization is

Re: FW: crashed+peering PGs

2011-07-22 Thread Christian Brunner
On 07/19/2011 08:41 AM, Christian Brunner wrote: 2011/7/18 Sage Weils...@newdream.net: On Mon, 18 Jul 2011, Christian Brunner wrote: $ ceph pg dump -o - | grep crashed pg_stat objects mip     degr    unf     kb      bytes   log disklog state   v       reported        up      acting

Re: FW: crashed+peering PGs

2011-07-19 Thread Christian Brunner
2011/7/18 Sage Weil s...@newdream.net: On Mon, 18 Jul 2011, Christian Brunner wrote: $ ceph pg dump -o - | grep crashed pg_stat objects mip     degr    unf     kb      bytes   log disklog state   v       reported        up      acting  last_scrub 1.1ac   0       0       0       0

Re: Delayed inode operations not doing the right thing with enospc

2011-07-14 Thread Christian Brunner
2011/7/13 Josef Bacik jo...@redhat.com: On 07/12/2011 11:20 AM, Christian Brunner wrote: 2011/6/7 Josef Bacik jo...@redhat.com: On 06/06/2011 09:39 PM, Miao Xie wrote: On fri, 03 Jun 2011 14:46:10 -0400, Josef Bacik wrote: I got a lot of these when running stress.sh on my test box

Re: Delayed inode operations not doing the right thing with enospc

2011-07-12 Thread Christian Brunner
2011/6/7 Josef Bacik jo...@redhat.com: On 06/06/2011 09:39 PM, Miao Xie wrote: On fri, 03 Jun 2011 14:46:10 -0400, Josef Bacik wrote: I got a lot of these when running stress.sh on my test box This is because use_block_rsv() is having to do a reserve_metadata_bytes(), which shouldn't

Re: qemu librbd patches

2011-05-25 Thread Christian Brunner
I'm sorry. At the moment I don't have a core file available and my colleagues don't allow me to reproduce it. ;-) We should get a new server in one or two weeks. I will try to get a core then. Christian 2011/5/24 Josh Durgin josh.dur...@dreamhost.com: On 05/09/2011 01:59 AM, Christian Brunner

Re: [Qemu-devel] [PATCH v4 1/4] rbd: use the higher level librbd instead of just librados

2011-05-25 Thread Christian Brunner
Apart from two cosmetic issues (see below), I think this patch is ready to replace the old rbd driver. You can add: Reviewed-by: Christian Brunner c...@muc.de Regards Christian 2011/5/24 Josh Durgin josh.dur...@dreamhost.com: librbd stacks on top of librados to provide access to rbd images

Re: [Qemu-devel] [PATCH v4 2/4] rbd: allow configuration of rados from the rbd filename

2011-05-25 Thread Christian Brunner
Looks good to me: Reviewed-by: Christian Brunner c...@muc.de 2011/5/24 Josh Durgin josh.dur...@dreamhost.com: The new format is rbd:pool/image[@snapshot][:option1=value1[:option2=value2...]] Each option is used to configure rados, and may be any Ceph option, or conf. The conf option

Fwd: [Qemu-devel] [PATCH] block/rbd: Remove unused local variable

2011-05-23 Thread Christian Brunner
-- From: Kevin Wolf kw...@redhat.com Date: 2011/5/23 Subject: Re: [Qemu-devel] [PATCH] block/rbd: Remove unused local variable To: c...@muc.de Cc: QEMU Developers qemu-de...@nongnu.org Am 23.05.2011 11:01, schrieb Christian Brunner: 2011/5/22 Stefan Weil w...@mail.berlios.de: Am 07.05.2011 22:15

Re: rbd export from corrupted cluster

2011-05-03 Thread Christian Brunner
2011/5/2 Sage Weil s...@newdream.net: On Mon, 2 May 2011, Christian Brunner wrote: after a series of hardware defects, I have a corrupted ceph cluster: 2011-05-02 18:12:31.038446    pg v8171648: 3712 pgs: 26 active, 3663 active+clean, 5 crashed+peering, 18 active+clean+inconsistent; 547 GB

rbd export from corrupted cluster

2011-05-02 Thread Christian Brunner
Hi, after a series of hardware defects, I have a corrupted ceph cluster: 2011-05-02 18:12:31.038446pg v8171648: 3712 pgs: 26 active, 3663 active+clean, 5 crashed+peering, 18 active+clean+inconsistent; 547 GB data, 388 GB used, 51922 GB / 78245 GB avail; 2410/284300 degraded (0.848%) Now I

Re: rbd layering

2011-02-03 Thread Christian Brunner
2011/2/2 Gregory Farnum gr...@hq.newdream.net: On Wed, Feb 2, 2011 at 9:47 AM, Sage Weil s...@newdream.net wrote: When I mentioned allocation bitmap before, I meant simply a bitmap specifying whether the block exists, that would let us avoid looking for an object in the parent image.  In its

[PATCH] ceph/rbd block driver for qemu-kvm (v9)

2010-12-06 Thread Christian Brunner
...@hq.newdream.net Signed-off-by: Christian Brunner c...@muc.de --- Makefile.objs |1 + block/rbd.c | 1059 + block/rbd_types.h | 71 configure | 52 +++ 4 files changed, 1183 insertions(+), 0 deletions(-) create mode 100644

Re: [Qemu-devel] [PATCH] ceph/rbd block driver for qemu-kvm (v8)

2010-12-06 Thread Christian Brunner
2010/12/6 Kevin Wolf kw...@redhat.com: Hi Kevin, This lacks a Signed-off-by. Please merge Yehuda's fix for configure when you resend the patch. I've sent an updated patch. What's the easiest way to try it out? I tried to use vstart.sh and copy the generated ceph.conf to /etc/ceph/ceph.conf

Re: crashed+down+peering

2010-12-02 Thread Christian Brunner
Hi Sage, 2010/12/2 Sage Weil s...@newdream.net: Hi Christian, On Thu, 2 Dec 2010, Christian Brunner wrote: We have simulated the simultanious crash of multiple osds in our environment. After starting all the cosd again, we have the following situation: 2010-12-02 16:18:33.944436    pg

Re: claims to be ... - wrong node!

2010-11-15 Thread Christian Brunner
: On Fri, 12 Nov 2010, Christian Brunner wrote: Presumably I'm doing something wrong here, but I don't have clue what to... After restarting one of our osd-servers I get the following messages in the cosd-log: 2010-11-12 10:24:31.965058 7f5bac380710 -- 10.255.0.60:6802/17175 10.255.0.60:6800

Re: [PATCH] ceph/rbd block driver for qemu-kvm (v7)

2010-11-15 Thread Christian Brunner
librados a lot better than me. I pretty sure, that they will give some feedback about this remaining issue. After that we will send an updated patch. Regards, Christian 2010/11/11 Stefan Hajnoczi stefa...@gmail.com: On Fri, Oct 15, 2010 at 8:54 PM, Christian Brunner c...@muc.de wrote

Re: AW: ./osd/OSDMap.h:460: FAILED assert(exists(osd) is_up(osd))

2010-11-05 Thread Christian Brunner
in start_thread () from /lib64/libpthread.so.0 #25 0x003c0c0e151d in clone () from /lib64/libc.so.6 I hope this helps. Regards, Christian 2010/11/4 Sage Weil s...@newdream.net: Hi Christian, On Tue, 26 Oct 2010, Christian Brunner wrote: I can't promise this for tomorrow, but I think I

./osd/OSDMap.h:460: FAILED assert(exists(osd) is_up(osd))

2010-10-26 Thread Christian Brunner
When accessing multiple RBD-Volumes from one VM in parallel, we are receiving an assertion: ./osd/OSDMap.h: In function 'entity_inst_t OSDMap::get_inst(int)': ./osd/OSDMap.h:460: FAILED assert(exists(osd) is_up(osd)) ceph version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279) 1:

AW: ./osd/OSDMap.h:460: FAILED assert(exists(osd) is_up(osd))

2010-10-26 Thread Christian Brunner
Yes, one OSD crashed approximately an hour before this was happening. Christian -Ursprüngliche Nachricht- Von: gr...@hq.newdream.net [mailto:ceph-devel-ow...@vger.kernel.org] Im Auftrag von Gregory Farnum Gesendet: Dienstag, 26. Oktober 2010 20:00 An: Christian Brunner Cc: ceph-devel

AW: ./osd/OSDMap.h:460: FAILED assert(exists(osd) is_up(osd))

2010-10-26 Thread Christian Brunner
I can't promise this for tomorrow, but I think I can do this on Thursday. Christian -Ursprüngliche Nachricht- Von: Sage Weil [mailto:s...@newdream.net] Gesendet: Dienstag, 26. Oktober 2010 21:09 An: Christian Brunner Cc: ceph-devel@vger.kernel.org Betreff: Re: ./osd/OSDMap.h:460: FAILED

AW: osd/ReplicatedPG.cc:2403: FAILED assert(!missing.is_missing(soid))

2010-10-26 Thread Christian Brunner
Does someone know which commit this is? I don't want to switch to the unstable branch at the moment. Christian -Ursprüngliche Nachricht- Von: Smets, Jan (Jan) [mailto:jan.sm...@alcatel-lucent.com] Gesendet: Dienstag, 26. Oktober 2010 14:37 An: Christian Brunner Betreff: RE: osd

Fwd: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-21 Thread Christian Brunner
drivers nbd, rbd and sheepdog To: Christian Brunner c...@muc.de, Laurent Vivier laur...@vivier.eu, MORITA Kazutaka morita.kazut...@lab.ntt.co.jp Cc: qemu-de...@nongnu.org Hi all, I'm currently looking into adding a return value to qemu's bdrv_flush function and I noticed that your block drivers

Re: is RBD supported in qemu 0.13?

2010-10-20 Thread Christian Brunner
The patch has not yet been accepted by the qemu-kvm maintainers, so the driver is not included in qemu 0.13. We have submited an updated patch to the qemu list, but there was no feedback. Regards, Christian On Wed, Oct 20, 2010 at 04:22:58PM +0800, Xiaoguang Liu wrote: just saw qemu 0.13

  1   2   >