OSD crash on 0.48.2argonaut

2012-11-14 Thread Eric_YH_Chen
Dear All: I met this issue on one of osd node. Is this a known issue? Thanks! ceph version 0.48.2argonaut (commit:3e02b2fad88c2a95d9c0c86878f10d1beb780bfe) 1: /usr/bin/ceph-osd() [0x6edaba] 2: (()+0xfcb0) [0x7f08b112dcb0] 3: (gsignal()+0x35) [0x7f08afd09445] 4: (abort()+0x17b) [0x7f08afd0cbab

Re: changed rbd cp behavior in 0.53

2012-11-14 Thread Andrey Korolyov
On Thu, Nov 15, 2012 at 4:56 AM, Dan Mick wrote: > > > On 11/12/2012 02:47 PM, Josh Durgin wrote: >> >> On 11/12/2012 08:30 AM, Andrey Korolyov wrote: >>> >>> Hi, >>> >>> For this version, rbd cp assumes that destination pool is the same as >>> source, not 'rbd', if pool in the destination path is

Re: Small feature request for v0.55 release

2012-11-14 Thread Nick Bartos
My personal preference would be for ${name}-${version}.tar.bz2 as well, but 2nd place would be ${name}-stable-${version}.tar.bz2. On Wed, Nov 14, 2012 at 3:47 PM, Tren Blackburn wrote: > On Wed, Nov 14, 2012 at 3:40 PM, Jimmy Tang wrote: >> >> On 14 Nov 2012, at 16:14, Sage Weil wrote: >> >>> >

Re: problem with ceph and btrfs patch: set journal_info in async trans commit worker

2012-11-14 Thread Miao Xie
Hi, Stefan On wed, 14 Nov 2012 14:42:07 +0100, Stefan Priebe - Profihost AG wrote: > Hello list, > > i wanted to try out ceph with latest vanilla kernel 3.7-rc5. I was seeing a > massive performance degration. I see around 22x btrfs-endio-write processes > every 10-20 seconds and they run a lon

Re: Authorization issues in the 0.54

2012-11-14 Thread Yehuda Sadeh
On Wed, Nov 14, 2012 at 4:20 AM, Andrey Korolyov wrote: > Hi, > In the 0.54 cephx is probably broken somehow: > > $ ceph auth add client.qemukvm osd 'allow *' mon 'allow *' mds 'allow > *' -i qemukvm.key > 2012-11-14 15:51:23.153910 7ff06441f780 -1 read 65 bytes from qemukvm.key > added key for cl

Re: changed rbd cp behavior in 0.53

2012-11-14 Thread Dan Mick
On 11/12/2012 02:47 PM, Josh Durgin wrote: On 11/12/2012 08:30 AM, Andrey Korolyov wrote: Hi, For this version, rbd cp assumes that destination pool is the same as source, not 'rbd', if pool in the destination path is omitted. rbd cp install/img testimg rbd ls install img testimg Is this c

Re: [PATCH] make mkcephfs and init-ceph osd filesystem handling more flexible

2012-11-14 Thread Sage Weil
Hi Danny, Have you had a chance to work on this? I'd like to include this in bobtail. If you don't have time we can go ahead an implement it, but I'd like avoid duplicating effort. Thanks! sage On Fri, 2 Nov 2012, Danny Al-Gaaf wrote: > Hi Sage, > > sorry for the late reply, was absent som

Re: Small feature request for v0.55 release

2012-11-14 Thread Tren Blackburn
On Wed, Nov 14, 2012 at 3:40 PM, Jimmy Tang wrote: > > On 14 Nov 2012, at 16:14, Sage Weil wrote: > >> >> Appending the codename to the version string is something we did with >> argonaut (0.48argonaut) just to make it obvious to users which stable >> version they are on. >> >> How do people feel

Re: Small feature request for v0.55 release

2012-11-14 Thread Tren Blackburn
On Wed, Nov 14, 2012 at 1:53 PM, Nick Bartos wrote: > I see that v0.55 will be the next stable release. Would it be > possible to use standard tarball naming conventions for this release? > > If I download http://ceph.com/download/ceph-0.48.2.tar.bz2, the top > level directory is actually ceph-0.

Re: Small feature request for v0.55 release

2012-11-14 Thread Jimmy Tang
On 14 Nov 2012, at 16:14, Sage Weil wrote: > > Appending the codename to the version string is something we did with > argonaut (0.48argonaut) just to make it obvious to users which stable > version they are on. > > How do people feel about that? Is it worthwhile? Useless? Ugly? > > We ca

Re: Small feature request for v0.55 release

2012-11-14 Thread Sage Weil
On Wed, 14 Nov 2012, Nick Bartos wrote: > I see that v0.55 will be the next stable release. Would it be > possible to use standard tarball naming conventions for this release? > > If I download http://ceph.com/download/ceph-0.48.2.tar.bz2, the top > level directory is actually ceph-0.48.2argonaut

Small feature request for v0.55 release

2012-11-14 Thread Nick Bartos
I see that v0.55 will be the next stable release. Would it be possible to use standard tarball naming conventions for this release? If I download http://ceph.com/download/ceph-0.48.2.tar.bz2, the top level directory is actually ceph-0.48.2argonaut, not ceph-0.48.2 as expected. Downloading http:/

[PATCH 4/4] rbd: use a common layout for each device

2012-11-14 Thread Alex Elder
Each osd message includes a layout structure, and for rbd it is always the same (at least for osd's in a given pool). Initialize a layout structure when an rbd_dev gets created and just copy that into osd requests for the rbd image. Replace an assertion that was done when initializing the layout

[PATCH 3/4] rbd: don't bother calculating file mapping

2012-11-14 Thread Alex Elder
When rbd_do_request() has a request to process it initializes a ceph file layout structure and uses it to compute offsets and limits for the range of the request using ceph_calc_file_object_mapping(). The layout used is fixed, and is based on RBD_MAX_OBJ_ORDER (30). It sets the layout's object siz

[PATCH 2/4] rbd: open code rbd_calc_raw_layout()

2012-11-14 Thread Alex Elder
This patch gets rid of rbd_calc_raw_layout() by simply open coding it in its one caller. Signed-off-by: Alex Elder --- drivers/block/rbd.c | 55 +-- 1 file changed, 18 insertions(+), 37 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/blo

[PATCH 1/4] rbd: pull in ceph_calc_raw_layout()

2012-11-14 Thread Alex Elder
This is the first in a series of patches aimed at eliminating the use of ceph_calc_raw_layout() by rbd. It simply pulls in a copy of that function and renames it rbd_calc_raw_layout(). Signed-off-by: Alex Elder --- drivers/block/rbd.c | 36 +++- 1 file changed,

[PATCH 0/4] rbd: stop using ceph_calc_raw_layout()

2012-11-14 Thread Alex Elder
This series makes rbd no longer call ceph_calc_raw_layout(), and in doing so, also stop calling ceph_calc_file_object_mapping() for its requests. Apparently the call to the former was for the *other* side-effects it had (unrelated to the layout). -Alex [PA

[PATCH] rbd: combine rbd sync watch/unwatch functions

2012-11-14 Thread Alex Elder
The rbd_req_sync_watch() and rbd_req_sync_unwatch() functions are nearly identical. Combine them into a single function with a flag indicating whether a watch is to be initiated or torn down. Signed-off-by: Alex Elder --- drivers/block/rbd.c | 81 +-

[PATCH] rbd: kill ceph_osd_req_op->flags

2012-11-14 Thread Alex Elder
The flags field of struct ceph_osd_req_op is never used, so just get rid of it. Signed-off-by: Alex Elder --- include/linux/ceph/osd_client.h |1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 2b04d05..69287cc 100644 ---

[PATCH 4/4] rbd: assume single op in a request

2012-11-14 Thread Alex Elder
We now know that every of rbd_req_sync_op() passes an array of exactly one operation, as evidenced by all callers passing 1 as its num_op argument. So get rid of that argument, assuming a single op. Similarly, we now know that all callers of rbd_do_request() pass 1 as the num_op value, so that pa

[PATCH 3/4] rbd: there is really only one op

2012-11-14 Thread Alex Elder
Throughout the rbd code there are spots where it appears we can handle an osd request containing more than one osd request op. But that is only the way it appears. In fact, currently only one operation at a time can be supported, and supporting more than one will require much more than fleshing o

[PATCH 2/4] libceph: pass num_op with ops

2012-11-14 Thread Alex Elder
Both ceph_osdc_alloc_request() and ceph_osdc_build_request() are provided an array of ceph osd request operations. Rather than just passing the number of operations in the array, the caller is required append an additional zeroed operation structure to signal the end of the array. All callers kno

[PATCH 1/4] rbd: pass num_op with ops array

2012-11-14 Thread Alex Elder
Add a num_op parameter to rbd_do_request() and rbd_req_sync_op() to indicate the number of entries in the array. The callers of these functions always know how many entries are in the array, so just pass that information down. This is in anticipation of eliminating the extra zero-filled entry in

[PATCH 0/4] rbd: disavow any support for multiple osd ops

2012-11-14 Thread Alex Elder
The rbd code is rife with places where it seems that an osd request could support multiple osd ops. But the reality is that there are spots in rbd as well as libceph and the messenger that make such support impossible without some (upcoming, planned) additional work. This series starts by getting

[PATCH 2/2] libceph: don't set pages or bio in ceph_osdc_alloc_request()

2012-11-14 Thread Alex Elder
Only one of the two callers of ceph_osdc_alloc_request() provides page or bio data for its payload. And essentially all that function was doing with those arguments was assigning them to fields in the osd request structure. Simplify ceph_osdc_alloc_request() by having the caller take care of maki

[PATCH 1/2] libceph: don't set flags in ceph_osdc_alloc_request()

2012-11-14 Thread Alex Elder
The only thing ceph_osdc_alloc_request() really does with the flags value it is passed is assign it to the newly-created osd request structure. Do that in the caller instead. Both callers subsequently call ceph_osdc_build_request(), so have that function (instead of ceph_osdc_alloc_request()) iss

[PATCH 0/2] libceph: simplify ceph_osdc_alloc_request()

2012-11-14 Thread Alex Elder
These two patches just move a couple of things that ceph_osdc_alloc_request() does out and into the caller. It simplifies the function slightly, and makes it possible for some callers to not have to supply irrelevant arguments. -Alex [PATCH 1/2] libceph: do

[PATCH 4/4] libceph: drop osdc from ceph_calc_raw_layout()

2012-11-14 Thread Alex Elder
The osdc parameter to ceph_calc_raw_layout() is not used, so get rid of it. Consequently, the corresponding parameter in calc_layout() becomes unused, so get rid of that as well. Signed-off-by: Alex Elder --- drivers/block/rbd.c |2 +- include/linux/ceph/osd_client.h |3 +--

[PATCH 3/4] libceph: drop snapid in ceph_calc_raw_layout()

2012-11-14 Thread Alex Elder
A snapshot id must be provided to ceph_calc_raw_layout() even though it is not needed at all for calculating the layout. Where the snapshot id *is* needed is when building the request message for an osd operation. Drop the snapid parameter from ceph_calc_raw_layout() and pass that value instead i

[PATCH 2/4] libceph: pass length to ceph_calc_file_object_mapping()

2012-11-14 Thread Alex Elder
ceph_calc_file_object_mapping() takes (among other things) a "file" offset and length, and based on the layout, determines the object number ("bno") backing the affected portion of the file's data and the offset into that object where the desired range begins. It also computes the size that should

[PATCH 1/4] libceph: pass length to ceph_osdc_build_request()

2012-11-14 Thread Alex Elder
The len argument to ceph_osdc_build_request() is set up to be passed by address, but that function never updates its value so there's no need to do this. Tighten up the interface by passing the length directly. Signed-off-by: Alex Elder --- drivers/block/rbd.c |2 +- include/lin

[PATCH 0/4] libceph: tighten up some interfaces

2012-11-14 Thread Alex Elder
While investigating exactly how and why rbd uses ceph_calc_raw_layout() I implemented some small changes to some functions to make it obvious to the caller that certain functions won't cause side-effects, or that certain functions do or don't need certain parameters.

[PATCH 2/2] libceph: kill op_needs_trail()

2012-11-14 Thread Alex Elder
Since every osd message is now prepared to include trailing data, there's no need to check ahead of time whether any operations will make use of the trail portion of the message. We can drop the second argument ot get_num_ops(), and as a result we can also get rid of op_needs_trail() which is no l

[PATCH 1/2] libceph: always allow trail in osd request

2012-11-14 Thread Alex Elder
An osd request structure contains an optional trail portion, which if present will contain data to be passed in the payload portion of the message containing the request. The trail field is a ceph_pagelist pointer, and if null it indicates there is no trail. A ceph_pagelist structure contains a l

[PATCH 0/2] libceph: always init trail for osd requests

2012-11-14 Thread Alex Elder
This series makes the ceph_osd_request->r_trail be a structure that's always initialized rather than a pointer. The result works equivalent to before but it makes things simpler. -Alex [PATCH 1/2] libceph: always allow trail in osd request [PATCH 2/2] libce

Re: ceph cluster hangs when rebooting one node

2012-11-14 Thread Sage Weil
On Wed, 14 Nov 2012, Aleksey Samarin wrote: > Hello! > > I have the same problem. After switching off the second node, the > cluster hangs, there is some solution? > > All the best, Alex! I suspect this is min_size; the latest master has a few changes and also will print it out so you can tell

Re: endless flying slow requests

2012-11-14 Thread Sage Weil
Hi Stefan, I would be nice to confirm that no clients are waiting on replies for these requests; currently we suspect that the OSD request tracking is the buggy part. If you query the OSD admin socket you should be able to dump requests and see the client IP, and then query the client. Is i

problem with ceph and btrfs patch: set journal_info in async trans commit worker

2012-11-14 Thread Stefan Priebe - Profihost AG
Hello list, i wanted to try out ceph with latest vanilla kernel 3.7-rc5. I was seeing a massive performance degration. I see around 22x btrfs-endio-write processes every 10-20 seconds and they run a long time while consuming a massive amount of CPU. So my performance of 23.000 iops drops to

Authorization issues in the 0.54

2012-11-14 Thread Andrey Korolyov
Hi, In the 0.54 cephx is probably broken somehow: $ ceph auth add client.qemukvm osd 'allow *' mon 'allow *' mds 'allow *' -i qemukvm.key 2012-11-14 15:51:23.153910 7ff06441f780 -1 read 65 bytes from qemukvm.key added key for client.qemukvm $ ceph auth list ... client.admin key: [xx]

Re: ceph cluster hangs when rebooting one node

2012-11-14 Thread Aleksey Samarin
Hello! I have the same problem. After switching off the second node, the cluster hangs, there is some solution? All the best, Alex! 2012/11/12 Stefan Priebe - Profihost AG : > Am 12.11.2012 16:11, schrieb Sage Weil: > >> On Mon, 12 Nov 2012, Stefan Priebe - Profihost AG wrote: >>> >>> Hello list

Re: [Help] Use Ceph RBD as primary storage in CloudStack 4.0

2012-11-14 Thread Alex Jiang
Hi, Dan Thank you for your reply.After installing ceph, I can compile Qemu with RBD enable and have added the host to CloudStack successfully. 2012/11/14 Dan Mick : > Hi Alex: > > did you install the ceph packages before trying to build qemu? It sounds > like qemu is looking for the Ceph librarie

endless flying slow requests

2012-11-14 Thread Stefan Priebe - Profihost AG
Hello list, i see this several times. Endless flying slow requests. And they never stop until i restart the mentioned osd. 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; oldest blocked for > 31789.858457 secs 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 317