Re: Delivery Status Notification (Failure)

2012-10-29 Thread hemant surale
Hi Folks , I have executed all required steps to build Ceph from source but now when I execute service Ceph start it shows ---//Error root@atish-virtual-machine:/ceph_data# service ceph start /etc/init.d/ceph: 37: .:

Ceph performance

2012-10-29 Thread Roman Alekseev
Hi, Kindly guide me how to improve performance on the cluster which consist of 5 dedicated servers: - ceph.conf: http://pastebin.com/hT3qEhUF - file system on all drives is ext4 - mount options user_xattr - each server has : CPU:Intel® Xeon® Processor E5335(8M Cache, 2.00 GHz, 1333 MHz FSB)

Monitor issue

2012-10-29 Thread Roman Alekseev
Hello, I have 3 monitors on different nodes and when 'mon.a' was stopped whole cluster stopped work too. My conf: http://pastebin.com/hT3qEhUF Could someone explain how to fix such kind of failure? -- Kind regards, R. Alekseev -- To unsubscribe from this list: send the line unsubscribe

Re: Monitor issue

2012-10-29 Thread Wido den Hollander
On 10/29/2012 03:48 PM, Roman Alekseev wrote: Hello, I have 3 monitors on different nodes and when 'mon.a' was stopped whole cluster stopped work too. My conf: http://pastebin.com/hT3qEhUF Could someone explain how to fix such kind of failure? Could you explain a bit more about the setup?

Re: Slow ceph fs performance

2012-10-29 Thread Bryan K. Wright
g...@inktank.com said: Eeek, I was going through my email backlog and came across this thread again. Everything here does look good; the data distribution etc is pretty reasonable. If you're still testing, we can at least get a rough idea of the sorts of IO the OSD is doing by looking at the

[GIT PULL] Ceph fixes for -rc4

2012-10-29 Thread Sage Weil
Hi Linus, Please pull the following fixes from git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git for-linus There are two fixes in the messenger code, one that can trigger a NULL dereference, and one that error in refcounting (extra put). There is also a trivial fix that in

Stuck Request

2012-10-29 Thread Ian Pye
Guys, I'm running a three node cluster (version 0.53), and after a while of running under constant write load generated by two daemons, I am seeing that 1 request is totally blocked: [WRN] 1 slow requests, 1 included below; oldest blocked for 7550.891933 secs 2012-10-29 10:33:54.689563 osd.0

Re: Stuck Request

2012-10-29 Thread Samuel Just
Interesting, I don't think the request is stalled. I think we completed the request, but leaked a reference to the request structure. Do you see IO from the clients stall? What is the output of ceph -s? What version are you running (ceph-osd --version)? -Sam On Mon, Oct 29, 2012 at 10:53 AM,

Re: Stuck Request

2012-10-29 Thread Ian Pye
The client's IO held up fine, and I don't see any signs of them blocking. The writes are done inside of an aio_operate() rados call. In the client logs too, I don't see any record of a failed write. ceph -s health HEALTH_OK monmap e1: 1 mons at {a=10.25.36.11:6789/0}, election epoch 2,

Re: Delivery Status Notification (Failure)

2012-10-29 Thread Dan Mick
Last time you asked this, I responded (on 18 Oct): So, clearly something went wrong with the installation step, and this file was not created. How did you install after building? On 10/29/2012 02:27 AM, hemant surale wrote: Hi Folks , I have executed all required steps to build Ceph from

Re: production ready?

2012-10-29 Thread Dan Mick
On 10/26/2012 02:52 PM, Gandalf Corvotempesta wrote: Hi all,i'm new to ceph. Are RBD and REST API production ready? There are sites using them in production now. Do you have any use case to share? we are looking for a distributed block storage for an HP C7000 blade with 16 dual processor

Re: [PATCH, resend] rbd: simplify rbd_rq_fn()

2012-10-29 Thread Josh Durgin
This is much easier to read now. It might be useful to add messages for the different failure cases in bio_chain_clone_range later. Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 03:44 PM, Alex Elder wrote: When processing a request, rbd_rq_fn() makes clones of the bio's in the

Re: [PATCH] rbd: remove snapshots on error in rbd_add()

2012-10-29 Thread Josh Durgin
On 10/26/2012 03:45 PM, Alex Elder wrote: If rbd_dev_snaps_update() has ever been called for an rbd device structure there could be snapshot structures on its snaps list. In rbd_add(), this function is called but a subsequent error path neglected to clean up any of these snapshots. Add a call

Re: [PATCH] rbd: make pool_id a 64 bit value

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 03:45 PM, Alex Elder wrote: If a format 2 image has a parent, its pool id will be specified using a 64-bit value. Change the pool id we save for an image to match that. Signed-off-by: Alex Elder el...@inktank.com ---

Re: [PATCH 1/2] rbd: move snap info out of rbd_mapping struct

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 03:51 PM, Alex Elder wrote: Moving the snap_id and snap_name fields into the separate rbd_mapping structure was misguided. (And in time, perhaps we'll do away with that structure altogether...) Move these fields back into struct

Re: [PATCH 2/2] rbd: rename snap_exists field

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 03:51 PM, Alex Elder wrote: A Boolean field snap_exists in an rbd mapping is used to indicate whether a mapped snapshot has been removed from an image's snapshot context, to stop sending requests for that snapshot as soon as we know

Re: [PATCH 1/2] rbd: move ceph_parse_options() call up

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 03:55 PM, Alex Elder wrote: Move option parsing out of rbd_get_client() and into its caller. Signed-off-by: Alex Elder el...@inktank.com --- drivers/block/rbd.c | 48 +++- 1 file

Re: [PATCH 2/2] rbd: do all argument parsing in one place

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 03:55 PM, Alex Elder wrote: This patch makes rbd_add_parse_args() be the single place all argument parsing occurs for an image map request: - Move the ceph_parse_options() call into that function - Use local variables

Re: [PATCH 1/8] rbd: get rid of snap_name_len

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 04:00 PM, Alex Elder wrote: The value returned in the snap_name_len argument to rbd_add_parse_args() is never actually used, so get rid of it. The snap_name_len recorded in *rbd_dev_v2_snap_name() is not useful either, so get rid

Re: [PATCH 2/8] rbd: remove options args from rbd_add_parse_args()

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 04:00 PM, Alex Elder wrote: They options argument to rbd_add_parse_args() (and it's partner options_size) is now only needed within the function, so there's no need to have the caller allocate and pass the options buffer. Just

Re: [PATCH 3/8] rbd: remove snap_name arg from rbd_add_parse_args()

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 04:01 PM, Alex Elder wrote: The snapshot name returned by rbd_add_parse_args() just gets saved in the rbd_dev eventually. So just do that inside that function and do away with the snap_name argument, both in rbd_add_parse_args()

Re: [PATCH 4/8] rbd: pass and populate rbd_options structure

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 04:02 PM, Alex Elder wrote: Have the caller pass the address of an rbd_options structure to rbd_add_parse_args(), to be initialized with the information gleaned as a result of the parse. I know, this is another near-reversal of a

Re: [PATCH 5/8] rbd: have rbd_add_parse_args() return error

2012-10-29 Thread Josh Durgin
Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 04:02 PM, Alex Elder wrote: Change the interface to rbd_add_parse_args() so it returns an error code rather than a pointer. Return the ceph_options result via a pointer whose address is passed as an argument. Signed-off-by: Alex

Re: [PATCH 6/8] rbd: define image specification structure

2012-10-29 Thread Josh Durgin
A couple notes below, but looks good. Reviewed-by: Josh Durgin josh.dur...@inktank.com On 10/26/2012 04:03 PM, Alex Elder wrote: Group the fields that uniquely specify an rbd image into a new reference-counted rbd_spec structure. This structure will be used to describe the desired image when

Re: [PATCH 7/8] rbd: add reference counting to rbd_spec

2012-10-29 Thread Josh Durgin
On 10/26/2012 04:03 PM, Alex Elder wrote: With layered images we'll share rbd_spec structures, so add a reference count to it. It neatens up some code also. Could you explain your plan for these data structures? What will the structs and their relationships look like with clones mapped? A

Re: [PATCH 8/8] rbd: fill rbd_spec in rbd_add_parse_args()

2012-10-29 Thread Josh Durgin
On 10/26/2012 04:03 PM, Alex Elder wrote: Pass the address of an rbd_spec structure to rbd_add_parse_args(). Use it to hold the information defining the rbd image to be mapped in an rbd_add() call. Use the result in the caller to initialize the rbd_dev-id field. This means rbd_dev is no longer

Setup for building Java unit tests

2012-10-29 Thread Noah Watkins
This is my proposal for handling Java unit test compilation: 1. Go with Joe's suggestion to backport the unit tests to the oldest version of JUnit shipping with the latest Ubuntu and Fedora. 2. Use --with-debug to enable unit test building: --enable-cephfs-java: no change --with-debug

Re: Help...MDS Continuously Segfaulting

2012-10-29 Thread Nick Couchman
Okay, that patch worked and it seems to be running, again. Should I continue to run with that patch, or go back to the original binaries? Gregory Farnum 10/19/12 4:16 PM I've written a small patch on top of v0.48.1argonaut which should avoid this. It's in branch 3369-mds-session-workaround