[PATCH 02/11] libceph: move ceph_file_layout helpers to ceph_fs.h

2014-01-27 Thread Ilya Dryomov
Move ceph_file_layout helper macros and inline functions to ceph_fs.h. Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- include/linux/ceph/ceph_fs.h | 23 +++ include/linux/ceph/osdmap.h | 27 --- 2 files changed, 23 insertions(+), 27

[PATCH 03/11] libceph: rename MAX_OBJ_NAME_SIZE to CEPH_MAX_OID_NAME_LEN

2014-01-27 Thread Ilya Dryomov
In preparation for adding oid abstraction, rename MAX_OBJ_NAME_SIZE to CEPH_MAX_OID_NAME_LEN. Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- drivers/block/rbd.c |6 +++--- include/linux/ceph/osd_client.h |4 ++-- net/ceph/osd_client.c |2 +- 3 files

[PATCH 00/11] tiering support

2014-01-27 Thread Ilya Dryomov
Hello, This series adds (limited, see 10/11 for more explanation) tiering support to the kernel client. Patches 01-05/11 prepare the ground, 06-08/11 implement basic tiering, 09-10/11 handle redirect replies, 11/11 announces tiering support to the server side. Thanks, Ilya

[PATCH 06/11] libceph: CEPH_OSD_FLAG_* enum update

2014-01-27 Thread Ilya Dryomov
Update CEPH_OSD_FLAG_* enum. (We need CEPH_OSD_FLAG_IGNORE_OVERLAY to support tiering). Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- include/linux/ceph/rados.h |4 1 file changed, 4 insertions(+) diff --git a/include/linux/ceph/rados.h b/include/linux/ceph/rados.h index

[PATCH 09/11] libceph: rename ceph_osd_request::r_{oloc,oid} to r_base_{oloc,oid}

2014-01-27 Thread Ilya Dryomov
Rename ceph_osd_request::r_{oloc,oid} to r_base_{oloc,oid} before introducing r_target_{oloc,oid} needed for redirects. Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- drivers/block/rbd.c |8 include/linux/ceph/osd_client.h |4 ++-- net/ceph/debugfs.c

[PATCH 07/11] libceph: add ceph_pg_pool_by_id()

2014-01-27 Thread Ilya Dryomov
Lookup pool info by ID function is hidden in osdmap.c. Expose it to the rest of libceph. Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- include/linux/ceph/osdmap.h |3 +++ net/ceph/osdmap.c |5 + 2 files changed, 8 insertions(+) diff --git

[PATCH 05/11] libceph: replace ceph_calc_ceph_pg() with ceph_oloc_oid_to_pg()

2014-01-27 Thread Ilya Dryomov
Switch ceph_calc_ceph_pg() to new oloc and oid abstractions and rename it to ceph_oloc_oid_to_pg() to make its purpose more clear. Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- fs/ceph/ioctl.c |8 ++-- include/linux/ceph/osdmap.h |7 +--

[PATCH 11/11] libceph: support CEPH_FEATURE_OSD_CACHEPOOL feature

2014-01-27 Thread Ilya Dryomov
Announce our (limited, see previous commit) support for CACHEPOOL feature. Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- include/linux/ceph/ceph_features.h |1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h

[PATCH 08/11] libceph: follow {read,write}_tier fields on osd request submission

2014-01-27 Thread Ilya Dryomov
Overwrite ceph_osd_request::r_oloc.pool with read_tier for read ops and write_tier for write and read+write ops (aka basic tiering support). {read,write}_tier are part of pg_pool_t since v9. This commit bumps our pg_pool_t decode compat version from v7 to v9, all new fields except for

[PATCH 10/11] libceph: follow redirect replies from osds

2014-01-27 Thread Ilya Dryomov
Follow redirect replies from osds, for details see ceph.git commit fbbe3ad1220799b7bb00ea30fce581c5eadaf034. v1 (current) version of redirect reply consists of oloc and oid, which expands to pool, key, nspace, hash and oid. However, server-side code that would populate anything other than pool

ceph branch status

2014-01-27 Thread ceph branch robot
-- All Branches -- Alfredo Deza alfr...@deza.pe 2013-09-27 10:33:52 -0400 wip-5900 2014-01-21 16:29:22 -0500 wip-6465 Dan Mick dan.m...@inktank.com 2013-07-16 23:00:06 -0700 wip-5634 David Zafman david.zaf...@inktank.com 2013-01-28 20:26:34 -0800

RADOS + deep scrubbing performance issues in production environment

2014-01-27 Thread Filippos Giannakos
Hello all, We have been running RADOS in a large scale, production, public cloud environment for a few months now and we are generally happy with it. However, we experience performance problems when deep scrubbing is active. We managed to reproduce them in our testing cluster running emperor,

[PATCH 01/11] libceph: start using oloc abstraction

2014-01-27 Thread Ilya Dryomov
Instead of relying on pool fields in ceph_file_layout (for mapping) and ceph_pg (for enconding), start using ceph_object_locator (oloc) abstraction. Note that userspace oloc currently consists of pool, key, nspace and hash fields, while this one contains only a pool. This is OK, because at this

Re: [PATCH 07/11] libceph: add ceph_pg_pool_by_id()

2014-01-27 Thread Sage Weil
Would it make more sense to just rename and export the existing function? I'm not sure __ is particularly meaningful in the context of osdmap.c... sage On Mon, 27 Jan 2014, Ilya Dryomov wrote: Lookup pool info by ID function is hidden in osdmap.c. Expose it to the rest of libceph.

Re: [PATCH 07/11] libceph: add ceph_pg_pool_by_id()

2014-01-27 Thread Ilya Dryomov
On Mon, Jan 27, 2014 at 6:38 PM, Sage Weil s...@inktank.com wrote: Would it make more sense to just rename and export the existing function? I'm not sure __ is particularly meaningful in the context of osdmap.c... I added a new one because __lookup_pg_pool() takes rb_root, whereas all existing

Re: [PATCH 07/11] libceph: add ceph_pg_pool_by_id()

2014-01-27 Thread Sage Weil
On Mon, 27 Jan 2014, Ilya Dryomov wrote: On Mon, Jan 27, 2014 at 6:38 PM, Sage Weil s...@inktank.com wrote: Would it make more sense to just rename and export the existing function? I'm not sure __ is particularly meaningful in the context of osdmap.c... I added a new one because

Re: [ceph-users] many meta files in osd

2014-01-27 Thread Gregory Farnum
Looks like you got lost over the Christmas holidays; sorry! I'm not an expert on running rgw but it sounds like garbage collection isn't running or something. What version are you on, and have you done anything to set it up? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On

Re: RADOS + deep scrubbing performance issues in production environment

2014-01-27 Thread Kyle Bader
Are there any tools we are not aware of for controlling, possibly pausing, deep-scrub and/or getting some progress about the procedure ? Also since I believe it would be a bad practice to disable deep-scrubbing do you have any recommendations of how to work around (or even solve) this issue

Re: [PATCH 10/11] libceph: follow redirect replies from osds

2014-01-27 Thread Sage Weil
On Mon, 27 Jan 2014, Ilya Dryomov wrote: Follow redirect replies from osds, for details see ceph.git commit fbbe3ad1220799b7bb00ea30fce581c5eadaf034. v1 (current) version of redirect reply consists of oloc and oid, which expands to pool, key, nspace, hash and oid. However, server-side code

Re: RADOS + deep scrubbing performance issues in production environment

2014-01-27 Thread Sage Weil
There is also ceph osd set noscrub and then later ceph osd unset noscrub I forget whether this pauses an in-progress PG scrub or just makes it stop when it gets to the next PG boundary. sage On Mon, 27 Jan 2014, Kyle Bader wrote: Are there any tools we are not aware of for

Re: [PATCH 10/11] libceph: follow redirect replies from osds

2014-01-27 Thread Ilya Dryomov
On Mon, Jan 27, 2014 at 8:32 PM, Sage Weil s...@inktank.com wrote: On Mon, 27 Jan 2014, Ilya Dryomov wrote: Follow redirect replies from osds, for details see ceph.git commit fbbe3ad1220799b7bb00ea30fce581c5eadaf034. v1 (current) version of redirect reply consists of oloc and oid, which

patch queue for linus

2014-01-27 Thread Sage Weil
Here is what I am planning on sending to Linus for 3.14, probably tomorrow. This includes Ilya's latest series (which isn't in testing just yet). Please let me know if anything seems to be missing. Thanks! sage Alex Elder (1):

Re: RADOS + deep scrubbing performance issues in production environment

2014-01-27 Thread Mike Dawson
On 1/27/2014 1:45 PM, Sage Weil wrote: There is also ceph osd set noscrub and then later ceph osd unset noscrub In my experience scrub isn't nearly as much of a problem as deep-scrub. On a IOPS constrained cluster with writes approaching the available aggregate spindle performance