Re: [ceph-users] Data still in OSD directories after removing

2014-05-22 Thread Olivier Bonvalet

Le mercredi 21 mai 2014 à 18:20 -0700, Josh Durgin a écrit :
 On 05/21/2014 03:03 PM, Olivier Bonvalet wrote:
  Le mercredi 21 mai 2014 à 08:20 -0700, Sage Weil a écrit :
  You're certain that that is the correct prefix for the rbd image you
  removed?  Do you see the objects lists when you do 'rados -p rbd ls - |
  grep prefix'?
 
  I'm pretty sure yes : since I didn't see a lot of space freed by the
  rbd snap purge command, I looked at the RBD prefix before to do the
  rbd rm (it's not the first time I see that problem, but previous time
  without the RBD prefix I was not able to check).
 
  So :
  - rados -p sas3copies ls - | grep rb.0.14bfb5a.238e1f29 return nothing
  at all
  - # rados stat -p sas3copies rb.0.14bfb5a.238e1f29.0002f026
error stat-ing sas3copies/rb.0.14bfb5a.238e1f29.0002f026: No such
  file or directory
  - # rados stat -p sas3copies rb.0.14bfb5a.238e1f29.
error stat-ing sas3copies/rb.0.14bfb5a.238e1f29.: No such
  file or directory
  - # ls -al 
  /var/lib/ceph/osd/ceph-67/current/9.1fe_head/DIR_E/DIR_F/DIR_1/DIR_7/rb.0.14bfb5a.238e1f29.0002f026__a252_E68871FE__9
  -rw-r--r-- 1 root root 4194304 oct.   8  2013 
  /var/lib/ceph/osd/ceph-67/current/9.1fe_head/DIR_E/DIR_F/DIR_1/DIR_7/rb.0.14bfb5a.238e1f29.0002f026__a252_E68871FE__9
 
 
  If the objects really are orphaned, teh way to clean them up is via 'rados
  -p rbd rm objectname'.  I'd like to get to the bottom of how they ended
  up that way first, though!
 
  I suppose the problem came from me, by doing CTRL+C while rbd snap
  purge $IMG.
  rados rm -p sas3copies rb.0.14bfb5a.238e1f29.0002f026 don't remove
  thoses files, and just answer with a No such file or directory.
 
 Those files are all for snapshots, which are removed by the osds
 asynchronously in a process called 'snap trimming'. There's no
 way to directly remove them via rados.
 
 Since you stopped 'rbd snap purge' partway through, it may
 have removed the reference to the snapshot before removing
 the snapshot itself.
 
 You can get a list of snapshot ids for the remaining objects
 via the 'rados listsnaps' command, and use
 rados_ioctx_selfmanaged_snap_remove() (no convenient wrapper
 unfortunately) on each of those snapshot ids to be sure they are all
 scheduled for asynchronous deletion.
 
 Josh
 

Great : rados listsnaps see it :
# rados listsnaps -p sas3copies rb.0.14bfb5a.238e1f29.0002f026
rb.0.14bfb5a.238e1f29.0002f026:
cloneid snaps   sizeoverlap
41554   35746   4194304 []

So, I have to writecompile a wrapper to
rados_ioctx_selfmanaged_snap_remove(), and find a way to obtain a list
of all orphan objects ?

I also try to recreate the object (rados put) then remove it (rados rm),
but snapshots still here.

Olivier

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SMART monitoring

2014-05-22 Thread Andrey Korolyov
On Fri, Dec 27, 2013 at 9:09 PM, Andrey Korolyov and...@xdel.ru wrote:
 On 12/27/2013 08:15 PM, Justin Erenkrantz wrote:
 On Thu, Dec 26, 2013 at 9:17 PM, Sage Weil s...@inktank.com wrote:
 I think the question comes down to whether Ceph should take some internal
 action based on the information, or whether that is better handled by some
 external monitoring agent.  For example, an external agent might collect
 SMART info into graphite, and every so often do some predictive analysis
 and mark out disks that are expected to fail soon.

 I'd love to see some consensus form around what this should look like...

 My $.02 from the peanut gallery: at a minimum, set the HEALTH_WARN flag if
 there is a SMART failure on a physical drive that contains an OSD.  Yes,
 you could build the monitoring into a separate system, but I think it'd be
 really useful to combine it into the cluster health assessment.  -- justin
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 Hi,

 Judging from my personal experience SMART failures can be dangerous if
 they are not bad enough to completely tear down an OSD therefore it will
 not flap and will not be marked as down in time, but cluster performance
 is greatly affected in this case. I don`t think that the SMART
 monitoring task is somehow related to Ceph because seperate monitoring
 of predictive failure counters can do its job well and in cause of
 sudden errors SMART query may not work at all since a lot of bus resets
 was made by the system and disk can be inaccessible at all. So I propose
 two set of strategies - do a regular scattered background checks and
 monitor OSD responsiveness to word around cases with performance
 degradation due to read/write errors.

Some necromant job for this thread..

Considering a year-long experience with Hitachi 4T disks, there are a
lot of failures which are cannot be handled by SMART completely -
speed degradation and sudden disk death. Although second case rules
out by itself by kicking out stuck OSD, it is not very easy to check
which disks are about to die without throughout dmesg monitoring for
bus errors and periodical speed calibration. Probably introducing such
thing as idle-priority speed measurement for OSDs without dramatically
increasing overall wearout may be useful enough to implement in couple
with additional OSD perf metric, like seek_time in SMART, though SMART
may return good value for it when performance already slowed down to
crawl, also it`ll handle most things impacting performance which can
be unexposable at all to the host OS - correctable bus errors and so
on. By the way, although 1T Seagates have way higher failure rate,
they always dying with an 'appropriate' set of attributes in SMART,
Hitachi tends to die without warning :) Hope that it`ll be helpful for
someone.
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] rbd: fix osd_request memory leak in __rbd_dev_header_watch_sync()

2014-05-22 Thread Ilya Dryomov
osd_request, along with r_request and r_reply messages attached to it
are leaked in __rbd_dev_header_watch_sync() if the requested image
doesn't exist.  This is because lingering requests are special and get
an extra ref in the reply path.  Fix it by unregistering linger request
on the error path and split __rbd_dev_header_watch_sync() into two
functions to make it maintainable.

Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com
---
 drivers/block/rbd.c |  123 +++
 1 file changed, 85 insertions(+), 38 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 552a2edcaa74..55c34b70842a 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -2872,56 +2872,55 @@ static void rbd_watch_cb(u64 ver, u64 notify_id, u8 
opcode, void *data)
 }
 
 /*
- * Request sync osd watch/unwatch.  The value of start determines
- * whether a watch request is being initiated or torn down.
+ * Initiate a watch request, synchronously.
  */
-static int __rbd_dev_header_watch_sync(struct rbd_device *rbd_dev, bool start)
+static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev)
 {
struct ceph_osd_client *osdc = rbd_dev-rbd_client-client-osdc;
struct rbd_obj_request *obj_request;
int ret;
 
-   rbd_assert(start ^ !!rbd_dev-watch_event);
-   rbd_assert(start ^ !!rbd_dev-watch_request);
+   rbd_assert(!rbd_dev-watch_event);
+   rbd_assert(!rbd_dev-watch_request);
 
-   if (start) {
-   ret = ceph_osdc_create_event(osdc, rbd_watch_cb, rbd_dev,
-   rbd_dev-watch_event);
-   if (ret  0)
-   return ret;
-   rbd_assert(rbd_dev-watch_event != NULL);
-   }
+   ret = ceph_osdc_create_event(osdc, rbd_watch_cb, rbd_dev,
+rbd_dev-watch_event);
+   if (ret  0)
+   return ret;
+
+   rbd_assert(rbd_dev-watch_event);
 
-   ret = -ENOMEM;
obj_request = rbd_obj_request_create(rbd_dev-header_name, 0, 0,
-   OBJ_REQUEST_NODATA);
-   if (!obj_request)
+OBJ_REQUEST_NODATA);
+   if (!obj_request) {
+   ret = -ENOMEM;
goto out_cancel;
+   }
 
obj_request-osd_req = rbd_osd_req_create(rbd_dev, true, 1,
  obj_request);
-   if (!obj_request-osd_req)
-   goto out_cancel;
+   if (!obj_request-osd_req) {
+   ret = -ENOMEM;
+   goto out_put;
+   }
 
-   if (start)
-   ceph_osdc_set_request_linger(osdc, obj_request-osd_req);
-   else
-   ceph_osdc_unregister_linger_request(osdc,
-   rbd_dev-watch_request-osd_req);
+   ceph_osdc_set_request_linger(osdc, obj_request-osd_req);
 
osd_req_op_watch_init(obj_request-osd_req, 0, CEPH_OSD_OP_WATCH,
-   rbd_dev-watch_event-cookie, 0, start ? 1 : 0);
+ rbd_dev-watch_event-cookie, 0, 1);
rbd_osd_req_format_write(obj_request);
 
ret = rbd_obj_request_submit(osdc, obj_request);
if (ret)
-   goto out_cancel;
+   goto out_linger;
+
ret = rbd_obj_request_wait(obj_request);
if (ret)
-   goto out_cancel;
+   goto out_linger;
+
ret = obj_request-result;
if (ret)
-   goto out_cancel;
+   goto out_linger;
 
/*
 * A watch request is set to linger, so the underlying osd
@@ -2931,36 +2930,84 @@ static int __rbd_dev_header_watch_sync(struct 
rbd_device *rbd_dev, bool start)
 * it.  We'll drop that reference (below) after we've
 * unregistered it.
 */
-   if (start) {
-   rbd_dev-watch_request = obj_request;
+   rbd_dev-watch_request = obj_request;
 
-   return 0;
+   return 0;
+
+out_linger:
+   ceph_osdc_unregister_linger_request(osdc, obj_request-osd_req);
+out_put:
+   rbd_obj_request_put(obj_request);
+out_cancel:
+   ceph_osdc_cancel_event(rbd_dev-watch_event);
+   rbd_dev-watch_event = NULL;
+
+   return ret;
+}
+
+/*
+ * Tear down a watch request, synchronously.
+ */
+static int __rbd_dev_header_unwatch_sync(struct rbd_device *rbd_dev)
+{
+   struct ceph_osd_client *osdc = rbd_dev-rbd_client-client-osdc;
+   struct rbd_obj_request *obj_request;
+   int ret;
+
+   rbd_assert(rbd_dev-watch_event);
+   rbd_assert(rbd_dev-watch_request);
+
+   obj_request = rbd_obj_request_create(rbd_dev-header_name, 0, 0,
+OBJ_REQUEST_NODATA);
+   if (!obj_request) {
+   ret = -ENOMEM;
+   goto out_cancel;
}
 
+   obj_request-osd_req = rbd_osd_req_create(rbd_dev, 

[PATCH v3 0/4] rbd: make sure we have latest osdmap on 'rbd map'

2014-05-22 Thread Ilya Dryomov
Hello,

This is a fix for #8184 that makes use of the updated
MMonGetVersionReply userspace code, which will now populate its tid
with the tid of the original MMonGetVersion request.

Thanks,

Ilya


Ilya Dryomov (4):
  libceph: recognize poolop requests in debugfs
  libceph: mon_get_version request infrastructure
  libceph: add ceph_monc_wait_osdmap()
  rbd: make sure we have latest osdmap on 'rbd map'

 drivers/block/rbd.c |   36 +-
 include/linux/ceph/mon_client.h |   11 ++-
 net/ceph/ceph_common.c  |2 +
 net/ceph/debugfs.c  |8 ++-
 net/ceph/mon_client.c   |  150 +--
 5 files changed, 194 insertions(+), 13 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/4] libceph: recognize poolop requests in debugfs

2014-05-22 Thread Ilya Dryomov
Recognize poolop requests in debugfs monc dump, fix prink format
specifiers - tid is unsigned.

Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com
---
 net/ceph/debugfs.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ceph/debugfs.c b/net/ceph/debugfs.c
index 10421a4b76f8..8903dcee8d8e 100644
--- a/net/ceph/debugfs.c
+++ b/net/ceph/debugfs.c
@@ -126,9 +126,11 @@ static int monc_show(struct seq_file *s, void *p)
req = rb_entry(rp, struct ceph_mon_generic_request, node);
op = le16_to_cpu(req-request-hdr.type);
if (op == CEPH_MSG_STATFS)
-   seq_printf(s, %lld statfs\n, req-tid);
+   seq_printf(s, %llu statfs\n, req-tid);
+   else if (op == CEPH_MSG_POOLOP)
+   seq_printf(s, %llu poolop\n, req-tid);
else
-   seq_printf(s, %lld unknown\n, req-tid);
+   seq_printf(s, %llu unknown\n, req-tid);
}
 
mutex_unlock(monc-mutex);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/4] libceph: add ceph_monc_wait_osdmap()

2014-05-22 Thread Ilya Dryomov
Add ceph_monc_wait_osdmap(), which will block until the osdmap with the
specified epoch is received or timeout occurs.

Export both of these as they are going to be needed by rbd.

Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com
---
 include/linux/ceph/mon_client.h |2 ++
 net/ceph/mon_client.c   |   27 +++
 2 files changed, 29 insertions(+)

diff --git a/include/linux/ceph/mon_client.h b/include/linux/ceph/mon_client.h
index 585ef9450e9d..deb47e45ac7c 100644
--- a/include/linux/ceph/mon_client.h
+++ b/include/linux/ceph/mon_client.h
@@ -104,6 +104,8 @@ extern int ceph_monc_got_mdsmap(struct ceph_mon_client 
*monc, u32 have);
 extern int ceph_monc_got_osdmap(struct ceph_mon_client *monc, u32 have);
 
 extern void ceph_monc_request_next_osdmap(struct ceph_mon_client *monc);
+extern int ceph_monc_wait_osdmap(struct ceph_mon_client *monc, u32 epoch,
+unsigned long timeout);
 
 extern int ceph_monc_do_statfs(struct ceph_mon_client *monc,
   struct ceph_statfs *buf);
diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index 6b46f1205ceb..ecfd65c05f49 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -296,6 +296,33 @@ void ceph_monc_request_next_osdmap(struct ceph_mon_client 
*monc)
__send_subscribe(monc);
mutex_unlock(monc-mutex);
 }
+EXPORT_SYMBOL(ceph_monc_request_next_osdmap);
+
+int ceph_monc_wait_osdmap(struct ceph_mon_client *monc, u32 epoch,
+ unsigned long timeout)
+{
+   unsigned long started = jiffies;
+   int ret;
+
+   mutex_lock(monc-mutex);
+   while (monc-have_osdmap  epoch) {
+   mutex_unlock(monc-mutex);
+
+   if (timeout != 0  time_after_eq(jiffies, started + timeout))
+   return -ETIMEDOUT;
+
+   ret = wait_event_interruptible_timeout(monc-client-auth_wq,
+monc-have_osdmap = epoch, timeout);
+   if (ret  0)
+   return ret;
+
+   mutex_lock(monc-mutex);
+   }
+
+   mutex_unlock(monc-mutex);
+   return 0;
+}
+EXPORT_SYMBOL(ceph_monc_wait_osdmap);
 
 /*
  *
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 4/4] rbd: make sure we have latest osdmap on 'rbd map'

2014-05-22 Thread Ilya Dryomov
Given an existing idle mapping (img1), mapping an image (img2) in
a newly created pool (pool2) fails:

$ ceph osd pool create pool1 8 8
$ rbd create --size 1000 pool1/img1
$ sudo rbd map pool1/img1
$ ceph osd pool create pool2 8 8
$ rbd create --size 1000 pool2/img2
$ sudo rbd map pool2/img2
rbd: sysfs write failed
rbd: map failed: (2) No such file or directory

This is because client instances are shared by default and we don't
request an osdmap update when bumping a ref on an existing client.  The
fix is to use the mon_get_version request to see if the osdmap we have
is the latest, and block until the requested update is received if it's
not.

Fixes: http://tracker.ceph.com/issues/8184

Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com
---
v2:
- send mon_get_version request and wait for a reply only if we were
  unable to locate the pool (i.e. don't hurt the common case)

v3:
- make use of the updated MMonGetVersionReply userspace code, which
  will now populate MMonGetVersionReply tid with the tid of the
  original MMonGetVersion request

 drivers/block/rbd.c |   36 +---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 552a2edcaa74..daf7b4659b4a 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -4683,6 +4683,38 @@ out_err:
 }
 
 /*
+ * Return pool id (= 0) or a negative error code.
+ */
+static int rbd_add_get_pool_id(struct rbd_client *rbdc, const char *pool_name)
+{
+   u64 newest_epoch;
+   unsigned long timeout = rbdc-client-options-mount_timeout * HZ;
+   int tries = 0;
+   int ret;
+
+again:
+   ret = ceph_pg_poolid_by_name(rbdc-client-osdc.osdmap, pool_name);
+   if (ret == -ENOENT  tries++  1) {
+   ret = ceph_monc_do_get_version(rbdc-client-monc, osdmap,
+  newest_epoch);
+   if (ret  0)
+   return ret;
+
+   if (rbdc-client-osdc.osdmap-epoch  newest_epoch) {
+   ceph_monc_request_next_osdmap(rbdc-client-monc);
+   (void) ceph_monc_wait_osdmap(rbdc-client-monc,
+newest_epoch, timeout);
+   goto again;
+   } else {
+   /* the osdmap we have is new enough */
+   return -ENOENT;
+   }
+   }
+
+   return ret;
+}
+
+/*
  * An rbd format 2 image has a unique identifier, distinct from the
  * name given to it by the user.  Internally, that identifier is
  * what's used to specify the names of objects related to the image.
@@ -5053,7 +5085,6 @@ static ssize_t do_rbd_add(struct bus_type *bus,
struct rbd_options *rbd_opts = NULL;
struct rbd_spec *spec = NULL;
struct rbd_client *rbdc;
-   struct ceph_osd_client *osdc;
bool read_only;
int rc = -ENOMEM;
 
@@ -5075,8 +5106,7 @@ static ssize_t do_rbd_add(struct bus_type *bus,
}
 
/* pick the pool */
-   osdc = rbdc-client-osdc;
-   rc = ceph_pg_poolid_by_name(osdc-osdmap, spec-pool_name);
+   rc = rbd_add_get_pool_id(rbdc, spec-pool_name);
if (rc  0)
goto err_out_client;
spec-pool_id = (u64)rc;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/4] libceph: mon_get_version request infrastructure

2014-05-22 Thread Ilya Dryomov
Add support for mon_get_version requests to libceph.  This reuses much
of the ceph_mon_generic_request infrastructure, with one exception.
Older OSDs don't set mon_get_version reply hdr-tid even if the
original request had a non-zero tid, which makes it impossible to
lookup ceph_mon_generic_request contexts by tid in get_generic_reply()
for such replies.  As a workaround, we allocate a reply message on the
reply path.  This can probably interfere with revoke, but I don't see
a better way.

Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com
---
 include/linux/ceph/mon_client.h |9 ++-
 net/ceph/ceph_common.c  |2 +
 net/ceph/debugfs.c  |2 +
 net/ceph/mon_client.c   |  123 +--
 4 files changed, 128 insertions(+), 8 deletions(-)

diff --git a/include/linux/ceph/mon_client.h b/include/linux/ceph/mon_client.h
index a486f390dfbe..585ef9450e9d 100644
--- a/include/linux/ceph/mon_client.h
+++ b/include/linux/ceph/mon_client.h
@@ -40,9 +40,9 @@ struct ceph_mon_request {
 };
 
 /*
- * ceph_mon_generic_request is being used for the statfs and poolop requests
- * which are bening done a bit differently because we need to get data back
- * to the caller
+ * ceph_mon_generic_request is being used for the statfs, poolop and
+ * mon_get_version requests which are being done a bit differently
+ * because we need to get data back to the caller
  */
 struct ceph_mon_generic_request {
struct kref kref;
@@ -108,6 +108,9 @@ extern void ceph_monc_request_next_osdmap(struct 
ceph_mon_client *monc);
 extern int ceph_monc_do_statfs(struct ceph_mon_client *monc,
   struct ceph_statfs *buf);
 
+extern int ceph_monc_do_get_version(struct ceph_mon_client *monc,
+   const char *what, u64 *newest);
+
 extern int ceph_monc_open_session(struct ceph_mon_client *monc);
 
 extern int ceph_monc_validate_auth(struct ceph_mon_client *monc);
diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c
index 67d7721d237e..1675021d8c12 100644
--- a/net/ceph/ceph_common.c
+++ b/net/ceph/ceph_common.c
@@ -72,6 +72,8 @@ const char *ceph_msg_type_name(int type)
case CEPH_MSG_MON_SUBSCRIBE_ACK: return mon_subscribe_ack;
case CEPH_MSG_STATFS: return statfs;
case CEPH_MSG_STATFS_REPLY: return statfs_reply;
+   case CEPH_MSG_MON_GET_VERSION: return mon_get_version;
+   case CEPH_MSG_MON_GET_VERSION_REPLY: return mon_get_version_reply;
case CEPH_MSG_MDS_MAP: return mds_map;
case CEPH_MSG_CLIENT_SESSION: return client_session;
case CEPH_MSG_CLIENT_RECONNECT: return client_reconnect;
diff --git a/net/ceph/debugfs.c b/net/ceph/debugfs.c
index 8903dcee8d8e..d1a62c69a9f4 100644
--- a/net/ceph/debugfs.c
+++ b/net/ceph/debugfs.c
@@ -129,6 +129,8 @@ static int monc_show(struct seq_file *s, void *p)
seq_printf(s, %llu statfs\n, req-tid);
else if (op == CEPH_MSG_POOLOP)
seq_printf(s, %llu poolop\n, req-tid);
+   else if (op == CEPH_MSG_MON_GET_VERSION)
+   seq_printf(s, %llu mon_get_version, req-tid);
else
seq_printf(s, %llu unknown\n, req-tid);
}
diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index 2ac9ef35110b..6b46f1205ceb 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -477,14 +477,13 @@ static struct ceph_msg *get_generic_reply(struct 
ceph_connection *con,
return m;
 }
 
-static int do_generic_request(struct ceph_mon_client *monc,
- struct ceph_mon_generic_request *req)
+static int __do_generic_request(struct ceph_mon_client *monc, u64 tid,
+   struct ceph_mon_generic_request *req)
 {
int err;
 
/* register request */
-   mutex_lock(monc-mutex);
-   req-tid = ++monc-last_tid;
+   req-tid = tid != 0 ? tid : ++monc-last_tid;
req-request-hdr.tid = cpu_to_le64(req-tid);
__insert_generic_request(monc, req);
monc-num_generic_requests++;
@@ -496,13 +495,24 @@ static int do_generic_request(struct ceph_mon_client 
*monc,
mutex_lock(monc-mutex);
rb_erase(req-node, monc-generic_request_tree);
monc-num_generic_requests--;
-   mutex_unlock(monc-mutex);
 
if (!err)
err = req-result;
return err;
 }
 
+static int do_generic_request(struct ceph_mon_client *monc,
+ struct ceph_mon_generic_request *req)
+{
+   int err;
+
+   mutex_lock(monc-mutex);
+   err = __do_generic_request(monc, 0, req);
+   mutex_unlock(monc-mutex);
+
+   return err;
+}
+
 /*
  * statfs
  */
@@ -579,6 +589,96 @@ out:
 }
 EXPORT_SYMBOL(ceph_monc_do_statfs);
 
+static void handle_get_version_reply(struct ceph_mon_client *monc,
+struct ceph_msg *msg)
+{
+   struct 

collectd / graphite / grafana .. calamari?

2014-05-22 Thread Ricardo Rocha
Hi.

I saw the thread a couple days ago on ceph-users regarding collectd...
and yes, i've been working on something similar for the last few days
:)

https://github.com/rochaporto/collectd-ceph

It has a set of collectd plugins pushing metrics which mostly map what
the ceph commands return. In the setup we have it pushes them to
graphite and the displays rely on grafana (check for a screenshot in
the link above).

As it relies on common building blocks, it's easily extensible and
we'll come up with new dashboards soon - things like plotting osd data
against the metrics from the collectd disk plugin, which we also
deploy.

This email is mostly to share the work, but also to check on Calamari?
I asked Patrick after the RedHat/Inktank news and have no idea what it
provides, but i'm sure it comes with lots of extra sauce - he
suggested to ask in the list.

What's the timeline to have it open sourced? It would be great to have
a look at it, and as there's work from different people in this area
maybe start working together on some fancier monitoring tools.

Regards,
  Ricardo
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html