Re: ceph perf command?

2015-01-10 Thread Sage Weil
On Sat, 10 Jan 2015, Sage Weil wrote:
> Ceph is collecting all kinds of metrics internally that are exposed 
> through the admin socket or via the 'ceph perf dump' command.  There's no 
> convenient way to watch these values from the command line, though.
> 
> A while back I wrote script/perf-watch.py that tries to provide something 
> that spits out a line every second (ala vmstat, iostat, etc) of whatever 
> metrics you specify.  It's a bit kludgey but look something like this:
> 
> $ script/perf-watch.py -s out/osd.0.asok filestore.bytes osd.wr 
> filestore.commitcycle osd.op osd.op_rw

One note: this currently only works on a vstart cluster (it assumes 
./ceph) and I had to turn off the dev warning:

-DEVMODEMSG = '*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH 
***'
+#DEVMODEMSG = '*** DEVELOPER MODE: setting PATH, PYTHONPATH and 
LD_LIBRARY_PATH ***'
+DEVMODEMSG = ''

(Incidentally, perhaps we should just turn that off anyway?)

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ceph perf command?

2015-01-10 Thread Sage Weil
Ceph is collecting all kinds of metrics internally that are exposed 
through the admin socket or via the 'ceph perf dump' command.  There's no 
convenient way to watch these values from the command line, though.

A while back I wrote script/perf-watch.py that tries to provide something 
that spits out a line every second (ala vmstat, iostat, etc) of whatever 
metrics you specify.  It's a bit kludgey but look something like this:

$ script/perf-watch.py -s out/osd.0.asok filestore.bytes osd.wr 
filestore.commitcycle osd.op osd.op_rw
# filestore.bytes filestore.commitcycle   osd.op osd.op_rw
0 00 0
0 00 0
0 00 0
 29375672 01 0
 37768744 06 0
 33572224 04 0
0 00 0
 46161808 13 0
0 00 0
0 00 0
# filestore.bytes filestore.commitcycle   osd.op osd.op_rw
0 00 0
0 00 0
 25179152 12 0
 37768736 06 0
0 10 0
0 00 0
...

You can specify either individual metrics ('osd.op') or a full category 
(just 'osd').  The problem is usually that there are so many metrics that 
doing the full set requires a *really* wide monitor to be useful.

Anyway, two thoughts:

1) We could incorporate this into the normal 'ceph' cli tool.  Maybe a 
'ceph perf  ', which works similar to 
'ceph daemon ...'.

2) We could mark certain metrics as the 'interesting' ones and make it 
default to showing those when no others are specified.  And/or possibly do 
the same for each group (osd, filestore, etc.).  That would steer 
admins towards the ones that are actually helpful in telling what the 
cluster is doing.

Thoughts?
sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rbd: convert to blk-mq

2015-01-10 Thread Alex Elder
On 01/10/2015 12:31 PM, Christoph Hellwig wrote:
> This converts the rbd driver to use the blk-mq infrastructure.  Except
> for switching to a per-request work item this is almost mechanical.
> 
> This was tested by Alexandre DERUMIER in November, and found to give
> him 12 iops, although the only comparism available was an old
> 3.10 kernel which gave 8iops.

I'm coming up to speed with the blk-mq stuff only now.  It looks
like requests are sent to the driver via ->queue_rq() rather than
the driver taking them via blk_fetch_request(q).

Previously we would pull as many requests as were available, put
them on the device's request queue, and then activate the rbd
workqueue to handle them one-by-one using rbd_handle_request().

Now, the rbd queue_rq method rbd_request_workfn() adds the request
to the rbd workqueue directly.  The work_struct implicitly follows
the request structure (which is set up by the blk-mq code).  We
have to do the REQ_TYPE_FS check at the time it's queued now,
rather than when it's fetched from the queue.  And finally we now
have to tell the blk-mq subsystem when we've started and ended a
request.

I didn't follow up on all the tag_set initialization values
so I assume you got that right (it looks reasonable to me).

Given the above, it looks like everything else should work
about the same as before, we're just handed requests rather
than asking for them.

With this patch applied, rbd_device->rq_queue is no longer
needed so you should delete it.  I got two warnings about
endo-of-line whitespace in your patch.  And I have one other
very small suggestion below.

Other than those things, this looks great to me.

Reviewed-by: Alex Elder 

> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/block/rbd.c | 118 
> +---
>  1 file changed, 67 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 3ec85df..52cd677 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c

. . .

(The following is in the new rbd_queue_rq().)

> + queue_work(rbd_wq, work);
> + return 0;

return BLK_MQ_RQ_QUEUE_OK;

(Because the symbolic values are explicitly checked
by the caller.)

>  }
>  
>  /*

. . .

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ceph osd df

2015-01-10 Thread Sage Weil
On Sat, 10 Jan 2015, Mykola Golub wrote:
> On Mon, Jan 05, 2015 at 11:03:40AM -0800, Sage Weil wrote:
> > We see a fair number of issues and confusion with OSD utilization and 
> > unfortunately there is easy way to see a summary of the current OSD 
> > utilization state.  'ceph pg dump' includes raw data but it not very 
> > friendly.  'ceph osd tree' shows weights but not actual utilization.  
> > 'ceph health detail' tells you the nearfull osds but only when they reach 
> > the warning threshold.
> > 
> > Opened a ticket for a new command that summarizes just the relevant info:
> > 
> > http://tracker.ceph.com/issues/10452
> > 
> > Suggestions welcome.  It's a pretty simple implementation (the mon has 
> > all the info; just need to add the command to present it) so I'm hoping it 
> > can get into hammer.  If anyone is interested in doing the 
> > implementation that would be great too!
> 
> I am interested in implementing this.
> 
> Here is my approach, for preliminary review and discussion.
>
> https://github.com/ceph/ceph/pull/3347

Awesome!  I made a few comments on the pull request.

> Only plane text format is available currently. As both "osd only" and
> "tree" outputs look useful I implemented both and added "tree" option
> to tell which to choose.

This sounds fine to me.  We will want to include the formatted output 
before merging, though!

> In http://tracker.ceph.com/issues/10452#note-2 Travis Rhoden suggested
> to extend 'ceph osd tree' command to provide this data instead, but
> I prefer to have many small specialized commands instead of one with
> large output. But if other people also think that it is better to add
> a '--detail' to osd tree instead of new command, I will change this.

Works for me.
 
> Also, I am not sure I got an idea how standard deviation should be
> calculated. Sage's note in 10452:
> 
>  - standard deviation (of normalized
>actual_osd_utilization/crush_weight/reweight value)
>
> I don't see why utilization should be normalized by
> reweight/crush_weight ratio? As I understand the goal is to have
> utilization be the same for all devices (thus deviation as small as
> possible), does not matter what reweight values we have?

Yeah, I think you're right.  If I'm reading the code correct you're still 
including reweight in there but I think it can be safely dropped.

> Some examples of command output for my dev environments:
> 
>  % ceph osd df
>  ID WEIGHT REWEIGHT %UTIL VAR  
>  01.00 1.00 18.12 1.00 
>  11.00 1.00 18.14 1.00 
>  21.00 1.00 18.13 1.00 

I wonder if we should try to standardize the table formats.  'ceph osd 
tree' current looks like

# idweight  type name   up/down reweight
-1  3   root default
-2  3   host maetl
0   1   osd.0   up  1   
1   1   osd.1   up  1   
2   1   osd.2   up  1   

That is, lowercase headers (with a # header prefix).  It's also not using 
TableFormatter (which it predates).

It's also pretty sloppy with the precision and formatting:

$ ./ceph osd crush reweight osd.1 
.0001
reweighted item id 1 name 'osd.1' to 0.0001 in crush map
$ ./ceph osd tree
# idweight  type name   up/down reweight
-1  2   root default
-2  2   host maetl
0   1   osd.0   up  1   
1   9.155e-05   osd.1   up  1   
2   1   osd.2   up  1   
$ ./ceph osd crush reweight osd.1 .001
reweighted item id 1 name 'osd.1' to 0.001 in crush map
$ ./ceph osd tree
# idweight  type name   up/down reweight
-1  2.001   root default
-2  2.001   host maetl
0   1   osd.0   up  1   
1   0.0009918   osd.1   up  1   
2   1   osd.2   up  1   

Given that the *actual* precision of these weights is 16.16 bit 
fixed-point, that's a lower bound of .1.  I'm not sure we want to 
print 1.0 all the time, though?  Although I suppose it's better than

  1
  2
 .1

In a perfect world I suppose TableFormatter (or whatever) would adjust the 
precision of all printed values to the highest precision needed by any 
item in the list, but maybe just sticking to 5 digits for 
everything is best for simplicity.

Anyway, any interest in making a single stringify_weight() helper and 
fixing up 'ceph osd tree' to also use it and TableFormatter too?  :)

sage


>  --
>  AVG %UTIL: 18.13  MIN/MAX VAR: 1.00/1.00  DEV: 0
>  
>  % ceph osd df tree
>  ID WEIGHT REWEIGHT %UTIL VAR  NAME
>  -1   3.00- 18.13 1.00 root default
>  -2   3.00- 18.13 1.00 host zhuzha 
>  01.00 1.00 18.12 1.00 osd.0   
>  11.00 1.00 18.14 1.00 osd.1   
>  21.00 1.00 18.13 1.00 osd.2   
>  --
>  AVG %UTIL: 18.13  MIN/MAX VAR: 1.00/1

[PATCH] rbd: convert to blk-mq

2015-01-10 Thread Christoph Hellwig
This converts the rbd driver to use the blk-mq infrastructure.  Except
for switching to a per-request work item this is almost mechanical.

This was tested by Alexandre DERUMIER in November, and found to give
him 12 iops, although the only comparism available was an old
3.10 kernel which gave 8iops.

Signed-off-by: Christoph Hellwig 
---
 drivers/block/rbd.c | 118 +---
 1 file changed, 67 insertions(+), 51 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 3ec85df..52cd677 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -342,7 +343,6 @@ struct rbd_device {
 
struct list_headrq_queue;   /* incoming rq queue */
spinlock_t  lock;   /* queue, flags, open_count */
-   struct work_struct  rq_work;
 
struct rbd_image_header header;
unsigned long   flags;  /* possibly lock protected */
@@ -360,6 +360,9 @@ struct rbd_device {
atomic_tparent_ref;
struct rbd_device   *parent;
 
+   /* Block layer tags. */
+   struct blk_mq_tag_set   tag_set;
+
/* protects updating the header */
struct rw_semaphore header_rwsem;
 
@@ -1817,7 +1820,8 @@ static void rbd_osd_req_callback(struct ceph_osd_request 
*osd_req,
 
/*
 * We support a 64-bit length, but ultimately it has to be
-* passed to blk_end_request(), which takes an unsigned int.
+* passed to the block layer, which just supports a 32-bit
+* length field.
 */
obj_request->xferred = osd_req->r_reply_op_len[0];
rbd_assert(obj_request->xferred < (u64)UINT_MAX);
@@ -2281,7 +2285,10 @@ static bool rbd_img_obj_end_request(struct 
rbd_obj_request *obj_request)
more = obj_request->which < img_request->obj_request_count - 1;
} else {
rbd_assert(img_request->rq != NULL);
-   more = blk_end_request(img_request->rq, result, xferred);
+   
+   more = blk_update_request(img_request->rq, result, xferred);
+   if (!more)
+   __blk_mq_end_request(img_request->rq, result);
}
 
return more;
@@ -3310,8 +3317,10 @@ out:
return ret;
 }
 
-static void rbd_handle_request(struct rbd_device *rbd_dev, struct request *rq)
+static void rbd_queue_workfn(struct work_struct *work)
 {
+   struct request *rq = blk_mq_rq_from_pdu(work);
+   struct rbd_device *rbd_dev = rq->q->queuedata;
struct rbd_img_request *img_request;
struct ceph_snap_context *snapc = NULL;
u64 offset = (u64)blk_rq_pos(rq) << SECTOR_SHIFT;
@@ -3319,6 +3328,13 @@ static void rbd_handle_request(struct rbd_device 
*rbd_dev, struct request *rq)
enum obj_operation_type op_type;
u64 mapping_size;
int result;
+   
+   if (rq->cmd_type != REQ_TYPE_FS) {
+   dout("%s: non-fs request type %d\n", __func__,
+   (int) rq->cmd_type);
+   result = -EIO;
+   goto err;
+   }
 
if (rq->cmd_flags & REQ_DISCARD)
op_type = OBJ_OP_DISCARD;
@@ -3358,6 +3374,8 @@ static void rbd_handle_request(struct rbd_device 
*rbd_dev, struct request *rq)
goto err_rq;
}
 
+   blk_mq_start_request(rq);
+
if (offset && length > U64_MAX - offset + 1) {
rbd_warn(rbd_dev, "bad request range (%llu~%llu)", offset,
 length);
@@ -3411,52 +3429,18 @@ err_rq:
 obj_op_name(op_type), length, offset, result);
ceph_put_snap_context(snapc);
blk_end_request_all(rq, result);
+err:
+   blk_mq_end_request(rq, result);
 }
 
-static void rbd_request_workfn(struct work_struct *work)
+static int rbd_queue_rq(struct blk_mq_hw_ctx *hctx,
+   const struct blk_mq_queue_data *bd)
 {
-   struct rbd_device *rbd_dev =
-   container_of(work, struct rbd_device, rq_work);
-   struct request *rq, *next;
-   LIST_HEAD(requests);
-
-   spin_lock_irq(&rbd_dev->lock); /* rq->q->queue_lock */
-   list_splice_init(&rbd_dev->rq_queue, &requests);
-   spin_unlock_irq(&rbd_dev->lock);
-
-   list_for_each_entry_safe(rq, next, &requests, queuelist) {
-   list_del_init(&rq->queuelist);
-   rbd_handle_request(rbd_dev, rq);
-   }
-}
+   struct request *rq = bd->rq;
+   struct work_struct *work = blk_mq_rq_to_pdu(rq);
 
-/*
- * Called with q->queue_lock held and interrupts disabled, possibly on
- * the way to schedule().  Do not sleep here!
- */
-static void rbd_request_fn(struct request_queue *q)
-{
-   struct rbd_device *rbd_dev = q->queuedata;
-   struct request *rq;
-   int queued = 0;
-
-   rbd_assert(rbd_dev);
-
-   while ((rq

Re: gf_unit 16 A -1 -m SPLIT 16 4 -r ALTMAP - failure

2015-01-10 Thread Loic Dachary
I should add that this failure only happens on ARMv7 Ubuntu with gcc version 
4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu7)

On 10/01/2015 18:54, Loic Dachary wrote:
> Hi Kevin & Janne,
> 
> The test gf_unit 16 A -1 -m SPLIT 16 4 -r ALTMAP - fails on the current 
> gf-complete master. The first commit where it fails is
> 
> commit 474010a91d35fef5ca7dea77205b6a5c7e68c3e9
> Author: Janne Grunau 
> Date:   Wed Sep 17 16:10:25 2014 +0200
> 
> arm: NEON optimisations for gf_w16
> 
> Optimisations for the 4,16 split table region multiplications.
> 
> Selected time_tool.sh 16 -A -B results for a 1.7 GHz cortex-a9:
> Region Best (MB/s):   532.14   W-Method: 16 -m SPLIT 16 4 -r SIMD -
> Region Best (MB/s):   212.34   W-Method: 16 -m SPLIT 16 4 -r NOSIMD -
> Region Best (MB/s):   801.36   W-Method: 16 -m SPLIT 16 4 -r SIMD -r 
> ALTMAP -
> Region Best (MB/s):93.20   W-Method: 16 -m SPLIT 16 4 -r NOSIMD -r 
> ALTMAP -
> Region Best (MB/s):   273.99   W-Method: 16 -m SPLIT 16 8 -
> Region Best (MB/s):   270.81   W-Method: 16 -m SPLIT 8 8 -
> Region Best (MB/s):70.42   W-Method: 16 -m COMPOSITE 2 - -
> Region Best (MB/s):   393.54   W-Method: 16 -m COMPOSITE 2 - -r ALTMAP -
> 
> but the test did exit(0) on error instead of exit(1) and we failed to notice.
> 
> gf_unit 16 A -1 -m SPLIT 16 4 -r ALTMAP -
> Args: 16 A -1 -m SPLIT 16 4 -r ALTMAP - / size (bytes): 524428
> Problem with region multiply (all values in hex):
>Target address base: 0x8fd08e.  Word 0x1 of 0x1fee.  Xor: 0
>Value: 2
>Original source word: d00a
>Product word: a000
>It should be: b01f
> 
> Do you have an idea why this happens ? For the record here is the bisect 
> command I used:
> 
> git bisect start 6fdd8bc3d32cb2f7fa55d2de9dc7cc5bb2f885aa 
> 36e75c3efec08b1e9bdb9c1f69a5b0018abd8ac7
> git bisect run try.sh
> 
> #!/bin/bash
> log=$(git rev-parse HEAD)
> echo $log.log
> make distclean > $log.log 2>&1
> ./autogen.sh >> $log.log 2>&1
> ./configure >> $log.log 2>&1
> make -j4 >> $log.log 2>&1
> ! test/gf_unit 16 A -1 -m SPLIT 16 4 -r ALTMAP - 2>&1 | grep 'It should be'
> 
> Note that b87c41f282dccc1b3649e3ea3fb80d19f820310 fails the test for 
> different reasons:
> 
> Args: 16 A -1 -m SPLIT 16 4 -r ALTMAP - / size (bytes): 524428
> *** Error in `/home/ubuntu/f/gf-complete/test/.libs/lt-gf_unit': free(): 
> invalid pointer: 0x00ce7070 ***
> try.sh: line 8: 12193 Aborted test/gf_unit 16 A -1 -m SPLIT 
> 16 4 -r ALTMAP -
> 
> Cheers
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


gf_unit 16 A -1 -m SPLIT 16 4 -r ALTMAP - failure

2015-01-10 Thread Loic Dachary
Hi Kevin & Janne,

The test gf_unit 16 A -1 -m SPLIT 16 4 -r ALTMAP - fails on the current 
gf-complete master. The first commit where it fails is

commit 474010a91d35fef5ca7dea77205b6a5c7e68c3e9
Author: Janne Grunau 
Date:   Wed Sep 17 16:10:25 2014 +0200

arm: NEON optimisations for gf_w16

Optimisations for the 4,16 split table region multiplications.

Selected time_tool.sh 16 -A -B results for a 1.7 GHz cortex-a9:
Region Best (MB/s):   532.14   W-Method: 16 -m SPLIT 16 4 -r SIMD -
Region Best (MB/s):   212.34   W-Method: 16 -m SPLIT 16 4 -r NOSIMD -
Region Best (MB/s):   801.36   W-Method: 16 -m SPLIT 16 4 -r SIMD -r ALTMAP 
-
Region Best (MB/s):93.20   W-Method: 16 -m SPLIT 16 4 -r NOSIMD -r 
ALTMAP -
Region Best (MB/s):   273.99   W-Method: 16 -m SPLIT 16 8 -
Region Best (MB/s):   270.81   W-Method: 16 -m SPLIT 8 8 -
Region Best (MB/s):70.42   W-Method: 16 -m COMPOSITE 2 - -
Region Best (MB/s):   393.54   W-Method: 16 -m COMPOSITE 2 - -r ALTMAP -

but the test did exit(0) on error instead of exit(1) and we failed to notice.

gf_unit 16 A -1 -m SPLIT 16 4 -r ALTMAP -
Args: 16 A -1 -m SPLIT 16 4 -r ALTMAP - / size (bytes): 524428
Problem with region multiply (all values in hex):
   Target address base: 0x8fd08e.  Word 0x1 of 0x1fee.  Xor: 0
   Value: 2
   Original source word: d00a
   Product word: a000
   It should be: b01f

Do you have an idea why this happens ? For the record here is the bisect 
command I used:

git bisect start 6fdd8bc3d32cb2f7fa55d2de9dc7cc5bb2f885aa 
36e75c3efec08b1e9bdb9c1f69a5b0018abd8ac7
git bisect run try.sh

#!/bin/bash
log=$(git rev-parse HEAD)
echo $log.log
make distclean > $log.log 2>&1
./autogen.sh >> $log.log 2>&1
./configure >> $log.log 2>&1
make -j4 >> $log.log 2>&1
! test/gf_unit 16 A -1 -m SPLIT 16 4 -r ALTMAP - 2>&1 | grep 'It should be'

Note that b87c41f282dccc1b3649e3ea3fb80d19f820310 fails the test for different 
reasons:

Args: 16 A -1 -m SPLIT 16 4 -r ALTMAP - / size (bytes): 524428
*** Error in `/home/ubuntu/f/gf-complete/test/.libs/lt-gf_unit': free(): 
invalid pointer: 0x00ce7070 ***
try.sh: line 8: 12193 Aborted test/gf_unit 16 A -1 -m SPLIT 16 
4 -r ALTMAP -

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


Re: New Defects reported by Coverity Scan for ceph

2015-01-10 Thread Haomai Wang
The first exception should be shadowed?

And the second exception seemed strange, because other tests are follow this way

On Sat, Jan 10, 2015 at 10:36 PM,   wrote:
>
> Hi,
>
> Please find the latest report on new defect(s) introduced to ceph found with 
> Coverity Scan.
>
> 2 new defect(s) introduced to ceph found with Coverity Scan.
>
>
> New defect(s) Reported-by: Coverity Scan
> Showing 2 of 2 defect(s)
>
>
> ** CID 1260210:  Resource leak  (RESOURCE_LEAK)
> /test/msgr/test_msgr.cc: 537 in 
> MessengerTest_ClientStandbyTest_Test::TestBody()()
>
> ** CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
> /test/msgr/test_msgr.cc: 579 in main()
> /test/msgr/test_msgr.cc: 579 in main()
> /test/msgr/test_msgr.cc: 579 in main()
> /test/msgr/test_msgr.cc: 579 in main()
> /test/msgr/test_msgr.cc: 579 in main()
> /test/msgr/test_msgr.cc: 579 in main()
>
>
> 
> *** CID 1260210:  Resource leak  (RESOURCE_LEAK)
> /test/msgr/test_msgr.cc: 537 in 
> MessengerTest_ClientStandbyTest_Test::TestBody()()
> 531   usleep(300*1000);
> 532   // client should be standby, so we use original connection
> 533   {
> 534 m = new MPing();
> 535 conn->send_keepalive();
> 536 CHECK_AND_WAIT_TRUE(conn->is_connected());
 CID 1260210:  Resource leak  (RESOURCE_LEAK)
 Variable "m" going out of scope leaks the storage it points to.
> 537 ASSERT_TRUE(conn->is_connected());
> 538 ASSERT_EQ(conn->send_message(m), 0);
> 539 Mutex::Locker l(cli_dispatcher.lock);
> 540 while (!cli_dispatcher.got_new)
> 541   cli_dispatcher.cond.Wait(cli_dispatcher.lock);
> 542 cli_dispatcher.got_new = false;
>
> 
> *** CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
> /test/msgr/test_msgr.cc: 579 in main()
> 573 // must be defined). This dummy test keeps gtest_main linked in.
> 574 TEST(DummyTest, ValueParameterizedTestsAreNotSupportedOnThisPlatform) 
> {}
> 575
> 576 #endif
> 577
> 578
 CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
 In function "main(int, char **)" an exception of type 
 "ceph::FailedAssertion" is thrown and never caught.
> 579 int main(int argc, char **argv) {
> 580   vector args;
> 581   argv_to_vec(argc, (const char **)argv, args);
> 582
> 583   global_init(NULL, args, CEPH_ENTITY_TYPE_CLIENT, 
> CODE_ENVIRONMENT_UTILITY, 0);
> 584   g_ceph_context->_conf->set_val("auth_cluster_required", "none");
> /test/msgr/test_msgr.cc: 579 in main()
> 573 // must be defined). This dummy test keeps gtest_main linked in.
> 574 TEST(DummyTest, ValueParameterizedTestsAreNotSupportedOnThisPlatform) 
> {}
> 575
> 576 #endif
> 577
> 578
 CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
 In function "main(int, char **)" an exception of type 
 "ceph::FailedAssertion" is thrown and never caught.
> 579 int main(int argc, char **argv) {
> 580   vector args;
> 581   argv_to_vec(argc, (const char **)argv, args);
> 582
> 583   global_init(NULL, args, CEPH_ENTITY_TYPE_CLIENT, 
> CODE_ENVIRONMENT_UTILITY, 0);
> 584   g_ceph_context->_conf->set_val("auth_cluster_required", "none");
> /test/msgr/test_msgr.cc: 579 in main()
> 573 // must be defined). This dummy test keeps gtest_main linked in.
> 574 TEST(DummyTest, ValueParameterizedTestsAreNotSupportedOnThisPlatform) 
> {}
> 575
> 576 #endif
> 577
> 578
 CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
 In function "main(int, char **)" an exception of type 
 "ceph::FailedAssertion" is thrown and never caught.
> 579 int main(int argc, char **argv) {
> 580   vector args;
> 581   argv_to_vec(argc, (const char **)argv, args);
> 582
> 583   global_init(NULL, args, CEPH_ENTITY_TYPE_CLIENT, 
> CODE_ENVIRONMENT_UTILITY, 0);
> 584   g_ceph_context->_conf->set_val("auth_cluster_required", "none");
> /test/msgr/test_msgr.cc: 579 in main()
> 573 // must be defined). This dummy test keeps gtest_main linked in.
> 574 TEST(DummyTest, ValueParameterizedTestsAreNotSupportedOnThisPlatform) 
> {}
> 575
> 576 #endif
> 577
> 578
 CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
 In function "main(int, char **)" an exception of type 
 "ceph::FailedAssertion" is thrown and never caught.
> 579 int main(int argc, char **argv) {
> 580   vector args;
> 581   argv_to_vec(argc, (const char **)argv, args);
> 582
> 583   global_init(NULL, args, CEPH_ENTITY_TYPE_CLIENT, 
> CODE_ENVIRONMENT_UTILITY, 0);
> 584   g_ceph_context->_conf->set_val("auth_cluster_required", "none");
> /test/msgr/test_msgr.cc: 579 in main()
> 573 // must be defined). This dummy test keeps gtest_main linked in.
> 574 TEST(DummyTest, ValueParameter

New Defects reported by Coverity Scan for ceph

2015-01-10 Thread scan-admin

Hi,

Please find the latest report on new defect(s) introduced to ceph found with 
Coverity Scan.

2 new defect(s) introduced to ceph found with Coverity Scan.


New defect(s) Reported-by: Coverity Scan
Showing 2 of 2 defect(s)


** CID 1260210:  Resource leak  (RESOURCE_LEAK)
/test/msgr/test_msgr.cc: 537 in 
MessengerTest_ClientStandbyTest_Test::TestBody()()

** CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
/test/msgr/test_msgr.cc: 579 in main()
/test/msgr/test_msgr.cc: 579 in main()
/test/msgr/test_msgr.cc: 579 in main()
/test/msgr/test_msgr.cc: 579 in main()
/test/msgr/test_msgr.cc: 579 in main()
/test/msgr/test_msgr.cc: 579 in main()



*** CID 1260210:  Resource leak  (RESOURCE_LEAK)
/test/msgr/test_msgr.cc: 537 in 
MessengerTest_ClientStandbyTest_Test::TestBody()()
531   usleep(300*1000);
532   // client should be standby, so we use original connection
533   {
534 m = new MPing();
535 conn->send_keepalive();
536 CHECK_AND_WAIT_TRUE(conn->is_connected());
>>> CID 1260210:  Resource leak  (RESOURCE_LEAK)
>>> Variable "m" going out of scope leaks the storage it points to.
537 ASSERT_TRUE(conn->is_connected());
538 ASSERT_EQ(conn->send_message(m), 0);
539 Mutex::Locker l(cli_dispatcher.lock);
540 while (!cli_dispatcher.got_new)
541   cli_dispatcher.cond.Wait(cli_dispatcher.lock);
542 cli_dispatcher.got_new = false;


*** CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
/test/msgr/test_msgr.cc: 579 in main()
573 // must be defined). This dummy test keeps gtest_main linked in.
574 TEST(DummyTest, ValueParameterizedTestsAreNotSupportedOnThisPlatform) {}
575 
576 #endif
577 
578 
>>> CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
>>> In function "main(int, char **)" an exception of type 
>>> "ceph::FailedAssertion" is thrown and never caught.
579 int main(int argc, char **argv) {
580   vector args;
581   argv_to_vec(argc, (const char **)argv, args);
582 
583   global_init(NULL, args, CEPH_ENTITY_TYPE_CLIENT, 
CODE_ENVIRONMENT_UTILITY, 0);
584   g_ceph_context->_conf->set_val("auth_cluster_required", "none");
/test/msgr/test_msgr.cc: 579 in main()
573 // must be defined). This dummy test keeps gtest_main linked in.
574 TEST(DummyTest, ValueParameterizedTestsAreNotSupportedOnThisPlatform) {}
575 
576 #endif
577 
578 
>>> CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
>>> In function "main(int, char **)" an exception of type 
>>> "ceph::FailedAssertion" is thrown and never caught.
579 int main(int argc, char **argv) {
580   vector args;
581   argv_to_vec(argc, (const char **)argv, args);
582 
583   global_init(NULL, args, CEPH_ENTITY_TYPE_CLIENT, 
CODE_ENVIRONMENT_UTILITY, 0);
584   g_ceph_context->_conf->set_val("auth_cluster_required", "none");
/test/msgr/test_msgr.cc: 579 in main()
573 // must be defined). This dummy test keeps gtest_main linked in.
574 TEST(DummyTest, ValueParameterizedTestsAreNotSupportedOnThisPlatform) {}
575 
576 #endif
577 
578 
>>> CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
>>> In function "main(int, char **)" an exception of type 
>>> "ceph::FailedAssertion" is thrown and never caught.
579 int main(int argc, char **argv) {
580   vector args;
581   argv_to_vec(argc, (const char **)argv, args);
582 
583   global_init(NULL, args, CEPH_ENTITY_TYPE_CLIENT, 
CODE_ENVIRONMENT_UTILITY, 0);
584   g_ceph_context->_conf->set_val("auth_cluster_required", "none");
/test/msgr/test_msgr.cc: 579 in main()
573 // must be defined). This dummy test keeps gtest_main linked in.
574 TEST(DummyTest, ValueParameterizedTestsAreNotSupportedOnThisPlatform) {}
575 
576 #endif
577 
578 
>>> CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
>>> In function "main(int, char **)" an exception of type 
>>> "ceph::FailedAssertion" is thrown and never caught.
579 int main(int argc, char **argv) {
580   vector args;
581   argv_to_vec(argc, (const char **)argv, args);
582 
583   global_init(NULL, args, CEPH_ENTITY_TYPE_CLIENT, 
CODE_ENVIRONMENT_UTILITY, 0);
584   g_ceph_context->_conf->set_val("auth_cluster_required", "none");
/test/msgr/test_msgr.cc: 579 in main()
573 // must be defined). This dummy test keeps gtest_main linked in.
574 TEST(DummyTest, ValueParameterizedTestsAreNotSupportedOnThisPlatform) {}
575 
576 #endif
577 
578 
>>> CID 1260212:  Uncaught exception  (UNCAUGHT_EXCEPT)
>>> In function "main(int, char **)" an exception of type 
>>> "ceph::FailedAssertion" is thrown and never caught.
579 int main(int argc, char **argv) {
580

Re: ceph osd df

2015-01-10 Thread Mykola Golub
On Mon, Jan 05, 2015 at 11:03:40AM -0800, Sage Weil wrote:
> We see a fair number of issues and confusion with OSD utilization and 
> unfortunately there is easy way to see a summary of the current OSD 
> utilization state.  'ceph pg dump' includes raw data but it not very 
> friendly.  'ceph osd tree' shows weights but not actual utilization.  
> 'ceph health detail' tells you the nearfull osds but only when they reach 
> the warning threshold.
> 
> Opened a ticket for a new command that summarizes just the relevant info:
> 
>   http://tracker.ceph.com/issues/10452
> 
> Suggestions welcome.  It's a pretty simple implementation (the mon has 
> all the info; just need to add the command to present it) so I'm hoping it 
> can get into hammer.  If anyone is interested in doing the 
> implementation that would be great too!

I am interested in implementing this.

Here is my approach, for preliminary review and discussion.

https://github.com/ceph/ceph/pull/3347

Only plane text format is available currently. As both "osd only" and
"tree" outputs look useful I implemented both and added "tree" option
to tell which to choose.

In http://tracker.ceph.com/issues/10452#note-2 Travis Rhoden suggested
to extend 'ceph osd tree' command to provide this data instead, but
I prefer to have many small specialized commands instead of one with
large output. But if other people also think that it is better to add
a '--detail' to osd tree instead of new command, I will change this.

Also, I am not sure I got an idea how standard deviation should be
calculated. Sage's note in 10452:

 - standard deviation (of normalized
   actual_osd_utilization/crush_weight/reweight value)
   
I don't see why utilization should be normalized by
reweight/crush_weight ratio? As I understand the goal is to have
utilization be the same for all devices (thus deviation as small as
possible), does not matter what reweight values we have?

Some examples of command output for my dev environments:

 % ceph osd df
 ID WEIGHT REWEIGHT %UTIL VAR  
 01.00 1.00 18.12 1.00 
 11.00 1.00 18.14 1.00 
 21.00 1.00 18.13 1.00 
 --
 AVG %UTIL: 18.13  MIN/MAX VAR: 1.00/1.00  DEV: 0
 
 % ceph osd df tree
 ID WEIGHT REWEIGHT %UTIL VAR  NAME
 -1   3.00- 18.13 1.00 root default
 -2   3.00- 18.13 1.00 host zhuzha 
 01.00 1.00 18.12 1.00 osd.0   
 11.00 1.00 18.14 1.00 osd.1   
 21.00 1.00 18.13 1.00 osd.2   
 --
 AVG %UTIL: 18.13  MIN/MAX VAR: 1.00/1.00  DEV: 0
 
 % ceph osd df
 ID WEIGHT REWEIGHT %UTIL VAR  
 01.00 1.00 38.15 0.91 
 11.00 1.00 44.15 1.06 
 21.00 1.00 45.66 1.09 
 31.00 1.00 44.15 1.06 
 41.00 0.80 36.82 0.88 
 --
 AVG %UTIL: 41.78  MIN/MAX VAR: 0.88/1.09  DEV: 6.19
 
 % ceph osd df tree
 ID WEIGHT REWEIGHT %UTIL VAR  NAME  
 -1   5.00- 41.78 1.00 root default  
 -2   1.00- 38.15 0.91 host osd1 
 01.00 1.00 38.15 0.91 osd.0 
 -3   1.00- 44.15 1.06 host osd2 
 11.00 1.00 44.15 1.06 osd.1 
 -4   1.00- 45.66 1.09 host osd3 
 21.00 1.00 45.66 1.09 osd.2 
 -5   1.00- 44.15 1.06 host osd4 
 31.00 1.00 44.15 1.06 osd.3 
 -6   1.00- 36.82 0.88 host osd5 
 41.00 0.80 36.82 0.88 osd.4 
 --
 AVG %UTIL: 41.78  MIN/MAX VAR: 0.88/1.09  DEV: 6.19

-- 
Mykola Golub
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html