Re: rbd export speed limit

2013-02-12 Thread Stefan Priebe - Profihost AG
Hi,
Am 12.02.2013 21:45, schrieb Andrey Korolyov:
> you may be interested in throttle(1) as a side solution with stdout
> export option.
What's throttle? Never seen this. Wouldn't it be possible to use tc?

> By the way, on which interconnect you have manage to
> get such speeds,
Bonded Intel 2x 10GBE

> if you mean 'commited' bytes(e.g. not almost empty
> allocated image)?
Yes commited images.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: File exists not handled in 0.48argonaut1

2013-02-12 Thread Mandell Degerness
I tried removing the corrupt log file (and all corrupt_log files) with
no success.

I then tried doing a journal flush and mkjournal.

Here is the log file after trying that (with the enhanced logging
turned on - ~50K lines)

http://dl.dropbox.com/u/766198/ceph-2.log.gz

On Mon, Feb 11, 2013 at 5:20 PM, Samuel Just  wrote:
> The actual problem appears to be a corrupted log file.  You should
> rename out of the way the directory:
> /mnt/osd97/current/corrupt_log_2013-02-08_18:50_2.fa8.  Then, restart
> the osd with debug osd = 20, debug filestore = 20, and debug ms = 1 in
> the [osd] section of the ceph.conf.
> -Sam
>
> On Mon, Feb 11, 2013 at 2:21 PM, Mandell Degerness
>  wrote:
>> Since the attachment didn't work, apparently, here is a link to the log:
>>
>> http://dl.dropbox.com/u/766198/error17.log.gz
>>
>> On Mon, Feb 11, 2013 at 1:42 PM, Samuel Just  wrote:
>>> I don't see the more complete log.
>>> -Sam
>>>
>>> On Mon, Feb 11, 2013 at 11:12 AM, Mandell Degerness
>>>  wrote:
 Anyone have any thoughts on this???  It looks like I may have to wipe
 out the OSDs effected and rebuild them, but I'm afraid that may result
 in data loss because of the old OSD first crush map in place :(.

 On Fri, Feb 8, 2013 at 1:36 PM, Mandell Degerness
  wrote:
> We ran into an error which appears very much like a bug fixed in 0.44.
>
> This cluster is running version:
>
> ceph version 0.48.1argonaut 
> (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c)
>
> The error line is:
>
> Feb  8 18:50:07 192.168.8.14 ceph-osd: 2013-02-08 18:50:07.545682
> 7f40f9f08700  0 filestore(/mnt/osd97)  error (17) File exists not
> handled on operation 20 (11279344.0.0, or op 0, counting from 0)
>
> A more complete log is attached.
>
> First question: is this a know bug fixed in more recent versions?
>
> Second question: is there any hope of recovery?
 --
 To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Links to various language bindings

2013-02-12 Thread John Wilkins
Also, be sure to open bugs and assign them to me.

On Tue, Feb 12, 2013 at 12:29 PM, Josh Durgin  wrote:
> On 02/08/2013 01:06 AM, Wido den Hollander wrote:
>>
>> Hi,
>>
>> I knew that there were Java bindings for RADOS, but they weren't linked.
>>
>> Well, some searching on Github lead me to Noah's bindings [0], but it
>> was a bit of searching.
>>
>> I expect new users to be less fortunate and end up searching endlessly
>> for them.
>>
>> The docs say this now:
>> http://ceph.com/docs/master/api/#rados-object-store-apis
>>
>> Only the libcephfs bindings are linked to, but they are part of the main
>> repository (which still puzzles me..).
>>
>> For what I know bindings exist for Java[0], PHP[1] and Python, but it's
>> not to be found anywhere.
>>
>> Shall I submit a patch to put this in the docs or should this go on
>> ceph.com itself?
>
>
> I think it'd be good to include in the docs.
>
> There are also Erlang bindings written against the current api:
> https://github.com/renzhi/erlrados
>
> There are a couple written against the older api, which wouldn't work
> anymore, but wouldn't be too hard to update:
>
> Ruby: https://github.com/johnl/desperados
> Haskell: https://github.com/athanatos/librados.hsc
>
>
>> I'd go for the docs so we can also include some simple samples for
>> people who are less experienced with Ceph/RADOS and just want to get
>> started, maybe developers who's only task is just to work with RADOS.
>>
>> Wido
>>
>> [0]: https://github.com/noahdesu/java-rados
>> [1]: https://github.com/ceph/phprados
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
John Wilkins
Senior Technical Writer
Intank
john.wilk...@inktank.com
(415) 425-9599
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: primary pg and replica pg are placed on the same node under "ssd-primary" rule

2013-02-12 Thread Gregory Farnum
Unfortunately this part doesn't fit into the CRUSH language. In order
to do it and segregate properly by node you need to have separate SSD
and HDD nodes, rather than interspersing them. (Or if you were brave
you could set up some much more specified rules and pull each replica
from a different rack/row, but that would take a bit of work and
enough machines for that mechanism to be useful.)
-Greg

On Tue, Feb 12, 2013 at 12:40 PM, ymorita...@gmail.com
 wrote:
> Hi,
>
> I am trying to test a storage tiering using the following
> "ssd-primary" rule set introduced in the Ceph page. However, there
> seems to be a chance that primary pg and replica pg are placed on the
> same node.
>
> Is there any way to avoid this at this point in time?
>
>   rule ssd-primary {
>   ruleset 4
>   type replicated
>   min_size 0
>   max_size 10
>   step take ssd
>   step chooseleaf 1 type host
>   step emit
>   step take platter
>   step chooseleaf -1 type host
>   step emit
>   }
>
> Ceph - CRUSH Maps
> http://ceph.com/docs/master/rados/operations/crush-map/#placing-different-pools-on-different-osds
>
> Thank you.
> Yuji
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rbd export speed limit

2013-02-12 Thread Andrey Korolyov
Hi Stefan,

you may be interested in throttle(1) as a side solution with stdout
export option. By the way, on which interconnect you have manage to
get such speeds, if you mean 'commited' bytes(e.g. not almost empty
allocated image)?

On Wed, Feb 13, 2013 at 12:22 AM, Stefan Priebe  wrote:
> Hi,
>
> is there a speed limit option for rbd export? Right now i'm able to produce
> several SLOW requests from IMPORTANT valid requests while just exporting a
> snapshot which is not really important.
>
> rbd export runs with 2400MB/s and each OSD with 250MB/s so it seems to block
> valid normal read / write operations.
>
> Greets,
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Links to various language bindings

2013-02-12 Thread Josh Durgin

On 02/08/2013 01:06 AM, Wido den Hollander wrote:

Hi,

I knew that there were Java bindings for RADOS, but they weren't linked.

Well, some searching on Github lead me to Noah's bindings [0], but it
was a bit of searching.

I expect new users to be less fortunate and end up searching endlessly
for them.

The docs say this now:
http://ceph.com/docs/master/api/#rados-object-store-apis

Only the libcephfs bindings are linked to, but they are part of the main
repository (which still puzzles me..).

For what I know bindings exist for Java[0], PHP[1] and Python, but it's
not to be found anywhere.

Shall I submit a patch to put this in the docs or should this go on
ceph.com itself?


I think it'd be good to include in the docs.

There are also Erlang bindings written against the current api:
https://github.com/renzhi/erlrados

There are a couple written against the older api, which wouldn't work
anymore, but wouldn't be too hard to update:

Ruby: https://github.com/johnl/desperados
Haskell: https://github.com/athanatos/librados.hsc


I'd go for the docs so we can also include some simple samples for
people who are less experienced with Ceph/RADOS and just want to get
started, maybe developers who's only task is just to work with RADOS.

Wido

[0]: https://github.com/noahdesu/java-rados
[1]: https://github.com/ceph/phprados


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


rbd export speed limit

2013-02-12 Thread Stefan Priebe

Hi,

is there a speed limit option for rbd export? Right now i'm able to 
produce several SLOW requests from IMPORTANT valid requests while just 
exporting a snapshot which is not really important.


rbd export runs with 2400MB/s and each OSD with 250MB/s so it seems to 
block valid normal read / write operations.


Greets,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slow requests, hunting for new mon

2013-02-12 Thread Chris Dunlop
On Tue, Feb 12, 2013 at 06:28:15PM +1100, Chris Dunlop wrote:
> Hi,
> 
> What are likely causes for "slow requests" and "monclient: hunting for new
> mon" messages? E.g.:
> 
> 2013-02-12 16:27:07.318943 7f9c0bc16700  0 monclient: hunting for new mon
> ...
> 2013-02-12 16:27:45.892314 7f9c13c26700  0 log [WRN] : 6 slow requests, 6 
> included below; oldest blocked for > 30.383883 secs
> 2013-02-12 16:27:45.892323 7f9c13c26700  0 log [WRN] : slow request 30.383883 
> seconds old, received at 2013-02-12 16:27:15.508374: 
> osd_op(client.9821.0:122242 rb.0.209f.74b0dc51.0120 [write 
> 921600~4096] 2.981cf6bc) v4 currently no flag points reached
> 2013-02-12 16:27:45.892328 7f9c13c26700  0 log [WRN] : slow request 30.383782 
> seconds old, received at 2013-02-12 16:27:15.508475: 
> osd_op(client.9821.0:122243 rb.0.209f.74b0dc51.0120 [write 
> 987136~4096] 2.981cf6bc) v4 currently no flag points reached
> 2013-02-12 16:27:45.892334 7f9c13c26700  0 log [WRN] : slow request 30.383720 
> seconds old, received at 2013-02-12 16:27:15.508537: 
> osd_op(client.9821.0:122244 rb.0.209f.74b0dc51.0120 [write 
> 1036288~8192] 2.981cf6bc) v4 currently no flag points reached
> 2013-02-12 16:27:45.892338 7f9c13c26700  0 log [WRN] : slow request 30.383684 
> seconds old, received at 2013-02-12 16:27:15.508573: 
> osd_op(client.9821.0:122245 rb.0.209f.74b0dc51.0122 [write 
> 1454080~4096] 2.fff29a9a) v4 currently no flag points reached
> 2013-02-12 16:27:45.892341 7f9c13c26700  0 log [WRN] : slow request 30.328986 
> seconds old, received at 2013-02-12 16:27:15.563271: 
> osd_op(client.9821.0:122246 rb.0.209f.74b0dc51.0122 [write 
> 1482752~4096] 2.fff29a9a) v4 currently no flag points reached

Sorry, I forgot to add what lead me to look into this in the first place...

The b2 machine is running a number of libvirt / kvm virtuals on rbd via
librados:


  
  

  
  
  
  
  


Around the time the "slow requests" messages pop up, the virtual machines 
experience
a spike in disk latency, e.g. from inside one of the virtuals:

load %user %nice  %sys  %iow  %stl %idle  dev rrqm/s wrqm/s 
   r/sw/srkB/swkB/s arq-sz aqu-sz  await  rwait wwait %util
2013-02-12-18:55:23  1.7   0.3   0.0   0.9  42.2   0.0  18.5  vdb0.0   25.8 
   0.09.6 0.00   136.53  28.44   1.34 139.56   0.00 139.6  70.7
2013-02-12-18:55:38  1.5   0.9   0.0   0.4  35.8   0.0  42.8  vdb0.0   26.6 
   0.0   10.8 0.00   143.20  26.52   0.46  41.98   0.00  42.0  40.0
2013-02-12-18:55:53  1.4   0.7   0.0   0.4  38.0   0.0  51.7  vdb0.07.5 
   0.06.8 0.0053.33  15.69   0.46  68.08   0.00  68.1  38.8
2013-02-12-18:56:10  2.2   0.0   0.0   0.1   8.0   0.0  15.8  vdb0.02.1 
   0.00.2 0.00 9.18  78.00   0.98 4164.00   0.00 4164.0  70.6
2013-02-12-18:56:32  3.7   0.0   0.0   0.1   0.0   0.0   0.0  vdb0.00.1 
   0.00.8 0.00 3.27   8.00  24.12 14519.78   0.00 14519.8 100.0
2013-02-12-18:56:47  5.1   0.5   0.0   0.6   0.0   0.0   0.0  vdb0.0   11.5 
   0.05.4 0.0065.87  24.40   1.60 3620.15   0.00 3620.1  88.4
2013-02-12-18:57:03  5.2   0.6   0.0   1.1   3.5   0.0   0.0  vdb0.0   19.6 
   0.06.9 0.00   101.25  29.19   1.13 162.41   0.00 162.4  87.4
2013-02-12-18:57:20  5.2   0.6   0.0   0.9   9.7   0.0   0.0  vdb0.0   44.6 
   0.09.7 0.00   214.82  44.27   2.41 248.22   0.00 248.2  85.3
2013-02-12-18:57:36  4.4   0.4   0.0   0.5  28.7   0.0  46.5  vdb0.0   17.6 
   0.05.2 0.0087.50  33.33   0.56 107.14   0.00 107.1  48.5

...corresponding to this in the b5 / osd.1 log:

2013-02-12 18:52:08.812880 7f9c0bc16700  0 monclient: hunting for new mon
2013-02-12 18:55:18.851791 7f9c0bc16700  0 monclient: hunting for new mon
2013-02-12 18:56:25.414948 7f9c13c26700  0 log [WRN] : 6 slow requests, 6 
included below; oldest blocked for > 30.372124 secs
2013-02-12 18:56:25.414958 7f9c13c26700  0 log [WRN] : slow request 30.372124 
seconds old, received at 2013-02-12 18:55:55.042767: 
osd_op(client.9821.0:144779 rb.0.209f.74b0dc51.0023 [write 
1593344~4096] 2.1882ddb7) v4 currently no flag points reached
...
2013-02-12 18:57:13.427008 7f9c13c26700  0 log [WRN] : slow request 40.721769 
seconds old, received at 2013-02-12 18:56:32.705190: 
osd_op(client.9821.0:146756 rb.0.209f.74b0dc51.0128 [write 819200~8192] 
2.b4390173) v4 currently commit sent
2013-02-12 18:59:43.886517 7f9c0bc16700  0 monclient: hunting for new mon
2013-02-12 19:02:53.911641 7f9c0bc16700  0 monclient: hunting for new mon


Cheers,

Chris.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html