[ceph-users] RGW performance test , put 30 thousands objects to one bucket, average latency 3 seconds

2014-07-02 Thread baijia...@126.com
hi, everyone

when I user rest bench testing RGW with cmd : rest-bench --access-key=ak 
--secret=sk  --bucket=bucket --seconds=360 -t 200  -b 524288  --no-cleanup 
write 

I found when RGW call the method "bucket_prepare_op " is very slow. so I 
observed from 'dump_historic_ops',to see:
{ "description": "osd_op(client.4211.0:265984 .dir.default.4148.1 [call 
rgw.bucket_prepare_op] 3.b168f3d0 e37)",
  "received_at": "2014-07-03 11:07:02.465700",
  "age": "308.315230",
  "duration": "3.401743",
  "type_data": [
"commit sent; apply or cleanup",
{ "client": "client.4211",
  "tid": 265984},
[
{ "time": "2014-07-03 11:07:02.465852",
  "event": "waiting_for_osdmap"},
{ "time": "2014-07-03 11:07:02.465875",
  "event": "queue op_wq"},
{ "time": "2014-07-03 11:07:03.729087",
  "event": "reached_pg"},
{ "time": "2014-07-03 11:07:03.729120",
  "event": "started"},
{ "time": "2014-07-03 11:07:03.729126",
  "event": "started"},
{ "time": "2014-07-03 11:07:03.804366",
  "event": "waiting for subops from [19,9]"},
{ "time": "2014-07-03 11:07:03.804431",
  "event": "commit_queued_for_journal_write"},
{ "time": "2014-07-03 11:07:03.804509",
  "event": "write_thread_in_journal_buffer"},
{ "time": "2014-07-03 11:07:03.934419",
  "event": "journaled_completion_queued"},
{ "time": "2014-07-03 11:07:05.297282",
  "event": "sub_op_commit_rec"},
{ "time": "2014-07-03 11:07:05.297319",
  "event": "sub_op_commit_rec"},
{ "time": "2014-07-03 11:07:05.311217",
  "event": "op_applied"},
{ "time": "2014-07-03 11:07:05.867384",
  "event": "op_commit finish lock"},
{ "time": "2014-07-03 11:07:05.867385",
  "event": "op_commit"},
{ "time": "2014-07-03 11:07:05.867424",
  "event": "commit_sent"},
{ "time": "2014-07-03 11:07:05.867428",
  "event": "op_commit finish"},
{ "time": "2014-07-03 11:07:05.867443",
  "event": "done"}]]}]}

so I find 2 performance degradation. one is from "queue op_wq" to "reached_pg" 
, anothor is from "journaled_completion_queued" to "op_commit".
and I must stess that there are so many ops write to one bucket object, so how 
to reduce Latency ?





baijia...@126.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Teuthology: Need some input on how to add osd after cluster setup is done using Teuthology

2014-07-02 Thread Shambhu Rajak
Hi
Teuthology Users,

Can some help me on how to add osd to a cluster already setup by Teuthology via 
yaml file.
Other than those osd's that are mentioned in roles of yaml file, I want to add 
additional few osd to the cluster as a part of my scenario. So far I haven't 
seen any task or any method available in ceph.py.

Thanks & Regards,
Shambhu Rajak




PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bypass Cache-Tiering for special reads (Backups)

2014-07-02 Thread Kyle Bader
> I was wondering, having a cache pool in front of an RBD pool is all fine
> and dandy, but imagine you want to pull backups of all your VMs (or one
> of them, or multiple...). Going to the cache for all those reads isn't
> only pointless, it'll also potentially fill up the cache and possibly
> evict actually frequently used data. Which got me thinking... wouldn't
> it be nifty if there was a special way of doing specific backup reads
> where you'd bypass the cache, ensuring the dirty cache contents get
> written to cold pool first? Or at least doing special reads where a
> cache-miss won't actually cache the requested data?
>
> AFAIK the backup routine for an RBD-backed KVM usually involves creating
> a snapshot of the RBD and putting that into a backup storage/tape, all
> done via librbd/API.
>
> Maybe something like that even already exists?

When used in the context of OpenStack Cinder, it does:

http://ceph.com/docs/next/rbd/rbd-openstack/#configuring-cinder-backup

You can have the backup pool use the default crush rules, assuming the
default isn't your hot pool. Another option might be to put backups on
an erasure coded pool, I'm not sure if that has been tested, but in
principle should work since objects composing a snapshot should be
immutable.

-- 

Kyle
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph RBD and Backup.

2014-07-02 Thread Irek Fasikhov
Hi,All.

Dear community. How do you make backups CEPH RDB?

Thanks

-- 
Fasihov Irek (aka Kataklysm).
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance is really bad when I run from vstart.sh

2014-07-02 Thread David Zafman

By default the vstart.sh setup would put all data below a directory called 
“dev” in the source tree.  In that case you’re using a single spindle.  The 
vstart script isn’t intended for performance testing.

David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com

On Jul 2, 2014, at 5:48 PM, Zhe Zhang  wrote:

> Hi folks,
>  
> I run ceph on a single node which contains 25 hard drives and each @7200 RPM. 
> I write raw data into the array, it achieved 2 GB/s. I presumed the 
> performance of ceph could go beyond 1 GB/s. but when I compile and ceph code 
> and run development mode with vstart.sh, the average throughput is only 200 
> MB/s for rados bench write.
> I suspected it was due to the debug mode when I configure the source code, 
> and I disable the gdb with ./configure CFLAGS=’-O3’ CXXFLAGS=’O3’ (avoid ‘–g’ 
> flag). But it did not help at all.
> I switched to the repository, and install ceph with ceph-deploy, the 
> performance achieved 800 MB/s. Since I did not successfully set up the ceph 
> with ceph-deploy, and there are still some pg at “creating+incomplete” state, 
> I guess this could impact the performance.
> Anyway, could someone give me some suggestions? Why it is so slow when I run 
> from vstart.sh?
>  
> Best,
> Zhe
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Performance is really bad when I run from vstart.sh

2014-07-02 Thread Zhe Zhang
Hi folks,

I run ceph on a single node which contains 25 hard drives and each @7200 RPM. I 
write raw data into the array, it achieved 2 GB/s. I presumed the performance 
of ceph could go beyond 1 GB/s. but when I compile and ceph code and run 
development mode with vstart.sh, the average throughput is only 200 MB/s for 
rados bench write.
I suspected it was due to the debug mode when I configure the source code, and 
I disable the gdb with ./configure CFLAGS='-O3' CXXFLAGS='O3' (avoid '-g' 
flag). But it did not help at all.
I switched to the repository, and install ceph with ceph-deploy, the 
performance achieved 800 MB/s. Since I did not successfully set up the ceph 
with ceph-deploy, and there are still some pg at "creating+incomplete" state, I 
guess this could impact the performance.
Anyway, could someone give me some suggestions? Why it is so slow when I run 
from vstart.sh?

Best,
Zhe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Samuel Just
Yes, thanks.
-Sam

On Wed, Jul 2, 2014 at 4:21 PM, Pierre BLONDEAU
 wrote:
> Like that ?
>
> # ceph --admin-daemon /var/run/ceph/ceph-mon.william.asok version
> {"version":"0.82"}
> # ceph --admin-daemon /var/run/ceph/ceph-mon.jack.asok version
> {"version":"0.82"}
> # ceph --admin-daemon /var/run/ceph/ceph-mon.joe.asok version
> {"version":"0.82"}
>
> Pierre
>
> Le 03/07/2014 01:17, Samuel Just a écrit :
>
>> Can you confirm from the admin socket that all monitors are running
>> the same version?
>> -Sam
>>
>> On Wed, Jul 2, 2014 at 4:15 PM, Pierre BLONDEAU
>>  wrote:
>>>
>>> Le 03/07/2014 00:55, Samuel Just a écrit :
>>>
 Ah,

 ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush
 /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i >
 /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d
 ../ceph/src/osdmaptool: osdmap file
 'osd-20_osdmap.13258__0_4E62BB79__none'
 ../ceph/src/osdmaptool: exported crush map to /tmp/crush20
 ../ceph/src/osdmaptool: osdmap file
 'osd-23_osdmap.13258__0_4E62BB79__none'
 ../ceph/src/osdmaptool: exported crush map to /tmp/crush23
 6d5
 < tunable chooseleaf_vary_r 1

 Looks like the chooseleaf_vary_r tunable somehow ended up divergent?

 Pierre: do you recall how and when that got set?
>>>
>>>
>>>
>>> I am not sure to understand, but if I good remember after the update in
>>> firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I
>>> see "feature set mismatch" in log.
>>>
>>> So if I good remeber, i do : ceph osd crush tunables optimal for the
>>> problem
>>> of "crush map" and I update my client and server kernel to 3.16rc.
>>>
>>> It's could be that ?
>>>
>>> Pierre
>>>
>>>
 -Sam

 On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just 
 wrote:
>
>
> Yeah, divergent osdmaps:
> 555ed048e73024687fc8b106a570db4f  osd-20_osdmap.13258__0_4E62BB79__none
> 6037911f31dc3c18b05499d24dcdbe5c  osd-23_osdmap.13258__0_4E62BB79__none
>
> Joao: thoughts?
> -Sam
>
> On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
>  wrote:
>>
>>
>> The files
>>
>> When I upgrade :
>>ceph-deploy install --stable firefly servers...
>>on each servers service ceph restart mon
>>on each servers service ceph restart osd
>>on each servers service ceph restart mds
>>
>> I upgraded from emperor to firefly. After repair, remap, replace, etc
>> ... I
>> have some PG which pass in peering state.
>>
>> I thought why not try the version 0.82, it could solve my problem. (
>> It's my mistake ). So, I upgrade from firefly to 0.83 with :
>>ceph-deploy install --testing servers...
>>..
>>
>> Now, all programs are in version 0.82.
>> I have 3 mons, 36 OSD and 3 mds.
>>
>> Pierre
>>
>> PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta
>> directory.
>>
>> Le 03/07/2014 00:10, Samuel Just a écrit :
>>
>>> Also, what version did you upgrade from, and how did you upgrade?
>>> -Sam
>>>
>>> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just 
>>> wrote:



 Ok, in current/meta on osd 20 and osd 23, please attach all files
 matching

 ^osdmap.13258.*

 There should be one such file on each osd. (should look something
 like
 osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
 you'll want to use find).

 What version of ceph is running on your mons?  How many mons do you
 have?
 -Sam

 On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
  wrote:
>
>
>
> Hi,
>
> I do it, the log files are available here :
> https://blondeau.users.greyc.fr/cephlog/debug20/
>
> The OSD's files are really big +/- 80M .
>
> After starting the osd.20 some other osd crash. I pass from 31 osd
> up
> to
> 16.
> I remark that after this the number of down+peering PG decrease
> from
> 367
> to
> 248. It's "normal" ? May be it's temporary, the time that the
> cluster
> verifies all the PG ?
>
> Regards
> Pierre
>
> Le 02/07/2014 19:16, Samuel Just a écrit :
>
>> You should add
>>
>> debug osd = 20
>> debug filestore = 20
>> debug ms = 1
>>
>> to the [osd] section of the ceph.conf and restart the osds.  I'd
>> like
>> all three logs if possible.
>>
>> Thanks
>> -Sam
>>
>> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
>>  wrote:
>>>
>>>
>>>
>>>
>>> Yes, but how i do that ?
>>>
>>> With a command li

Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Pierre BLONDEAU

Like that ?

# ceph --admin-daemon /var/run/ceph/ceph-mon.william.asok version
{"version":"0.82"}
# ceph --admin-daemon /var/run/ceph/ceph-mon.jack.asok version
{"version":"0.82"}
# ceph --admin-daemon /var/run/ceph/ceph-mon.joe.asok version
{"version":"0.82"}

Pierre

Le 03/07/2014 01:17, Samuel Just a écrit :

Can you confirm from the admin socket that all monitors are running
the same version?
-Sam

On Wed, Jul 2, 2014 at 4:15 PM, Pierre BLONDEAU
 wrote:

Le 03/07/2014 00:55, Samuel Just a écrit :


Ah,

~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush
/tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i >
/tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d
../ceph/src/osdmaptool: osdmap file
'osd-20_osdmap.13258__0_4E62BB79__none'
../ceph/src/osdmaptool: exported crush map to /tmp/crush20
../ceph/src/osdmaptool: osdmap file
'osd-23_osdmap.13258__0_4E62BB79__none'
../ceph/src/osdmaptool: exported crush map to /tmp/crush23
6d5
< tunable chooseleaf_vary_r 1

Looks like the chooseleaf_vary_r tunable somehow ended up divergent?

Pierre: do you recall how and when that got set?



I am not sure to understand, but if I good remember after the update in
firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I
see "feature set mismatch" in log.

So if I good remeber, i do : ceph osd crush tunables optimal for the problem
of "crush map" and I update my client and server kernel to 3.16rc.

It's could be that ?

Pierre



-Sam

On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just  wrote:


Yeah, divergent osdmaps:
555ed048e73024687fc8b106a570db4f  osd-20_osdmap.13258__0_4E62BB79__none
6037911f31dc3c18b05499d24dcdbe5c  osd-23_osdmap.13258__0_4E62BB79__none

Joao: thoughts?
-Sam

On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
 wrote:


The files

When I upgrade :
   ceph-deploy install --stable firefly servers...
   on each servers service ceph restart mon
   on each servers service ceph restart osd
   on each servers service ceph restart mds

I upgraded from emperor to firefly. After repair, remap, replace, etc
... I
have some PG which pass in peering state.

I thought why not try the version 0.82, it could solve my problem. (
It's my mistake ). So, I upgrade from firefly to 0.83 with :
   ceph-deploy install --testing servers...
   ..

Now, all programs are in version 0.82.
I have 3 mons, 36 OSD and 3 mds.

Pierre

PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta
directory.

Le 03/07/2014 00:10, Samuel Just a écrit :


Also, what version did you upgrade from, and how did you upgrade?
-Sam

On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just 
wrote:



Ok, in current/meta on osd 20 and osd 23, please attach all files
matching

^osdmap.13258.*

There should be one such file on each osd. (should look something like
osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
you'll want to use find).

What version of ceph is running on your mons?  How many mons do you
have?
-Sam

On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
 wrote:



Hi,

I do it, the log files are available here :
https://blondeau.users.greyc.fr/cephlog/debug20/

The OSD's files are really big +/- 80M .

After starting the osd.20 some other osd crash. I pass from 31 osd up
to
16.
I remark that after this the number of down+peering PG decrease from
367
to
248. It's "normal" ? May be it's temporary, the time that the cluster
verifies all the PG ?

Regards
Pierre

Le 02/07/2014 19:16, Samuel Just a écrit :


You should add

debug osd = 20
debug filestore = 20
debug ms = 1

to the [osd] section of the ceph.conf and restart the osds.  I'd
like
all three logs if possible.

Thanks
-Sam

On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
 wrote:




Yes, but how i do that ?

With a command like that ?

ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
--debug-ms
1'

By modify the /etc/ceph/ceph.conf ? This file is really poor
because I
use
udev detection.

When I have made these changes, you want the three log files or
only
osd.20's ?

Thank you so much for the help

Regards
Pierre

Le 01/07/2014 23:51, Samuel Just a écrit :


Can you reproduce with
debug osd = 20
debug filestore = 20
debug ms = 1
?
-Sam

On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
 wrote:





Hi,

I join :
  - osd.20 is one of osd that I detect which makes crash other
OSD.
  - osd.23 is one of osd which crash when i start osd.20
  - mds, is one of my MDS

I cut log file because they are to big but. All is here :
https://blondeau.users.greyc.fr/cephlog/

Regards

Le 30/06/2014 17:35, Gregory Farnum a écrit :


What's the backtrace from the crashing OSDs?

Keep in mind that as a dev release, it's generally best not to
upgrade
to unnamed versions like 0.82 (but it's probably too late to go
back
now).





I will remember it the next time ;)


-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
 wrote:




Hi,

After the upgrade to firefly, I h

Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Samuel Just
Can you confirm from the admin socket that all monitors are running
the same version?
-Sam

On Wed, Jul 2, 2014 at 4:15 PM, Pierre BLONDEAU
 wrote:
> Le 03/07/2014 00:55, Samuel Just a écrit :
>
>> Ah,
>>
>> ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush
>> /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i >
>> /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d
>> ../ceph/src/osdmaptool: osdmap file
>> 'osd-20_osdmap.13258__0_4E62BB79__none'
>> ../ceph/src/osdmaptool: exported crush map to /tmp/crush20
>> ../ceph/src/osdmaptool: osdmap file
>> 'osd-23_osdmap.13258__0_4E62BB79__none'
>> ../ceph/src/osdmaptool: exported crush map to /tmp/crush23
>> 6d5
>> < tunable chooseleaf_vary_r 1
>>
>> Looks like the chooseleaf_vary_r tunable somehow ended up divergent?
>>
>> Pierre: do you recall how and when that got set?
>
>
> I am not sure to understand, but if I good remember after the update in
> firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I
> see "feature set mismatch" in log.
>
> So if I good remeber, i do : ceph osd crush tunables optimal for the problem
> of "crush map" and I update my client and server kernel to 3.16rc.
>
> It's could be that ?
>
> Pierre
>
>
>> -Sam
>>
>> On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just  wrote:
>>>
>>> Yeah, divergent osdmaps:
>>> 555ed048e73024687fc8b106a570db4f  osd-20_osdmap.13258__0_4E62BB79__none
>>> 6037911f31dc3c18b05499d24dcdbe5c  osd-23_osdmap.13258__0_4E62BB79__none
>>>
>>> Joao: thoughts?
>>> -Sam
>>>
>>> On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
>>>  wrote:

 The files

 When I upgrade :
   ceph-deploy install --stable firefly servers...
   on each servers service ceph restart mon
   on each servers service ceph restart osd
   on each servers service ceph restart mds

 I upgraded from emperor to firefly. After repair, remap, replace, etc
 ... I
 have some PG which pass in peering state.

 I thought why not try the version 0.82, it could solve my problem. (
 It's my mistake ). So, I upgrade from firefly to 0.83 with :
   ceph-deploy install --testing servers...
   ..

 Now, all programs are in version 0.82.
 I have 3 mons, 36 OSD and 3 mds.

 Pierre

 PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta
 directory.

 Le 03/07/2014 00:10, Samuel Just a écrit :

> Also, what version did you upgrade from, and how did you upgrade?
> -Sam
>
> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just 
> wrote:
>>
>>
>> Ok, in current/meta on osd 20 and osd 23, please attach all files
>> matching
>>
>> ^osdmap.13258.*
>>
>> There should be one such file on each osd. (should look something like
>> osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
>> you'll want to use find).
>>
>> What version of ceph is running on your mons?  How many mons do you
>> have?
>> -Sam
>>
>> On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
>>  wrote:
>>>
>>>
>>> Hi,
>>>
>>> I do it, the log files are available here :
>>> https://blondeau.users.greyc.fr/cephlog/debug20/
>>>
>>> The OSD's files are really big +/- 80M .
>>>
>>> After starting the osd.20 some other osd crash. I pass from 31 osd up
>>> to
>>> 16.
>>> I remark that after this the number of down+peering PG decrease from
>>> 367
>>> to
>>> 248. It's "normal" ? May be it's temporary, the time that the cluster
>>> verifies all the PG ?
>>>
>>> Regards
>>> Pierre
>>>
>>> Le 02/07/2014 19:16, Samuel Just a écrit :
>>>
 You should add

 debug osd = 20
 debug filestore = 20
 debug ms = 1

 to the [osd] section of the ceph.conf and restart the osds.  I'd
 like
 all three logs if possible.

 Thanks
 -Sam

 On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
  wrote:
>
>
>
> Yes, but how i do that ?
>
> With a command like that ?
>
> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
> --debug-ms
> 1'
>
> By modify the /etc/ceph/ceph.conf ? This file is really poor
> because I
> use
> udev detection.
>
> When I have made these changes, you want the three log files or
> only
> osd.20's ?
>
> Thank you so much for the help
>
> Regards
> Pierre
>
> Le 01/07/2014 23:51, Samuel Just a écrit :
>
>> Can you reproduce with
>> debug osd = 20
>> debug filestore = 20
>> debug ms = 1
>> ?
>> -Sam
>>
>> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
>>  wrote:
>>>
>

Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Pierre BLONDEAU

Le 03/07/2014 00:55, Samuel Just a écrit :

Ah,

~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush
/tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i >
/tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d
../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none'
../ceph/src/osdmaptool: exported crush map to /tmp/crush20
../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none'
../ceph/src/osdmaptool: exported crush map to /tmp/crush23
6d5
< tunable chooseleaf_vary_r 1

Looks like the chooseleaf_vary_r tunable somehow ended up divergent?

Pierre: do you recall how and when that got set?


I am not sure to understand, but if I good remember after the update in 
firefly, I was in state : HEALTH_WARN crush map has legacy tunables and 
I see "feature set mismatch" in log.


So if I good remeber, i do : ceph osd crush tunables optimal for the 
problem of "crush map" and I update my client and server kernel to 3.16rc.


It's could be that ?

Pierre


-Sam

On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just  wrote:

Yeah, divergent osdmaps:
555ed048e73024687fc8b106a570db4f  osd-20_osdmap.13258__0_4E62BB79__none
6037911f31dc3c18b05499d24dcdbe5c  osd-23_osdmap.13258__0_4E62BB79__none

Joao: thoughts?
-Sam

On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
 wrote:

The files

When I upgrade :
  ceph-deploy install --stable firefly servers...
  on each servers service ceph restart mon
  on each servers service ceph restart osd
  on each servers service ceph restart mds

I upgraded from emperor to firefly. After repair, remap, replace, etc ... I
have some PG which pass in peering state.

I thought why not try the version 0.82, it could solve my problem. (
It's my mistake ). So, I upgrade from firefly to 0.83 with :
  ceph-deploy install --testing servers...
  ..

Now, all programs are in version 0.82.
I have 3 mons, 36 OSD and 3 mds.

Pierre

PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta
directory.

Le 03/07/2014 00:10, Samuel Just a écrit :


Also, what version did you upgrade from, and how did you upgrade?
-Sam

On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just  wrote:


Ok, in current/meta on osd 20 and osd 23, please attach all files
matching

^osdmap.13258.*

There should be one such file on each osd. (should look something like
osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
you'll want to use find).

What version of ceph is running on your mons?  How many mons do you have?
-Sam

On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
 wrote:


Hi,

I do it, the log files are available here :
https://blondeau.users.greyc.fr/cephlog/debug20/

The OSD's files are really big +/- 80M .

After starting the osd.20 some other osd crash. I pass from 31 osd up to
16.
I remark that after this the number of down+peering PG decrease from 367
to
248. It's "normal" ? May be it's temporary, the time that the cluster
verifies all the PG ?

Regards
Pierre

Le 02/07/2014 19:16, Samuel Just a écrit :


You should add

debug osd = 20
debug filestore = 20
debug ms = 1

to the [osd] section of the ceph.conf and restart the osds.  I'd like
all three logs if possible.

Thanks
-Sam

On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
 wrote:



Yes, but how i do that ?

With a command like that ?

ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
--debug-ms
1'

By modify the /etc/ceph/ceph.conf ? This file is really poor because I
use
udev detection.

When I have made these changes, you want the three log files or only
osd.20's ?

Thank you so much for the help

Regards
Pierre

Le 01/07/2014 23:51, Samuel Just a écrit :


Can you reproduce with
debug osd = 20
debug filestore = 20
debug ms = 1
?
-Sam

On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
 wrote:




Hi,

I join :
 - osd.20 is one of osd that I detect which makes crash other
OSD.
 - osd.23 is one of osd which crash when i start osd.20
 - mds, is one of my MDS

I cut log file because they are to big but. All is here :
https://blondeau.users.greyc.fr/cephlog/

Regards

Le 30/06/2014 17:35, Gregory Farnum a écrit :


What's the backtrace from the crashing OSDs?

Keep in mind that as a dev release, it's generally best not to
upgrade
to unnamed versions like 0.82 (but it's probably too late to go
back
now).




I will remember it the next time ;)


-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
 wrote:



Hi,

After the upgrade to firefly, I have some PG in peering state.
I seen the output of 0.82 so I try to upgrade for solved my
problem.

My three MDS crash and some OSD triggers a chain reaction that
kills
other
OSD.
I think my MDS will not start because of the metadata are on the
OSD.

I have 36 OSD on three servers and I identified 5 OSD which makes
crash
others. If i not start their, the cluster passe in reconstructive
state
with
31 OSD but i have 378 in down+peering state.

How can I do ? Would you more in

Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Samuel Just
Ah,

~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush
/tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i >
/tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d
../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none'
../ceph/src/osdmaptool: exported crush map to /tmp/crush20
../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none'
../ceph/src/osdmaptool: exported crush map to /tmp/crush23
6d5
< tunable chooseleaf_vary_r 1

Looks like the chooseleaf_vary_r tunable somehow ended up divergent?

Pierre: do you recall how and when that got set?
-Sam

On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just  wrote:
> Yeah, divergent osdmaps:
> 555ed048e73024687fc8b106a570db4f  osd-20_osdmap.13258__0_4E62BB79__none
> 6037911f31dc3c18b05499d24dcdbe5c  osd-23_osdmap.13258__0_4E62BB79__none
>
> Joao: thoughts?
> -Sam
>
> On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
>  wrote:
>> The files
>>
>> When I upgrade :
>>  ceph-deploy install --stable firefly servers...
>>  on each servers service ceph restart mon
>>  on each servers service ceph restart osd
>>  on each servers service ceph restart mds
>>
>> I upgraded from emperor to firefly. After repair, remap, replace, etc ... I
>> have some PG which pass in peering state.
>>
>> I thought why not try the version 0.82, it could solve my problem. (
>> It's my mistake ). So, I upgrade from firefly to 0.83 with :
>>  ceph-deploy install --testing servers...
>>  ..
>>
>> Now, all programs are in version 0.82.
>> I have 3 mons, 36 OSD and 3 mds.
>>
>> Pierre
>>
>> PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta
>> directory.
>>
>> Le 03/07/2014 00:10, Samuel Just a écrit :
>>
>>> Also, what version did you upgrade from, and how did you upgrade?
>>> -Sam
>>>
>>> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just  wrote:

 Ok, in current/meta on osd 20 and osd 23, please attach all files
 matching

 ^osdmap.13258.*

 There should be one such file on each osd. (should look something like
 osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
 you'll want to use find).

 What version of ceph is running on your mons?  How many mons do you have?
 -Sam

 On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
  wrote:
>
> Hi,
>
> I do it, the log files are available here :
> https://blondeau.users.greyc.fr/cephlog/debug20/
>
> The OSD's files are really big +/- 80M .
>
> After starting the osd.20 some other osd crash. I pass from 31 osd up to
> 16.
> I remark that after this the number of down+peering PG decrease from 367
> to
> 248. It's "normal" ? May be it's temporary, the time that the cluster
> verifies all the PG ?
>
> Regards
> Pierre
>
> Le 02/07/2014 19:16, Samuel Just a écrit :
>
>> You should add
>>
>> debug osd = 20
>> debug filestore = 20
>> debug ms = 1
>>
>> to the [osd] section of the ceph.conf and restart the osds.  I'd like
>> all three logs if possible.
>>
>> Thanks
>> -Sam
>>
>> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
>>  wrote:
>>>
>>>
>>> Yes, but how i do that ?
>>>
>>> With a command like that ?
>>>
>>> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
>>> --debug-ms
>>> 1'
>>>
>>> By modify the /etc/ceph/ceph.conf ? This file is really poor because I
>>> use
>>> udev detection.
>>>
>>> When I have made these changes, you want the three log files or only
>>> osd.20's ?
>>>
>>> Thank you so much for the help
>>>
>>> Regards
>>> Pierre
>>>
>>> Le 01/07/2014 23:51, Samuel Just a écrit :
>>>
 Can you reproduce with
 debug osd = 20
 debug filestore = 20
 debug ms = 1
 ?
 -Sam

 On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
  wrote:
>
>
>
> Hi,
>
> I join :
> - osd.20 is one of osd that I detect which makes crash other
> OSD.
> - osd.23 is one of osd which crash when i start osd.20
> - mds, is one of my MDS
>
> I cut log file because they are to big but. All is here :
> https://blondeau.users.greyc.fr/cephlog/
>
> Regards
>
> Le 30/06/2014 17:35, Gregory Farnum a écrit :
>
>> What's the backtrace from the crashing OSDs?
>>
>> Keep in mind that as a dev release, it's generally best not to
>> upgrade
>> to unnamed versions like 0.82 (but it's probably too late to go
>> back
>> now).
>
>
>
> I will remember it the next time ;)
>
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>> On Mon, Jun 30, 2014 at 8:06 A

Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Samuel Just
Yeah, divergent osdmaps:
555ed048e73024687fc8b106a570db4f  osd-20_osdmap.13258__0_4E62BB79__none
6037911f31dc3c18b05499d24dcdbe5c  osd-23_osdmap.13258__0_4E62BB79__none

Joao: thoughts?
-Sam

On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
 wrote:
> The files
>
> When I upgrade :
>  ceph-deploy install --stable firefly servers...
>  on each servers service ceph restart mon
>  on each servers service ceph restart osd
>  on each servers service ceph restart mds
>
> I upgraded from emperor to firefly. After repair, remap, replace, etc ... I
> have some PG which pass in peering state.
>
> I thought why not try the version 0.82, it could solve my problem. (
> It's my mistake ). So, I upgrade from firefly to 0.83 with :
>  ceph-deploy install --testing servers...
>  ..
>
> Now, all programs are in version 0.82.
> I have 3 mons, 36 OSD and 3 mds.
>
> Pierre
>
> PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta
> directory.
>
> Le 03/07/2014 00:10, Samuel Just a écrit :
>
>> Also, what version did you upgrade from, and how did you upgrade?
>> -Sam
>>
>> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just  wrote:
>>>
>>> Ok, in current/meta on osd 20 and osd 23, please attach all files
>>> matching
>>>
>>> ^osdmap.13258.*
>>>
>>> There should be one such file on each osd. (should look something like
>>> osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
>>> you'll want to use find).
>>>
>>> What version of ceph is running on your mons?  How many mons do you have?
>>> -Sam
>>>
>>> On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
>>>  wrote:

 Hi,

 I do it, the log files are available here :
 https://blondeau.users.greyc.fr/cephlog/debug20/

 The OSD's files are really big +/- 80M .

 After starting the osd.20 some other osd crash. I pass from 31 osd up to
 16.
 I remark that after this the number of down+peering PG decrease from 367
 to
 248. It's "normal" ? May be it's temporary, the time that the cluster
 verifies all the PG ?

 Regards
 Pierre

 Le 02/07/2014 19:16, Samuel Just a écrit :

> You should add
>
> debug osd = 20
> debug filestore = 20
> debug ms = 1
>
> to the [osd] section of the ceph.conf and restart the osds.  I'd like
> all three logs if possible.
>
> Thanks
> -Sam
>
> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
>  wrote:
>>
>>
>> Yes, but how i do that ?
>>
>> With a command like that ?
>>
>> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
>> --debug-ms
>> 1'
>>
>> By modify the /etc/ceph/ceph.conf ? This file is really poor because I
>> use
>> udev detection.
>>
>> When I have made these changes, you want the three log files or only
>> osd.20's ?
>>
>> Thank you so much for the help
>>
>> Regards
>> Pierre
>>
>> Le 01/07/2014 23:51, Samuel Just a écrit :
>>
>>> Can you reproduce with
>>> debug osd = 20
>>> debug filestore = 20
>>> debug ms = 1
>>> ?
>>> -Sam
>>>
>>> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
>>>  wrote:



 Hi,

 I join :
 - osd.20 is one of osd that I detect which makes crash other
 OSD.
 - osd.23 is one of osd which crash when i start osd.20
 - mds, is one of my MDS

 I cut log file because they are to big but. All is here :
 https://blondeau.users.greyc.fr/cephlog/

 Regards

 Le 30/06/2014 17:35, Gregory Farnum a écrit :

> What's the backtrace from the crashing OSDs?
>
> Keep in mind that as a dev release, it's generally best not to
> upgrade
> to unnamed versions like 0.82 (but it's probably too late to go
> back
> now).



 I will remember it the next time ;)

> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
> On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
>  wrote:
>>
>>
>> Hi,
>>
>> After the upgrade to firefly, I have some PG in peering state.
>> I seen the output of 0.82 so I try to upgrade for solved my
>> problem.
>>
>> My three MDS crash and some OSD triggers a chain reaction that
>> kills
>> other
>> OSD.
>> I think my MDS will not start because of the metadata are on the
>> OSD.
>>
>> I have 36 OSD on three servers and I identified 5 OSD which makes
>> crash
>> others. If i not start their, the cluster passe in reconstructive
>> state
>> with
>> 31 OSD but i have 378 in down+peering state.
>>
>> How can I do ? Would you more information ( os, crash log, etc .

Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Samuel Just
Joao: this looks like divergent osdmaps, osd 20 and osd 23 have
differing ideas of the acting set for pg 2.11.  Did we add hashes to
the incremental maps?  What would you want to know from the mons?
-Sam

On Wed, Jul 2, 2014 at 3:10 PM, Samuel Just  wrote:
> Also, what version did you upgrade from, and how did you upgrade?
> -Sam
>
> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just  wrote:
>> Ok, in current/meta on osd 20 and osd 23, please attach all files matching
>>
>> ^osdmap.13258.*
>>
>> There should be one such file on each osd. (should look something like
>> osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
>> you'll want to use find).
>>
>> What version of ceph is running on your mons?  How many mons do you have?
>> -Sam
>>
>> On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
>>  wrote:
>>> Hi,
>>>
>>> I do it, the log files are available here :
>>> https://blondeau.users.greyc.fr/cephlog/debug20/
>>>
>>> The OSD's files are really big +/- 80M .
>>>
>>> After starting the osd.20 some other osd crash. I pass from 31 osd up to 16.
>>> I remark that after this the number of down+peering PG decrease from 367 to
>>> 248. It's "normal" ? May be it's temporary, the time that the cluster
>>> verifies all the PG ?
>>>
>>> Regards
>>> Pierre
>>>
>>> Le 02/07/2014 19:16, Samuel Just a écrit :
>>>
 You should add

 debug osd = 20
 debug filestore = 20
 debug ms = 1

 to the [osd] section of the ceph.conf and restart the osds.  I'd like
 all three logs if possible.

 Thanks
 -Sam

 On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
  wrote:
>
> Yes, but how i do that ?
>
> With a command like that ?
>
> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
> --debug-ms
> 1'
>
> By modify the /etc/ceph/ceph.conf ? This file is really poor because I
> use
> udev detection.
>
> When I have made these changes, you want the three log files or only
> osd.20's ?
>
> Thank you so much for the help
>
> Regards
> Pierre
>
> Le 01/07/2014 23:51, Samuel Just a écrit :
>
>> Can you reproduce with
>> debug osd = 20
>> debug filestore = 20
>> debug ms = 1
>> ?
>> -Sam
>>
>> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
>>  wrote:
>>>
>>>
>>> Hi,
>>>
>>> I join :
>>>- osd.20 is one of osd that I detect which makes crash other OSD.
>>>- osd.23 is one of osd which crash when i start osd.20
>>>- mds, is one of my MDS
>>>
>>> I cut log file because they are to big but. All is here :
>>> https://blondeau.users.greyc.fr/cephlog/
>>>
>>> Regards
>>>
>>> Le 30/06/2014 17:35, Gregory Farnum a écrit :
>>>
 What's the backtrace from the crashing OSDs?

 Keep in mind that as a dev release, it's generally best not to upgrade
 to unnamed versions like 0.82 (but it's probably too late to go back
 now).
>>>
>>>
>>> I will remember it the next time ;)
>>>
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com

 On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
  wrote:
>
> Hi,
>
> After the upgrade to firefly, I have some PG in peering state.
> I seen the output of 0.82 so I try to upgrade for solved my problem.
>
> My three MDS crash and some OSD triggers a chain reaction that kills
> other
> OSD.
> I think my MDS will not start because of the metadata are on the OSD.
>
> I have 36 OSD on three servers and I identified 5 OSD which makes
> crash
> others. If i not start their, the cluster passe in reconstructive
> state
> with
> 31 OSD but i have 378 in down+peering state.
>
> How can I do ? Would you more information ( os, crash log, etc ... )
> ?
>
> Regards
>>>
>>>
>>> --
>>> --
>>> Pierre BLONDEAU
>>> Administrateur Systèmes & réseaux
>>> Université de Caen
>>> Laboratoire GREYC, Département d'informatique
>>>
>>> tel : 02 31 56 75 42
>>> bureau  : Campus 2, Science 3, 406
>>> --
>>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Samuel Just
Also, what version did you upgrade from, and how did you upgrade?
-Sam

On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just  wrote:
> Ok, in current/meta on osd 20 and osd 23, please attach all files matching
>
> ^osdmap.13258.*
>
> There should be one such file on each osd. (should look something like
> osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
> you'll want to use find).
>
> What version of ceph is running on your mons?  How many mons do you have?
> -Sam
>
> On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
>  wrote:
>> Hi,
>>
>> I do it, the log files are available here :
>> https://blondeau.users.greyc.fr/cephlog/debug20/
>>
>> The OSD's files are really big +/- 80M .
>>
>> After starting the osd.20 some other osd crash. I pass from 31 osd up to 16.
>> I remark that after this the number of down+peering PG decrease from 367 to
>> 248. It's "normal" ? May be it's temporary, the time that the cluster
>> verifies all the PG ?
>>
>> Regards
>> Pierre
>>
>> Le 02/07/2014 19:16, Samuel Just a écrit :
>>
>>> You should add
>>>
>>> debug osd = 20
>>> debug filestore = 20
>>> debug ms = 1
>>>
>>> to the [osd] section of the ceph.conf and restart the osds.  I'd like
>>> all three logs if possible.
>>>
>>> Thanks
>>> -Sam
>>>
>>> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
>>>  wrote:

 Yes, but how i do that ?

 With a command like that ?

 ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
 --debug-ms
 1'

 By modify the /etc/ceph/ceph.conf ? This file is really poor because I
 use
 udev detection.

 When I have made these changes, you want the three log files or only
 osd.20's ?

 Thank you so much for the help

 Regards
 Pierre

 Le 01/07/2014 23:51, Samuel Just a écrit :

> Can you reproduce with
> debug osd = 20
> debug filestore = 20
> debug ms = 1
> ?
> -Sam
>
> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
>  wrote:
>>
>>
>> Hi,
>>
>> I join :
>>- osd.20 is one of osd that I detect which makes crash other OSD.
>>- osd.23 is one of osd which crash when i start osd.20
>>- mds, is one of my MDS
>>
>> I cut log file because they are to big but. All is here :
>> https://blondeau.users.greyc.fr/cephlog/
>>
>> Regards
>>
>> Le 30/06/2014 17:35, Gregory Farnum a écrit :
>>
>>> What's the backtrace from the crashing OSDs?
>>>
>>> Keep in mind that as a dev release, it's generally best not to upgrade
>>> to unnamed versions like 0.82 (but it's probably too late to go back
>>> now).
>>
>>
>> I will remember it the next time ;)
>>
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>> On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
>>>  wrote:

 Hi,

 After the upgrade to firefly, I have some PG in peering state.
 I seen the output of 0.82 so I try to upgrade for solved my problem.

 My three MDS crash and some OSD triggers a chain reaction that kills
 other
 OSD.
 I think my MDS will not start because of the metadata are on the OSD.

 I have 36 OSD on three servers and I identified 5 OSD which makes
 crash
 others. If i not start their, the cluster passe in reconstructive
 state
 with
 31 OSD but i have 378 in down+peering state.

 How can I do ? Would you more information ( os, crash log, etc ... )
 ?

 Regards
>>
>>
>> --
>> --
>> Pierre BLONDEAU
>> Administrateur Systèmes & réseaux
>> Université de Caen
>> Laboratoire GREYC, Département d'informatique
>>
>> tel : 02 31 56 75 42
>> bureau  : Campus 2, Science 3, 406
>> --
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Samuel Just
Ok, in current/meta on osd 20 and osd 23, please attach all files matching

^osdmap.13258.*

There should be one such file on each osd. (should look something like
osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
you'll want to use find).

What version of ceph is running on your mons?  How many mons do you have?
-Sam

On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
 wrote:
> Hi,
>
> I do it, the log files are available here :
> https://blondeau.users.greyc.fr/cephlog/debug20/
>
> The OSD's files are really big +/- 80M .
>
> After starting the osd.20 some other osd crash. I pass from 31 osd up to 16.
> I remark that after this the number of down+peering PG decrease from 367 to
> 248. It's "normal" ? May be it's temporary, the time that the cluster
> verifies all the PG ?
>
> Regards
> Pierre
>
> Le 02/07/2014 19:16, Samuel Just a écrit :
>
>> You should add
>>
>> debug osd = 20
>> debug filestore = 20
>> debug ms = 1
>>
>> to the [osd] section of the ceph.conf and restart the osds.  I'd like
>> all three logs if possible.
>>
>> Thanks
>> -Sam
>>
>> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
>>  wrote:
>>>
>>> Yes, but how i do that ?
>>>
>>> With a command like that ?
>>>
>>> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
>>> --debug-ms
>>> 1'
>>>
>>> By modify the /etc/ceph/ceph.conf ? This file is really poor because I
>>> use
>>> udev detection.
>>>
>>> When I have made these changes, you want the three log files or only
>>> osd.20's ?
>>>
>>> Thank you so much for the help
>>>
>>> Regards
>>> Pierre
>>>
>>> Le 01/07/2014 23:51, Samuel Just a écrit :
>>>
 Can you reproduce with
 debug osd = 20
 debug filestore = 20
 debug ms = 1
 ?
 -Sam

 On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
  wrote:
>
>
> Hi,
>
> I join :
>- osd.20 is one of osd that I detect which makes crash other OSD.
>- osd.23 is one of osd which crash when i start osd.20
>- mds, is one of my MDS
>
> I cut log file because they are to big but. All is here :
> https://blondeau.users.greyc.fr/cephlog/
>
> Regards
>
> Le 30/06/2014 17:35, Gregory Farnum a écrit :
>
>> What's the backtrace from the crashing OSDs?
>>
>> Keep in mind that as a dev release, it's generally best not to upgrade
>> to unnamed versions like 0.82 (but it's probably too late to go back
>> now).
>
>
> I will remember it the next time ;)
>
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>> On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
>>  wrote:
>>>
>>> Hi,
>>>
>>> After the upgrade to firefly, I have some PG in peering state.
>>> I seen the output of 0.82 so I try to upgrade for solved my problem.
>>>
>>> My three MDS crash and some OSD triggers a chain reaction that kills
>>> other
>>> OSD.
>>> I think my MDS will not start because of the metadata are on the OSD.
>>>
>>> I have 36 OSD on three servers and I identified 5 OSD which makes
>>> crash
>>> others. If i not start their, the cluster passe in reconstructive
>>> state
>>> with
>>> 31 OSD but i have 378 in down+peering state.
>>>
>>> How can I do ? Would you more information ( os, crash log, etc ... )
>>> ?
>>>
>>> Regards
>
>
> --
> --
> Pierre BLONDEAU
> Administrateur Systèmes & réseaux
> Université de Caen
> Laboratoire GREYC, Département d'informatique
>
> tel : 02 31 56 75 42
> bureau  : Campus 2, Science 3, 406
> --
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bypass Cache-Tiering for special reads (Backups)

2014-07-02 Thread Marc
Hi,

I was wondering, having a cache pool in front of an RBD pool is all fine
and dandy, but imagine you want to pull backups of all your VMs (or one
of them, or multiple...). Going to the cache for all those reads isn't
only pointless, it'll also potentially fill up the cache and possibly
evict actually frequently used data. Which got me thinking... wouldn't
it be nifty if there was a special way of doing specific backup reads
where you'd bypass the cache, ensuring the dirty cache contents get
written to cold pool first? Or at least doing special reads where a
cache-miss won't actually cache the requested data?

AFAIK the backup routine for an RBD-backed KVM usually involves creating
a snapshot of the RBD and putting that into a backup storage/tape, all
done via librbd/API.

Maybe something like that even already exists?


KR,
Marc
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Pierre BLONDEAU

Hi,

I do it, the log files are available here : 
https://blondeau.users.greyc.fr/cephlog/debug20/


The OSD's files are really big +/- 80M .

After starting the osd.20 some other osd crash. I pass from 31 osd up to 
16. I remark that after this the number of down+peering PG decrease from 
367 to 248. It's "normal" ? May be it's temporary, the time that the 
cluster verifies all the PG ?


Regards
Pierre

Le 02/07/2014 19:16, Samuel Just a écrit :

You should add

debug osd = 20
debug filestore = 20
debug ms = 1

to the [osd] section of the ceph.conf and restart the osds.  I'd like
all three logs if possible.

Thanks
-Sam

On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
 wrote:

Yes, but how i do that ?

With a command like that ?

ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms
1'

By modify the /etc/ceph/ceph.conf ? This file is really poor because I use
udev detection.

When I have made these changes, you want the three log files or only
osd.20's ?

Thank you so much for the help

Regards
Pierre

Le 01/07/2014 23:51, Samuel Just a écrit :


Can you reproduce with
debug osd = 20
debug filestore = 20
debug ms = 1
?
-Sam

On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
 wrote:


Hi,

I join :
   - osd.20 is one of osd that I detect which makes crash other OSD.
   - osd.23 is one of osd which crash when i start osd.20
   - mds, is one of my MDS

I cut log file because they are to big but. All is here :
https://blondeau.users.greyc.fr/cephlog/

Regards

Le 30/06/2014 17:35, Gregory Farnum a écrit :


What's the backtrace from the crashing OSDs?

Keep in mind that as a dev release, it's generally best not to upgrade
to unnamed versions like 0.82 (but it's probably too late to go back
now).


I will remember it the next time ;)


-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
 wrote:

Hi,

After the upgrade to firefly, I have some PG in peering state.
I seen the output of 0.82 so I try to upgrade for solved my problem.

My three MDS crash and some OSD triggers a chain reaction that kills
other
OSD.
I think my MDS will not start because of the metadata are on the OSD.

I have 36 OSD on three servers and I identified 5 OSD which makes crash
others. If i not start their, the cluster passe in reconstructive state
with
31 OSD but i have 378 in down+peering state.

How can I do ? Would you more information ( os, crash log, etc ... ) ?

Regards


--
--
Pierre BLONDEAU
Administrateur Systèmes & réseaux
Université de Caen
Laboratoire GREYC, Département d'informatique

tel : 02 31 56 75 42
bureau  : Campus 2, Science 3, 406
--



smime.p7s
Description: Signature cryptographique S/MIME
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5

2014-07-02 Thread Brian Lovett

Alright, I was finally able to get this resolved without adding another node. 
As pointed out, even though I had a config variable that defined the default 
replicated size at 2, ceph for some reason created the default pools (data, 
and metadata) with a value of 3. After digging trough documentation I found:

ceph osd dump | grep 'replicated size'

Which shows the replicated size for each pool. My newly created pools ssd and 
sata were correctly configured, but the default pools in ceph were not.

I was then able to set: ceph osd pool set metadata size 2

and

ceph osd pool set data size 2

Finally, my cluster is healthy!

Not exactly straight forward installation and troubleshooting, but it works. 
Thanks for the help and tips along the way. The advice definitely led me in 
the right direction.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Gregory Farnum
On Wed, Jul 2, 2014 at 12:44 PM, Stefan Priebe  wrote:
> Hi Greg,
>
> Am 02.07.2014 21:36, schrieb Gregory Farnum:
>>
>> On Wed, Jul 2, 2014 at 12:00 PM, Stefan Priebe 
>> wrote:
>>>
>>>
>>> Am 02.07.2014 16:00, schrieb Gregory Farnum:
>>>
 Yeah, it's fighting for attention with a lot of other urgent stuff. :(

 Anyway, even if you can't look up any details or reproduce at this
 time, I'm sure you know what shape the cluster was (number of OSDs,
 running on SSDs or hard drives, etc), and that would be useful
 guidance. :)
>>>
>>>
>>>
>>> Sure
>>>
>>> Number of OSDs: 24
>>> Each OSD has an SSD capable of doing tested with fio before installing
>>> ceph
>>> (70.000 iop/s 4k write, 580MB/s seq. write 1MB blocks)
>>>
>>> Single Xeon E5-1620 v2 @ 3.70GHz
>>>
>>> 48GB RAM
>>
>>
>> Awesome, thanks.
>>
>> I went through the changelogs on the librados/, osdc/, and msg/
>> directories to see if I could find any likely change candidates
>> between Dumpling and Firefly and couldn't see any issues. :( But I
>> suspect that the sharding changes coming will more than make up the
>> difference, so you might want to plan on checking that out when it
>> arrives, even if you don't want to deploy it to production.n
>
>
> To which changes do you refer? Will they be part or backported of/to
> firefly?

Yehuda's got a pretty big patchset that is sharding up the "big
Objecter lock" into many smaller mutexes and RWLocks that will make it
much more parallel. He's on vacation just now but I understand it's
almost ready to merge; I don't think it'll be suitable for backport to
firefly, though (it's big).
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Stefan Priebe

Hi Greg,

Am 02.07.2014 21:36, schrieb Gregory Farnum:

On Wed, Jul 2, 2014 at 12:00 PM, Stefan Priebe  wrote:


Am 02.07.2014 16:00, schrieb Gregory Farnum:


Yeah, it's fighting for attention with a lot of other urgent stuff. :(

Anyway, even if you can't look up any details or reproduce at this
time, I'm sure you know what shape the cluster was (number of OSDs,
running on SSDs or hard drives, etc), and that would be useful
guidance. :)



Sure

Number of OSDs: 24
Each OSD has an SSD capable of doing tested with fio before installing ceph
(70.000 iop/s 4k write, 580MB/s seq. write 1MB blocks)

Single Xeon E5-1620 v2 @ 3.70GHz

48GB RAM


Awesome, thanks.

I went through the changelogs on the librados/, osdc/, and msg/
directories to see if I could find any likely change candidates
between Dumpling and Firefly and couldn't see any issues. :( But I
suspect that the sharding changes coming will more than make up the
difference, so you might want to plan on checking that out when it
arrives, even if you don't want to deploy it to production.n


To which changes do you refer? Will they be part or backported of/to 
firefly?



-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Gregory Farnum
On Wed, Jul 2, 2014 at 12:00 PM, Stefan Priebe  wrote:
>
> Am 02.07.2014 16:00, schrieb Gregory Farnum:
>
>> Yeah, it's fighting for attention with a lot of other urgent stuff. :(
>>
>> Anyway, even if you can't look up any details or reproduce at this
>> time, I'm sure you know what shape the cluster was (number of OSDs,
>> running on SSDs or hard drives, etc), and that would be useful
>> guidance. :)
>
>
> Sure
>
> Number of OSDs: 24
> Each OSD has an SSD capable of doing tested with fio before installing ceph
> (70.000 iop/s 4k write, 580MB/s seq. write 1MB blocks)
>
> Single Xeon E5-1620 v2 @ 3.70GHz
>
> 48GB RAM

Awesome, thanks.

I went through the changelogs on the librados/, osdc/, and msg/
directories to see if I could find any likely change candidates
between Dumpling and Firefly and couldn't see any issues. :( But I
suspect that the sharding changes coming will more than make up the
difference, so you might want to plan on checking that out when it
arrives, even if you don't want to deploy it to production.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Stefan Priebe


Am 02.07.2014 16:00, schrieb Gregory Farnum:

Yeah, it's fighting for attention with a lot of other urgent stuff. :(

Anyway, even if you can't look up any details or reproduce at this
time, I'm sure you know what shape the cluster was (number of OSDs,
running on SSDs or hard drives, etc), and that would be useful
guidance. :)


Sure

Number of OSDs: 24
Each OSD has an SSD capable of doing tested with fio before installing 
ceph (70.000 iop/s 4k write, 580MB/s seq. write 1MB blocks)


Single Xeon E5-1620 v2 @ 3.70GHz

48GB RAM

Stefan


-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Wed, Jul 2, 2014 at 6:12 AM, Stefan Priebe - Profihost AG
 wrote:


Am 02.07.2014 15:07, schrieb Haomai Wang:

Could you give some perf counter from rbd client side? Such as op latency?


Sorry haven't any counters. As this mail was some days unseen - i
thought nobody has an idea or could help.

Stefan


On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG
 wrote:

Am 02.07.2014 00:51, schrieb Gregory Farnum:

On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG
 wrote:

Hi Greg,

Am 26.06.2014 02:17, schrieb Gregory Farnum:

Sorry we let this drop; we've all been busy traveling and things.

There have been a lot of changes to librados between Dumpling and
Firefly, but we have no idea what would have made it slower. Can you
provide more details about how you were running these tests?


it's just a normal fio run:
fio --ioengine=rbd --bs=4k --name=foo --invalidate=0
--readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor
--runtime=90 --numjobs=32 --direct=1 --group

Running one time with firefly libs and one time with dumpling libs.
Traget is always the same pool on a firefly ceph storage.


What's the backing cluster you're running against? What kind of CPU
usage do you see with both? 25k IOPS is definitely getting up there,
but I'd like some guidance about whether we're looking for a reduction
in parallelism, or an increase in per-op costs, or something else.


Hi Greg,

i don't have that test cluster anymore. It had to go into production
with dumpling.

So i can't tell you.

Sorry.

Stefan


-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Samuel Just
You should add

debug osd = 20
debug filestore = 20
debug ms = 1

to the [osd] section of the ceph.conf and restart the osds.  I'd like
all three logs if possible.

Thanks
-Sam

On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
 wrote:
> Yes, but how i do that ?
>
> With a command like that ?
>
> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms
> 1'
>
> By modify the /etc/ceph/ceph.conf ? This file is really poor because I use
> udev detection.
>
> When I have made these changes, you want the three log files or only
> osd.20's ?
>
> Thank you so much for the help
>
> Regards
> Pierre
>
> Le 01/07/2014 23:51, Samuel Just a écrit :
>
>> Can you reproduce with
>> debug osd = 20
>> debug filestore = 20
>> debug ms = 1
>> ?
>> -Sam
>>
>> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
>>  wrote:
>>>
>>> Hi,
>>>
>>> I join :
>>>   - osd.20 is one of osd that I detect which makes crash other OSD.
>>>   - osd.23 is one of osd which crash when i start osd.20
>>>   - mds, is one of my MDS
>>>
>>> I cut log file because they are to big but. All is here :
>>> https://blondeau.users.greyc.fr/cephlog/
>>>
>>> Regards
>>>
>>> Le 30/06/2014 17:35, Gregory Farnum a écrit :
>>>
 What's the backtrace from the crashing OSDs?

 Keep in mind that as a dev release, it's generally best not to upgrade
 to unnamed versions like 0.82 (but it's probably too late to go back
 now).
>>>
>>>
>>>
>>> I will remember it the next time ;)
>>>
>>>
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com


 On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
  wrote:
>
>
> Hi,
>
> After the upgrade to firefly, I have some PG in peering state.
> I seen the output of 0.82 so I try to upgrade for solved my problem.
>
> My three MDS crash and some OSD triggers a chain reaction that kills
> other
> OSD.
> I think my MDS will not start because of the metadata are on the OSD.
>
> I have 36 OSD on three servers and I identified 5 OSD which makes crash
> others. If i not start their, the cluster passe in reconstructive state
> with
> 31 OSD but i have 378 in down+peering state.
>
> How can I do ? Would you more information ( os, crash log, etc ... ) ?
>
> Regards
>
> --
> --
> Pierre BLONDEAU
> Administrateur Systèmes & réseaux
> Université de Caen
> Laboratoire GREYC, Département d'informatique
>
> tel : 02 31 56 75 42
> bureau  : Campus 2, Science 3, 406
> --
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>>>
>>>
>>> --
>>> --
>>> Pierre BLONDEAU
>>> Administrateur Systèmes & réseaux
>>> Université de Caen
>>> Laboratoire GREYC, Département d'informatique
>>>
>>> tel : 02 31 56 75 42
>>> bureau  : Campus 2, Science 3, 406
>>> --
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>
>
> --
> --
> Pierre BLONDEAU
> Administrateur Systèmes & réseaux
> Université de Caen
> Laboratoire GREYC, Département d'informatique
>
> tel : 02 31 56 75 42
> bureau  : Campus 2, Science 3, 406
> --
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [ANN] ceph-deploy 1.5.7 released

2014-07-02 Thread Alfredo Deza
Hi All,

There is a new bug-fix release of ceph-deploy, the easy deployment tool
for Ceph.

The full list of fixes for this release can be found in the changelog:

http://ceph.com/ceph-deploy/docs/changelog.html#id1


Make sure you update!


-Alfredo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD layering

2014-07-02 Thread NEVEU Stephane
Ok thanks :
mon 'allow r' osd 'allow  class-read object_prefix rbd_children, allow rwx 
pool=kvm1, allow rx  pool=templates'
seems to be enough.

One more question about RBD layering :
I've made a clone (child) in my pool 'kvm' from my protected snapshot in my 
pool 'template' and after launching my vm, the whole fs is read-only.
Am I wrong thinking the protected snapshot acts like the base image and 
additional data will be store in the clone ?

>Objet : Re: [ceph-users] RBD layering

On 07/02/2014 10:08 AM, NEVEU Stephane wrote:
>> Hi all,
>>
>> I'm missing around with "rbd layering" to store some ready-to-use 
>> templates (format 2) in a template pool :
>>
>> /Rbd -p templates ls/
>>
>> /Ubuntu1404/
>>
>> /Centos6/
>>
>> /./
>>
>> //
>>
>> /Rbd snap create templates/Ubuntu1404@Ubuntu1404-snap-protected/
>>
>> /Rbd snap protect templates/Ubuntu1404@Ubuntu1404-snap-protected/
>>
>> /Rbd clone templates/Ubuntu1404@Ubuntu1404-snap-protected
>> kvm1/Ubuntu1404-snap-protected-children/
>>
>> My libvirt key is created with :
>>
>> /Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow 
>>class-read object_prefix rbd_children, allow rwx pool=kvm1, allow r  
>>pool=templates'/
>>
>> //
>>
>> But read permission for the pool 'templates' seems to be not enough, 
>> libvirt is complaining "RBD cannot access the rbd disk 
>> kvm1/Ubuntu1404-snap-protected-children" so :
>>
>> /Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow 
>> class-read object_prefix rbd_children, allow rwx pool=kvm1, allow
>> *rwx* pool=templates'/
>>

>I think that rx should be enough instead of rwx. Could you try that?

>Wido

Hi Wido, thank you:
I'm trying this :
Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow  class-read 
object_prefix rbd_children, allow rwx pool=kvm1, allow rx  pool=templates'
Error EIVAL: key for client.kvm1 exists but cap osd does not match

Is there another way to directly modify caps ? or do I need to suppress the key 
and re-create it ?

> //
>
> It's actually working but it's probably a bit too much, because I 
> don't want people to be able to modify the parent template so do I 
> have a better choice ?
>
> Libvirt seems to be happier but this clone is read-only and I want now 
> people to use this OS image as a base file and write differences in a 
> backing file (like with qemu . -b .).
>
> How can I do such a thing ? or maybe I'm doing it in a wrong way. any help ?

Am I clear enough here ?

>
> Thanks
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


--
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5

2014-07-02 Thread Brian Lovett
Christian Balzer  writes:

> Read EVERYTHING you can find about crushmap rules.
> 
> The quickstart (I think) talks about 3 storage nodes, not OSDs.
> 
> Ceph is quite good when it comes to defining failure domains, the default
> is to segregate at the storage node level.
> What good is a replication of 3 when all 3 OSDs are on the same host?


Agreed, which is why I had defined the default as 2 replicas. I had hoped that 
this would work, but I will be adding a third host hopefully today or 
tomorrow. hopefully that takes care of the issue. I'll try another fresh 
install and see if I can get things going.




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5

2014-07-02 Thread Christian Balzer
On Wed, 2 Jul 2014 14:25:49 + (UTC) Brian Lovett wrote:

> Christian Balzer  writes:
> 
> 
> > So either make sure these pools really have a replication of 2 by
> > deleting and re-creating them or add a third storage node.
> 
> 
> 
> I just executed "ceph osd pool set {POOL} size 2" for both pools.
> Anything else I need to do? I still don't see any changes to the status
> of the cluster. We're adding a 3rd storage cluster, but why is it that
> this is an issue? I don't see anything anywhere that says you have to
> have a minimum number of osd's for ceph to function. Even the quick
> start only has 3, so I assumed 8 would be fine as well.
> 

Read EVERYTHING you can find about crushmap rules.

The quickstart (I think) talks about 3 storage nodes, not OSDs.

Ceph is quite good when it comes to defining failure domains, the default
is to segregate at the storage node level.
What good is a replication of 3 when all 3 OSDs are on the same host?

Christian
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issues upgrading from 0.72.x (emperor) to 0.81.x (firefly)

2014-07-02 Thread Sylvain Munaut
Hi,


> I can't help you with packaging issues, but i can tell you that the
> rbdmap executable got moved to a different package at some point, but
> I believe the official ones handle it properly.

I'll see tonight when doing the other nodes. Maybe it's a result of
using dist-upgrade rather than just "upgrade" + "install ceph".


> And I'm just guessing here (like I said, can't help with packaging),
> but I think the deleted /etc/ceph is a result of the force-overwrite
> option you used.

Nope, that happened before I used that option :p


>> Now it might be "normal", but being the production cluster, I can't
>> risk and upgrading more than half the mons if I'm not sure this is
>> indeed normal and not a symptom that the install/update failed and
>> that the mon is not actually working.
>
> That's not normal. A first guess is that you didn't give the new
> monitor the same keyring as the old ones, but I couldn't say for sure
> without more info. Turn up logging and post it somewhere?

jao on IRC just debugged this and it turns out that you have to
upgrade the leader monitor first because of a message (MForward)
incompatibility between the two versions (due to
b4fbe4f81348be74c654f3dae1c20a961b99c895 I think).


Cheers,

   Sylvain
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mixing CEPH versions on new ceph nodes...

2014-07-02 Thread Wido den Hollander

On 07/02/2014 04:08 PM, Andrija Panic wrote:

Hi,

I have existing CEPH cluster of 3 nodes, versions 0.72.2

I'm in a process of installing CEPH on 4th node, but now CEPH version is
0.80.1

Will this make problems running mixed CEPH versions ?



No, but the recommendation is not to have this running for a very long 
period. Try to upgrade all nodes to the same version within a reasonable 
amount of time.




I intend to upgrade CEPH on exsiting 3 nodes anyway ?
Recommended steps ?



Always upgrade the monitors first! Then to the OSDs one by one.


Thanks

--

Andrija Panić


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5

2014-07-02 Thread Brian Lovett
Gregory Farnum  writes:

> 
> On Tue, Jul 1, 2014 at 1:26 PM, Brian Lovett
>  wrote:
> >   "profile": "bobtail",
> 
> Okay. That's unusual. What's the oldest client you need to support,
> and what Ceph version are you using? You probably want to set the
> crush tunables to "optimal"; the "bobtail" ones are going to have all
> kinds of issues with a small map like this. (Specifically, a map where
> the number of buckets/items at each level is similar to the number of
> requested replicas.)
> -Greg
> Software Engineer #42http://inktank.com | http://ceph.com
> 


Ok, I issued: ceph osd crush tunables optimal

Shouldn't that be the default though since I just did this as a fresh install? 
The cluster status hasn't changed though.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5

2014-07-02 Thread Brian Lovett
Christian Balzer  writes:


> So either make sure these pools really have a replication of 2 by deleting
> and re-creating them or add a third storage node.



I just executed "ceph osd pool set {POOL} size 2" for both pools. Anything 
else I need to do? I still don't see any changes to the status of the cluster. 
We're adding a 3rd storage cluster, but why is it that this is an issue? I 
don't see anything anywhere that says you have to have a minimum number of 
osd's for ceph to function. Even the quick start only has 3, so I assumed 8 
would be fine as well.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issues upgrading from 0.72.x (emperor) to 0.81.x (firefly)

2014-07-02 Thread Gregory Farnum
On Wed, Jul 2, 2014 at 6:18 AM, Sylvain Munaut
 wrote:
> Hi,
>
>
> I'm having a couple of issues during this update. On the test cluster
> it went fine, but when running it on production I have a few issues.
> (I guess there is some subtle difference I missed, I updated the test
> one back when emperor came out).
>
> For reference, I'm on ubuntu precise, I use self-built packages
> (because I'm hitting bugs that are not fixed in the latest official
> ones, but there is no change whatsoever to the debian/ directory
> except the changelog and they're built with the dpkg-buildpackage). I
> did a 'apt-get dist-upgrade' to upgrade everything despite the new
> requirements.
>
>
> * The first one is essentially the same as
> http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/19632
>
> dpkg: error processing
> /var/cache/apt/archives/ceph-common_0.80.1-1we3_amd64.deb (--unpack):
>  trying to overwrite '/etc/ceph/rbdmap', which is also in package ceph
> 0.80.1-1we3
>
> apt complained about /etc/ceph/rbdmap being in two package and refused
> to go further. I ended up using -o Dpkg::Options::="--force-overwrite"
>  to force it to go on (because it just left some weird inconsistent
> state and I needed to clean up the mess), but this seems wrong.
>
>
> * The second one is that apparently it ran a "rm /etc/ceph" somehow
> ... on my setup this is not a directory, but a symlink to the real
> place the config is stored (the root partition is considered
> 'expendable', so machine specific config is elsewhere). It also tried
> to erase the /var/log/ceph but failed:

I can't help you with packaging issues, but i can tell you that the
rbdmap executable got moved to a different package at some point, but
I believe the official ones handle it properly.
And I'm just guessing here (like I said, can't help with packaging),
but I think the deleted /etc/ceph is a result of the force-overwrite
option you used.

>
> ---
> Replacing files in old package ceph-common ...
> dpkg: warning: unable to delete old directory '/var/log/ceph':
> Directory not empty
> ---
>
>
> * And finally the upgraded monitor can't join the existing quorum.
> Nowhere in the firefly update notes does it say that the new mon can't
> join an old quorum. When this was the case back in dumpling, there was
> a very explicit explanation but here it just doesn't join and spits
> out "pipe fault" in the logs continuously.
>
> Now it might be "normal", but being the production cluster, I can't
> risk and upgrading more than half the mons if I'm not sure this is
> indeed normal and not a symptom that the install/update failed and
> that the mon is not actually working.

That's not normal. A first guess is that you didn't give the new
monitor the same keyring as the old ones, but I couldn't say for sure
without more info. Turn up logging and post it somewhere?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

>
>
>
> Cheers,
>
> Sylvain
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Mixing CEPH versions on new ceph nodes...

2014-07-02 Thread Andrija Panic
Hi,

I have existing CEPH cluster of 3 nodes, versions 0.72.2

I'm in a process of installing CEPH on 4th node, but now CEPH version is
0.80.1

Will this make problems running mixed CEPH versions ?

I intend to upgrade CEPH on exsiting 3 nodes anyway ?
Recommended steps ?

Thanks

-- 

Andrija Panić
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Gregory Farnum
Yeah, it's fighting for attention with a lot of other urgent stuff. :(

Anyway, even if you can't look up any details or reproduce at this
time, I'm sure you know what shape the cluster was (number of OSDs,
running on SSDs or hard drives, etc), and that would be useful
guidance. :)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Wed, Jul 2, 2014 at 6:12 AM, Stefan Priebe - Profihost AG
 wrote:
>
> Am 02.07.2014 15:07, schrieb Haomai Wang:
>> Could you give some perf counter from rbd client side? Such as op latency?
>
> Sorry haven't any counters. As this mail was some days unseen - i
> thought nobody has an idea or could help.
>
> Stefan
>
>> On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Am 02.07.2014 00:51, schrieb Gregory Farnum:
 On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG
  wrote:
> Hi Greg,
>
> Am 26.06.2014 02:17, schrieb Gregory Farnum:
>> Sorry we let this drop; we've all been busy traveling and things.
>>
>> There have been a lot of changes to librados between Dumpling and
>> Firefly, but we have no idea what would have made it slower. Can you
>> provide more details about how you were running these tests?
>
> it's just a normal fio run:
> fio --ioengine=rbd --bs=4k --name=foo --invalidate=0
> --readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor
> --runtime=90 --numjobs=32 --direct=1 --group
>
> Running one time with firefly libs and one time with dumpling libs.
> Traget is always the same pool on a firefly ceph storage.

 What's the backing cluster you're running against? What kind of CPU
 usage do you see with both? 25k IOPS is definitely getting up there,
 but I'd like some guidance about whether we're looking for a reduction
 in parallelism, or an increase in per-op costs, or something else.
>>>
>>> Hi Greg,
>>>
>>> i don't have that test cluster anymore. It had to go into production
>>> with dumpling.
>>>
>>> So i can't tell you.
>>>
>>> Sorry.
>>>
>>> Stefan
>>>
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com
 --
 To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Hi,


> Did you also recreate the journal?!

It was a journal file and got re-created automatically.

Cheers,

   Sylvain
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Smart Weblications GmbH
Am 01.07.2014 17:48, schrieb Sylvain Munaut:
> Hi,
> 
> 
> As an exercise, I killed an OSD today, just killed the process and
> removed its data directory.
> 
> To recreate it, I recreated an empty data dir, then
> 
> ceph-osd -c /etc/ceph/ceph.conf -i 3 --monmap /tmp/monmap --mkfs
> 
> (I tried with and without giving the monmap).
> 
> I then restored the keyring file (from a backup) in the
> /var/lib/osd/ceph-3/keyring
> 
> And then I start the process, and it starts fine. http://pastebin.com/TPzNth6P
> I even see one active tcp connection to a mon from that process.
> 
> But the osd never becomes "up" or do anything ...
> 

Did you also recreate the journal?!


-- 

Mit freundlichen Grüßen,


Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Issues upgrading from 0.72.x (emperor) to 0.81.x (firefly)

2014-07-02 Thread Sylvain Munaut
Hi,


I'm having a couple of issues during this update. On the test cluster
it went fine, but when running it on production I have a few issues.
(I guess there is some subtle difference I missed, I updated the test
one back when emperor came out).

For reference, I'm on ubuntu precise, I use self-built packages
(because I'm hitting bugs that are not fixed in the latest official
ones, but there is no change whatsoever to the debian/ directory
except the changelog and they're built with the dpkg-buildpackage). I
did a 'apt-get dist-upgrade' to upgrade everything despite the new
requirements.


* The first one is essentially the same as
http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/19632

dpkg: error processing
/var/cache/apt/archives/ceph-common_0.80.1-1we3_amd64.deb (--unpack):
 trying to overwrite '/etc/ceph/rbdmap', which is also in package ceph
0.80.1-1we3

apt complained about /etc/ceph/rbdmap being in two package and refused
to go further. I ended up using -o Dpkg::Options::="--force-overwrite"
 to force it to go on (because it just left some weird inconsistent
state and I needed to clean up the mess), but this seems wrong.


* The second one is that apparently it ran a "rm /etc/ceph" somehow
... on my setup this is not a directory, but a symlink to the real
place the config is stored (the root partition is considered
'expendable', so machine specific config is elsewhere). It also tried
to erase the /var/log/ceph but failed:

---
Replacing files in old package ceph-common ...
dpkg: warning: unable to delete old directory '/var/log/ceph':
Directory not empty
---


* And finally the upgraded monitor can't join the existing quorum.
Nowhere in the firefly update notes does it say that the new mon can't
join an old quorum. When this was the case back in dumpling, there was
a very explicit explanation but here it just doesn't join and spits
out "pipe fault" in the logs continuously.

Now it might be "normal", but being the production cluster, I can't
risk and upgrading more than half the mons if I'm not sure this is
indeed normal and not a symptom that the install/update failed and
that the mon is not actually working.



Cheers,

Sylvain
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Stefan Priebe - Profihost AG

Am 02.07.2014 15:07, schrieb Haomai Wang:
> Could you give some perf counter from rbd client side? Such as op latency?

Sorry haven't any counters. As this mail was some days unseen - i
thought nobody has an idea or could help.

Stefan

> On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Am 02.07.2014 00:51, schrieb Gregory Farnum:
>>> On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hi Greg,

 Am 26.06.2014 02:17, schrieb Gregory Farnum:
> Sorry we let this drop; we've all been busy traveling and things.
>
> There have been a lot of changes to librados between Dumpling and
> Firefly, but we have no idea what would have made it slower. Can you
> provide more details about how you were running these tests?

 it's just a normal fio run:
 fio --ioengine=rbd --bs=4k --name=foo --invalidate=0
 --readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor
 --runtime=90 --numjobs=32 --direct=1 --group

 Running one time with firefly libs and one time with dumpling libs.
 Traget is always the same pool on a firefly ceph storage.
>>>
>>> What's the backing cluster you're running against? What kind of CPU
>>> usage do you see with both? 25k IOPS is definitely getting up there,
>>> but I'd like some guidance about whether we're looking for a reduction
>>> in parallelism, or an increase in per-op costs, or something else.
>>
>> Hi Greg,
>>
>> i don't have that test cluster anymore. It had to go into production
>> with dumpling.
>>
>> So i can't tell you.
>>
>> Sorry.
>>
>> Stefan
>>
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Haomai Wang
Could you give some perf counter from rbd client side? Such as op latency?

On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG
 wrote:
> Am 02.07.2014 00:51, schrieb Gregory Farnum:
>> On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hi Greg,
>>>
>>> Am 26.06.2014 02:17, schrieb Gregory Farnum:
 Sorry we let this drop; we've all been busy traveling and things.

 There have been a lot of changes to librados between Dumpling and
 Firefly, but we have no idea what would have made it slower. Can you
 provide more details about how you were running these tests?
>>>
>>> it's just a normal fio run:
>>> fio --ioengine=rbd --bs=4k --name=foo --invalidate=0
>>> --readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor
>>> --runtime=90 --numjobs=32 --direct=1 --group
>>>
>>> Running one time with firefly libs and one time with dumpling libs.
>>> Traget is always the same pool on a firefly ceph storage.
>>
>> What's the backing cluster you're running against? What kind of CPU
>> usage do you see with both? 25k IOPS is definitely getting up there,
>> but I'd like some guidance about whether we're looking for a reduction
>> in parallelism, or an increase in per-op costs, or something else.
>
> Hi Greg,
>
> i don't have that test cluster anymore. It had to go into production
> with dumpling.
>
> So i can't tell you.
>
> Sorry.
>
> Stefan
>
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Best Regards,

Wheat
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Stefan Priebe - Profihost AG
Am 02.07.2014 00:51, schrieb Gregory Farnum:
> On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hi Greg,
>>
>> Am 26.06.2014 02:17, schrieb Gregory Farnum:
>>> Sorry we let this drop; we've all been busy traveling and things.
>>>
>>> There have been a lot of changes to librados between Dumpling and
>>> Firefly, but we have no idea what would have made it slower. Can you
>>> provide more details about how you were running these tests?
>>
>> it's just a normal fio run:
>> fio --ioengine=rbd --bs=4k --name=foo --invalidate=0
>> --readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor
>> --runtime=90 --numjobs=32 --direct=1 --group
>>
>> Running one time with firefly libs and one time with dumpling libs.
>> Traget is always the same pool on a firefly ceph storage.
> 
> What's the backing cluster you're running against? What kind of CPU
> usage do you see with both? 25k IOPS is definitely getting up there,
> but I'd like some guidance about whether we're looking for a reduction
> in parallelism, or an increase in per-op costs, or something else.

Hi Greg,

i don't have that test cluster anymore. It had to go into production
with dumpling.

So i can't tell you.

Sorry.

Stefan

> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some OSD and MDS crash

2014-07-02 Thread Pierre BLONDEAU

Yes, but how i do that ?

With a command like that ?

ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 
--debug-ms 1'


By modify the /etc/ceph/ceph.conf ? This file is really poor because I 
use udev detection.


When I have made these changes, you want the three log files or only 
osd.20's ?


Thank you so much for the help

Regards
Pierre

Le 01/07/2014 23:51, Samuel Just a écrit :

Can you reproduce with
debug osd = 20
debug filestore = 20
debug ms = 1
?
-Sam

On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
 wrote:

Hi,

I join :
  - osd.20 is one of osd that I detect which makes crash other OSD.
  - osd.23 is one of osd which crash when i start osd.20
  - mds, is one of my MDS

I cut log file because they are to big but. All is here :
https://blondeau.users.greyc.fr/cephlog/

Regards

Le 30/06/2014 17:35, Gregory Farnum a écrit :


What's the backtrace from the crashing OSDs?

Keep in mind that as a dev release, it's generally best not to upgrade
to unnamed versions like 0.82 (but it's probably too late to go back
now).



I will remember it the next time ;)



-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
 wrote:


Hi,

After the upgrade to firefly, I have some PG in peering state.
I seen the output of 0.82 so I try to upgrade for solved my problem.

My three MDS crash and some OSD triggers a chain reaction that kills
other
OSD.
I think my MDS will not start because of the metadata are on the OSD.

I have 36 OSD on three servers and I identified 5 OSD which makes crash
others. If i not start their, the cluster passe in reconstructive state
with
31 OSD but i have 378 in down+peering state.

How can I do ? Would you more information ( os, crash log, etc ... ) ?

Regards

--
--
Pierre BLONDEAU
Administrateur Systèmes & réseaux
Université de Caen
Laboratoire GREYC, Département d'informatique

tel : 02 31 56 75 42
bureau  : Campus 2, Science 3, 406
--


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
--
Pierre BLONDEAU
Administrateur Systèmes & réseaux
Université de Caen
Laboratoire GREYC, Département d'informatique

tel : 02 31 56 75 42
bureau  : Campus 2, Science 3, 406
--

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
--
Pierre BLONDEAU
Administrateur Systèmes & réseaux
Université de Caen
Laboratoire GREYC, Département d'informatique

tel : 02 31 56 75 42
bureau  : Campus 2, Science 3, 406
--



smime.p7s
Description: Signature cryptographique S/MIME
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD layering

2014-07-02 Thread NEVEU Stephane
>Objet : Re: [ceph-users] RBD layering

On 07/02/2014 10:08 AM, NEVEU Stephane wrote:
>> Hi all,
>>
>> I'm missing around with "rbd layering" to store some ready-to-use 
>> templates (format 2) in a template pool :
>>
>> /Rbd -p templates ls/
>>
>> /Ubuntu1404/
>>
>> /Centos6/
>>
>> /./
>>
>> //
>>
>> /Rbd snap create templates/Ubuntu1404@Ubuntu1404-snap-protected/
>>
>> /Rbd snap protect templates/Ubuntu1404@Ubuntu1404-snap-protected/
>>
>> /Rbd clone templates/Ubuntu1404@Ubuntu1404-snap-protected
>> kvm1/Ubuntu1404-snap-protected-children/
>>
>> My libvirt key is created with :
>>
>> /Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow 
>>class-read object_prefix rbd_children, allow rwx pool=kvm1, allow r 
>> pool=templates'/
>>
>> //
>>
>> But read permission for the pool 'templates' seems to be not enough, 
>> libvirt is complaining "RBD cannot access the rbd disk 
>> kvm1/Ubuntu1404-snap-protected-children" so :
>>
>> /Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow 
>> class-read object_prefix rbd_children, allow rwx pool=kvm1, allow 
>> *rwx* pool=templates'/
>>

>I think that rx should be enough instead of rwx. Could you try that?

>Wido

Hi Wido, thank you:
I'm trying this :
Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow  class-read 
object_prefix rbd_children, allow rwx pool=kvm1, allow rx  pool=templates'
Error EIVAL: key for client.kvm1 exists but cap osd does not match

Is there another way to directly modify caps ? or do I need to suppress the key 
and re-create it ?

> //
>
> It's actually working but it's probably a bit too much, because I 
> don't want people to be able to modify the parent template so do I 
> have a better choice ?
>
> Libvirt seems to be happier but this clone is read-only and I want now 
> people to use this OS image as a base file and write differences in a 
> backing file (like with qemu . -b .).
>
> How can I do such a thing ? or maybe I'm doing it in a wrong way. any help ?

Am I clear enough here ?

>
> Thanks
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


--
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Hi Loic,


> By restoring the fsid file from the back, presumably. I did not think of that 
> when you showed the ceph-osd mkfs line, but it makes sense. This is not the 
> ceph fsid.

Yeah, I though about that and I saw fsid and ceph_fsid, but I wasn't
just that just replacing the file would be enough or if the fsid was
used somewhere else and this could yield some weird state ...


> root@bm0015:/var/lib/ceph/osd/ceph-1# grep fsid /etc/ceph/ceph.conf
> fsid = 571bb920-6d85-44d7-9eca-1bc114d1cd75

Weird, I don't have a fsid in my ceph.conf ...


Cheers,

   Sylvain
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Loic Dachary
Hi Sylvain,

On 02/07/2014 11:13, Sylvain Munaut wrote:
> Ah, I finally fond something that looks like an error message :
> 
> 2014-07-02 11:07:57.817269 7f0692e3a700  7 mon.a@0(leader).osd e1147
> preprocess_boot from osd.3 10.192.2.70:6807/9702 clashes with existing
> osd: different fsid (ours: e44c914a-23e9-4756-9713-166de401dec6 ;
> theirs: c1cfff2f-4f2e-4c1d-a947-24bbc6f122ca)
> 
> Not really sure how to fix it though.

By restoring the fsid file from the back, presumably. I did not think of that 
when you showed the ceph-osd mkfs line, but it makes sense. This is not the 
ceph fsid.

root@bm0015:/var/lib/ceph/osd/ceph-1# cat ceph_fsid
571bb920-6d85-44d7-9eca-1bc114d1cd75
root@bm0015:/var/lib/ceph/osd/ceph-1# cat fsid
085a821e-b487-41ef-87ed-dfc6af097a44
root@bm0015:/var/lib/ceph/osd/ceph-1# grep fsid /etc/ceph/ceph.conf
fsid = 571bb920-6d85-44d7-9eca-1bc114d1cd75
root@bm0015:/var/lib/ceph/osd/ceph-1# 

Cheers

> 
> 
> Cheers,
> 
>Sylvain
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Just for future reference, you actually do need to remove the OSD even
if you're going to re-add it like 10 sec later ...

$ ceph osd rm 3
removed osd.3
$ ceph osd create
3

Then it works fine.

No need to remove from crusmap or remove the auth key (you can re-use
both), but you need to remove/add it from the cluster for it to
properly boot.


Cheers,

Sylvain
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD layering

2014-07-02 Thread Wido den Hollander

On 07/02/2014 10:08 AM, NEVEU Stephane wrote:

Hi all,

I’m missing around with “rbd layering” to store some ready-to-use
templates (format 2) in a template pool :

/Rbd –p templates ls/

/Ubuntu1404/

/Centos6/

/…/

//

/Rbd snap create templates/Ubuntu1404@Ubuntu1404-snap-protected/

/Rbd snap protect templates/Ubuntu1404@Ubuntu1404-snap-protected/

/Rbd clone templates/Ubuntu1404@Ubuntu1404-snap-protected
kvm1/Ubuntu1404-snap-protected-children/

My libvirt key is created with :

/Ceph auth get-or-create client.kvm1 mon ‘allow r’ osd ‘allow class-read
object_prefix rbd_children, allow rwx pool=kvm1, allow r pool=templates’/

//

But read permission for the pool ‘templates’ seems to be not enough,
libvirt is complaining “RBD cannot access the rbd disk
kvm1/Ubuntu1404-snap-protected-children” so :

/Ceph auth get-or-create client.kvm1 mon ‘allow r’ osd ‘allow class-read
object_prefix rbd_children, allow rwx pool=kvm1, allow *rwx*
pool=templates’/



I think that rx should be enough instead of rwx. Could you try that?

Wido


//

It’s actually working but it’s probably a bit too much, because I don’t
want people to be able to modify the parent template so do I have a
better choice ?

Libvirt seems to be happier but this clone is read-only and I want now
people to use this OS image as a base file and write differences in a
backing file (like with qemu … -b …).

How can I do such a thing ? or maybe I’m doing it in a wrong way… any help ?

Thanks



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Ah, I finally fond something that looks like an error message :

2014-07-02 11:07:57.817269 7f0692e3a700  7 mon.a@0(leader).osd e1147
preprocess_boot from osd.3 10.192.2.70:6807/9702 clashes with existing
osd: different fsid (ours: e44c914a-23e9-4756-9713-166de401dec6 ;
theirs: c1cfff2f-4f2e-4c1d-a947-24bbc6f122ca)

Not really sure how to fix it though.


Cheers,

   Sylvain
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Solved] Init scripts in Debian not working

2014-07-02 Thread Dieter Scholz

Hello,


I tried the ceph packages from jessie, too. After some time penetrating
Google I think I found the solution. This will probably work for all package
sources.

You have to create an empty marker file named 'sysvinit' in the directories
below /var/lib/ceph/XXX. Then everythink works fine.


ceph-deploy will create those files for you depending on the init
system for the monitors:

$ ls /var/lib/ceph/mon/ceph-node1/
done  keyring  store.db  upstart
$ cat /etc/issue
Ubuntu 12.04 LTS \n \l

And the same would happen for OSDs (upstart file is there):
$ ls /var/lib/ceph/osd/ceph-0/
activate.monmap  active  ceph_fsid  current  fsid  journal  keyring
magic  ready  store_version  superblock  upstart  whoami

The output in ceph-deploy that tells you what init system will be used is this:

[ceph_deploy.osd][INFO  ] Distro info: Ubuntu 12.04 precise
[ceph_deploy.osd][DEBUG ] activating host node1 disk /home/vagrant/foo
[ceph_deploy.osd][DEBUG ] will use init type: upstart

Can you share your ceph-deploy output as you are deploying your OSDs?


Sorry for the late reply.

I deleted my ceph setup I created with ceph-deploy. So I cannot give you 
the info you requested. Perhaps I will create another setup using some 
virtual maschines ...


I found some time to work on my 'manually' created setup (Wheezy, Ceph 
Firefly repositories). Everything works fine and after manually creating 
the empty 'sysvinit' files in the data dirs of the daemons, I am able to 
start/stop the daemons on the same host. In my opinion the sysvinit 
files should be mentioned in the manual install section of the 
documentation.


One problem remains:

At the moment my ceph.conf file does not contain any configuration for 
the single daemons, just a [global] section. The docs mention the 
allhosts switch -a that doesn't work for me. I tried to add daemon 
specific sections in the config file (host entries), but that made no 
difference. Is the -a switch still supported for sysvinit? What entries 
do I have to put into the config to remotely start/stop services?


It would be nice to start/stop the ceph cluster from a single host.

Thanks in advance.

Dieter

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RBD layering

2014-07-02 Thread NEVEU Stephane
Hi all,

I'm missing around with "rbd layering" to store some ready-to-use templates 
(format 2) in a template pool :

Rbd -p templates ls
Ubuntu1404
Centos6
...

Rbd snap create templates/Ubuntu1404@Ubuntu1404-snap-protected
Rbd snap protect templates/Ubuntu1404@Ubuntu1404-snap-protected
Rbd clone templates/Ubuntu1404@Ubuntu1404-snap-protected 
kvm1/Ubuntu1404-snap-protected-children

My libvirt key is created with :
Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow class-read 
object_prefix rbd_children, allow rwx pool=kvm1, allow r pool=templates'

But read permission for the pool 'templates' seems to be not enough, libvirt is 
complaining "RBD cannot access the rbd disk 
kvm1/Ubuntu1404-snap-protected-children" so :
Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow class-read 
object_prefix rbd_children, allow rwx pool=kvm1, allow rwx pool=templates'

It's actually working but it's probably a bit too much, because I don't want 
people to be able to modify the parent template so do I have a better choice ?

Libvirt seems to be happier but this clone is read-only and I want now people 
to use this OS image as a base file and write differences in a backing file 
(like with qemu ... -b ...).
How can I do such a thing ? or maybe I'm doing it in a wrong way... any help ?

Thanks

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Hi,

> Does OSD 3 show when you ceph pg dump ? If so I would look in the logs of an 
> OSD which is participating in the same PG.

It appears at the end but not in any PG, it's now been marked out and
all was redistributed.

osdstat kbused  kbavail kb  hb in   hb out
0   156023521584468831447040[1,2]   []
1   156023521584468831447040[0,2]   []
2   156023521584468831447040[0,1]   []
3   0   0   0   []  []
 sum468070564753406494341120


Cheers,

  Sylvain
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com