[ceph-users] RGW performance test , put 30 thousands objects to one bucket, average latency 3 seconds
hi, everyone when I user rest bench testing RGW with cmd : rest-bench --access-key=ak --secret=sk --bucket=bucket --seconds=360 -t 200 -b 524288 --no-cleanup write I found when RGW call the method "bucket_prepare_op " is very slow. so I observed from 'dump_historic_ops',to see: { "description": "osd_op(client.4211.0:265984 .dir.default.4148.1 [call rgw.bucket_prepare_op] 3.b168f3d0 e37)", "received_at": "2014-07-03 11:07:02.465700", "age": "308.315230", "duration": "3.401743", "type_data": [ "commit sent; apply or cleanup", { "client": "client.4211", "tid": 265984}, [ { "time": "2014-07-03 11:07:02.465852", "event": "waiting_for_osdmap"}, { "time": "2014-07-03 11:07:02.465875", "event": "queue op_wq"}, { "time": "2014-07-03 11:07:03.729087", "event": "reached_pg"}, { "time": "2014-07-03 11:07:03.729120", "event": "started"}, { "time": "2014-07-03 11:07:03.729126", "event": "started"}, { "time": "2014-07-03 11:07:03.804366", "event": "waiting for subops from [19,9]"}, { "time": "2014-07-03 11:07:03.804431", "event": "commit_queued_for_journal_write"}, { "time": "2014-07-03 11:07:03.804509", "event": "write_thread_in_journal_buffer"}, { "time": "2014-07-03 11:07:03.934419", "event": "journaled_completion_queued"}, { "time": "2014-07-03 11:07:05.297282", "event": "sub_op_commit_rec"}, { "time": "2014-07-03 11:07:05.297319", "event": "sub_op_commit_rec"}, { "time": "2014-07-03 11:07:05.311217", "event": "op_applied"}, { "time": "2014-07-03 11:07:05.867384", "event": "op_commit finish lock"}, { "time": "2014-07-03 11:07:05.867385", "event": "op_commit"}, { "time": "2014-07-03 11:07:05.867424", "event": "commit_sent"}, { "time": "2014-07-03 11:07:05.867428", "event": "op_commit finish"}, { "time": "2014-07-03 11:07:05.867443", "event": "done"}]]}]} so I find 2 performance degradation. one is from "queue op_wq" to "reached_pg" , anothor is from "journaled_completion_queued" to "op_commit". and I must stess that there are so many ops write to one bucket object, so how to reduce Latency ? baijia...@126.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Teuthology: Need some input on how to add osd after cluster setup is done using Teuthology
Hi Teuthology Users, Can some help me on how to add osd to a cluster already setup by Teuthology via yaml file. Other than those osd's that are mentioned in roles of yaml file, I want to add additional few osd to the cluster as a part of my scenario. So far I haven't seen any task or any method available in ceph.py. Thanks & Regards, Shambhu Rajak PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Bypass Cache-Tiering for special reads (Backups)
> I was wondering, having a cache pool in front of an RBD pool is all fine > and dandy, but imagine you want to pull backups of all your VMs (or one > of them, or multiple...). Going to the cache for all those reads isn't > only pointless, it'll also potentially fill up the cache and possibly > evict actually frequently used data. Which got me thinking... wouldn't > it be nifty if there was a special way of doing specific backup reads > where you'd bypass the cache, ensuring the dirty cache contents get > written to cold pool first? Or at least doing special reads where a > cache-miss won't actually cache the requested data? > > AFAIK the backup routine for an RBD-backed KVM usually involves creating > a snapshot of the RBD and putting that into a backup storage/tape, all > done via librbd/API. > > Maybe something like that even already exists? When used in the context of OpenStack Cinder, it does: http://ceph.com/docs/next/rbd/rbd-openstack/#configuring-cinder-backup You can have the backup pool use the default crush rules, assuming the default isn't your hot pool. Another option might be to put backups on an erasure coded pool, I'm not sure if that has been tested, but in principle should work since objects composing a snapshot should be immutable. -- Kyle ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph RBD and Backup.
Hi,All. Dear community. How do you make backups CEPH RDB? Thanks -- Fasihov Irek (aka Kataklysm). С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Performance is really bad when I run from vstart.sh
By default the vstart.sh setup would put all data below a directory called “dev” in the source tree. In that case you’re using a single spindle. The vstart script isn’t intended for performance testing. David Zafman Senior Developer http://www.inktank.com http://www.redhat.com On Jul 2, 2014, at 5:48 PM, Zhe Zhang wrote: > Hi folks, > > I run ceph on a single node which contains 25 hard drives and each @7200 RPM. > I write raw data into the array, it achieved 2 GB/s. I presumed the > performance of ceph could go beyond 1 GB/s. but when I compile and ceph code > and run development mode with vstart.sh, the average throughput is only 200 > MB/s for rados bench write. > I suspected it was due to the debug mode when I configure the source code, > and I disable the gdb with ./configure CFLAGS=’-O3’ CXXFLAGS=’O3’ (avoid ‘–g’ > flag). But it did not help at all. > I switched to the repository, and install ceph with ceph-deploy, the > performance achieved 800 MB/s. Since I did not successfully set up the ceph > with ceph-deploy, and there are still some pg at “creating+incomplete” state, > I guess this could impact the performance. > Anyway, could someone give me some suggestions? Why it is so slow when I run > from vstart.sh? > > Best, > Zhe > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Performance is really bad when I run from vstart.sh
Hi folks, I run ceph on a single node which contains 25 hard drives and each @7200 RPM. I write raw data into the array, it achieved 2 GB/s. I presumed the performance of ceph could go beyond 1 GB/s. but when I compile and ceph code and run development mode with vstart.sh, the average throughput is only 200 MB/s for rados bench write. I suspected it was due to the debug mode when I configure the source code, and I disable the gdb with ./configure CFLAGS='-O3' CXXFLAGS='O3' (avoid '-g' flag). But it did not help at all. I switched to the repository, and install ceph with ceph-deploy, the performance achieved 800 MB/s. Since I did not successfully set up the ceph with ceph-deploy, and there are still some pg at "creating+incomplete" state, I guess this could impact the performance. Anyway, could someone give me some suggestions? Why it is so slow when I run from vstart.sh? Best, Zhe ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some OSD and MDS crash
Yes, thanks. -Sam On Wed, Jul 2, 2014 at 4:21 PM, Pierre BLONDEAU wrote: > Like that ? > > # ceph --admin-daemon /var/run/ceph/ceph-mon.william.asok version > {"version":"0.82"} > # ceph --admin-daemon /var/run/ceph/ceph-mon.jack.asok version > {"version":"0.82"} > # ceph --admin-daemon /var/run/ceph/ceph-mon.joe.asok version > {"version":"0.82"} > > Pierre > > Le 03/07/2014 01:17, Samuel Just a écrit : > >> Can you confirm from the admin socket that all monitors are running >> the same version? >> -Sam >> >> On Wed, Jul 2, 2014 at 4:15 PM, Pierre BLONDEAU >> wrote: >>> >>> Le 03/07/2014 00:55, Samuel Just a écrit : >>> Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i > /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 < tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? Pierre: do you recall how and when that got set? >>> >>> >>> >>> I am not sure to understand, but if I good remember after the update in >>> firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I >>> see "feature set mismatch" in log. >>> >>> So if I good remeber, i do : ceph osd crush tunables optimal for the >>> problem >>> of "crush map" and I update my client and server kernel to 3.16rc. >>> >>> It's could be that ? >>> >>> Pierre >>> >>> -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just wrote: > > > Yeah, divergent osdmaps: > 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none > 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none > > Joao: thoughts? > -Sam > > On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU > wrote: >> >> >> The files >> >> When I upgrade : >>ceph-deploy install --stable firefly servers... >>on each servers service ceph restart mon >>on each servers service ceph restart osd >>on each servers service ceph restart mds >> >> I upgraded from emperor to firefly. After repair, remap, replace, etc >> ... I >> have some PG which pass in peering state. >> >> I thought why not try the version 0.82, it could solve my problem. ( >> It's my mistake ). So, I upgrade from firefly to 0.83 with : >>ceph-deploy install --testing servers... >>.. >> >> Now, all programs are in version 0.82. >> I have 3 mons, 36 OSD and 3 mds. >> >> Pierre >> >> PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta >> directory. >> >> Le 03/07/2014 00:10, Samuel Just a écrit : >> >>> Also, what version did you upgrade from, and how did you upgrade? >>> -Sam >>> >>> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just >>> wrote: Ok, in current/meta on osd 20 and osd 23, please attach all files matching ^osdmap.13258.* There should be one such file on each osd. (should look something like osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, you'll want to use find). What version of ceph is running on your mons? How many mons do you have? -Sam On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU wrote: > > > > Hi, > > I do it, the log files are available here : > https://blondeau.users.greyc.fr/cephlog/debug20/ > > The OSD's files are really big +/- 80M . > > After starting the osd.20 some other osd crash. I pass from 31 osd > up > to > 16. > I remark that after this the number of down+peering PG decrease > from > 367 > to > 248. It's "normal" ? May be it's temporary, the time that the > cluster > verifies all the PG ? > > Regards > Pierre > > Le 02/07/2014 19:16, Samuel Just a écrit : > >> You should add >> >> debug osd = 20 >> debug filestore = 20 >> debug ms = 1 >> >> to the [osd] section of the ceph.conf and restart the osds. I'd >> like >> all three logs if possible. >> >> Thanks >> -Sam >> >> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU >> wrote: >>> >>> >>> >>> >>> Yes, but how i do that ? >>> >>> With a command li
Re: [ceph-users] Some OSD and MDS crash
Like that ? # ceph --admin-daemon /var/run/ceph/ceph-mon.william.asok version {"version":"0.82"} # ceph --admin-daemon /var/run/ceph/ceph-mon.jack.asok version {"version":"0.82"} # ceph --admin-daemon /var/run/ceph/ceph-mon.joe.asok version {"version":"0.82"} Pierre Le 03/07/2014 01:17, Samuel Just a écrit : Can you confirm from the admin socket that all monitors are running the same version? -Sam On Wed, Jul 2, 2014 at 4:15 PM, Pierre BLONDEAU wrote: Le 03/07/2014 00:55, Samuel Just a écrit : Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i > /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 < tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? Pierre: do you recall how and when that got set? I am not sure to understand, but if I good remember after the update in firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I see "feature set mismatch" in log. So if I good remeber, i do : ceph osd crush tunables optimal for the problem of "crush map" and I update my client and server kernel to 3.16rc. It's could be that ? Pierre -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just wrote: Yeah, divergent osdmaps: 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none Joao: thoughts? -Sam On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU wrote: The files When I upgrade : ceph-deploy install --stable firefly servers... on each servers service ceph restart mon on each servers service ceph restart osd on each servers service ceph restart mds I upgraded from emperor to firefly. After repair, remap, replace, etc ... I have some PG which pass in peering state. I thought why not try the version 0.82, it could solve my problem. ( It's my mistake ). So, I upgrade from firefly to 0.83 with : ceph-deploy install --testing servers... .. Now, all programs are in version 0.82. I have 3 mons, 36 OSD and 3 mds. Pierre PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta directory. Le 03/07/2014 00:10, Samuel Just a écrit : Also, what version did you upgrade from, and how did you upgrade? -Sam On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just wrote: Ok, in current/meta on osd 20 and osd 23, please attach all files matching ^osdmap.13258.* There should be one such file on each osd. (should look something like osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, you'll want to use find). What version of ceph is running on your mons? How many mons do you have? -Sam On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU wrote: Hi, I do it, the log files are available here : https://blondeau.users.greyc.fr/cephlog/debug20/ The OSD's files are really big +/- 80M . After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. I remark that after this the number of down+peering PG decrease from 367 to 248. It's "normal" ? May be it's temporary, the time that the cluster verifies all the PG ? Regards Pierre Le 02/07/2014 19:16, Samuel Just a écrit : You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU wrote: Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). I will remember it the next time ;) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU wrote: Hi, After the upgrade to firefly, I h
Re: [ceph-users] Some OSD and MDS crash
Can you confirm from the admin socket that all monitors are running the same version? -Sam On Wed, Jul 2, 2014 at 4:15 PM, Pierre BLONDEAU wrote: > Le 03/07/2014 00:55, Samuel Just a écrit : > >> Ah, >> >> ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush >> /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i > >> /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d >> ../ceph/src/osdmaptool: osdmap file >> 'osd-20_osdmap.13258__0_4E62BB79__none' >> ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 >> ../ceph/src/osdmaptool: osdmap file >> 'osd-23_osdmap.13258__0_4E62BB79__none' >> ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 >> 6d5 >> < tunable chooseleaf_vary_r 1 >> >> Looks like the chooseleaf_vary_r tunable somehow ended up divergent? >> >> Pierre: do you recall how and when that got set? > > > I am not sure to understand, but if I good remember after the update in > firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I > see "feature set mismatch" in log. > > So if I good remeber, i do : ceph osd crush tunables optimal for the problem > of "crush map" and I update my client and server kernel to 3.16rc. > > It's could be that ? > > Pierre > > >> -Sam >> >> On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just wrote: >>> >>> Yeah, divergent osdmaps: >>> 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none >>> 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none >>> >>> Joao: thoughts? >>> -Sam >>> >>> On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU >>> wrote: The files When I upgrade : ceph-deploy install --stable firefly servers... on each servers service ceph restart mon on each servers service ceph restart osd on each servers service ceph restart mds I upgraded from emperor to firefly. After repair, remap, replace, etc ... I have some PG which pass in peering state. I thought why not try the version 0.82, it could solve my problem. ( It's my mistake ). So, I upgrade from firefly to 0.83 with : ceph-deploy install --testing servers... .. Now, all programs are in version 0.82. I have 3 mons, 36 OSD and 3 mds. Pierre PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta directory. Le 03/07/2014 00:10, Samuel Just a écrit : > Also, what version did you upgrade from, and how did you upgrade? > -Sam > > On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just > wrote: >> >> >> Ok, in current/meta on osd 20 and osd 23, please attach all files >> matching >> >> ^osdmap.13258.* >> >> There should be one such file on each osd. (should look something like >> osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, >> you'll want to use find). >> >> What version of ceph is running on your mons? How many mons do you >> have? >> -Sam >> >> On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU >> wrote: >>> >>> >>> Hi, >>> >>> I do it, the log files are available here : >>> https://blondeau.users.greyc.fr/cephlog/debug20/ >>> >>> The OSD's files are really big +/- 80M . >>> >>> After starting the osd.20 some other osd crash. I pass from 31 osd up >>> to >>> 16. >>> I remark that after this the number of down+peering PG decrease from >>> 367 >>> to >>> 248. It's "normal" ? May be it's temporary, the time that the cluster >>> verifies all the PG ? >>> >>> Regards >>> Pierre >>> >>> Le 02/07/2014 19:16, Samuel Just a écrit : >>> You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU wrote: > > > > Yes, but how i do that ? > > With a command like that ? > > ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 > --debug-ms > 1' > > By modify the /etc/ceph/ceph.conf ? This file is really poor > because I > use > udev detection. > > When I have made these changes, you want the three log files or > only > osd.20's ? > > Thank you so much for the help > > Regards > Pierre > > Le 01/07/2014 23:51, Samuel Just a écrit : > >> Can you reproduce with >> debug osd = 20 >> debug filestore = 20 >> debug ms = 1 >> ? >> -Sam >> >> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU >> wrote: >>> >
Re: [ceph-users] Some OSD and MDS crash
Le 03/07/2014 00:55, Samuel Just a écrit : Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i > /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 < tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? Pierre: do you recall how and when that got set? I am not sure to understand, but if I good remember after the update in firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I see "feature set mismatch" in log. So if I good remeber, i do : ceph osd crush tunables optimal for the problem of "crush map" and I update my client and server kernel to 3.16rc. It's could be that ? Pierre -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just wrote: Yeah, divergent osdmaps: 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none Joao: thoughts? -Sam On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU wrote: The files When I upgrade : ceph-deploy install --stable firefly servers... on each servers service ceph restart mon on each servers service ceph restart osd on each servers service ceph restart mds I upgraded from emperor to firefly. After repair, remap, replace, etc ... I have some PG which pass in peering state. I thought why not try the version 0.82, it could solve my problem. ( It's my mistake ). So, I upgrade from firefly to 0.83 with : ceph-deploy install --testing servers... .. Now, all programs are in version 0.82. I have 3 mons, 36 OSD and 3 mds. Pierre PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta directory. Le 03/07/2014 00:10, Samuel Just a écrit : Also, what version did you upgrade from, and how did you upgrade? -Sam On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just wrote: Ok, in current/meta on osd 20 and osd 23, please attach all files matching ^osdmap.13258.* There should be one such file on each osd. (should look something like osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, you'll want to use find). What version of ceph is running on your mons? How many mons do you have? -Sam On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU wrote: Hi, I do it, the log files are available here : https://blondeau.users.greyc.fr/cephlog/debug20/ The OSD's files are really big +/- 80M . After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. I remark that after this the number of down+peering PG decrease from 367 to 248. It's "normal" ? May be it's temporary, the time that the cluster verifies all the PG ? Regards Pierre Le 02/07/2014 19:16, Samuel Just a écrit : You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU wrote: Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). I will remember it the next time ;) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU wrote: Hi, After the upgrade to firefly, I have some PG in peering state. I seen the output of 0.82 so I try to upgrade for solved my problem. My three MDS crash and some OSD triggers a chain reaction that kills other OSD. I think my MDS will not start because of the metadata are on the OSD. I have 36 OSD on three servers and I identified 5 OSD which makes crash others. If i not start their, the cluster passe in reconstructive state with 31 OSD but i have 378 in down+peering state. How can I do ? Would you more in
Re: [ceph-users] Some OSD and MDS crash
Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i > /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 < tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? Pierre: do you recall how and when that got set? -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just wrote: > Yeah, divergent osdmaps: > 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none > 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none > > Joao: thoughts? > -Sam > > On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU > wrote: >> The files >> >> When I upgrade : >> ceph-deploy install --stable firefly servers... >> on each servers service ceph restart mon >> on each servers service ceph restart osd >> on each servers service ceph restart mds >> >> I upgraded from emperor to firefly. After repair, remap, replace, etc ... I >> have some PG which pass in peering state. >> >> I thought why not try the version 0.82, it could solve my problem. ( >> It's my mistake ). So, I upgrade from firefly to 0.83 with : >> ceph-deploy install --testing servers... >> .. >> >> Now, all programs are in version 0.82. >> I have 3 mons, 36 OSD and 3 mds. >> >> Pierre >> >> PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta >> directory. >> >> Le 03/07/2014 00:10, Samuel Just a écrit : >> >>> Also, what version did you upgrade from, and how did you upgrade? >>> -Sam >>> >>> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just wrote: Ok, in current/meta on osd 20 and osd 23, please attach all files matching ^osdmap.13258.* There should be one such file on each osd. (should look something like osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, you'll want to use find). What version of ceph is running on your mons? How many mons do you have? -Sam On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU wrote: > > Hi, > > I do it, the log files are available here : > https://blondeau.users.greyc.fr/cephlog/debug20/ > > The OSD's files are really big +/- 80M . > > After starting the osd.20 some other osd crash. I pass from 31 osd up to > 16. > I remark that after this the number of down+peering PG decrease from 367 > to > 248. It's "normal" ? May be it's temporary, the time that the cluster > verifies all the PG ? > > Regards > Pierre > > Le 02/07/2014 19:16, Samuel Just a écrit : > >> You should add >> >> debug osd = 20 >> debug filestore = 20 >> debug ms = 1 >> >> to the [osd] section of the ceph.conf and restart the osds. I'd like >> all three logs if possible. >> >> Thanks >> -Sam >> >> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU >> wrote: >>> >>> >>> Yes, but how i do that ? >>> >>> With a command like that ? >>> >>> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 >>> --debug-ms >>> 1' >>> >>> By modify the /etc/ceph/ceph.conf ? This file is really poor because I >>> use >>> udev detection. >>> >>> When I have made these changes, you want the three log files or only >>> osd.20's ? >>> >>> Thank you so much for the help >>> >>> Regards >>> Pierre >>> >>> Le 01/07/2014 23:51, Samuel Just a écrit : >>> Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU wrote: > > > > Hi, > > I join : > - osd.20 is one of osd that I detect which makes crash other > OSD. > - osd.23 is one of osd which crash when i start osd.20 > - mds, is one of my MDS > > I cut log file because they are to big but. All is here : > https://blondeau.users.greyc.fr/cephlog/ > > Regards > > Le 30/06/2014 17:35, Gregory Farnum a écrit : > >> What's the backtrace from the crashing OSDs? >> >> Keep in mind that as a dev release, it's generally best not to >> upgrade >> to unnamed versions like 0.82 (but it's probably too late to go >> back >> now). > > > > I will remember it the next time ;) > >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> >> On Mon, Jun 30, 2014 at 8:06 A
Re: [ceph-users] Some OSD and MDS crash
Yeah, divergent osdmaps: 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none Joao: thoughts? -Sam On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU wrote: > The files > > When I upgrade : > ceph-deploy install --stable firefly servers... > on each servers service ceph restart mon > on each servers service ceph restart osd > on each servers service ceph restart mds > > I upgraded from emperor to firefly. After repair, remap, replace, etc ... I > have some PG which pass in peering state. > > I thought why not try the version 0.82, it could solve my problem. ( > It's my mistake ). So, I upgrade from firefly to 0.83 with : > ceph-deploy install --testing servers... > .. > > Now, all programs are in version 0.82. > I have 3 mons, 36 OSD and 3 mds. > > Pierre > > PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta > directory. > > Le 03/07/2014 00:10, Samuel Just a écrit : > >> Also, what version did you upgrade from, and how did you upgrade? >> -Sam >> >> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just wrote: >>> >>> Ok, in current/meta on osd 20 and osd 23, please attach all files >>> matching >>> >>> ^osdmap.13258.* >>> >>> There should be one such file on each osd. (should look something like >>> osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, >>> you'll want to use find). >>> >>> What version of ceph is running on your mons? How many mons do you have? >>> -Sam >>> >>> On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU >>> wrote: Hi, I do it, the log files are available here : https://blondeau.users.greyc.fr/cephlog/debug20/ The OSD's files are really big +/- 80M . After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. I remark that after this the number of down+peering PG decrease from 367 to 248. It's "normal" ? May be it's temporary, the time that the cluster verifies all the PG ? Regards Pierre Le 02/07/2014 19:16, Samuel Just a écrit : > You should add > > debug osd = 20 > debug filestore = 20 > debug ms = 1 > > to the [osd] section of the ceph.conf and restart the osds. I'd like > all three logs if possible. > > Thanks > -Sam > > On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU > wrote: >> >> >> Yes, but how i do that ? >> >> With a command like that ? >> >> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 >> --debug-ms >> 1' >> >> By modify the /etc/ceph/ceph.conf ? This file is really poor because I >> use >> udev detection. >> >> When I have made these changes, you want the three log files or only >> osd.20's ? >> >> Thank you so much for the help >> >> Regards >> Pierre >> >> Le 01/07/2014 23:51, Samuel Just a écrit : >> >>> Can you reproduce with >>> debug osd = 20 >>> debug filestore = 20 >>> debug ms = 1 >>> ? >>> -Sam >>> >>> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU >>> wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : > What's the backtrace from the crashing OSDs? > > Keep in mind that as a dev release, it's generally best not to > upgrade > to unnamed versions like 0.82 (but it's probably too late to go > back > now). I will remember it the next time ;) > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU > wrote: >> >> >> Hi, >> >> After the upgrade to firefly, I have some PG in peering state. >> I seen the output of 0.82 so I try to upgrade for solved my >> problem. >> >> My three MDS crash and some OSD triggers a chain reaction that >> kills >> other >> OSD. >> I think my MDS will not start because of the metadata are on the >> OSD. >> >> I have 36 OSD on three servers and I identified 5 OSD which makes >> crash >> others. If i not start their, the cluster passe in reconstructive >> state >> with >> 31 OSD but i have 378 in down+peering state. >> >> How can I do ? Would you more information ( os, crash log, etc .
Re: [ceph-users] Some OSD and MDS crash
Joao: this looks like divergent osdmaps, osd 20 and osd 23 have differing ideas of the acting set for pg 2.11. Did we add hashes to the incremental maps? What would you want to know from the mons? -Sam On Wed, Jul 2, 2014 at 3:10 PM, Samuel Just wrote: > Also, what version did you upgrade from, and how did you upgrade? > -Sam > > On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just wrote: >> Ok, in current/meta on osd 20 and osd 23, please attach all files matching >> >> ^osdmap.13258.* >> >> There should be one such file on each osd. (should look something like >> osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, >> you'll want to use find). >> >> What version of ceph is running on your mons? How many mons do you have? >> -Sam >> >> On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU >> wrote: >>> Hi, >>> >>> I do it, the log files are available here : >>> https://blondeau.users.greyc.fr/cephlog/debug20/ >>> >>> The OSD's files are really big +/- 80M . >>> >>> After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. >>> I remark that after this the number of down+peering PG decrease from 367 to >>> 248. It's "normal" ? May be it's temporary, the time that the cluster >>> verifies all the PG ? >>> >>> Regards >>> Pierre >>> >>> Le 02/07/2014 19:16, Samuel Just a écrit : >>> You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU wrote: > > Yes, but how i do that ? > > With a command like that ? > > ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 > --debug-ms > 1' > > By modify the /etc/ceph/ceph.conf ? This file is really poor because I > use > udev detection. > > When I have made these changes, you want the three log files or only > osd.20's ? > > Thank you so much for the help > > Regards > Pierre > > Le 01/07/2014 23:51, Samuel Just a écrit : > >> Can you reproduce with >> debug osd = 20 >> debug filestore = 20 >> debug ms = 1 >> ? >> -Sam >> >> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU >> wrote: >>> >>> >>> Hi, >>> >>> I join : >>>- osd.20 is one of osd that I detect which makes crash other OSD. >>>- osd.23 is one of osd which crash when i start osd.20 >>>- mds, is one of my MDS >>> >>> I cut log file because they are to big but. All is here : >>> https://blondeau.users.greyc.fr/cephlog/ >>> >>> Regards >>> >>> Le 30/06/2014 17:35, Gregory Farnum a écrit : >>> What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). >>> >>> >>> I will remember it the next time ;) >>> -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU wrote: > > Hi, > > After the upgrade to firefly, I have some PG in peering state. > I seen the output of 0.82 so I try to upgrade for solved my problem. > > My three MDS crash and some OSD triggers a chain reaction that kills > other > OSD. > I think my MDS will not start because of the metadata are on the OSD. > > I have 36 OSD on three servers and I identified 5 OSD which makes > crash > others. If i not start their, the cluster passe in reconstructive > state > with > 31 OSD but i have 378 in down+peering state. > > How can I do ? Would you more information ( os, crash log, etc ... ) > ? > > Regards >>> >>> >>> -- >>> -- >>> Pierre BLONDEAU >>> Administrateur Systèmes & réseaux >>> Université de Caen >>> Laboratoire GREYC, Département d'informatique >>> >>> tel : 02 31 56 75 42 >>> bureau : Campus 2, Science 3, 406 >>> -- >>> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some OSD and MDS crash
Also, what version did you upgrade from, and how did you upgrade? -Sam On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just wrote: > Ok, in current/meta on osd 20 and osd 23, please attach all files matching > > ^osdmap.13258.* > > There should be one such file on each osd. (should look something like > osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, > you'll want to use find). > > What version of ceph is running on your mons? How many mons do you have? > -Sam > > On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU > wrote: >> Hi, >> >> I do it, the log files are available here : >> https://blondeau.users.greyc.fr/cephlog/debug20/ >> >> The OSD's files are really big +/- 80M . >> >> After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. >> I remark that after this the number of down+peering PG decrease from 367 to >> 248. It's "normal" ? May be it's temporary, the time that the cluster >> verifies all the PG ? >> >> Regards >> Pierre >> >> Le 02/07/2014 19:16, Samuel Just a écrit : >> >>> You should add >>> >>> debug osd = 20 >>> debug filestore = 20 >>> debug ms = 1 >>> >>> to the [osd] section of the ceph.conf and restart the osds. I'd like >>> all three logs if possible. >>> >>> Thanks >>> -Sam >>> >>> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU >>> wrote: Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : > Can you reproduce with > debug osd = 20 > debug filestore = 20 > debug ms = 1 > ? > -Sam > > On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU > wrote: >> >> >> Hi, >> >> I join : >>- osd.20 is one of osd that I detect which makes crash other OSD. >>- osd.23 is one of osd which crash when i start osd.20 >>- mds, is one of my MDS >> >> I cut log file because they are to big but. All is here : >> https://blondeau.users.greyc.fr/cephlog/ >> >> Regards >> >> Le 30/06/2014 17:35, Gregory Farnum a écrit : >> >>> What's the backtrace from the crashing OSDs? >>> >>> Keep in mind that as a dev release, it's generally best not to upgrade >>> to unnamed versions like 0.82 (but it's probably too late to go back >>> now). >> >> >> I will remember it the next time ;) >> >>> -Greg >>> Software Engineer #42 @ http://inktank.com | http://ceph.com >>> >>> On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU >>> wrote: Hi, After the upgrade to firefly, I have some PG in peering state. I seen the output of 0.82 so I try to upgrade for solved my problem. My three MDS crash and some OSD triggers a chain reaction that kills other OSD. I think my MDS will not start because of the metadata are on the OSD. I have 36 OSD on three servers and I identified 5 OSD which makes crash others. If i not start their, the cluster passe in reconstructive state with 31 OSD but i have 378 in down+peering state. How can I do ? Would you more information ( os, crash log, etc ... ) ? Regards >> >> >> -- >> -- >> Pierre BLONDEAU >> Administrateur Systèmes & réseaux >> Université de Caen >> Laboratoire GREYC, Département d'informatique >> >> tel : 02 31 56 75 42 >> bureau : Campus 2, Science 3, 406 >> -- >> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some OSD and MDS crash
Ok, in current/meta on osd 20 and osd 23, please attach all files matching ^osdmap.13258.* There should be one such file on each osd. (should look something like osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, you'll want to use find). What version of ceph is running on your mons? How many mons do you have? -Sam On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU wrote: > Hi, > > I do it, the log files are available here : > https://blondeau.users.greyc.fr/cephlog/debug20/ > > The OSD's files are really big +/- 80M . > > After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. > I remark that after this the number of down+peering PG decrease from 367 to > 248. It's "normal" ? May be it's temporary, the time that the cluster > verifies all the PG ? > > Regards > Pierre > > Le 02/07/2014 19:16, Samuel Just a écrit : > >> You should add >> >> debug osd = 20 >> debug filestore = 20 >> debug ms = 1 >> >> to the [osd] section of the ceph.conf and restart the osds. I'd like >> all three logs if possible. >> >> Thanks >> -Sam >> >> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU >> wrote: >>> >>> Yes, but how i do that ? >>> >>> With a command like that ? >>> >>> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 >>> --debug-ms >>> 1' >>> >>> By modify the /etc/ceph/ceph.conf ? This file is really poor because I >>> use >>> udev detection. >>> >>> When I have made these changes, you want the three log files or only >>> osd.20's ? >>> >>> Thank you so much for the help >>> >>> Regards >>> Pierre >>> >>> Le 01/07/2014 23:51, Samuel Just a écrit : >>> Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU wrote: > > > Hi, > > I join : >- osd.20 is one of osd that I detect which makes crash other OSD. >- osd.23 is one of osd which crash when i start osd.20 >- mds, is one of my MDS > > I cut log file because they are to big but. All is here : > https://blondeau.users.greyc.fr/cephlog/ > > Regards > > Le 30/06/2014 17:35, Gregory Farnum a écrit : > >> What's the backtrace from the crashing OSDs? >> >> Keep in mind that as a dev release, it's generally best not to upgrade >> to unnamed versions like 0.82 (but it's probably too late to go back >> now). > > > I will remember it the next time ;) > >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> >> On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU >> wrote: >>> >>> Hi, >>> >>> After the upgrade to firefly, I have some PG in peering state. >>> I seen the output of 0.82 so I try to upgrade for solved my problem. >>> >>> My three MDS crash and some OSD triggers a chain reaction that kills >>> other >>> OSD. >>> I think my MDS will not start because of the metadata are on the OSD. >>> >>> I have 36 OSD on three servers and I identified 5 OSD which makes >>> crash >>> others. If i not start their, the cluster passe in reconstructive >>> state >>> with >>> 31 OSD but i have 378 in down+peering state. >>> >>> How can I do ? Would you more information ( os, crash log, etc ... ) >>> ? >>> >>> Regards > > > -- > -- > Pierre BLONDEAU > Administrateur Systèmes & réseaux > Université de Caen > Laboratoire GREYC, Département d'informatique > > tel : 02 31 56 75 42 > bureau : Campus 2, Science 3, 406 > -- > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Bypass Cache-Tiering for special reads (Backups)
Hi, I was wondering, having a cache pool in front of an RBD pool is all fine and dandy, but imagine you want to pull backups of all your VMs (or one of them, or multiple...). Going to the cache for all those reads isn't only pointless, it'll also potentially fill up the cache and possibly evict actually frequently used data. Which got me thinking... wouldn't it be nifty if there was a special way of doing specific backup reads where you'd bypass the cache, ensuring the dirty cache contents get written to cold pool first? Or at least doing special reads where a cache-miss won't actually cache the requested data? AFAIK the backup routine for an RBD-backed KVM usually involves creating a snapshot of the RBD and putting that into a backup storage/tape, all done via librbd/API. Maybe something like that even already exists? KR, Marc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some OSD and MDS crash
Hi, I do it, the log files are available here : https://blondeau.users.greyc.fr/cephlog/debug20/ The OSD's files are really big +/- 80M . After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. I remark that after this the number of down+peering PG decrease from 367 to 248. It's "normal" ? May be it's temporary, the time that the cluster verifies all the PG ? Regards Pierre Le 02/07/2014 19:16, Samuel Just a écrit : You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU wrote: Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). I will remember it the next time ;) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU wrote: Hi, After the upgrade to firefly, I have some PG in peering state. I seen the output of 0.82 so I try to upgrade for solved my problem. My three MDS crash and some OSD triggers a chain reaction that kills other OSD. I think my MDS will not start because of the metadata are on the OSD. I have 36 OSD on three servers and I identified 5 OSD which makes crash others. If i not start their, the cluster passe in reconstructive state with 31 OSD but i have 378 in down+peering state. How can I do ? Would you more information ( os, crash log, etc ... ) ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes & réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5
Alright, I was finally able to get this resolved without adding another node. As pointed out, even though I had a config variable that defined the default replicated size at 2, ceph for some reason created the default pools (data, and metadata) with a value of 3. After digging trough documentation I found: ceph osd dump | grep 'replicated size' Which shows the replicated size for each pool. My newly created pools ssd and sata were correctly configured, but the default pools in ceph were not. I was then able to set: ceph osd pool set metadata size 2 and ceph osd pool set data size 2 Finally, my cluster is healthy! Not exactly straight forward installation and troubleshooting, but it works. Thanks for the help and tips along the way. The advice definitely led me in the right direction. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?
On Wed, Jul 2, 2014 at 12:44 PM, Stefan Priebe wrote: > Hi Greg, > > Am 02.07.2014 21:36, schrieb Gregory Farnum: >> >> On Wed, Jul 2, 2014 at 12:00 PM, Stefan Priebe >> wrote: >>> >>> >>> Am 02.07.2014 16:00, schrieb Gregory Farnum: >>> Yeah, it's fighting for attention with a lot of other urgent stuff. :( Anyway, even if you can't look up any details or reproduce at this time, I'm sure you know what shape the cluster was (number of OSDs, running on SSDs or hard drives, etc), and that would be useful guidance. :) >>> >>> >>> >>> Sure >>> >>> Number of OSDs: 24 >>> Each OSD has an SSD capable of doing tested with fio before installing >>> ceph >>> (70.000 iop/s 4k write, 580MB/s seq. write 1MB blocks) >>> >>> Single Xeon E5-1620 v2 @ 3.70GHz >>> >>> 48GB RAM >> >> >> Awesome, thanks. >> >> I went through the changelogs on the librados/, osdc/, and msg/ >> directories to see if I could find any likely change candidates >> between Dumpling and Firefly and couldn't see any issues. :( But I >> suspect that the sharding changes coming will more than make up the >> difference, so you might want to plan on checking that out when it >> arrives, even if you don't want to deploy it to production.n > > > To which changes do you refer? Will they be part or backported of/to > firefly? Yehuda's got a pretty big patchset that is sharding up the "big Objecter lock" into many smaller mutexes and RWLocks that will make it much more parallel. He's on vacation just now but I understand it's almost ready to merge; I don't think it'll be suitable for backport to firefly, though (it's big). -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?
Hi Greg, Am 02.07.2014 21:36, schrieb Gregory Farnum: On Wed, Jul 2, 2014 at 12:00 PM, Stefan Priebe wrote: Am 02.07.2014 16:00, schrieb Gregory Farnum: Yeah, it's fighting for attention with a lot of other urgent stuff. :( Anyway, even if you can't look up any details or reproduce at this time, I'm sure you know what shape the cluster was (number of OSDs, running on SSDs or hard drives, etc), and that would be useful guidance. :) Sure Number of OSDs: 24 Each OSD has an SSD capable of doing tested with fio before installing ceph (70.000 iop/s 4k write, 580MB/s seq. write 1MB blocks) Single Xeon E5-1620 v2 @ 3.70GHz 48GB RAM Awesome, thanks. I went through the changelogs on the librados/, osdc/, and msg/ directories to see if I could find any likely change candidates between Dumpling and Firefly and couldn't see any issues. :( But I suspect that the sharding changes coming will more than make up the difference, so you might want to plan on checking that out when it arrives, even if you don't want to deploy it to production.n To which changes do you refer? Will they be part or backported of/to firefly? -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?
On Wed, Jul 2, 2014 at 12:00 PM, Stefan Priebe wrote: > > Am 02.07.2014 16:00, schrieb Gregory Farnum: > >> Yeah, it's fighting for attention with a lot of other urgent stuff. :( >> >> Anyway, even if you can't look up any details or reproduce at this >> time, I'm sure you know what shape the cluster was (number of OSDs, >> running on SSDs or hard drives, etc), and that would be useful >> guidance. :) > > > Sure > > Number of OSDs: 24 > Each OSD has an SSD capable of doing tested with fio before installing ceph > (70.000 iop/s 4k write, 580MB/s seq. write 1MB blocks) > > Single Xeon E5-1620 v2 @ 3.70GHz > > 48GB RAM Awesome, thanks. I went through the changelogs on the librados/, osdc/, and msg/ directories to see if I could find any likely change candidates between Dumpling and Firefly and couldn't see any issues. :( But I suspect that the sharding changes coming will more than make up the difference, so you might want to plan on checking that out when it arrives, even if you don't want to deploy it to production. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?
Am 02.07.2014 16:00, schrieb Gregory Farnum: Yeah, it's fighting for attention with a lot of other urgent stuff. :( Anyway, even if you can't look up any details or reproduce at this time, I'm sure you know what shape the cluster was (number of OSDs, running on SSDs or hard drives, etc), and that would be useful guidance. :) Sure Number of OSDs: 24 Each OSD has an SSD capable of doing tested with fio before installing ceph (70.000 iop/s 4k write, 580MB/s seq. write 1MB blocks) Single Xeon E5-1620 v2 @ 3.70GHz 48GB RAM Stefan -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Jul 2, 2014 at 6:12 AM, Stefan Priebe - Profihost AG wrote: Am 02.07.2014 15:07, schrieb Haomai Wang: Could you give some perf counter from rbd client side? Such as op latency? Sorry haven't any counters. As this mail was some days unseen - i thought nobody has an idea or could help. Stefan On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG wrote: Am 02.07.2014 00:51, schrieb Gregory Farnum: On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG wrote: Hi Greg, Am 26.06.2014 02:17, schrieb Gregory Farnum: Sorry we let this drop; we've all been busy traveling and things. There have been a lot of changes to librados between Dumpling and Firefly, but we have no idea what would have made it slower. Can you provide more details about how you were running these tests? it's just a normal fio run: fio --ioengine=rbd --bs=4k --name=foo --invalidate=0 --readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor --runtime=90 --numjobs=32 --direct=1 --group Running one time with firefly libs and one time with dumpling libs. Traget is always the same pool on a firefly ceph storage. What's the backing cluster you're running against? What kind of CPU usage do you see with both? 25k IOPS is definitely getting up there, but I'd like some guidance about whether we're looking for a reduction in parallelism, or an increase in per-op costs, or something else. Hi Greg, i don't have that test cluster anymore. It had to go into production with dumpling. So i can't tell you. Sorry. Stefan -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some OSD and MDS crash
You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU wrote: > Yes, but how i do that ? > > With a command like that ? > > ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms > 1' > > By modify the /etc/ceph/ceph.conf ? This file is really poor because I use > udev detection. > > When I have made these changes, you want the three log files or only > osd.20's ? > > Thank you so much for the help > > Regards > Pierre > > Le 01/07/2014 23:51, Samuel Just a écrit : > >> Can you reproduce with >> debug osd = 20 >> debug filestore = 20 >> debug ms = 1 >> ? >> -Sam >> >> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU >> wrote: >>> >>> Hi, >>> >>> I join : >>> - osd.20 is one of osd that I detect which makes crash other OSD. >>> - osd.23 is one of osd which crash when i start osd.20 >>> - mds, is one of my MDS >>> >>> I cut log file because they are to big but. All is here : >>> https://blondeau.users.greyc.fr/cephlog/ >>> >>> Regards >>> >>> Le 30/06/2014 17:35, Gregory Farnum a écrit : >>> What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). >>> >>> >>> >>> I will remember it the next time ;) >>> >>> -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU wrote: > > > Hi, > > After the upgrade to firefly, I have some PG in peering state. > I seen the output of 0.82 so I try to upgrade for solved my problem. > > My three MDS crash and some OSD triggers a chain reaction that kills > other > OSD. > I think my MDS will not start because of the metadata are on the OSD. > > I have 36 OSD on three servers and I identified 5 OSD which makes crash > others. If i not start their, the cluster passe in reconstructive state > with > 31 OSD but i have 378 in down+peering state. > > How can I do ? Would you more information ( os, crash log, etc ... ) ? > > Regards > > -- > -- > Pierre BLONDEAU > Administrateur Systèmes & réseaux > Université de Caen > Laboratoire GREYC, Département d'informatique > > tel : 02 31 56 75 42 > bureau : Campus 2, Science 3, 406 > -- > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> >>> >>> -- >>> -- >>> Pierre BLONDEAU >>> Administrateur Systèmes & réseaux >>> Université de Caen >>> Laboratoire GREYC, Département d'informatique >>> >>> tel : 02 31 56 75 42 >>> bureau : Campus 2, Science 3, 406 >>> -- >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > > > -- > -- > Pierre BLONDEAU > Administrateur Systèmes & réseaux > Université de Caen > Laboratoire GREYC, Département d'informatique > > tel : 02 31 56 75 42 > bureau : Campus 2, Science 3, 406 > -- > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] [ANN] ceph-deploy 1.5.7 released
Hi All, There is a new bug-fix release of ceph-deploy, the easy deployment tool for Ceph. The full list of fixes for this release can be found in the changelog: http://ceph.com/ceph-deploy/docs/changelog.html#id1 Make sure you update! -Alfredo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD layering
Ok thanks : mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=kvm1, allow rx pool=templates' seems to be enough. One more question about RBD layering : I've made a clone (child) in my pool 'kvm' from my protected snapshot in my pool 'template' and after launching my vm, the whole fs is read-only. Am I wrong thinking the protected snapshot acts like the base image and additional data will be store in the clone ? >Objet : Re: [ceph-users] RBD layering On 07/02/2014 10:08 AM, NEVEU Stephane wrote: >> Hi all, >> >> I'm missing around with "rbd layering" to store some ready-to-use >> templates (format 2) in a template pool : >> >> /Rbd -p templates ls/ >> >> /Ubuntu1404/ >> >> /Centos6/ >> >> /./ >> >> // >> >> /Rbd snap create templates/Ubuntu1404@Ubuntu1404-snap-protected/ >> >> /Rbd snap protect templates/Ubuntu1404@Ubuntu1404-snap-protected/ >> >> /Rbd clone templates/Ubuntu1404@Ubuntu1404-snap-protected >> kvm1/Ubuntu1404-snap-protected-children/ >> >> My libvirt key is created with : >> >> /Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow >>class-read object_prefix rbd_children, allow rwx pool=kvm1, allow r >>pool=templates'/ >> >> // >> >> But read permission for the pool 'templates' seems to be not enough, >> libvirt is complaining "RBD cannot access the rbd disk >> kvm1/Ubuntu1404-snap-protected-children" so : >> >> /Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow >> class-read object_prefix rbd_children, allow rwx pool=kvm1, allow >> *rwx* pool=templates'/ >> >I think that rx should be enough instead of rwx. Could you try that? >Wido Hi Wido, thank you: I'm trying this : Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=kvm1, allow rx pool=templates' Error EIVAL: key for client.kvm1 exists but cap osd does not match Is there another way to directly modify caps ? or do I need to suppress the key and re-create it ? > // > > It's actually working but it's probably a bit too much, because I > don't want people to be able to modify the parent template so do I > have a better choice ? > > Libvirt seems to be happier but this clone is read-only and I want now > people to use this OS image as a base file and write differences in a > backing file (like with qemu . -b .). > > How can I do such a thing ? or maybe I'm doing it in a wrong way. any help ? Am I clear enough here ? > > Thanks > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5
Christian Balzer writes: > Read EVERYTHING you can find about crushmap rules. > > The quickstart (I think) talks about 3 storage nodes, not OSDs. > > Ceph is quite good when it comes to defining failure domains, the default > is to segregate at the storage node level. > What good is a replication of 3 when all 3 OSDs are on the same host? Agreed, which is why I had defined the default as 2 replicas. I had hoped that this would work, but I will be adding a third host hopefully today or tomorrow. hopefully that takes care of the issue. I'll try another fresh install and see if I can get things going. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5
On Wed, 2 Jul 2014 14:25:49 + (UTC) Brian Lovett wrote: > Christian Balzer writes: > > > > So either make sure these pools really have a replication of 2 by > > deleting and re-creating them or add a third storage node. > > > > I just executed "ceph osd pool set {POOL} size 2" for both pools. > Anything else I need to do? I still don't see any changes to the status > of the cluster. We're adding a 3rd storage cluster, but why is it that > this is an issue? I don't see anything anywhere that says you have to > have a minimum number of osd's for ceph to function. Even the quick > start only has 3, so I assumed 8 would be fine as well. > Read EVERYTHING you can find about crushmap rules. The quickstart (I think) talks about 3 storage nodes, not OSDs. Ceph is quite good when it comes to defining failure domains, the default is to segregate at the storage node level. What good is a replication of 3 when all 3 OSDs are on the same host? Christian > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Issues upgrading from 0.72.x (emperor) to 0.81.x (firefly)
Hi, > I can't help you with packaging issues, but i can tell you that the > rbdmap executable got moved to a different package at some point, but > I believe the official ones handle it properly. I'll see tonight when doing the other nodes. Maybe it's a result of using dist-upgrade rather than just "upgrade" + "install ceph". > And I'm just guessing here (like I said, can't help with packaging), > but I think the deleted /etc/ceph is a result of the force-overwrite > option you used. Nope, that happened before I used that option :p >> Now it might be "normal", but being the production cluster, I can't >> risk and upgrading more than half the mons if I'm not sure this is >> indeed normal and not a symptom that the install/update failed and >> that the mon is not actually working. > > That's not normal. A first guess is that you didn't give the new > monitor the same keyring as the old ones, but I couldn't say for sure > without more info. Turn up logging and post it somewhere? jao on IRC just debugged this and it turns out that you have to upgrade the leader monitor first because of a message (MForward) incompatibility between the two versions (due to b4fbe4f81348be74c654f3dae1c20a961b99c895 I think). Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Mixing CEPH versions on new ceph nodes...
On 07/02/2014 04:08 PM, Andrija Panic wrote: Hi, I have existing CEPH cluster of 3 nodes, versions 0.72.2 I'm in a process of installing CEPH on 4th node, but now CEPH version is 0.80.1 Will this make problems running mixed CEPH versions ? No, but the recommendation is not to have this running for a very long period. Try to upgrade all nodes to the same version within a reasonable amount of time. I intend to upgrade CEPH on exsiting 3 nodes anyway ? Recommended steps ? Always upgrade the monitors first! Then to the OSDs one by one. Thanks -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5
Gregory Farnum writes: > > On Tue, Jul 1, 2014 at 1:26 PM, Brian Lovett > wrote: > > "profile": "bobtail", > > Okay. That's unusual. What's the oldest client you need to support, > and what Ceph version are you using? You probably want to set the > crush tunables to "optimal"; the "bobtail" ones are going to have all > kinds of issues with a small map like this. (Specifically, a map where > the number of buckets/items at each level is similar to the number of > requested replicas.) > -Greg > Software Engineer #42http://inktank.com | http://ceph.com > Ok, I issued: ceph osd crush tunables optimal Shouldn't that be the default though since I just did this as a fresh install? The cluster status hasn't changed though. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] HEALTH_WARN active+degraded on fresh install CENTOS 6.5
Christian Balzer writes: > So either make sure these pools really have a replication of 2 by deleting > and re-creating them or add a third storage node. I just executed "ceph osd pool set {POOL} size 2" for both pools. Anything else I need to do? I still don't see any changes to the status of the cluster. We're adding a 3rd storage cluster, but why is it that this is an issue? I don't see anything anywhere that says you have to have a minimum number of osd's for ceph to function. Even the quick start only has 3, so I assumed 8 would be fine as well. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Issues upgrading from 0.72.x (emperor) to 0.81.x (firefly)
On Wed, Jul 2, 2014 at 6:18 AM, Sylvain Munaut wrote: > Hi, > > > I'm having a couple of issues during this update. On the test cluster > it went fine, but when running it on production I have a few issues. > (I guess there is some subtle difference I missed, I updated the test > one back when emperor came out). > > For reference, I'm on ubuntu precise, I use self-built packages > (because I'm hitting bugs that are not fixed in the latest official > ones, but there is no change whatsoever to the debian/ directory > except the changelog and they're built with the dpkg-buildpackage). I > did a 'apt-get dist-upgrade' to upgrade everything despite the new > requirements. > > > * The first one is essentially the same as > http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/19632 > > dpkg: error processing > /var/cache/apt/archives/ceph-common_0.80.1-1we3_amd64.deb (--unpack): > trying to overwrite '/etc/ceph/rbdmap', which is also in package ceph > 0.80.1-1we3 > > apt complained about /etc/ceph/rbdmap being in two package and refused > to go further. I ended up using -o Dpkg::Options::="--force-overwrite" > to force it to go on (because it just left some weird inconsistent > state and I needed to clean up the mess), but this seems wrong. > > > * The second one is that apparently it ran a "rm /etc/ceph" somehow > ... on my setup this is not a directory, but a symlink to the real > place the config is stored (the root partition is considered > 'expendable', so machine specific config is elsewhere). It also tried > to erase the /var/log/ceph but failed: I can't help you with packaging issues, but i can tell you that the rbdmap executable got moved to a different package at some point, but I believe the official ones handle it properly. And I'm just guessing here (like I said, can't help with packaging), but I think the deleted /etc/ceph is a result of the force-overwrite option you used. > > --- > Replacing files in old package ceph-common ... > dpkg: warning: unable to delete old directory '/var/log/ceph': > Directory not empty > --- > > > * And finally the upgraded monitor can't join the existing quorum. > Nowhere in the firefly update notes does it say that the new mon can't > join an old quorum. When this was the case back in dumpling, there was > a very explicit explanation but here it just doesn't join and spits > out "pipe fault" in the logs continuously. > > Now it might be "normal", but being the production cluster, I can't > risk and upgrading more than half the mons if I'm not sure this is > indeed normal and not a symptom that the install/update failed and > that the mon is not actually working. That's not normal. A first guess is that you didn't give the new monitor the same keyring as the old ones, but I couldn't say for sure without more info. Turn up logging and post it somewhere? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com > > > > Cheers, > > Sylvain > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Mixing CEPH versions on new ceph nodes...
Hi, I have existing CEPH cluster of 3 nodes, versions 0.72.2 I'm in a process of installing CEPH on 4th node, but now CEPH version is 0.80.1 Will this make problems running mixed CEPH versions ? I intend to upgrade CEPH on exsiting 3 nodes anyway ? Recommended steps ? Thanks -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?
Yeah, it's fighting for attention with a lot of other urgent stuff. :( Anyway, even if you can't look up any details or reproduce at this time, I'm sure you know what shape the cluster was (number of OSDs, running on SSDs or hard drives, etc), and that would be useful guidance. :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Jul 2, 2014 at 6:12 AM, Stefan Priebe - Profihost AG wrote: > > Am 02.07.2014 15:07, schrieb Haomai Wang: >> Could you give some perf counter from rbd client side? Such as op latency? > > Sorry haven't any counters. As this mail was some days unseen - i > thought nobody has an idea or could help. > > Stefan > >> On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG >> wrote: >>> Am 02.07.2014 00:51, schrieb Gregory Farnum: On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG wrote: > Hi Greg, > > Am 26.06.2014 02:17, schrieb Gregory Farnum: >> Sorry we let this drop; we've all been busy traveling and things. >> >> There have been a lot of changes to librados between Dumpling and >> Firefly, but we have no idea what would have made it slower. Can you >> provide more details about how you were running these tests? > > it's just a normal fio run: > fio --ioengine=rbd --bs=4k --name=foo --invalidate=0 > --readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor > --runtime=90 --numjobs=32 --direct=1 --group > > Running one time with firefly libs and one time with dumpling libs. > Traget is always the same pool on a firefly ceph storage. What's the backing cluster you're running against? What kind of CPU usage do you see with both? 25k IOPS is definitely getting up there, but I'd like some guidance about whether we're looking for a reduction in parallelism, or an increase in per-op costs, or something else. >>> >>> Hi Greg, >>> >>> i don't have that test cluster anymore. It had to go into production >>> with dumpling. >>> >>> So i can't tell you. >>> >>> Sorry. >>> >>> Stefan >>> -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacing an OSD
Hi, > Did you also recreate the journal?! It was a journal file and got re-created automatically. Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacing an OSD
Am 01.07.2014 17:48, schrieb Sylvain Munaut: > Hi, > > > As an exercise, I killed an OSD today, just killed the process and > removed its data directory. > > To recreate it, I recreated an empty data dir, then > > ceph-osd -c /etc/ceph/ceph.conf -i 3 --monmap /tmp/monmap --mkfs > > (I tried with and without giving the monmap). > > I then restored the keyring file (from a backup) in the > /var/lib/osd/ceph-3/keyring > > And then I start the process, and it starts fine. http://pastebin.com/TPzNth6P > I even see one active tcp connection to a mon from that process. > > But the osd never becomes "up" or do anything ... > Did you also recreate the journal?! -- Mit freundlichen Grüßen, Smart Weblications GmbH Martinsberger Str. 1 D-95119 Naila fon.: +49 9282 9638 200 fax.: +49 9282 9638 205 24/7: +49 900 144 000 00 - 0,99 EUR/Min* http://www.smart-weblications.de -- Sitz der Gesellschaft: Naila Geschäftsführer: Florian Wiessner HRB-Nr.: HRB 3840 Amtsgericht Hof *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Issues upgrading from 0.72.x (emperor) to 0.81.x (firefly)
Hi, I'm having a couple of issues during this update. On the test cluster it went fine, but when running it on production I have a few issues. (I guess there is some subtle difference I missed, I updated the test one back when emperor came out). For reference, I'm on ubuntu precise, I use self-built packages (because I'm hitting bugs that are not fixed in the latest official ones, but there is no change whatsoever to the debian/ directory except the changelog and they're built with the dpkg-buildpackage). I did a 'apt-get dist-upgrade' to upgrade everything despite the new requirements. * The first one is essentially the same as http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/19632 dpkg: error processing /var/cache/apt/archives/ceph-common_0.80.1-1we3_amd64.deb (--unpack): trying to overwrite '/etc/ceph/rbdmap', which is also in package ceph 0.80.1-1we3 apt complained about /etc/ceph/rbdmap being in two package and refused to go further. I ended up using -o Dpkg::Options::="--force-overwrite" to force it to go on (because it just left some weird inconsistent state and I needed to clean up the mess), but this seems wrong. * The second one is that apparently it ran a "rm /etc/ceph" somehow ... on my setup this is not a directory, but a symlink to the real place the config is stored (the root partition is considered 'expendable', so machine specific config is elsewhere). It also tried to erase the /var/log/ceph but failed: --- Replacing files in old package ceph-common ... dpkg: warning: unable to delete old directory '/var/log/ceph': Directory not empty --- * And finally the upgraded monitor can't join the existing quorum. Nowhere in the firefly update notes does it say that the new mon can't join an old quorum. When this was the case back in dumpling, there was a very explicit explanation but here it just doesn't join and spits out "pipe fault" in the logs continuously. Now it might be "normal", but being the production cluster, I can't risk and upgrading more than half the mons if I'm not sure this is indeed normal and not a symptom that the install/update failed and that the mon is not actually working. Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?
Am 02.07.2014 15:07, schrieb Haomai Wang: > Could you give some perf counter from rbd client side? Such as op latency? Sorry haven't any counters. As this mail was some days unseen - i thought nobody has an idea or could help. Stefan > On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG > wrote: >> Am 02.07.2014 00:51, schrieb Gregory Farnum: >>> On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG >>> wrote: Hi Greg, Am 26.06.2014 02:17, schrieb Gregory Farnum: > Sorry we let this drop; we've all been busy traveling and things. > > There have been a lot of changes to librados between Dumpling and > Firefly, but we have no idea what would have made it slower. Can you > provide more details about how you were running these tests? it's just a normal fio run: fio --ioengine=rbd --bs=4k --name=foo --invalidate=0 --readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor --runtime=90 --numjobs=32 --direct=1 --group Running one time with firefly libs and one time with dumpling libs. Traget is always the same pool on a firefly ceph storage. >>> >>> What's the backing cluster you're running against? What kind of CPU >>> usage do you see with both? 25k IOPS is definitely getting up there, >>> but I'd like some guidance about whether we're looking for a reduction >>> in parallelism, or an increase in per-op costs, or something else. >> >> Hi Greg, >> >> i don't have that test cluster anymore. It had to go into production >> with dumpling. >> >> So i can't tell you. >> >> Sorry. >> >> Stefan >> >>> -Greg >>> Software Engineer #42 @ http://inktank.com | http://ceph.com >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?
Could you give some perf counter from rbd client side? Such as op latency? On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG wrote: > Am 02.07.2014 00:51, schrieb Gregory Farnum: >> On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG >> wrote: >>> Hi Greg, >>> >>> Am 26.06.2014 02:17, schrieb Gregory Farnum: Sorry we let this drop; we've all been busy traveling and things. There have been a lot of changes to librados between Dumpling and Firefly, but we have no idea what would have made it slower. Can you provide more details about how you were running these tests? >>> >>> it's just a normal fio run: >>> fio --ioengine=rbd --bs=4k --name=foo --invalidate=0 >>> --readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor >>> --runtime=90 --numjobs=32 --direct=1 --group >>> >>> Running one time with firefly libs and one time with dumpling libs. >>> Traget is always the same pool on a firefly ceph storage. >> >> What's the backing cluster you're running against? What kind of CPU >> usage do you see with both? 25k IOPS is definitely getting up there, >> but I'd like some guidance about whether we're looking for a reduction >> in parallelism, or an increase in per-op costs, or something else. > > Hi Greg, > > i don't have that test cluster anymore. It had to go into production > with dumpling. > > So i can't tell you. > > Sorry. > > Stefan > >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Best Regards, Wheat ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?
Am 02.07.2014 00:51, schrieb Gregory Farnum: > On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG > wrote: >> Hi Greg, >> >> Am 26.06.2014 02:17, schrieb Gregory Farnum: >>> Sorry we let this drop; we've all been busy traveling and things. >>> >>> There have been a lot of changes to librados between Dumpling and >>> Firefly, but we have no idea what would have made it slower. Can you >>> provide more details about how you were running these tests? >> >> it's just a normal fio run: >> fio --ioengine=rbd --bs=4k --name=foo --invalidate=0 >> --readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor >> --runtime=90 --numjobs=32 --direct=1 --group >> >> Running one time with firefly libs and one time with dumpling libs. >> Traget is always the same pool on a firefly ceph storage. > > What's the backing cluster you're running against? What kind of CPU > usage do you see with both? 25k IOPS is definitely getting up there, > but I'd like some guidance about whether we're looking for a reduction > in parallelism, or an increase in per-op costs, or something else. Hi Greg, i don't have that test cluster anymore. It had to go into production with dumpling. So i can't tell you. Sorry. Stefan > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some OSD and MDS crash
Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). I will remember it the next time ;) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU wrote: Hi, After the upgrade to firefly, I have some PG in peering state. I seen the output of 0.82 so I try to upgrade for solved my problem. My three MDS crash and some OSD triggers a chain reaction that kills other OSD. I think my MDS will not start because of the metadata are on the OSD. I have 36 OSD on three servers and I identified 5 OSD which makes crash others. If i not start their, the cluster passe in reconstructive state with 31 OSD but i have 378 in down+peering state. How can I do ? Would you more information ( os, crash log, etc ... ) ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes & réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- -- Pierre BLONDEAU Administrateur Systèmes & réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- -- Pierre BLONDEAU Administrateur Systèmes & réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD layering
>Objet : Re: [ceph-users] RBD layering On 07/02/2014 10:08 AM, NEVEU Stephane wrote: >> Hi all, >> >> I'm missing around with "rbd layering" to store some ready-to-use >> templates (format 2) in a template pool : >> >> /Rbd -p templates ls/ >> >> /Ubuntu1404/ >> >> /Centos6/ >> >> /./ >> >> // >> >> /Rbd snap create templates/Ubuntu1404@Ubuntu1404-snap-protected/ >> >> /Rbd snap protect templates/Ubuntu1404@Ubuntu1404-snap-protected/ >> >> /Rbd clone templates/Ubuntu1404@Ubuntu1404-snap-protected >> kvm1/Ubuntu1404-snap-protected-children/ >> >> My libvirt key is created with : >> >> /Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow >>class-read object_prefix rbd_children, allow rwx pool=kvm1, allow r >> pool=templates'/ >> >> // >> >> But read permission for the pool 'templates' seems to be not enough, >> libvirt is complaining "RBD cannot access the rbd disk >> kvm1/Ubuntu1404-snap-protected-children" so : >> >> /Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow >> class-read object_prefix rbd_children, allow rwx pool=kvm1, allow >> *rwx* pool=templates'/ >> >I think that rx should be enough instead of rwx. Could you try that? >Wido Hi Wido, thank you: I'm trying this : Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=kvm1, allow rx pool=templates' Error EIVAL: key for client.kvm1 exists but cap osd does not match Is there another way to directly modify caps ? or do I need to suppress the key and re-create it ? > // > > It's actually working but it's probably a bit too much, because I > don't want people to be able to modify the parent template so do I > have a better choice ? > > Libvirt seems to be happier but this clone is read-only and I want now > people to use this OS image as a base file and write differences in a > backing file (like with qemu . -b .). > > How can I do such a thing ? or maybe I'm doing it in a wrong way. any help ? Am I clear enough here ? > > Thanks > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacing an OSD
Hi Loic, > By restoring the fsid file from the back, presumably. I did not think of that > when you showed the ceph-osd mkfs line, but it makes sense. This is not the > ceph fsid. Yeah, I though about that and I saw fsid and ceph_fsid, but I wasn't just that just replacing the file would be enough or if the fsid was used somewhere else and this could yield some weird state ... > root@bm0015:/var/lib/ceph/osd/ceph-1# grep fsid /etc/ceph/ceph.conf > fsid = 571bb920-6d85-44d7-9eca-1bc114d1cd75 Weird, I don't have a fsid in my ceph.conf ... Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacing an OSD
Hi Sylvain, On 02/07/2014 11:13, Sylvain Munaut wrote: > Ah, I finally fond something that looks like an error message : > > 2014-07-02 11:07:57.817269 7f0692e3a700 7 mon.a@0(leader).osd e1147 > preprocess_boot from osd.3 10.192.2.70:6807/9702 clashes with existing > osd: different fsid (ours: e44c914a-23e9-4756-9713-166de401dec6 ; > theirs: c1cfff2f-4f2e-4c1d-a947-24bbc6f122ca) > > Not really sure how to fix it though. By restoring the fsid file from the back, presumably. I did not think of that when you showed the ceph-osd mkfs line, but it makes sense. This is not the ceph fsid. root@bm0015:/var/lib/ceph/osd/ceph-1# cat ceph_fsid 571bb920-6d85-44d7-9eca-1bc114d1cd75 root@bm0015:/var/lib/ceph/osd/ceph-1# cat fsid 085a821e-b487-41ef-87ed-dfc6af097a44 root@bm0015:/var/lib/ceph/osd/ceph-1# grep fsid /etc/ceph/ceph.conf fsid = 571bb920-6d85-44d7-9eca-1bc114d1cd75 root@bm0015:/var/lib/ceph/osd/ceph-1# Cheers > > > Cheers, > >Sylvain > -- Loïc Dachary, Artisan Logiciel Libre signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacing an OSD
Just for future reference, you actually do need to remove the OSD even if you're going to re-add it like 10 sec later ... $ ceph osd rm 3 removed osd.3 $ ceph osd create 3 Then it works fine. No need to remove from crusmap or remove the auth key (you can re-use both), but you need to remove/add it from the cluster for it to properly boot. Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD layering
On 07/02/2014 10:08 AM, NEVEU Stephane wrote: Hi all, I’m missing around with “rbd layering” to store some ready-to-use templates (format 2) in a template pool : /Rbd –p templates ls/ /Ubuntu1404/ /Centos6/ /…/ // /Rbd snap create templates/Ubuntu1404@Ubuntu1404-snap-protected/ /Rbd snap protect templates/Ubuntu1404@Ubuntu1404-snap-protected/ /Rbd clone templates/Ubuntu1404@Ubuntu1404-snap-protected kvm1/Ubuntu1404-snap-protected-children/ My libvirt key is created with : /Ceph auth get-or-create client.kvm1 mon ‘allow r’ osd ‘allow class-read object_prefix rbd_children, allow rwx pool=kvm1, allow r pool=templates’/ // But read permission for the pool ‘templates’ seems to be not enough, libvirt is complaining “RBD cannot access the rbd disk kvm1/Ubuntu1404-snap-protected-children” so : /Ceph auth get-or-create client.kvm1 mon ‘allow r’ osd ‘allow class-read object_prefix rbd_children, allow rwx pool=kvm1, allow *rwx* pool=templates’/ I think that rx should be enough instead of rwx. Could you try that? Wido // It’s actually working but it’s probably a bit too much, because I don’t want people to be able to modify the parent template so do I have a better choice ? Libvirt seems to be happier but this clone is read-only and I want now people to use this OS image as a base file and write differences in a backing file (like with qemu … -b …). How can I do such a thing ? or maybe I’m doing it in a wrong way… any help ? Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacing an OSD
Ah, I finally fond something that looks like an error message : 2014-07-02 11:07:57.817269 7f0692e3a700 7 mon.a@0(leader).osd e1147 preprocess_boot from osd.3 10.192.2.70:6807/9702 clashes with existing osd: different fsid (ours: e44c914a-23e9-4756-9713-166de401dec6 ; theirs: c1cfff2f-4f2e-4c1d-a947-24bbc6f122ca) Not really sure how to fix it though. Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [Solved] Init scripts in Debian not working
Hello, I tried the ceph packages from jessie, too. After some time penetrating Google I think I found the solution. This will probably work for all package sources. You have to create an empty marker file named 'sysvinit' in the directories below /var/lib/ceph/XXX. Then everythink works fine. ceph-deploy will create those files for you depending on the init system for the monitors: $ ls /var/lib/ceph/mon/ceph-node1/ done keyring store.db upstart $ cat /etc/issue Ubuntu 12.04 LTS \n \l And the same would happen for OSDs (upstart file is there): $ ls /var/lib/ceph/osd/ceph-0/ activate.monmap active ceph_fsid current fsid journal keyring magic ready store_version superblock upstart whoami The output in ceph-deploy that tells you what init system will be used is this: [ceph_deploy.osd][INFO ] Distro info: Ubuntu 12.04 precise [ceph_deploy.osd][DEBUG ] activating host node1 disk /home/vagrant/foo [ceph_deploy.osd][DEBUG ] will use init type: upstart Can you share your ceph-deploy output as you are deploying your OSDs? Sorry for the late reply. I deleted my ceph setup I created with ceph-deploy. So I cannot give you the info you requested. Perhaps I will create another setup using some virtual maschines ... I found some time to work on my 'manually' created setup (Wheezy, Ceph Firefly repositories). Everything works fine and after manually creating the empty 'sysvinit' files in the data dirs of the daemons, I am able to start/stop the daemons on the same host. In my opinion the sysvinit files should be mentioned in the manual install section of the documentation. One problem remains: At the moment my ceph.conf file does not contain any configuration for the single daemons, just a [global] section. The docs mention the allhosts switch -a that doesn't work for me. I tried to add daemon specific sections in the config file (host entries), but that made no difference. Is the -a switch still supported for sysvinit? What entries do I have to put into the config to remotely start/stop services? It would be nice to start/stop the ceph cluster from a single host. Thanks in advance. Dieter ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] RBD layering
Hi all, I'm missing around with "rbd layering" to store some ready-to-use templates (format 2) in a template pool : Rbd -p templates ls Ubuntu1404 Centos6 ... Rbd snap create templates/Ubuntu1404@Ubuntu1404-snap-protected Rbd snap protect templates/Ubuntu1404@Ubuntu1404-snap-protected Rbd clone templates/Ubuntu1404@Ubuntu1404-snap-protected kvm1/Ubuntu1404-snap-protected-children My libvirt key is created with : Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=kvm1, allow r pool=templates' But read permission for the pool 'templates' seems to be not enough, libvirt is complaining "RBD cannot access the rbd disk kvm1/Ubuntu1404-snap-protected-children" so : Ceph auth get-or-create client.kvm1 mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=kvm1, allow rwx pool=templates' It's actually working but it's probably a bit too much, because I don't want people to be able to modify the parent template so do I have a better choice ? Libvirt seems to be happier but this clone is read-only and I want now people to use this OS image as a base file and write differences in a backing file (like with qemu ... -b ...). How can I do such a thing ? or maybe I'm doing it in a wrong way... any help ? Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacing an OSD
Hi, > Does OSD 3 show when you ceph pg dump ? If so I would look in the logs of an > OSD which is participating in the same PG. It appears at the end but not in any PG, it's now been marked out and all was redistributed. osdstat kbused kbavail kb hb in hb out 0 156023521584468831447040[1,2] [] 1 156023521584468831447040[0,2] [] 2 156023521584468831447040[0,1] [] 3 0 0 0 [] [] sum468070564753406494341120 Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com