Re: [ceph-users] dmcrypt osd startup problem
Hello, I have a similar problem. I am not sure it's the same, but if i can help. I'm in jewel ( upgrade from firefly ) on jessie. The temporary solution that i found to start the OSD is to force udev to launch its triggers : udevadm trigger --action=add Regards -- -- Pierre BLONDEAU Administrateur Système & réseau Université de Caen Normandie Laboratoire GREYC, Département d'informatique Tel : 02 31 56 75 42. Bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Reusing journal partitions when using ceph-deploy/ceph-disk --dmcrypt
Le 05/12/2016 à 05:14, Alex Gorbachev a écrit : > Referencing > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-July/003293.html > > When using --dmcrypt with ceph-deploy/ceph-disk, the journal device is > not allowed to be an existing partition. You have to specify the entire > block device, on which the tools create a partition equal to osd journal > size setting. > > However, in the case when an HDD fails and OSD is deleted and then > replaced with another HDD, I have not been able to find a way to reuse > the earlier journal partition. Ceph-deploy creates a new one, which can > lead into unpleasant situations on the SSD used for journaling. Hello, Remove the old journal partition ( ex : parted /dev/sdc rm 2 ). Ceph-deploy should reuse the space for the new one. Regards > Is there a way anyone know of, to continue to use a specific partition > as journal with ceph-deploy? > > Thanks in advance, > Alex > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- -- Pierre BLONDEAU Administrateur Système & réseau Université de Caen Normandie Laboratoire GREYC, Département d'informatique Tel : 02 31 56 75 42. Bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-disk dmcrypt : encryption key placement problem
Hello, we have a JEWEL cluster upgraded from FIREFLY. The cluster is encrypted with dmcrypt. Yesterday, i added some new OSDs. The first time since the upgrade. I searched the new keys to backup them and i see that the creation of new OSDs with the option dmcrypt changed. To be able to retrieved the key if the server filesystem crash ( http://tracker.ceph.com/issues/14669 ) or if the OSD move, a ceph user is created and its keyring file is used as LUKS's encryption key. Good idea. The problem is : There is a small partition named ceph lockbox at the begening of the disk. We can find the keyring among the files of this partition. Why is the encryption key stored on the same disk and in clear ? Someone who could get the disk would be able to read it. There's no point encrypting it in this case. It is urgent to move the keyring file elsewhere ( in /etc/ceph/dmcrypt-keys ? ) Regards Pierre -- -- Pierre BLONDEAU Administrateur Système & réseau Université de Caen Normandie Laboratoire GREYC, Département d'informatique Tel : 02 31 56 75 42. Bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] journal encryption with dmcrypt
Hi, We had a problem to start the OSD because the startup script doesn't know where find the key. The default directory is /etc/ceph/dmcrypt-keys. we left it by default and it worked. I haven't tried, but may be it can be solved using /etc/crypttab. Regards Le 22/01/2016 21:35, Reno Rainz a écrit : > Hi guys, > > I'm trying to setup a cluster with encryption on osd data and journal. > To do that I use ceph-deploy with this 2 options --dmcrypt > --dmcrypt-key-dir on /dev/sdc disk. > > Disk state before the prepare ceph-deploy command : > > root@ceph-osd-1:~$ lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > sda 8:00 40G 0 disk > └─sda1 8:10 40G 0 part / > sdb 8:16 0 16G 0 disk > sdc 8:32 0 16G 0 disk > sdd 8:48 0 16G 0 disk > > The prepare command run fine : > > ceph-deploy osd prepare ceph-osd-1:/dev/sdc --dmcrypt --dmcrypt-key-dir > /root/keydir > > ceph_deploy.conf][DEBUG ] found configuration file at: > /home/cephuser/.cephdeploy.conf > [ceph_deploy.cli][INFO ] Invoked (1.5.31): ../ceph-deploy osd prepare > ceph-osd-1:/dev/sdc --dmcrypt --dmcrypt-key-dir /root/keydir > . > . > . > [ceph-osd-1][INFO ] Running command: sudo ceph --cluster=ceph osd stat > --format=json > [ceph_deploy.osd][DEBUG ] Host ceph-osd-1 is now ready for osd use. > > So far, so good. > > root@ceph-osd-1:~# lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > sda 8:00 40G 0 disk > └─sda1 8:10 40G 0 part / > sdb 8:16 0 16G 0 disk > sdc 8:32 0 16G 0 disk > ├─sdc1 8:33 0 11G 0 part > └─sdc2 8:34 0 5G 0 part > sdd 8:48 0 16G 0 disk > > Unfortunately when I try to activate this osd it does not work ... > > ceph-deploy osd activate ceph-osd-1:/dev/sdc1:/dev/sdc2 > > [ceph_deploy.conf][DEBUG ] found configuration file at: > /home/cephuser/.cephdeploy.conf > [ceph_deploy.cli][INFO ] Invoked (1.5.31): ../ceph-deploy osd activate > ceph-osd-1:/dev/sdc1:/dev/sdc2 > . > . > . > . > [ceph-osd-1][WARNIN] INFO:ceph-disk:Running command: /bin/mount -t > crypto_LUKS -o -- /dev/sdc1 /var/lib/ceph/tmp/mnt.C0wSgD > [ceph-osd-1][WARNIN] mount: unknown filesystem type 'crypto_LUKS' > [ceph-osd-1][WARNIN] ceph-disk: Mounting filesystem failed: Command > '['/bin/mount', '-t', 'crypto_LUKS', '-o', '', '--', '/dev/sdc1', > '/var/lib/ceph/tmp/mnt.C0wSgD']' returned non-zero exit status 32 > [ceph-osd-1][ERROR ] RuntimeError: command returned non-zero exit status: 1 > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: ceph-disk > -v activate --mark-init upstart --mount /dev/sdc1 > > I can provide all the log. > > Do you guys have idea ? > > Thanks. > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- -- Pierre BLONDEAU Administrateur Systèmes & réseaux Université de Caen Normandie Laboratoire GREYC, Département d'informatique tel: 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How repair 2 invalids pgs
Le 14/08/2015 15:48, Pierre BLONDEAU a écrit : Hy, Yesterday, I removed 5 ods on 15 from my cluster ( machine migration ). When I stopped the processes, I haven't verified that all the pages were in active stat. I removed the 5 ods from the cluster ( ceph osd out osd.9 ; ceph osd crush rm osd.9 ; ceph auth del osd.9 ; ceph osd rm osd.9 ) , and i check after... and I had two inactive pgs I have not formatted the filesystem of the osds. The health : pg 7.b is stuck inactive for 86083.236722, current state inactive, last acting [1,2] pg 7.136 is stuck inactive for 86098.214967, current state inactive, last acting [4,7] The recovery state : recovery_state: [ { name: Started\/Primary\/Peering\/WaitActingChange, enter_time: 2015-08-13 15:19:49.559965, comment: waiting for pg acting set to change}, { name: Started, enter_time: 2015-08-13 15:19:46.492625}], How can i solved my problem ? Can i re-add the osds since the filesystem ? My cluster is used for rbd's image and a little cephfs share. I can read all files in cephfs and I tried to check if there pgs were used by an image. I don't find anything, but I not sure of my script. My cluster is used for rbd image and a little cephfs share. I can read all block in cephfs and i check all image to verify if they use these pgs. I don't find anything. How do you know if a pg is used ? Regards Hello, The names of pgs start with 7.. so they are used by the pool id 7 ? For me, it's cephfs_meta ( cephfs metadata ). I get no response when i done rados -p cephfs_meta ls . Like it's a small share, it's not serious. I can restore it easily. So I add the news OSDs of the new machine. And it solved the problem, but i don't understand why. So if someone have an idea ? Regards PS : I use 0.80.10 on wheezy -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How repair 2 invalids pgs
Hy, Yesterday, I removed 5 ods on 15 from my cluster ( machine migration ). When I stopped the processes, I haven't verified that all the pages were in active stat. I removed the 5 ods from the cluster ( ceph osd out osd.9 ; ceph osd crush rm osd.9 ; ceph auth del osd.9 ; ceph osd rm osd.9 ) , and i check after... and I had two inactive pgs I have not formatted the filesystem of the osds. The health : pg 7.b is stuck inactive for 86083.236722, current state inactive, last acting [1,2] pg 7.136 is stuck inactive for 86098.214967, current state inactive, last acting [4,7] The recovery state : recovery_state: [ { name: Started\/Primary\/Peering\/WaitActingChange, enter_time: 2015-08-13 15:19:49.559965, comment: waiting for pg acting set to change}, { name: Started, enter_time: 2015-08-13 15:19:46.492625}], How can i solved my problem ? Can i re-add the osds since the filesystem ? My cluster is used for rbd's image and a little cephfs share. I can read all files in cephfs and I tried to check if there pgs were used by an image. I don't find anything, but I not sure of my script. My cluster is used for rbd image and a little cephfs share. I can read all block in cephfs and i check all image to verify if they use these pgs. I don't find anything. How do you know if a pg is used ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- { state: inactive, snap_trimq: [], epoch: 15291, up: [ 4, 7], acting: [ 4, 7], info: { pgid: 7.136, last_update: 0'0, last_complete: 0'0, log_tail: 0'0, last_user_version: 0, last_backfill: MAX, purged_snaps: [], history: { epoch_created: 4046, last_epoch_started: 14458, last_epoch_clean: 14458, last_epoch_split: 0, same_up_since: 14475, same_interval_since: 14475, same_primary_since: 1, last_scrub: 0'0, last_scrub_stamp: 2015-08-13 07:07:17.963482, last_deep_scrub: 0'0, last_deep_scrub_stamp: 2015-08-08 06:18:33.726150, last_clean_scrub_stamp: 2015-08-13 07:07:17.963482}, stats: { version: 0'0, reported_seq: 10510, reported_epoch: 15291, state: inactive, last_fresh: 2015-08-14 13:52:48.121254, last_change: 2015-08-13 15:19:43.824578, last_active: 2015-08-13 15:19:31.362363, last_clean: 2015-08-13 15:19:31.362363, last_became_active: 0.00, last_unstale: 2015-08-14 13:52:48.121254, mapping_epoch: 14472, log_start: 0'0, ondisk_log_start: 0'0, created: 4046, last_epoch_clean: 14458, parent: 0.0, parent_split_bits: 0, last_scrub: 0'0, last_scrub_stamp: 2015-08-13 07:07:17.963482, last_deep_scrub: 0'0, last_deep_scrub_stamp: 2015-08-08 06:18:33.726150, last_clean_scrub_stamp: 2015-08-13 07:07:17.963482, log_size: 0, ondisk_log_size: 0, stats_invalid: 0, stat_sum: { num_bytes: 0, num_objects: 0, num_object_clones: 0, num_object_copies: 0, num_objects_missing_on_primary: 0, num_objects_degraded: 0, num_objects_unfound: 0, num_objects_dirty: 0, num_whiteouts: 0, num_read: 0, num_read_kb: 0, num_write: 0, num_write_kb: 0, num_scrub_errors: 0, num_shallow_scrub_errors: 0, num_deep_scrub_errors: 0, num_objects_recovered: 0, num_bytes_recovered: 0, num_keys_recovered: 0, num_objects_omap: 0, num_objects_hit_set_archive: 0}, stat_cat_sum: {}, up: [ 4, 7], acting: [ 4, 7], up_primary: 4, acting_primary: 4}, empty: 1, dne: 0, incomplete: 0, last_epoch_started: 14474, hit_set_history: { current_last_update: 0'0, current_last_stamp: 0.00, current_info: { begin: 0.00, end: 0.00, version: 0'0}, history: []}}, peer_info: [], recovery_state: [ { name: Started\/Primary\/Peering\/WaitActingChange, enter_time: 2015-08-13 15:19:43.688351, comment: waiting for pg acting set to change}, { name: Started, enter_time: 2015-08-13 15:19:35.569102}], agent_state: {}}{ state: inactive, snap_trimq: [], epoch: 15291, up: [ 1, 2
[ceph-users] 0.80.10 released ?
Hi, I can update my ceph's packages to 0.80.10. But i can't found informations about this version ( website, mailing list ). Someone know where i can found these informations ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-deploy option --dmcrypt-key-dir unusable
Hi, The option --dmcrypt-key-dir when you want to activate/create a new OSD is unusable by default. Because the default path /etc/ceph/dmcrypt-keys/ is hard coded in udev rules. I have found and test two simple way to solve : - Change the path of keys in ''/lib/udev/rules.d/95-ceph-osd.rules'' - Make a link from your path on ''/etc/ceph/dmcrypt-keys''. Maybe the second way is a simple solution to patch ceph-deploy. Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Can't export cephfs via nfs
Hi, The NFS crossmnt options can help you. Regards Le 11/08/2014 16:34, Micha Krause a écrit : Hi, Im trying to build a cephfs to nfs gateway, but somehow i can't mount the share if it is backed by cephfs: mount ngw01.ceph:/srv/micha /mnt/tmp/ mount.nfs: Connection timed out cephfs mount on the gateway: 10.210.32.11:6789:/ngw on /srv type ceph (rw,relatime,name=cephfs-ngw,secret=hidden,nodcache,nofsc,acl) /etc/exports: /srv/micha 10.6.6.137(rw,no_root_squash,async) /etc 10.6.6.137(rw,no_root_squash,async) I can mount the /etc export with no problem. uname -a Linux ngw01 3.14-0.bpo.1-amd64 #1 SMP Debian 3.14.12-1~bpo70+1 (2014-07-13) x86_64 GNU/Linux Im using the nfs-kernel-server. Micha Krause ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some OSD and MDS crash
Hi, After the repair process, i have : 1926 active+clean 2 active+clean+inconsistent This two PGs seem to be on the same osd ( #34 ): # ceph pg dump | grep inconsistent dumped all in format plain 0.2e4 0 0 0 8388660 4 4 active+clean+inconsistent 2014-07-16 11:39:43.819631 9463'4 438411:133968 [34,4] 34 [34,4] 34 9463'4 2014-07-16 04:52:54.417333 9463'4 2014-07-11 09:29:22.041717 0.1ed 5 0 0 0 8388623 10 10 active+clean+inconsistent 2014-07-16 11:39:45.820142 9712'10 438411:144792 [34,2] 34 [34,2] 34 9712'10 2014-07-16 09:12:44.742488 9712'10 2014-07-10 21:57:11.345241 It's can explain why my MDS won't to start ? If i remove ( or shutdown ) this OSD, it's can solved my problem ? Regards. Le 10/07/2014 11:51, Pierre BLONDEAU a écrit : Hi, Great. All my OSD restart : osdmap e438044: 36 osds: 36 up, 36 in All PG page are active and some in recovery : 1604040/49575206 objects degraded (3.236%) 1780 active+clean 17 active+degraded+remapped+backfilling 61 active+degraded+remapped+wait_backfill 11 active+clean+scrubbing+deep 34 active+remapped+backfilling 21 active+remapped+wait_backfill 4 active+clean+replay But all mds crash. Logs are here : https://blondeau.users.greyc.fr/cephlog/legacy/ In any case, thank you very much for your help. Pierre Le 09/07/2014 19:34, Joao Eduardo Luis a écrit : On 07/09/2014 02:22 PM, Pierre BLONDEAU wrote: Hi, There is any chance to restore my data ? Okay, I talked to Sam and here's what you could try before anything else: - Make sure you have everything running on the same version. - unset the the chooseleaf_vary_r flag -- this can be accomplished by setting tunables to legacy. - have the osds join in the cluster - you should then either upgrade to firefly (if you haven't done so by now) or wait for the point-release before you move on to setting tunables to optimal again. Let us know how it goes. -Joao Regards Pierre Le 07/07/2014 15:42, Pierre BLONDEAU a écrit : No chance to have those logs and even less in debug mode. I do this change 3 weeks ago. I put all my log here if it's can help : https://blondeau.users.greyc.fr/cephlog/all/ I have a chance to recover my +/- 20TB of data ? Regards Le 03/07/2014 21:48, Joao Luis a écrit : Do those logs have a higher debugging level than the default? If not nevermind as they will not have enough information. If they do however, we'd be interested in the portion around the moment you set the tunables. Say, before the upgrade and a bit after you set the tunable. If you want to be finer grained, then ideally it would be the moment where those maps were created, but you'd have to grep the logs for that. Or drop the logs somewhere and I'll take a look. -Joao On Jul 3, 2014 5:48 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr mailto:pierre.blond...@unicaen.fr wrote: Le 03/07/2014 13:49, Joao Eduardo Luis a écrit : On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote: Le 03/07/2014 00:55, Samuel Just a écrit : Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0___4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0___4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? The only thing that comes to mind that could cause this is if we changed the leader's in-memory map, proposed it, it failed, and only the leader got to write the map to disk somehow. This happened once on a totally different issue (although I can't pinpoint right now which). In such a scenario, the leader would serve the incorrect osdmap to whoever asked osdmaps from it, the remaining quorum would serve the correct osdmaps to all the others. This could cause this divergence. Or it could be something else. Are there logs for the monitors for the timeframe this may have happened in? Which exactly timeframe you want ? I have 7 days of logs, I should have informations about the upgrade from firefly to 0.82. Which mon's log do you want ? Three ? Regards -Joao Pierre: do you recall how and when that got set? I am not sure to understand
Re: [ceph-users] Some OSD and MDS crash
Hi, Great. All my OSD restart : osdmap e438044: 36 osds: 36 up, 36 in All PG page are active and some in recovery : 1604040/49575206 objects degraded (3.236%) 1780 active+clean 17 active+degraded+remapped+backfilling 61 active+degraded+remapped+wait_backfill 11 active+clean+scrubbing+deep 34 active+remapped+backfilling 21 active+remapped+wait_backfill 4 active+clean+replay But all mds crash. Logs are here : https://blondeau.users.greyc.fr/cephlog/legacy/ In any case, thank you very much for your help. Pierre Le 09/07/2014 19:34, Joao Eduardo Luis a écrit : On 07/09/2014 02:22 PM, Pierre BLONDEAU wrote: Hi, There is any chance to restore my data ? Okay, I talked to Sam and here's what you could try before anything else: - Make sure you have everything running on the same version. - unset the the chooseleaf_vary_r flag -- this can be accomplished by setting tunables to legacy. - have the osds join in the cluster - you should then either upgrade to firefly (if you haven't done so by now) or wait for the point-release before you move on to setting tunables to optimal again. Let us know how it goes. -Joao Regards Pierre Le 07/07/2014 15:42, Pierre BLONDEAU a écrit : No chance to have those logs and even less in debug mode. I do this change 3 weeks ago. I put all my log here if it's can help : https://blondeau.users.greyc.fr/cephlog/all/ I have a chance to recover my +/- 20TB of data ? Regards Le 03/07/2014 21:48, Joao Luis a écrit : Do those logs have a higher debugging level than the default? If not nevermind as they will not have enough information. If they do however, we'd be interested in the portion around the moment you set the tunables. Say, before the upgrade and a bit after you set the tunable. If you want to be finer grained, then ideally it would be the moment where those maps were created, but you'd have to grep the logs for that. Or drop the logs somewhere and I'll take a look. -Joao On Jul 3, 2014 5:48 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr mailto:pierre.blond...@unicaen.fr wrote: Le 03/07/2014 13:49, Joao Eduardo Luis a écrit : On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote: Le 03/07/2014 00:55, Samuel Just a écrit : Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0___4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0___4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? The only thing that comes to mind that could cause this is if we changed the leader's in-memory map, proposed it, it failed, and only the leader got to write the map to disk somehow. This happened once on a totally different issue (although I can't pinpoint right now which). In such a scenario, the leader would serve the incorrect osdmap to whoever asked osdmaps from it, the remaining quorum would serve the correct osdmaps to all the others. This could cause this divergence. Or it could be something else. Are there logs for the monitors for the timeframe this may have happened in? Which exactly timeframe you want ? I have 7 days of logs, I should have informations about the upgrade from firefly to 0.82. Which mon's log do you want ? Three ? Regards -Joao Pierre: do you recall how and when that got set? I am not sure to understand, but if I good remember after the update in firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I see feature set mismatch in log. So if I good remeber, i do : ceph osd crush tunables optimal for the problem of crush map and I update my client and server kernel to 3.16rc. It's could be that ? Pierre -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just sam.j...@inktank.com mailto:sam.j...@inktank.com wrote: Yeah, divergent osdmaps: 555ed048e73024687fc8b106a570db__4f osd-20_osdmap.13258__0___4E62BB79__none 6037911f31dc3c18b05499d24dcdbe__5c osd-23_osdmap
Re: [ceph-users] Some OSD and MDS crash
Hi, There is any chance to restore my data ? Regards Pierre Le 07/07/2014 15:42, Pierre BLONDEAU a écrit : No chance to have those logs and even less in debug mode. I do this change 3 weeks ago. I put all my log here if it's can help : https://blondeau.users.greyc.fr/cephlog/all/ I have a chance to recover my +/- 20TB of data ? Regards Le 03/07/2014 21:48, Joao Luis a écrit : Do those logs have a higher debugging level than the default? If not nevermind as they will not have enough information. If they do however, we'd be interested in the portion around the moment you set the tunables. Say, before the upgrade and a bit after you set the tunable. If you want to be finer grained, then ideally it would be the moment where those maps were created, but you'd have to grep the logs for that. Or drop the logs somewhere and I'll take a look. -Joao On Jul 3, 2014 5:48 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr mailto:pierre.blond...@unicaen.fr wrote: Le 03/07/2014 13:49, Joao Eduardo Luis a écrit : On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote: Le 03/07/2014 00:55, Samuel Just a écrit : Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0___4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0___4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? The only thing that comes to mind that could cause this is if we changed the leader's in-memory map, proposed it, it failed, and only the leader got to write the map to disk somehow. This happened once on a totally different issue (although I can't pinpoint right now which). In such a scenario, the leader would serve the incorrect osdmap to whoever asked osdmaps from it, the remaining quorum would serve the correct osdmaps to all the others. This could cause this divergence. Or it could be something else. Are there logs for the monitors for the timeframe this may have happened in? Which exactly timeframe you want ? I have 7 days of logs, I should have informations about the upgrade from firefly to 0.82. Which mon's log do you want ? Three ? Regards -Joao Pierre: do you recall how and when that got set? I am not sure to understand, but if I good remember after the update in firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I see feature set mismatch in log. So if I good remeber, i do : ceph osd crush tunables optimal for the problem of crush map and I update my client and server kernel to 3.16rc. It's could be that ? Pierre -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just sam.j...@inktank.com mailto:sam.j...@inktank.com wrote: Yeah, divergent osdmaps: 555ed048e73024687fc8b106a570db__4f osd-20_osdmap.13258__0___4E62BB79__none 6037911f31dc3c18b05499d24dcdbe__5c osd-23_osdmap.13258__0___4E62BB79__none Joao: thoughts? -Sam On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr mailto:pierre.blond...@unicaen.fr wrote: The files When I upgrade : ceph-deploy install --stable firefly servers... on each servers service ceph restart mon on each servers service ceph restart osd on each servers service ceph restart mds I upgraded from emperor to firefly. After repair, remap, replace, etc ... I have some PG which pass in peering state. I thought why not try the version 0.82, it could solve my problem. ( It's my mistake ). So, I upgrade from firefly to 0.83 with : ceph-deploy install
Re: [ceph-users] Some OSD and MDS crash
No chance to have those logs and even less in debug mode. I do this change 3 weeks ago. I put all my log here if it's can help : https://blondeau.users.greyc.fr/cephlog/all/ I have a chance to recover my +/- 20TB of data ? Regards Le 03/07/2014 21:48, Joao Luis a écrit : Do those logs have a higher debugging level than the default? If not nevermind as they will not have enough information. If they do however, we'd be interested in the portion around the moment you set the tunables. Say, before the upgrade and a bit after you set the tunable. If you want to be finer grained, then ideally it would be the moment where those maps were created, but you'd have to grep the logs for that. Or drop the logs somewhere and I'll take a look. -Joao On Jul 3, 2014 5:48 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr mailto:pierre.blond...@unicaen.fr wrote: Le 03/07/2014 13:49, Joao Eduardo Luis a écrit : On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote: Le 03/07/2014 00:55, Samuel Just a écrit : Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0___4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0___4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? The only thing that comes to mind that could cause this is if we changed the leader's in-memory map, proposed it, it failed, and only the leader got to write the map to disk somehow. This happened once on a totally different issue (although I can't pinpoint right now which). In such a scenario, the leader would serve the incorrect osdmap to whoever asked osdmaps from it, the remaining quorum would serve the correct osdmaps to all the others. This could cause this divergence. Or it could be something else. Are there logs for the monitors for the timeframe this may have happened in? Which exactly timeframe you want ? I have 7 days of logs, I should have informations about the upgrade from firefly to 0.82. Which mon's log do you want ? Three ? Regards -Joao Pierre: do you recall how and when that got set? I am not sure to understand, but if I good remember after the update in firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I see feature set mismatch in log. So if I good remeber, i do : ceph osd crush tunables optimal for the problem of crush map and I update my client and server kernel to 3.16rc. It's could be that ? Pierre -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just sam.j...@inktank.com mailto:sam.j...@inktank.com wrote: Yeah, divergent osdmaps: 555ed048e73024687fc8b106a570db__4f osd-20_osdmap.13258__0___4E62BB79__none 6037911f31dc3c18b05499d24dcdbe__5c osd-23_osdmap.13258__0___4E62BB79__none Joao: thoughts? -Sam On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr mailto:pierre.blond...@unicaen.fr wrote: The files When I upgrade : ceph-deploy install --stable firefly servers... on each servers service ceph restart mon on each servers service ceph restart osd on each servers service ceph restart mds I upgraded from emperor to firefly. After repair, remap, replace, etc ... I have some PG which pass in peering state. I thought why not try the version 0.82, it could solve my problem. ( It's my mistake ). So, I upgrade from firefly to 0.83 with : ceph-deploy install --testing servers... .. Now, all programs are in version 0.82
Re: [ceph-users] Some OSD and MDS crash
Le 03/07/2014 13:49, Joao Eduardo Luis a écrit : On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote: Le 03/07/2014 00:55, Samuel Just a écrit : Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? The only thing that comes to mind that could cause this is if we changed the leader's in-memory map, proposed it, it failed, and only the leader got to write the map to disk somehow. This happened once on a totally different issue (although I can't pinpoint right now which). In such a scenario, the leader would serve the incorrect osdmap to whoever asked osdmaps from it, the remaining quorum would serve the correct osdmaps to all the others. This could cause this divergence. Or it could be something else. Are there logs for the monitors for the timeframe this may have happened in? Which exactly timeframe you want ? I have 7 days of logs, I should have informations about the upgrade from firefly to 0.82. Which mon's log do you want ? Three ? Regards -Joao Pierre: do you recall how and when that got set? I am not sure to understand, but if I good remember after the update in firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I see feature set mismatch in log. So if I good remeber, i do : ceph osd crush tunables optimal for the problem of crush map and I update my client and server kernel to 3.16rc. It's could be that ? Pierre -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just sam.j...@inktank.com wrote: Yeah, divergent osdmaps: 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none Joao: thoughts? -Sam On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: The files When I upgrade : ceph-deploy install --stable firefly servers... on each servers service ceph restart mon on each servers service ceph restart osd on each servers service ceph restart mds I upgraded from emperor to firefly. After repair, remap, replace, etc ... I have some PG which pass in peering state. I thought why not try the version 0.82, it could solve my problem. ( It's my mistake ). So, I upgrade from firefly to 0.83 with : ceph-deploy install --testing servers... .. Now, all programs are in version 0.82. I have 3 mons, 36 OSD and 3 mds. Pierre PS : I find also inc\uosdmap.13258__0_469271DE__none on each meta directory. Le 03/07/2014 00:10, Samuel Just a écrit : Also, what version did you upgrade from, and how did you upgrade? -Sam On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just sam.j...@inktank.com wrote: Ok, in current/meta on osd 20 and osd 23, please attach all files matching ^osdmap.13258.* There should be one such file on each osd. (should look something like osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, you'll want to use find). What version of ceph is running on your mons? How many mons do you have? -Sam On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, I do it, the log files are available here : https://blondeau.users.greyc.fr/cephlog/debug20/ The OSD's files are really big +/- 80M . After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. I remark that after this the number of down+peering PG decrease from 367 to 248. It's normal ? May be it's temporary, the time that the cluster verifies all the PG ? Regards Pierre Le 02/07/2014 19:16, Samuel Just a écrit : You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because
Re: [ceph-users] Some OSD and MDS crash
Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). I will remember it the next time ;) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, After the upgrade to firefly, I have some PG in peering state. I seen the output of 0.82 so I try to upgrade for solved my problem. My three MDS crash and some OSD triggers a chain reaction that kills other OSD. I think my MDS will not start because of the metadata are on the OSD. I have 36 OSD on three servers and I identified 5 OSD which makes crash others. If i not start their, the cluster passe in reconstructive state with 31 OSD but i have 378 in down+peering state. How can I do ? Would you more information ( os, crash log, etc ... ) ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some OSD and MDS crash
Hi, I do it, the log files are available here : https://blondeau.users.greyc.fr/cephlog/debug20/ The OSD's files are really big +/- 80M . After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. I remark that after this the number of down+peering PG decrease from 367 to 248. It's normal ? May be it's temporary, the time that the cluster verifies all the PG ? Regards Pierre Le 02/07/2014 19:16, Samuel Just a écrit : You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). I will remember it the next time ;) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, After the upgrade to firefly, I have some PG in peering state. I seen the output of 0.82 so I try to upgrade for solved my problem. My three MDS crash and some OSD triggers a chain reaction that kills other OSD. I think my MDS will not start because of the metadata are on the OSD. I have 36 OSD on three servers and I identified 5 OSD which makes crash others. If i not start their, the cluster passe in reconstructive state with 31 OSD but i have 378 in down+peering state. How can I do ? Would you more information ( os, crash log, etc ... ) ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some OSD and MDS crash
Le 03/07/2014 00:55, Samuel Just a écrit : Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? Pierre: do you recall how and when that got set? I am not sure to understand, but if I good remember after the update in firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I see feature set mismatch in log. So if I good remeber, i do : ceph osd crush tunables optimal for the problem of crush map and I update my client and server kernel to 3.16rc. It's could be that ? Pierre -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just sam.j...@inktank.com wrote: Yeah, divergent osdmaps: 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none Joao: thoughts? -Sam On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: The files When I upgrade : ceph-deploy install --stable firefly servers... on each servers service ceph restart mon on each servers service ceph restart osd on each servers service ceph restart mds I upgraded from emperor to firefly. After repair, remap, replace, etc ... I have some PG which pass in peering state. I thought why not try the version 0.82, it could solve my problem. ( It's my mistake ). So, I upgrade from firefly to 0.83 with : ceph-deploy install --testing servers... .. Now, all programs are in version 0.82. I have 3 mons, 36 OSD and 3 mds. Pierre PS : I find also inc\uosdmap.13258__0_469271DE__none on each meta directory. Le 03/07/2014 00:10, Samuel Just a écrit : Also, what version did you upgrade from, and how did you upgrade? -Sam On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just sam.j...@inktank.com wrote: Ok, in current/meta on osd 20 and osd 23, please attach all files matching ^osdmap.13258.* There should be one such file on each osd. (should look something like osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, you'll want to use find). What version of ceph is running on your mons? How many mons do you have? -Sam On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, I do it, the log files are available here : https://blondeau.users.greyc.fr/cephlog/debug20/ The OSD's files are really big +/- 80M . After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. I remark that after this the number of down+peering PG decrease from 367 to 248. It's normal ? May be it's temporary, the time that the cluster verifies all the PG ? Regards Pierre Le 02/07/2014 19:16, Samuel Just a écrit : You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). I will remember it the next time ;) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, After the upgrade to firefly, I have some PG in peering state. I seen the output of 0.82 so I try to upgrade for solved my problem. My three MDS crash and some OSD triggers a chain reaction that kills other OSD. I think my MDS will not start because of the metadata are on the OSD. I have 36 OSD on three servers and I identified 5 OSD which makes
Re: [ceph-users] Some OSD and MDS crash
Like that ? # ceph --admin-daemon /var/run/ceph/ceph-mon.william.asok version {version:0.82} # ceph --admin-daemon /var/run/ceph/ceph-mon.jack.asok version {version:0.82} # ceph --admin-daemon /var/run/ceph/ceph-mon.joe.asok version {version:0.82} Pierre Le 03/07/2014 01:17, Samuel Just a écrit : Can you confirm from the admin socket that all monitors are running the same version? -Sam On Wed, Jul 2, 2014 at 4:15 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Le 03/07/2014 00:55, Samuel Just a écrit : Ah, ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d ../ceph/src/osdmaptool: osdmap file 'osd-20_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush20 ../ceph/src/osdmaptool: osdmap file 'osd-23_osdmap.13258__0_4E62BB79__none' ../ceph/src/osdmaptool: exported crush map to /tmp/crush23 6d5 tunable chooseleaf_vary_r 1 Looks like the chooseleaf_vary_r tunable somehow ended up divergent? Pierre: do you recall how and when that got set? I am not sure to understand, but if I good remember after the update in firefly, I was in state : HEALTH_WARN crush map has legacy tunables and I see feature set mismatch in log. So if I good remeber, i do : ceph osd crush tunables optimal for the problem of crush map and I update my client and server kernel to 3.16rc. It's could be that ? Pierre -Sam On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just sam.j...@inktank.com wrote: Yeah, divergent osdmaps: 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none Joao: thoughts? -Sam On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: The files When I upgrade : ceph-deploy install --stable firefly servers... on each servers service ceph restart mon on each servers service ceph restart osd on each servers service ceph restart mds I upgraded from emperor to firefly. After repair, remap, replace, etc ... I have some PG which pass in peering state. I thought why not try the version 0.82, it could solve my problem. ( It's my mistake ). So, I upgrade from firefly to 0.83 with : ceph-deploy install --testing servers... .. Now, all programs are in version 0.82. I have 3 mons, 36 OSD and 3 mds. Pierre PS : I find also inc\uosdmap.13258__0_469271DE__none on each meta directory. Le 03/07/2014 00:10, Samuel Just a écrit : Also, what version did you upgrade from, and how did you upgrade? -Sam On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just sam.j...@inktank.com wrote: Ok, in current/meta on osd 20 and osd 23, please attach all files matching ^osdmap.13258.* There should be one such file on each osd. (should look something like osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, you'll want to use find). What version of ceph is running on your mons? How many mons do you have? -Sam On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, I do it, the log files are available here : https://blondeau.users.greyc.fr/cephlog/debug20/ The OSD's files are really big +/- 80M . After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. I remark that after this the number of down+peering PG decrease from 367 to 248. It's normal ? May be it's temporary, the time that the cluster verifies all the PG ? Regards Pierre Le 02/07/2014 19:16, Samuel Just a écrit : You should add debug osd = 20 debug filestore = 20 debug ms = 1 to the [osd] section of the ceph.conf and restart the osds. I'd like all three logs if possible. Thanks -Sam On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Yes, but how i do that ? With a command like that ? ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms 1' By modify the /etc/ceph/ceph.conf ? This file is really poor because I use udev detection. When I have made these changes, you want the three log files or only osd.20's ? Thank you so much for the help Regards Pierre Le 01/07/2014 23:51, Samuel Just a écrit : Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 ? -Sam On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). I will remember it the next time ;) -Greg Software
Re: [ceph-users] Some OSD and MDS crash
Hi, I join : - osd.20 is one of osd that I detect which makes crash other OSD. - osd.23 is one of osd which crash when i start osd.20 - mds, is one of my MDS I cut log file because they are to big but. All is here : https://blondeau.users.greyc.fr/cephlog/ Regards Le 30/06/2014 17:35, Gregory Farnum a écrit : What's the backtrace from the crashing OSDs? Keep in mind that as a dev release, it's generally best not to upgrade to unnamed versions like 0.82 (but it's probably too late to go back now). I will remember it the next time ;) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU pierre.blond...@unicaen.fr wrote: Hi, After the upgrade to firefly, I have some PG in peering state. I seen the output of 0.82 so I try to upgrade for solved my problem. My three MDS crash and some OSD triggers a chain reaction that kills other OSD. I think my MDS will not start because of the metadata are on the OSD. I have 36 OSD on three servers and I identified 5 OSD which makes crash others. If i not start their, the cluster passe in reconstructive state with 31 OSD but i have 378 in down+peering state. How can I do ? Would you more information ( os, crash log, etc ... ) ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ceph version 0.82 (14085f42ddd0fef4e7e1dc99402d07a8df82c04e) 1: (MDLog::_reformat_journal(JournalPointer const, Journaler*, Context*)+0x1356) [0x855826] 2: (MDLog::_recovery_thread(Context*)+0x7dc) [0x85606c] 3: (MDLog::RecoveryThread::entry()+0x11) [0x664651] 4: (()+0x6b50) [0x7f3bb5bc5b50] 5: (clone()+0x6d) [0x7f3bb49ee0ed] ceph version 0.82 (14085f42ddd0fef4e7e1dc99402d07a8df82c04e) 1: /usr/bin/ceph-mds() [0x8d81f2] 2: (()+0xf030) [0x7f3bb5bce030] 3: (gsignal()+0x35) [0x7f3bb4944475] 4: (abort()+0x180) [0x7f3bb49476f0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f3bb519a89d] 6: (()+0x63996) [0x7f3bb5198996] 7: (()+0x639c3) [0x7f3bb51989c3] 8: (()+0x63bee) [0x7f3bb5198bee] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x40a) [0x9ab5da] 10: (MDLog::_reformat_journal(JournalPointer const, Journaler*, Context*)+0x1356) [0x855826] 11: (MDLog::_recovery_thread(Context*)+0x7dc) [0x85606c] 12: (MDLog::RecoveryThread::entry()+0x11) [0x664651] 13: (()+0x6b50) [0x7f3bb5bc5b50] 14: (clone()+0x6d) [0x7f3bb49ee0ed] ceph version 0.82 (14085f42ddd0fef4e7e1dc99402d07a8df82c04e) 1: (PG::fulfill_info(pg_shard_t, pg_query_t const, std::pairpg_shard_t, pg_info_t)+0x5a) [0x879efa] 2: (PG::RecoveryState::Stray::react(PG::MQuery const)+0xef) [0x88be5f] 3: (boost::statechart::detail::reaction_result boost::statechart::simple_statePG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::listmpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, (boost::statechart::history_mode)0::local_react_impl_non_empty::local_react_implboost::mpl::listboost::statechart::custom_reactionPG::MQuery, boost::statechart::custom_reactionPG::MLogRec, boost::statechart::custom_reactionPG::MInfoRec, boost::statechart::custom_reactionPG::ActMap, boost::statechart::custom_reactionPG::RecoveryDone, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, boost::statechart::simple_statePG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::listmpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, (boost::statechart::history_mode)0 (boost::statechart::simple_statePG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::listmpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, (boost::statechart::history_mode)0, boost::statechart::event_base const, void const*)+0x86) [0x8c8f06] 4: (boost::statechart::simple_statePG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::listmpl_::na, mpl_::na, mpl_
Re: [ceph-users] external monitoring tools for ceph
Hi, May be you can use that : https://github.com/thelan/ceph-zabbix, but i am interested to view Craig's script and template. Regards Le 01/07/2014 10:16, Georgios Dimitrakakis a écrit : Hi Craig, I am also interested at the Zabbix templates and scripts if you can publish them. Regards, G. On Mon, 30 Jun 2014 18:15:12 -0700, Craig Lewis wrote: You should check out Calamari (https://github.com/ceph/calamari [3]), Inktanks monitoring and administration tool. I started before Calamari was announced, so I rolled my own using using Zabbix. It handles all the monitoring, graphing, and alerting in one tool. Its kind of a pain to setup, but works ok now that its going. I dont know how to handle the cluster view though. Im monitoring individual machines. Whenever something happens, like an OSD stops responding, I get an alert from every monitor. Otherwise its not a big deal. Im in the middle of re-factoring the data gathering from poll to push. If youre interested, I can publish my templates and scripts when Im done. On Sun, Jun 29, 2014 at 1:17 AM, pragya jain wrote: Hello all, I am working on ceph storage cluster with rados gateway for object storage. I am looking for external monitoring tools that can be used to monitor ceph storage cluster and rados gateway interface. I find various monitoring tools, such as nagios, collectd, ganglia, diamond, sensu, logstash. but i dont get details of anyone about what features do these monitoring tools monitor in ceph. Has somebody implemented anyone of these tools? Can somebody help me in identifying the features provided by these tools? Is there any other tool which can also be used to monitor ceph specially for object storage? Regards Pragya Jain ___ ceph-users mailing list ceph-users@lists.ceph.com [1] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [2] Links: -- [1] mailto:ceph-users@lists.ceph.com [2] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [3] https://github.com/ceph/calamari [4] mailto:prag_2...@yahoo.co.in -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Some OSD and MDS crash
Hi, After the upgrade to firefly, I have some PG in peering state. I seen the output of 0.82 so I try to upgrade for solved my problem. My three MDS crash and some OSD triggers a chain reaction that kills other OSD. I think my MDS will not start because of the metadata are on the OSD. I have 36 OSD on three servers and I identified 5 OSD which makes crash others. If i not start their, the cluster passe in reconstructive state with 31 OSD but i have 378 in down+peering state. How can I do ? Would you more information ( os, crash log, etc ... ) ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some PG remains peering after upgrade to Firefly 0.80.1
Le 26/06/2014 20:17, Pierre BLONDEAU a écrit : Hy, Some pg remains in the peering after upgrade to firefly. All pg seem to be on the same OSD ( 16 ). You can view the file. I tried to stop this OSD, but when i did, some PG becomes inactive. How can i do ? Regards. Now, it's the MDS which remains in up:rejoin state : mdsmap e205: 1/1/1 up {0=william=up:rejoin}, 2 up:standby I find in the mds log : ... 2014-06-27 11:30:56.099392 7f6b14688700 1 mds.0.41 reconnect_done 2014-06-27 11:30:56.219144 7f6b14688700 1 mds.0.41 handle_mds_map i am now mds.0.41 2014-06-27 11:30:56.219152 7f6b14688700 1 mds.0.41 handle_mds_map state change up:reconnect -- up:rejoin 2014-06-27 11:30:56.219155 7f6b14688700 1 mds.0.41 rejoin_start 2014-06-27 11:30:59.807482 7f6b14688700 1 mds.0.41 rejoin_joint_start And nothing after. How can i restore my ceph ? -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some PG remains peering after upgrade to Firefly 0.80.1
Le 27/06/2014 11:46, Pierre BLONDEAU a écrit : Le 26/06/2014 20:17, Pierre BLONDEAU a écrit : Hy, Some pg remains in the peering after upgrade to firefly. All pg seem to be on the same OSD ( 16 ). You can view the file. I tried to stop this OSD, but when i did, some PG becomes inactive. How can i do ? Regards. Now, it's the MDS which remains in up:rejoin state : mdsmap e205: 1/1/1 up {0=william=up:rejoin}, 2 up:standby I find in the mds log : ... 2014-06-27 11:30:56.099392 7f6b14688700 1 mds.0.41 reconnect_done 2014-06-27 11:30:56.219144 7f6b14688700 1 mds.0.41 handle_mds_map i am now mds.0.41 2014-06-27 11:30:56.219152 7f6b14688700 1 mds.0.41 handle_mds_map state change up:reconnect -- up:rejoin 2014-06-27 11:30:56.219155 7f6b14688700 1 mds.0.41 rejoin_start 2014-06-27 11:30:59.807482 7f6b14688700 1 mds.0.41 rejoin_joint_start And nothing after. How can i restore my ceph ? I have reboot all machines, and start mds one by one and it's ok for mds. How can I check the mds status between all servers ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Some PG remains peering after upgrade to Firefly 0.80.1
Hy, Some pg remains in the peering after upgrade to firefly. All pg seem to be on the same OSD ( 16 ). You can view the file. I tried to stop this OSD, but when i did, some PG becomes inactive. How can i do ? Regards. -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- # ceph pg dump | grep peering dumped all in format plain 0.183 11983 0 0 0 10427139093 30013001peering 2014-06-19 20:02:14.336102 7272'38060 9443:205684 [16,21] 16 [16,21] 16 7272'38060 2014-06-18 11:14:44.256675 7272'38060 2014-06-15 09:00:51.503603 0.17d 11781 0 0 0 10069345173 30023002peering 2014-06-19 20:17:55.111756 9363'37428 9443:154364 [16,1] 16 [16,1] 16 7272'37427 2014-06-18 08:09:28.789890 7272'37427 2014-06-18 08:09:28.789890 0.a411802 0 0 0 10197325776 30013001peering 2014-06-19 18:58:30.358562 7272'37300 9443:154382 [16,7] 16 [16,7] 16 7272'37300 2014-06-17 20:11:47.525174 7272'37300 2014-06-16 17:41:53.331833 0.4c11924 0 0 0 10143098090 30023002peering 2014-06-19 18:54:53.964913 9403'38144 9443:145595 [16,25] 16 [16,25] 16 7272'38143 2014-06-18 09:52:08.198845 7272'38143 2014-06-14 06:48:08.792218 0.6e8 11737 0 0 0 10003845159 30033003peering 2014-06-19 18:10:04.675405 9070'37041 9443:139424 [16,0] 16 [16,0] 16 7815'37040 2014-06-18 01:48:24.490826 7272'37039 2014-06-16 21:05:49.085204 0.6cb 12133 0 0 0 10401273250 30013001peering 2014-06-19 19:45:45.549566 7272'38338 9443:157345 [16,2] 16 [16,2] 16 7272'38338 2014-06-18 09:20:22.852169 7272'38338 2014-06-14 04:29:41.282777 0.605 11739 0 0 0 10457751036 30023002peering 2014-06-19 18:39:24.262705 9403'36957 9443:150978 [16,23] 16 [16,23] 16 7272'36956 2014-06-18 08:48:23.483915 7272'36956 2014-06-14 03:21:45.419081 0.5d7 11780 0 0 0 10069556805 30013001peering 2014-06-19 19:23:04.636194 7272'37647 9443:141532 [16,7] 16 [16,7] 16 7272'37647 2014-06-17 11:06:55.623203 7272'37647 2014-06-15 10:28:18.834868 0.52d 11933 0 0 0 9987516970 30033003peering 2014-06-19 20:03:51.966678 8140'37995 9443:141404 [16,9] 16 [16,9] 16 7910'37994 2014-06-17 20:00:29.459131 7272'37993 2014-06-15 16:40:55.982749 0.4b2 11983 0 0 0 10051670129 30013001peering 2014-06-19 18:58:54.192528 7272'37626 9443:139692 [16,10] 16 [16,10] 16 7272'37626 2014-06-18 03:30:02.698832 7272'37626 2014-06-16 23:14:07.622190 0.48a 11956 0 0 0 10128500386 30013001peering 2014-06-19 20:19:09.503945 7272'37701 9443:175908 [16,1] 16 [16,1] 16 7272'37701 2014-06-18 09:10:22.653136 7272'37701 2014-06-14 04:13:54.103455 0.374 24030 0 0 0 20635805936 30063006peering 2014-06-19 19:02:57.694534 9403'75637 9443:341986 [16,18] 16 [16,18] 16 7815'75633 2014-06-18 03:22:06.559734 7272'75632 2014-06-16 22:37:43.670052 0.373 24267 0 0 0 20721317362 30013001peering 2014-06-19 18:31:21.671566 7272'77231 9443:298500 [16,24] 16 [16,24] 16 7272'77231 2014-06-18 04:18:57.258497 7272'77231 2014-06-18 04:18:57.258497 0.36e 23767 0 0 0 19873454495 30013001peering 2014-06-19 20:16:31.483342 7272'75823 9443:402552 [16,3] 16 [16,3] 16 7272'75823 2014-06-17 20:48:31.470419 7272'75823 2014-06-16 20:06:44.687559 0.331 23751 0 0 0 19725402006 30013001peering 2014-06-19 19:52:01.676693 7272'74947 9443:349176 [16,18] 16 [16,18] 16 7272'74947 2014-06-18 01:58:06.619495 7272'74947 2014-06-16 21:25:09.111350 0.2eb 12029 0 0 0 10156848656 30013001peering 2014-06-19 19:38:01.638117 7272'37893 9443:147549 [16,6] 16 [16,6] 16 7272'37893 2014-06-18 02:06:36.900841 7272'37893 2014-06-16 21:31:37.370617 0.2a9 11882 0 0 0 10149455357 30013001peering 2014-06-19 20:01:16.604454 7272'37768 9443:175913 [16,26] 16
[ceph-users] ceph-deploy with partition, lvm or dm-crypt
Hello, I read in the documentation that it is more recommended to use ceph-deploy the configuration files. But I can not: Use a partition as OSD (and not a full hard drive) Give a logical volume (LVM) as log (SSD hardware raid 1) Using dm-crypt My version of Ceph-depoly is 1.0-1 on http://ceph.com/debian-cuttlefish/ Thank you in advance for your help. Regards Pierre -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problem with data distribution
Le 04/07/2013 01:07, Vladislav Gorbunov a écrit : ceph osd pool set data pg_num 1800 And I do not understand why the OSD 16 and 19 are hardly used Actually you need to change the pgp_num for real data rebalancing: ceph osd pool set data pgp_num 1800 Check it with the command: ceph osd dump | grep 'pgp_num' Vladislav Gorbunov, Michael Lowe, Thank you. I think it's that. I changed the value and the placement is really better, but the operation will be long ;) : 2013-07-04 18:39:03.022694 mon.0 [INF] pgmap v210884: 1928 pgs: 412 active+clean, 1115 active+remapped+wait_backfill, 303 active+remapped+wait_backfill+backfill_toofull, 1 active+degraded+wait_backfill+backfill_toofull, 86 active+remapped+backfilling, 6 active+remapped+backfill_toofull, 3 active+recovery_wait+remapped, 2 active+degraded+remapped+wait_backfill+backfill_toofull; 345 TB data, 49504 GB used, 17499 GB / 67004 GB avail; 92830230/261366383 degraded (35.517%); recovering 94 o/s, 375MB/s jack 67 - 77% /var/lib/ceph/osd/ceph-6 86 - 84% /var/lib/ceph/osd/ceph-8 77 - 86% /var/lib/ceph/osd/ceph-11 66 - 80% /var/lib/ceph/osd/ceph-7 47 - 55% /var/lib/ceph/osd/ceph-10 29 - 52% /var/lib/ceph/osd/ceph-9 joe 86 - 86% /var/lib/ceph/osd/ceph-15 67 - 70% /var/lib/ceph/osd/ceph-13 96 - 88% /var/lib/ceph/osd/ceph-14 85 - 86% /var/lib/ceph/osd/ceph-17 87 - 84% /var/lib/ceph/osd/ceph-12 20 - 38% /var/lib/ceph/osd/ceph-16 wiliam 86 - 86% /var/lib/ceph/osd/ceph-0 86 - 86% /var/lib/ceph/osd/ceph-3 61 - 68% /var/lib/ceph/osd/ceph-4 71 - 83% /var/lib/ceph/osd/ceph-1 58 - 68% /var/lib/ceph/osd/ceph-18 50 - 63% /var/lib/ceph/osd/ceph-2 Thanks a lot to Alex Bligh and Gregory Farnum For the other question, can we change the ratio of 95% because with hard disks 4T it makes at least 200G lost by OSD and with 18 OSD it makes 3.5To. Regards 2013/7/3 Pierre BLONDEAU pierre.blond...@unicaen.fr: Le 01/07/2013 19:17, Gregory Farnum a écrit : On Mon, Jul 1, 2013 at 10:13 AM, Alex Bligh a...@alex.org.uk wrote: On 1 Jul 2013, at 17:37, Gregory Farnum wrote: Oh, that's out of date! PG splitting is supported in Cuttlefish: ceph osd pool set foo pg_num number http://ceph.com/docs/master/rados/operations/control/#osd-subsystem Ah, so: pg_num: The placement group number. means pg_num: The number of placement groups. Perhaps worth demystifying for those hard of understanding such as myself. I'm still not quite sure how that relates to pgp_num. Pools are sharded into placement groups. That's the pg_num. Those placement groups can be placed all independently, or as if there were a smaller number of placement groups (this is so you can double the number of PGs but not move any data until the splitting is done). -Greg Hy, Thank you very much for your answer. Sorry for the late reply but a modification of a cluster of 67T is long ;) Actually my pg number was very insufficient : ceph osd pool get data pg_num pg_num: 48 As I'm not sure of the rate of replication that I will set, I change the number of pg to 1800: ceph osd pool set data pg_num 1800 But the placement is always heterogeneous especially on the machine where I had an full osd. I now have two osd on this machine to the limit and I can not write to the cluster jack 67 - 67% /var/lib/ceph/osd/ceph-6 86 - 86% /var/lib/ceph/osd/ceph-8 85 - 77% /var/lib/ceph/osd/ceph-11 ? - 66% /var/lib/ceph/osd/ceph-7 47 - 47% /var/lib/ceph/osd/ceph-10 29 - 29% /var/lib/ceph/osd/ceph-9 joe 86 - 77% /var/lib/ceph/osd/ceph-15 67 - 67% /var/lib/ceph/osd/ceph-13 95 - 96% /var/lib/ceph/osd/ceph-14 92 - 95% /var/lib/ceph/osd/ceph-17 86 - 87% /var/lib/ceph/osd/ceph-12 20 - 20% /var/lib/ceph/osd/ceph-16 william 68 - 86% /var/lib/ceph/osd/ceph-0 86 - 86% /var/lib/ceph/osd/ceph-3 67 - 61% /var/lib/ceph/osd/ceph-4 79 - 71% /var/lib/ceph/osd/ceph-1 58 - 58% /var/lib/ceph/osd/ceph-18 64 - 50% /var/lib/ceph/osd/ceph-2 ceph -w : 2013-07-03 10:56:06.610928 mon.0 [INF] pgmap v174071: 1928 pgs: 1816 active+clean, 84 active+remapped+backfill_toofull, 9 active+degraded+backfill_toofull, 19 active+degraded+remapped+backfill_toofull; 300 TB data, 45284 GB used, 21719 GB / 67004 GB avail; 15EB/s rd, 15EB/s wr, 15Eop/s; 9975324/165229620 degraded (6.037%); recovering 15E o/s, 15EB/s 2013-07-03 10:56:08.404701 osd.14 [WRN] OSD near full (95%) 2013-07-03 10:56:29.729297 osd.17 [WRN] OSD near full (94%) And I do not understand why the OSD 16 and 19 are hardly used Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de
Re: [ceph-users] Problem with data distribution
Le 01/07/2013 19:17, Gregory Farnum a écrit : On Mon, Jul 1, 2013 at 10:13 AM, Alex Bligh a...@alex.org.uk wrote: On 1 Jul 2013, at 17:37, Gregory Farnum wrote: Oh, that's out of date! PG splitting is supported in Cuttlefish: ceph osd pool set foo pg_num number http://ceph.com/docs/master/rados/operations/control/#osd-subsystem Ah, so: pg_num: The placement group number. means pg_num: The number of placement groups. Perhaps worth demystifying for those hard of understanding such as myself. I'm still not quite sure how that relates to pgp_num. Pools are sharded into placement groups. That's the pg_num. Those placement groups can be placed all independently, or as if there were a smaller number of placement groups (this is so you can double the number of PGs but not move any data until the splitting is done). -Greg Hy, Thank you very much for your answer. Sorry for the late reply but a modification of a cluster of 67T is long ;) Actually my pg number was very insufficient : ceph osd pool get data pg_num pg_num: 48 As I'm not sure of the rate of replication that I will set, I change the number of pg to 1800: ceph osd pool set data pg_num 1800 But the placement is always heterogeneous especially on the machine where I had an full osd. I now have two osd on this machine to the limit and I can not write to the cluster jack 67 - 67% /var/lib/ceph/osd/ceph-6 86 - 86% /var/lib/ceph/osd/ceph-8 85 - 77% /var/lib/ceph/osd/ceph-11 ? - 66% /var/lib/ceph/osd/ceph-7 47 - 47% /var/lib/ceph/osd/ceph-10 29 - 29% /var/lib/ceph/osd/ceph-9 joe 86 - 77% /var/lib/ceph/osd/ceph-15 67 - 67% /var/lib/ceph/osd/ceph-13 95 - 96% /var/lib/ceph/osd/ceph-14 92 - 95% /var/lib/ceph/osd/ceph-17 86 - 87% /var/lib/ceph/osd/ceph-12 20 - 20% /var/lib/ceph/osd/ceph-16 william 68 - 86% /var/lib/ceph/osd/ceph-0 86 - 86% /var/lib/ceph/osd/ceph-3 67 - 61% /var/lib/ceph/osd/ceph-4 79 - 71% /var/lib/ceph/osd/ceph-1 58 - 58% /var/lib/ceph/osd/ceph-18 64 - 50% /var/lib/ceph/osd/ceph-2 ceph -w : 2013-07-03 10:56:06.610928 mon.0 [INF] pgmap v174071: 1928 pgs: 1816 active+clean, 84 active+remapped+backfill_toofull, 9 active+degraded+backfill_toofull, 19 active+degraded+remapped+backfill_toofull; 300 TB data, 45284 GB used, 21719 GB / 67004 GB avail; 15EB/s rd, 15EB/s wr, 15Eop/s; 9975324/165229620 degraded (6.037%); recovering 15E o/s, 15EB/s 2013-07-03 10:56:08.404701 osd.14 [WRN] OSD near full (95%) 2013-07-03 10:56:29.729297 osd.17 [WRN] OSD near full (94%) And I do not understand why the OSD 16 and 19 are hardly used Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Problem with data distribution
Hy, I use the 0.64.1 version of CEPH on debian wheezy for serveur and ubuntu precise ( with raring kernel 3.8.0-25 ) as client. My problem is the distribution of data on the cluster. I have 3 servers each with 6 osd, but the distribution is very heterogeneous : 86% /var/lib/ceph/osd/ceph-15 67% /var/lib/ceph/osd/ceph-13 95% /var/lib/ceph/osd/ceph-14 92% /var/lib/ceph/osd/ceph-17 86% /var/lib/ceph/osd/ceph-12 20% /var/lib/ceph/osd/ceph-16 47% /var/lib/ceph/osd/ceph-10 85% /var/lib/ceph/osd/ceph-11 67% /var/lib/ceph/osd/ceph-6 86% /var/lib/ceph/osd/ceph-8 29% /var/lib/ceph/osd/ceph-9 64% /var/lib/ceph/osd/ceph-2 68% /var/lib/ceph/osd/ceph-0 86% /var/lib/ceph/osd/ceph-3 67% /var/lib/ceph/osd/ceph-4 79% /var/lib/ceph/osd/ceph-1 58% /var/lib/ceph/osd/ceph-18 Friday, one of my OSD was 95%. I could not write. I was advised on IRC to change the weight of the OSD. I put it at 0.8. The data were migrated to another OSD, but the distribution is still very heterogeneous. A few hours later, I had the same problem with another OSD. I tried to change the weights depending on the place occupied, but it is not much better. For example: 16 OSD is very empty. Do you have any idea to solve my problem ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com