Re: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?)
Hi, you need to import foreign config from openmanage webui. somewhere in storage controller BTW, I'm currently testing new dell r630 with a perc h330 ( lsi 3008) With this controller, it's possible to do hardware for some disks, and passthrough for some others disks. So, perfect for ceph :) - Mail original - De: pixelfairy pixelfa...@gmail.com À: ceph-users ceph-us...@ceph.com Envoyé: Mardi 10 Février 2015 11:38:48 Objet: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?) Im stuck with these servers with dell perc 710p raid cards. 8 bays, looking at a pair of 256gig ssds in raid 1 for / and journals, the rest as 4tb sas we already have. since that card refuses jbod, we made them all single disk raid0, then pulled one as a test. putting it back, its state is foreign and there doesnt seem to be anything that can change this from omconfig, the om web ui, or idrac web ui. is there any way to restore this? does the machine need to be rebooted to fix it? lspci reports these to be lsi mega raid 2208 thunderbolt, has anyone here used the perc 710p with the mega raid utilities under ubuntu 14.04? if it really is only fixable from the bios (needing to reboot), were looking at putting in lsi 9211 cards ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Thanks for everyone!! After applying the re-weighting command (ceph osd crush reweight osd.0 0.0095), my cluster is getting healthy now :)) But I have one question, what if I have hundreds of OSDs, shall I do the re-weighting on each device, or there is some way to make this happen automatically .. the question in other words, why would I need to do weighting in the first place?? On Feb 10, 2015, at 4:00 PM, Vikhyat Umrao vum...@redhat.com wrote: Oh , I have miss placed the places for osd names and weight ceph osd crush reweight osd.0 0.0095 and so on .. Regards, Vikhyat On 02/10/2015 07:31 PM, B L wrote: Thanks Vikhyat, As suggested .. ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0 Invalid command: osd.0 doesn't represent a float osd crush reweight name float[0.0-] : change name's weight to weight in crush map Error EINVAL: invalid command What do you think On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com mailto:vum...@redhat.com wrote: sudo ceph osd crush reweight 0.0095 osd.0 to osd.5 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Update 0.80.5 to 0.80.8 --the VM's read request become too slow
Hello! We use Ceph+Openstack in our private cloud. Recently we upgrade our centos6.5 based cluster from Ceph Emperor to Ceph Firefly. At first,we use redhat yum repo epel to upgrade, this Ceph's version is 0.80.5. First upgrade monitor,then osd,last client. when we complete this upgrade, we boot a VM on the cluster,then use fio to test the io performance. The io performance is as better as before. Everything is ok! Then we upgrade the cluster from 0.80.5 to 0.80.8,when we completed , we reboot the VM to load the newest librbd. after that we also use fio to test the io performance*.then we find the randwrite and write is as good as before.but the randread and read is become worse, randwrite's iops from 4000-5000 to 300-400 ,and the latency is worse. the write's bw from 400MB/s to 115MB/s*. then I downgrade the ceph client version from 0.80.8 to 0.80.5, then the reslut become normal. So I think maybe something cause about librbd. I compare the 0.80.8 release notes with 0.80.5 ( http://ceph.com/docs/master/release-notes/#v0-80-8-firefly ), I just find this change in 0.80.8 is something about read request : librbd: cap memory utilization for read requests (Jason Dillaman) . Who can explain this? * My ceph cluster is 400osd,5mons*: ceph -s health HEALTH_OK monmap e11: 5 mons at {BJ-M1-Cloud71= 172.28.2.71:6789/0,BJ-M1-Cloud73=172.28.2.73:6789/0,BJ-M2-Cloud80=172.28.2.80:6789/0,BJ-M2-Cloud81=172.28.2.81:6789/0,BJ-M3-Cloud85=172.28.2.85:6789/0}, election epoch 198, quorum 0,1,2,3,4 BJ-M1-Cloud71,BJ-M1-Cloud73,BJ-M2-Cloud80,BJ-M2-Cloud81,BJ-M3-Cloud85 osdmap e120157: 400 osds: 400 up, 400 in pgmap v26161895: 29288 pgs, 6 pools, 20862 GB data, 3014 kobjects 41084 GB used, 323 TB / 363 TB avail 29288 active+clean client io 52640 kB/s rd, 32419 kB/s wr, 5193 op/s *The follwing is my ceph client conf :* [global] auth_service_required = cephx filestore_xattr_use_omap = true auth_client_required = cephx auth_cluster_required = cephx mon_host = 172.29.204.24,172.29.204.48,172.29.204.55,172.29.204.58,172.29.204.73 mon_initial_members = ZR-F5-Cloud24, ZR-F6-Cloud48, ZR-F7-Cloud55, ZR-F8-Cloud58, ZR-F9-Cloud73 fsid = c01c8e28-304e-47a4-b876-cb93acc2e980 mon osd full ratio = .85 mon osd nearfull ratio = .75 public network = 172.29.204.0/24 mon warn on legacy crush tunables = false [osd] osd op threads = 12 filestore journal writeahead = true filestore merge threshold = 40 filestore split multiple = 8 [client] rbd cache = true rbd cache writethrough until flush = false rbd cache size = 67108864 rbd cache max dirty = 50331648 rbd cache target dirty = 33554432 [client.cinder] admin socket = /var/run/ceph/rbd-$pid.asok * My VM is 8core16G,we use fio scripts is : * fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randread -size=60G -filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=60G -filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=read -size=60G -filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=write -size=60G -filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200 *The following is the io test result* ceph client verison :0.80.5 read: bw=*430MB* write: bw=420MB randread: iops=*4875* latency=65ms randwrite: iops=6844 latency=46ms ceph client verison :0.80.8 read: bw=*115MB* write: bw=480MB randread: iops=*381* latency=83ms randwrite: iops=4843 latency=68ms ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?)
turns out you can do some stuff with omconfig as long as you enable auto import in the cards bios utility. still need the web ui to turn the new disk into a usable block device. have you been able to automate the whole recovery process? id like to just put the new disk in and have the system notice and automatically set it up. On Tue, Feb 10, 2015 at 7:28 AM, Don Doerner don.doer...@quantum.com wrote: I've been involved on projects over many years that use the MegaRAID 2208, in any of many forms, including the Dell H710. Without resorting the a BIOS-time utility, I only know of one way to manage them: MegaCLI. Look around on the web a bit: you can download it from LSI, and there is full documentation on-line also. I am presently doing some prototyping work with Ceph using these RAID controllers (H710, not H710P, but that's a negligible difference; and H810), but I haven't started investigating failure scenarios yet... ___ Don Doerner Technical Director, Advanced Projects Quantum Corporation -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of pixelfairy Sent: 10 February, 2015 02:39 To: ceph-users Subject: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?) Im stuck with these servers with dell perc 710p raid cards. 8 bays, looking at a pair of 256gig ssds in raid 1 for / and journals, the rest as 4tb sas we already have. since that card refuses jbod, we made them all single disk raid0, then pulled one as a test. putting it back, its state is foreign and there doesnt seem to be anything that can change this from omconfig, the om web ui, or idrac web ui. is there any way to restore this? does the machine need to be rebooted to fix it? lspci reports these to be lsi mega raid 2208 thunderbolt, has anyone here used the perc 710p with the mega raid utilities under ubuntu 14.04? if it really is only fixable from the bios (needing to reboot), were looking at putting in lsi 9211 cards ___ ceph-users mailing list ceph-users@lists.ceph.com https://urldefense.proofpoint.com/v1/url?u=http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.comk=8F5TVnBDKF32UabxXsxZiA%3D%3D%0Ar=klXZewu0kUquU7GVFsSHwpsWEaffmLRymeSfL%2FX1EJo%3D%0Am=iPJ3KSOTK1wAgXCHkNbxNA%2FrXHlBuS6kMvNOZKaf%2BqA%3D%0As=f82b4bd26f3bae0074b979a54994791a9a8a5ba92dfa89799b596ba41072ad31 -- The information contained in this transmission may be confidential. Any disclosure, copying, or further distribution of confidential information is not permitted unless such privilege is explicitly granted in writing by Quantum. Quantum reserves the right to have electronic communications, including email and attachments, sent across its networks filtered through anti virus and spam software programs and retain such messages in order to comply with applicable data security and retention requirements. Quantum is not responsible for the proper and complete transmission of the substance of this communication or for any delay in its receipt. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph Performance vs PG counts
Hi, Just a heads up I hope , you are aware of this tool: http://ceph.com/pgcalc/ Regards, Vikhyat On 02/11/2015 09:11 AM, Sumit Gaur wrote: Hi , I am not sure why PG numbers have not given that much importance in the ceph documents, I am seeing huge variation in performance number by changing PG numbers. Just an example *without SSD* :* * 36 OSD HDD = PG count 2048 gives me random write (1024K bz) performance of 550 MBps *with SSD :* 6 SSD for journals + 24 OSD HDD = PG count 2048 gives me random write (1024K bz) performance 250 MBps if I change it to 6 SSD for journals + 24 OSD HDD = PG count 512 gives me random write (1024K bz) performance 700 MBps Variation of PG numbers make SSD looks bad in number. I am bit confused here with this behaviour. Thanks sumit On Mon, Feb 9, 2015 at 11:36 AM, Gregory Farnum g...@gregs42.com mailto:g...@gregs42.com wrote: On Sun, Feb 8, 2015 at 6:00 PM, Sumit Gaur sumitkg...@gmail.com mailto:sumitkg...@gmail.com wrote: Hi I have installed 6 node ceph cluster and doing a performance bench mark for the same using Nova VMs. What I have observed that FIO random write reports around 250 MBps for 1M block size and PGs 4096 and 650MBps for iM block size and PG counts 2048 . Can some body let me know if I am missing any ceph Architecture point here ? As per my understanding PG numbers are mainly involved in calculating the hash and should not effect performance so much. PGs are also serialization points within the codebase, so depending on how you're testing you can run into contention if you have multiple objects within a single PG that you're trying to write to at once. This isn't normally a problem, but for a single benchmark run the random collisions can become noticeable. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hi The weight is reflect spaces or ability of disks. For example, the weight of 100G OSD disk is 0.100(100G/1T). Best wishes, Vickie 2015-02-10 22:25 GMT+08:00 B L super.itera...@gmail.com: Thanks for everyone!! After applying the re-weighting command (*ceph osd crush reweight osd.0 0.0095*), my cluster is getting healthy now :)) But I have one question, what if I have hundreds of OSDs, shall I do the re-weighting on each device, or there is some way to make this happen automatically .. the question in other words, why would I need to do weighting in the first place?? On Feb 10, 2015, at 4:00 PM, Vikhyat Umrao vum...@redhat.com wrote: Oh , I have miss placed the places for osd names and weight ceph osd crush reweight osd.0 0.0095 and so on .. Regards, Vikhyat On 02/10/2015 07:31 PM, B L wrote: Thanks Vikhyat, As suggested .. ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0 Invalid command: osd.0 doesn't represent a float osd crush reweight name float[0.0-] : change name's weight to weight in crush map Error EINVAL: invalid command What do you think On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com wrote: sudo ceph osd crush reweight 0.0095 osd.0 to osd.5 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] wider rados namespace support?
Just came across this in the docs: Currently (i.e., firefly), namespaces are only useful for applications written on top of librados. Ceph clients such as block device, object storage and file system do not currently support this feature. Then found: https://wiki.ceph.com/Planning/Sideboard/rbd%3A_namespace_support Is there any progress or plans to address this (particularly for rbd clients but also cephfs)? -- Cheers, ~Blairo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph Performance vs PG counts
Hi , I am not sure why PG numbers have not given that much importance in the ceph documents, I am seeing huge variation in performance number by changing PG numbers. Just an example *without SSD* : 36 OSD HDD = PG count 2048 gives me random write (1024K bz) performance of 550 MBps *with SSD :* 6 SSD for journals + 24 OSD HDD = PG count 2048 gives me random write (1024K bz) performance 250 MBps if I change it to 6 SSD for journals + 24 OSD HDD = PG count 512 gives me random write (1024K bz) performance 700 MBps Variation of PG numbers make SSD looks bad in number. I am bit confused here with this behaviour. Thanks sumit On Mon, Feb 9, 2015 at 11:36 AM, Gregory Farnum g...@gregs42.com wrote: On Sun, Feb 8, 2015 at 6:00 PM, Sumit Gaur sumitkg...@gmail.com wrote: Hi I have installed 6 node ceph cluster and doing a performance bench mark for the same using Nova VMs. What I have observed that FIO random write reports around 250 MBps for 1M block size and PGs 4096 and 650MBps for iM block size and PG counts 2048 . Can some body let me know if I am missing any ceph Architecture point here ? As per my understanding PG numbers are mainly involved in calculating the hash and should not effect performance so much. PGs are also serialization points within the codebase, so depending on how you're testing you can run into contention if you have multiple objects within a single PG that you're trying to write to at once. This isn't normally a problem, but for a single benchmark run the random collisions can become noticeable. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] combined ceph roles
Hello, I'm giving thought to a minimal footprint scenario with full redundancy. I realize it isn't ideal--and may impact overall performance -- but wondering if the below example would work, supported, or known to cause issue? Example, 3x hosts each running: -- OSD's -- Mon -- Client I thought I read a post a while back about Client+OSD on the same host possibly being an issue -- but i am having difficulty finding that reference. I would appreciate if anyone has insight into such a setup, thanks! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph vs Hardware RAID: No battery backed cache
On 10/02/15 20:40, Thomas Güttler wrote: Hi, does the lack of a battery backed cache in Ceph introduce any disadvantages? We use PostgreSQL and our servers have UPS. But I want to survive a power outage, although it is unlikely. But hope is not an option ... You can certainly make use of adapter cards that have a battery backed cache with Ceph - either using RAID as usual or creating arrays of RAID 0 of 1 disk that enable you to use the nice battery backed cache + writeback options on the card and still have a 1 osd mapped to 1 disk topology. Without such cards it is still quite possible to have a power loss safe setup. These days (with reasonably modern 3.* kernels) using SATA or SAS plus mount options that do *not* disable write barriers will leave you with a consistent, safe state in the advent of power loss. You might want to test your SATA disk of choice to be sure, but SAS should be safe! Cheers Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Having problem with my fresh non-healthy cluster, my cluster status summary shows this: ceph@ceph-node1:~$ ceph -s cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean; pool data pg_num 128 pgp_num 64 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e25: 6 osds: 6 up, 6 in pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects 198 MB used, 18167 MB / 18365 MB avail 192 incomplete 64 creating+incomplete Where shall I start troubleshooting this? P.S. I’m new to CEPH. Thanks! Beanos___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com wrote: Hi Vickie, Thanks for your reply! You can find the dump in this link: https://gist.github.com/anonymous/706d4a1ec81c93fd1eca https://gist.github.com/anonymous/706d4a1ec81c93fd1eca Thanks! B. On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com mailto:mika.leaf...@gmail.com wrote: Hi Beanos: Would you post the reult of $ceph osd dump? Best wishes, Vickie 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com mailto:super.itera...@gmail.com: Having problem with my fresh non-healthy cluster, my cluster status summary shows this: ceph@ceph-node1:~$ ceph -s cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean; pool data pg_num 128 pgp_num 64 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0 http://172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e25: 6 osds: 6 up, 6 in pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects 198 MB used, 18167 MB / 18365 MB avail 192 incomplete 64 creating+incomplete Where shall I start troubleshooting this? P.S. I’m new to CEPH. Thanks! Beanos ___ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ISCSI LIO hang after 2-3 days of working
Hi Mike, I can also seem to reproduce this behaviour. If I shutdown a Ceph node, the delay while Ceph works out that the OSD's are down seems to trigger similar error messages. It seems fairly reliable that if a OSD is down for more than 10 seconds that LIO will have this problem. Below is an excerpt from the kernel log, showing the OSD's going down, timing out and then coming back up. Even when the OSD's are back up ESXi never seems to be able to resume IO to the LUN (at least not within 24 hours). However if you remember I am working on the Active/Standby ALUA solution, when I promote the other LIO node to active for the LUN, access resumes immediately. This is a test/dev cluster for my ALUA resource agents, so please let me know if you need me to run any commands. Ideally it would be best to make LIO recover cleanly, but extending the timeout if possible would probably help in the majority of day to day cases. Feb 9 16:33:34 ceph-iscsi1 kernel: [ 3656.412071] libceph: osd1 10.3.2.31:6800 socket closed (con state OPEN) Feb 9 16:33:34 ceph-iscsi1 kernel: [ 3656.418621] libceph: osd0 10.3.2.31:6805 socket closed (con state OPEN) Feb 9 16:33:45 ceph-iscsi1 kernel: [ 3666.927352] ABORT_TASK: Found referenced iSCSI task_tag: 26731 Feb 9 16:33:45 ceph-iscsi1 kernel: [ 3666.927356] ABORT_TASK: ref_tag: 26731 already complete, skipping Feb 9 16:33:45 ceph-iscsi1 kernel: [ 3666.927369] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26731 Feb 9 16:33:56 ceph-iscsi1 kernel: [ 3678.230684] libceph: osd0 down Feb 9 16:33:56 ceph-iscsi1 kernel: [ 3678.230689] libceph: osd1 down Feb 9 16:33:57 ceph-iscsi1 kernel: [ 3679.279227] ABORT_TASK: Found referenced iSCSI task_tag: 26733 Feb 9 16:33:57 ceph-iscsi1 kernel: [ 3679.279855] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 26733 Feb 9 16:33:57 ceph-iscsi1 kernel: [ 3679.279861] ABORT_TASK: Found referenced iSCSI task_tag: 26735 Feb 9 16:33:57 ceph-iscsi1 kernel: [ 3679.279862] ABORT_TASK: ref_tag: 26735 already complete, skipping Feb 9 16:33:57 ceph-iscsi1 kernel: [ 3679.279863] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26735 Feb 9 16:34:14 ceph-iscsi1 kernel: [ 3696.143515] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26739 Feb 9 16:34:28 ceph-iscsi1 kernel: [ 3709.731706] ABORT_TASK: Found referenced iSCSI task_tag: 26745 Feb 9 16:34:28 ceph-iscsi1 kernel: [ 3709.731711] ABORT_TASK: ref_tag: 26745 already complete, skipping Feb 9 16:34:28 ceph-iscsi1 kernel: [ 3709.731712] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26745 Feb 9 16:34:37 ceph-iscsi1 kernel: [ 3718.905787] libceph: osd1 up Feb 9 16:34:40 ceph-iscsi1 kernel: [ 3721.904940] libceph: osd0 up Feb 9 16:34:55 ceph-iscsi1 kernel: [ 3736.924123] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26754 Feb 9 16:35:22 ceph-iscsi1 kernel: [ 3764.102399] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26762 Feb 9 16:35:36 ceph-iscsi1 kernel: [ 3777.689640] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26768 Feb 9 16:36:03 ceph-iscsi1 kernel: [ 3804.866891] ABORT_TASK: Found referenced iSCSI task_tag: 26777 Feb 9 16:36:03 ceph-iscsi1 kernel: [ 3804.866896] ABORT_TASK: ref_tag: 26777 already complete, skipping Feb 9 16:36:03 ceph-iscsi1 kernel: [ 3804.866897] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26777 Feb 9 16:36:30 ceph-iscsi1 kernel: [ 3832.011008] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000a Feb 9 16:36:30 ceph-iscsi1 kernel: [ 3832.011137] ABORT_TASK: Found referenced iSCSI task_tag: 26792 Feb 9 16:36:30 ceph-iscsi1 kernel: [ 3832.011139] ABORT_TASK: ref_tag: 26792 already complete, skipping Feb 9 16:36:30 ceph-iscsi1 kernel: [ 3832.011141] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26792 Feb 9 16:36:44 ceph-iscsi1 kernel: [ 3845.613219] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x0014 Feb 9 16:36:57 ceph-iscsi1 kernel: [ 3859.215023] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x001e Feb 9 16:37:11 ceph-iscsi1 kernel: [ 3872.791732] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x0028 Feb 9 16:37:11 ceph-iscsi1 kernel: [ 3872.791826] ABORT_TASK: Found referenced iSCSI task_tag: 26809 Feb 9 16:37:11 ceph-iscsi1 kernel: [ 3872.791827] ABORT_TASK: ref_tag: 26809 already complete, skipping Feb 9 16:37:11 ceph-iscsi1 kernel: [ 3872.791828] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 26809 Feb 9 16:37:24 ceph-iscsi1 kernel: [ 3886.378032] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x0032 Feb 9 16:37:38 ceph-iscsi1 kernel: [ 3899.958060] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x003c Feb 9 16:37:38 ceph-iscsi1 kernel: [ 3899.958144] ABORT_TASK: Found referenced iSCSI task_tag: 26819 Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mike Christie Sent: 06 February 2015 04:09
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Here is the updated direct copy/paste dump eph@ceph-node1:~$ ceph osd dump epoch 25 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d created 2015-02-08 16:59:07.050875 modified 2015-02-09 22:35:33.191218 flags pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 max_osd 6 osd.0 up in weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval [0,0) 172.31.0.84:6800/11739 172.31.0.84:6801/11739 172.31.0.84:6802/11739 172.31.0.84:6803/11739 exists,up 765f5066-d13e-4a9e-a446-8630ee06e596 osd.1 up in weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.0.84:6805/12279 172.31.0.84:6806/12279 172.31.0.84:6807/12279 172.31.0.84:6808/12279 exists,up e1d073e5-9397-4b63-8b7c-a4064e430f7a osd.2 up in weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6800/5517 172.31.3.57:6801/5517 172.31.3.57:6802/5517 172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f osd.3 up in weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6805/6043 172.31.3.57:6806/6043 172.31.3.57:6807/6043 172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92 osd.4 up in weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6800/5106 172.31.3.56:6801/5106 172.31.3.56:6802/5106 172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be osd.5 up in weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6805/7019 172.31.3.56:6806/7019 172.31.3.56:6807/7019 172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com wrote: On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com mailto:super.itera...@gmail.com wrote: Hi Vickie, Thanks for your reply! You can find the dump in this link: https://gist.github.com/anonymous/706d4a1ec81c93fd1eca https://gist.github.com/anonymous/706d4a1ec81c93fd1eca Thanks! B. On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com mailto:mika.leaf...@gmail.com wrote: Hi Beanos: Would you post the reult of $ceph osd dump? Best wishes, Vickie 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com mailto:super.itera...@gmail.com: Having problem with my fresh non-healthy cluster, my cluster status summary shows this: ceph@ceph-node1:~$ ceph -s cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean; pool data pg_num 128 pgp_num 64 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0 http://172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e25: 6 osds: 6 up, 6 in pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects 198 MB used, 18167 MB / 18365 MB avail 192 incomplete 64 creating+incomplete Where shall I start troubleshooting this? P.S. I’m new to CEPH. Thanks! Beanos ___ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hi Beanos: So you have 3 OSD servers and each of them have 2 disks. I have a question. What result of ceph osd tree. Look like the osd status is down. Best wishes, Vickie 2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com: Here is the updated direct copy/paste dump eph@ceph-node1:~$ ceph osd dump epoch 25 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d created 2015-02-08 16:59:07.050875 modified 2015-02-09 22:35:33.191218 flags pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 max_osd 6 osd.0 up in weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval [0,0) 172.31.0.84:6800/11739 172.31.0.84:6801/11739 172.31.0.84:6802/11739 172.31.0.84:6803/11739 exists,up 765f5066-d13e-4a9e-a446-8630ee06e596 osd.1 up in weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.0.84:6805/12279 172.31.0.84:6806/12279 172.31.0.84:6807/12279 172.31.0.84:6808/12279 exists,up e1d073e5-9397-4b63-8b7c-a4064e430f7a osd.2 up in weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6800/5517 172.31.3.57:6801/5517 172.31.3.57:6802/5517 172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f osd.3 up in weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6805/6043 172.31.3.57:6806/6043 172.31.3.57:6807/6043 172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92 osd.4 up in weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6800/5106 172.31.3.56:6801/5106 172.31.3.56:6802/5106 172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be osd.5 up in weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6805/7019 172.31.3.56:6806/7019 172.31.3.56:6807/7019 172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com wrote: On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com wrote: Hi Vickie, Thanks for your reply! You can find the dump in this link: https://gist.github.com/anonymous/706d4a1ec81c93fd1eca Thanks! B. On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com wrote: Hi Beanos: Would you post the reult of $ceph osd dump? Best wishes, Vickie 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com: Having problem with my fresh non-healthy cluster, my cluster status summary shows this: ceph@ceph-node1:~$* ceph -s* cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean; pool data pg_num 128 pgp_num 64 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e25: 6 osds: 6 up, 6 in pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects 198 MB used, 18167 MB / 18365 MB avail 192 incomplete 64 creating+incomplete Where shall I start troubleshooting this? P.S. I’m new to CEPH. Thanks! Beanos ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?)
On 10/02/15 11:38, pixelfairy wrote: since that card refuses jbod, we made them all single disk raid0, then pulled one as a test. putting it back, its state is foreign and there doesnt seem to be anything that can change this from omconfig, the om web ui, or idrac web ui. is there any way to restore this? does the machine need to be rebooted to fix it? On real LSI MegaRAIDs, you can use megacli to show / clear the foreign state of drives. The following will clear the foreign state of any foreign drives on all controllers: # megacli -cfgforeign -clear -aALL I'm not sure if you can use megacli with PERC controllers, but I'm sure there will be something very similar. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree # idweight type name up/down reweight -1 0 root default -2 0 host ceph-node1 0 0 osd.0 up 1 1 0 osd.1 up 1 -3 0 host ceph-node3 2 0 osd.2 up 1 3 0 osd.3 up 1 -4 0 host ceph-node2 4 0 osd.4 up 1 5 0 osd.5 up 1 On Feb 10, 2015, at 1:18 PM, Vickie ch mika.leaf...@gmail.com wrote: Hi Beanos: BTW, if your cluster just for test. You may try to reduce replica size and min_size. ceph osd pool set rbd size 2;ceph osd pool set data size 2;ceph osd pool set metadata size 2 ceph osd pool set rbd min_size 1;ceph osd pool set data min_size 1;ceph osd pool set metadata min_size 1 Open another terminal and use command ceph -w watch pg and pgs status . Best wishes, Vickie 2015-02-10 19:16 GMT+08:00 Vickie ch mika.leaf...@gmail.com mailto:mika.leaf...@gmail.com: Hi Beanos: So you have 3 OSD servers and each of them have 2 disks. I have a question. What result of ceph osd tree. Look like the osd status is down. Best wishes, Vickie 2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com mailto:super.itera...@gmail.com: Here is the updated direct copy/paste dump eph@ceph-node1:~$ ceph osd dump epoch 25 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d created 2015-02-08 16:59:07.050875 modified 2015-02-09 22:35:33.191218 flags pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 max_osd 6 osd.0 up in weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval [0,0) 172.31.0.84:6800/11739 http://172.31.0.84:6800/11739 172.31.0.84:6801/11739 http://172.31.0.84:6801/11739 172.31.0.84:6802/11739 http://172.31.0.84:6802/11739 172.31.0.84:6803/11739 http://172.31.0.84:6803/11739 exists,up 765f5066-d13e-4a9e-a446-8630ee06e596 osd.1 up in weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.0.84:6805/12279 http://172.31.0.84:6805/12279 172.31.0.84:6806/12279 http://172.31.0.84:6806/12279 172.31.0.84:6807/12279 http://172.31.0.84:6807/12279 172.31.0.84:6808/12279 http://172.31.0.84:6808/12279 exists,up e1d073e5-9397-4b63-8b7c-a4064e430f7a osd.2 up in weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6800/5517 http://172.31.3.57:6800/5517 172.31.3.57:6801/5517 http://172.31.3.57:6801/5517 172.31.3.57:6802/5517 http://172.31.3.57:6802/5517 172.31.3.57:6803/5517 http://172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f osd.3 up in weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6805/6043 http://172.31.3.57:6805/6043 172.31.3.57:6806/6043 http://172.31.3.57:6806/6043 172.31.3.57:6807/6043 http://172.31.3.57:6807/6043 172.31.3.57:6808/6043 http://172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92 osd.4 up in weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6800/5106 http://172.31.3.56:6800/5106 172.31.3.56:6801/5106 http://172.31.3.56:6801/5106 172.31.3.56:6802/5106 http://172.31.3.56:6802/5106 172.31.3.56:6803/5106 http://172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be osd.5 up in weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6805/7019 http://172.31.3.56:6805/7019 172.31.3.56:6806/7019 http://172.31.3.56:6806/7019 172.31.3.56:6807/7019 http://172.31.3.56:6807/7019 172.31.3.56:6808/7019 http://172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com mailto:super.itera...@gmail.com wrote: On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com mailto:super.itera...@gmail.com wrote: Hi Vickie, Thanks for your reply! You can find the dump in this link: https://gist.github.com/anonymous/706d4a1ec81c93fd1eca https://gist.github.com/anonymous/706d4a1ec81c93fd1eca Thanks! B. On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com mailto:mika.leaf...@gmail.com wrote: Hi Beanos: Would you post the reult of $ceph osd dump? Best wishes, Vickie 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com mailto:super.itera...@gmail.com: Having problem with my fresh non-healthy cluster, my cluster status summary shows this:
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hi Beanos: BTW, if your cluster just for test. You may try to reduce replica size and min_size. ceph osd pool set rbd size 2;ceph osd pool set data size 2;ceph osd pool set metadata size 2 ceph osd pool set rbd min_size 1;ceph osd pool set data min_size 1;ceph osd pool set metadata min_size 1 Open another terminal and use command ceph -w watch pg and pgs status . Best wishes, Vickie 2015-02-10 19:16 GMT+08:00 Vickie ch mika.leaf...@gmail.com: Hi Beanos: So you have 3 OSD servers and each of them have 2 disks. I have a question. What result of ceph osd tree. Look like the osd status is down. Best wishes, Vickie 2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com: Here is the updated direct copy/paste dump eph@ceph-node1:~$ ceph osd dump epoch 25 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d created 2015-02-08 16:59:07.050875 modified 2015-02-09 22:35:33.191218 flags pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 max_osd 6 osd.0 up in weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval [0,0) 172.31.0.84:6800/11739 172.31.0.84:6801/11739 172.31.0.84:6802/11739 172.31.0.84:6803/11739 exists,up 765f5066-d13e-4a9e-a446-8630ee06e596 osd.1 up in weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.0.84:6805/12279 172.31.0.84:6806/12279 172.31.0.84:6807/12279 172.31.0.84:6808/12279 exists,up e1d073e5-9397-4b63-8b7c-a4064e430f7a osd.2 up in weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6800/5517 172.31.3.57:6801/5517 172.31.3.57:6802/5517 172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f osd.3 up in weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6805/6043 172.31.3.57:6806/6043 172.31.3.57:6807/6043 172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92 osd.4 up in weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6800/5106 172.31.3.56:6801/5106 172.31.3.56:6802/5106 172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be osd.5 up in weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6805/7019 172.31.3.56:6806/7019 172.31.3.56:6807/7019 172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com wrote: On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com wrote: Hi Vickie, Thanks for your reply! You can find the dump in this link: https://gist.github.com/anonymous/706d4a1ec81c93fd1eca Thanks! B. On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com wrote: Hi Beanos: Would you post the reult of $ceph osd dump? Best wishes, Vickie 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com: Having problem with my fresh non-healthy cluster, my cluster status summary shows this: ceph@ceph-node1:~$* ceph -s* cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean; pool data pg_num 128 pgp_num 64 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e25: 6 osds: 6 up, 6 in pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects 198 MB used, 18167 MB / 18365 MB avail 192 incomplete 64 creating+incomplete Where shall I start troubleshooting this? P.S. I’m new to CEPH. Thanks! Beanos ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
I will try to change the replication size now as you suggested .. but how is that related to the non-healthy cluster? On Feb 10, 2015, at 1:22 PM, B L super.itera...@gmail.com wrote: Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree # id weight type name up/down reweight -10 root default -20 host ceph-node1 0 0 osd.0 up 1 1 0 osd.1 up 1 -30 host ceph-node3 2 0 osd.2 up 1 3 0 osd.3 up 1 -40 host ceph-node2 4 0 osd.4 up 1 5 0 osd.5 up 1 On Feb 10, 2015, at 1:18 PM, Vickie ch mika.leaf...@gmail.com mailto:mika.leaf...@gmail.com wrote: Hi Beanos: BTW, if your cluster just for test. You may try to reduce replica size and min_size. ceph osd pool set rbd size 2;ceph osd pool set data size 2;ceph osd pool set metadata size 2 ceph osd pool set rbd min_size 1;ceph osd pool set data min_size 1;ceph osd pool set metadata min_size 1 Open another terminal and use command ceph -w watch pg and pgs status . Best wishes, Vickie 2015-02-10 19:16 GMT+08:00 Vickie ch mika.leaf...@gmail.com mailto:mika.leaf...@gmail.com: Hi Beanos: So you have 3 OSD servers and each of them have 2 disks. I have a question. What result of ceph osd tree. Look like the osd status is down. Best wishes, Vickie 2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com mailto:super.itera...@gmail.com: Here is the updated direct copy/paste dump eph@ceph-node1:~$ ceph osd dump epoch 25 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d created 2015-02-08 16:59:07.050875 modified 2015-02-09 22:35:33.191218 flags pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 max_osd 6 osd.0 up in weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval [0,0) 172.31.0.84:6800/11739 http://172.31.0.84:6800/11739 172.31.0.84:6801/11739 http://172.31.0.84:6801/11739 172.31.0.84:6802/11739 http://172.31.0.84:6802/11739 172.31.0.84:6803/11739 http://172.31.0.84:6803/11739 exists,up 765f5066-d13e-4a9e-a446-8630ee06e596 osd.1 up in weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.0.84:6805/12279 http://172.31.0.84:6805/12279 172.31.0.84:6806/12279 http://172.31.0.84:6806/12279 172.31.0.84:6807/12279 http://172.31.0.84:6807/12279 172.31.0.84:6808/12279 http://172.31.0.84:6808/12279 exists,up e1d073e5-9397-4b63-8b7c-a4064e430f7a osd.2 up in weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6800/5517 http://172.31.3.57:6800/5517 172.31.3.57:6801/5517 http://172.31.3.57:6801/5517 172.31.3.57:6802/5517 http://172.31.3.57:6802/5517 172.31.3.57:6803/5517 http://172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f osd.3 up in weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6805/6043 http://172.31.3.57:6805/6043 172.31.3.57:6806/6043 http://172.31.3.57:6806/6043 172.31.3.57:6807/6043 http://172.31.3.57:6807/6043 172.31.3.57:6808/6043 http://172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92 osd.4 up in weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6800/5106 http://172.31.3.56:6800/5106 172.31.3.56:6801/5106 http://172.31.3.56:6801/5106 172.31.3.56:6802/5106 http://172.31.3.56:6802/5106 172.31.3.56:6803/5106 http://172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be osd.5 up in weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6805/7019 http://172.31.3.56:6805/7019 172.31.3.56:6806/7019 http://172.31.3.56:6806/7019 172.31.3.56:6807/7019 http://172.31.3.56:6807/7019 172.31.3.56:6808/7019 http://172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com mailto:super.itera...@gmail.com wrote: On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com mailto:super.itera...@gmail.com wrote: Hi Vickie, Thanks for your reply! You can find the dump in this link: https://gist.github.com/anonymous/706d4a1ec81c93fd1eca https://gist.github.com/anonymous/706d4a1ec81c93fd1eca Thanks! B. On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com mailto:mika.leaf...@gmail.com wrote: Hi Beanos: Would you post the reult of $ceph
[ceph-users] Too few pgs per osd - Health_warn for EC pool
Hi We have created EC pool ( k =10 and m =3) with 540 osds. We followed the following rule to calculate the pgs count for the EC pool. (OSDs * 100) Total PGs = pool size Where *pool size* is either the number of replicas for replicated pools or the K+M sum for erasure coded pools Total pgs = 540 *100/13 = 4153.8 Nearest power of 2: 8192 So we have configured 8192 as the EC pool pg size. But we are getting health HEALTH_WARN ; too few pgs per osd (15 min 20). we checked all the osds and it has more than 80 pgs What's wrong here? - Regards K.Mohamed Pakkeer ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hello Vickie, After changing the size and min_size on all the existing pools, the cluster seems to be working, and I can store objects to the cluster .. but the cluster still shows non healthy: cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 256 pgs degraded; 256 pgs stuck unclean; recovery 1/2 objects degraded (50.000%); pool data pg_num 128 pgp_num 64 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e31: 6 osds: 6 up, 6 in pgmap v99: 256 pgs, 3 pools, 10240 kB data, 1 objects 210 MB used, 18155 MB / 18365 MB avail 1/2 objects degraded (50.000%) 256 active+degraded I can see some changes like: 1- recovery 1/2 objects degraded (50.000%) 2- 1/2 objects degraded (50.000%) 3- 256 active+degraded My question is: 1- What do those changes mean 2- How changing replication size can cause the cluster to be un healthy Thanks Vickie! Beanos On Feb 10, 2015, at 1:28 PM, B L super.itera...@gmail.com wrote: I changed the size and min_size as you suggested while opening the ceph -w on a different window, and I got this: ceph@ceph-node1:~$ ceph -w cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean; pool data pg_num 128 pgp_num 64 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e25: 6 osds: 6 up, 6 in pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects 198 MB used, 18167 MB / 18365 MB avail 192 incomplete 64 creating+incomplete 2015-02-10 11:22:24.421000 mon.0 [INF] osdmap e26: 6 osds: 6 up, 6 in 2015-02-10 11:22:24.425906 mon.0 [INF] pgmap v83: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:25.432950 mon.0 [INF] osdmap e27: 6 osds: 6 up, 6 in 2015-02-10 11:22:25.437626 mon.0 [INF] pgmap v84: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:26.449640 mon.0 [INF] osdmap e28: 6 osds: 6 up, 6 in 2015-02-10 11:22:26.454749 mon.0 [INF] pgmap v85: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:27.474113 mon.0 [INF] pgmap v86: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:31.770385 mon.0 [INF] pgmap v87: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:41.695656 mon.0 [INF] osdmap e29: 6 osds: 6 up, 6 in 2015-02-10 11:22:41.700296 mon.0 [INF] pgmap v88: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:42.712288 mon.0 [INF] osdmap e30: 6 osds: 6 up, 6 in 2015-02-10 11:22:42.716877 mon.0 [INF] pgmap v89: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:43.723701 mon.0 [INF] osdmap e31: 6 osds: 6 up, 6 in 2015-02-10 11:22:43.732035 mon.0 [INF] pgmap v90: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:46.774217 mon.0 [INF] pgmap v91: 256 pgs: 256 active+degraded; 0 bytes data, 199 MB used, 18165 MB / 18365 MB avail 2015-02-10 11:23:08.232686 mon.0 [INF] pgmap v92: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail 2015-02-10 11:23:27.767358 mon.0 [INF] pgmap v93: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail 2015-02-10 11:23:40.769794 mon.0 [INF] pgmap v94: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail 2015-02-10 11:23:45.530713 mon.0 [INF] pgmap v95: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail On Feb 10, 2015, at 1:24 PM, B L super.itera...@gmail.com mailto:super.itera...@gmail.com wrote: I will try to change the replication size now as you suggested .. but how is that related to the non-healthy cluster? On Feb 10, 2015, at 1:22 PM, B L super.itera...@gmail.com mailto:super.itera...@gmail.com wrote: Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree # idweight type name up/down reweight -1 0 root default -2 0 host ceph-node1 0 0 osd.0 up 1 1 0 osd.1 up 1 -3 0 host ceph-node3 2 0 osd.2 up 1 3 0 osd.3 up 1 -4 0 host ceph-node2 4 0 osd.4 up 1 5 0 osd.5 up 1 On Feb 10, 2015,
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
I changed the size and min_size as you suggested while opening the ceph -w on a different window, and I got this: ceph@ceph-node1:~$ ceph -w cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean; pool data pg_num 128 pgp_num 64 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e25: 6 osds: 6 up, 6 in pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects 198 MB used, 18167 MB / 18365 MB avail 192 incomplete 64 creating+incomplete 2015-02-10 11:22:24.421000 mon.0 [INF] osdmap e26: 6 osds: 6 up, 6 in 2015-02-10 11:22:24.425906 mon.0 [INF] pgmap v83: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:25.432950 mon.0 [INF] osdmap e27: 6 osds: 6 up, 6 in 2015-02-10 11:22:25.437626 mon.0 [INF] pgmap v84: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:26.449640 mon.0 [INF] osdmap e28: 6 osds: 6 up, 6 in 2015-02-10 11:22:26.454749 mon.0 [INF] pgmap v85: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:27.474113 mon.0 [INF] pgmap v86: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:31.770385 mon.0 [INF] pgmap v87: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:41.695656 mon.0 [INF] osdmap e29: 6 osds: 6 up, 6 in 2015-02-10 11:22:41.700296 mon.0 [INF] pgmap v88: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:42.712288 mon.0 [INF] osdmap e30: 6 osds: 6 up, 6 in 2015-02-10 11:22:42.716877 mon.0 [INF] pgmap v89: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:43.723701 mon.0 [INF] osdmap e31: 6 osds: 6 up, 6 in 2015-02-10 11:22:43.732035 mon.0 [INF] pgmap v90: 256 pgs: 192 incomplete, 64 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail 2015-02-10 11:22:46.774217 mon.0 [INF] pgmap v91: 256 pgs: 256 active+degraded; 0 bytes data, 199 MB used, 18165 MB / 18365 MB avail 2015-02-10 11:23:08.232686 mon.0 [INF] pgmap v92: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail 2015-02-10 11:23:27.767358 mon.0 [INF] pgmap v93: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail 2015-02-10 11:23:40.769794 mon.0 [INF] pgmap v94: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail 2015-02-10 11:23:45.530713 mon.0 [INF] pgmap v95: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail On Feb 10, 2015, at 1:24 PM, B L super.itera...@gmail.com wrote: I will try to change the replication size now as you suggested .. but how is that related to the non-healthy cluster? On Feb 10, 2015, at 1:22 PM, B L super.itera...@gmail.com mailto:super.itera...@gmail.com wrote: Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree # id weight type name up/down reweight -1 0 root default -2 0 host ceph-node1 00 osd.0 up 1 10 osd.1 up 1 -3 0 host ceph-node3 20 osd.2 up 1 30 osd.3 up 1 -4 0 host ceph-node2 40 osd.4 up 1 50 osd.5 up 1 On Feb 10, 2015, at 1:18 PM, Vickie ch mika.leaf...@gmail.com mailto:mika.leaf...@gmail.com wrote: Hi Beanos: BTW, if your cluster just for test. You may try to reduce replica size and min_size. ceph osd pool set rbd size 2;ceph osd pool set data size 2;ceph osd pool set metadata size 2 ceph osd pool set rbd min_size 1;ceph osd pool set data min_size 1;ceph osd pool set metadata min_size 1 Open another terminal and use command ceph -w watch pg and pgs status . Best wishes, Vickie 2015-02-10 19:16 GMT+08:00 Vickie ch mika.leaf...@gmail.com mailto:mika.leaf...@gmail.com: Hi Beanos: So you have 3 OSD servers and each of them have 2 disks. I have a question. What result of ceph osd tree. Look like the osd status is down. Best wishes, Vickie 2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com mailto:super.itera...@gmail.com: Here is the updated direct copy/paste dump eph@ceph-node1:~$ ceph osd dump epoch 25 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d created 2015-02-08 16:59:07.050875 modified 2015-02-09 22:35:33.191218 flags pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Thanks Vikhyat, As suggested .. ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0 Invalid command: osd.0 doesn't represent a float osd crush reweight name float[0.0-] : change name's weight to weight in crush map Error EINVAL: invalid command What do you think On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com wrote: sudo ceph osd crush reweight 0.0095 osd.0 to osd.5 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] combined ceph roles
Similar setup works well for me - 2 vm hosts, 1 Mon only mode. 6 osd's, 3 per vm host. Using rbd and cephfs The more memory on your vm hosts, the better. Lindsay Mathieson -Original Message- From: David Graham xtn...@gmail.com Sent: 11/02/2015 3:07 AM To: ceph-us...@ceph.com ceph-us...@ceph.com Subject: [ceph-users] combined ceph roles Hello, I'm giving thought to a minimal footprint scenario with full redundancy. I realize it isn't ideal--and may impact overall performance -- but wondering if the below example would work, supported, or known to cause issue? Example, 3x hosts each running: -- OSD's -- Mon -- Client I thought I read a post a while back about Client+OSD on the same host possibly being an issue -- but i am having difficulty finding that reference. I would appreciate if anyone has insight into such a setup, thanks!___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] cannot obtain keys from the nodes : [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-mds/ceph.keyring on ['ceph-vm01']
Hello! I am novice in Ceph and I in the desperation. My problem in the fact that I cannot obtain keys from the nodes. I found similar problem in maillist (http://www.spinics.net/lists/ceph-users/msg03843.html), but me did not succeed in solving it. There Francesc Alted writes about the fact that I tracked down my problem. It turned out that I was setting different names for the ceph servers in /etc/hosts than their own `hostname`. I rename all hostnames on all my nodes but it did not help. My admin server and nodes are: adm ceph-vm01 ceph-vm02 ceph-vm03 ceph-vm04 Below my ceph.conf, log and hosts file. [ceph@adm ~]$ more ceph.conf [global] auth_service_required = cephx filestore_xattr_use_omap = true auth_client_required = cephx auth_cluster_required = cephx mon_host = 192.168.10.210,192.168.10.211,192.168.10.212,192.168.10.213 mon_initial_members = ceph-vm01, ceph-vm02, ceph-vm03, ceph-vm04 fsid = 0a5be896-bbd8-4bea-9ca9-486d93222164 osd pool default size = 2 [ceph@adm ~]$ ceph-deploy gatherkeys ceph-vm01 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.11): /usr/bin/ceph-deploy gatherkeys ceph-vm01 [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-vm01 for /etc/ceph/ceph.client.admin.keyring [ceph-vm01][DEBUG ] connected to host: ceph-vm01 [ceph-vm01][DEBUG ] detect platform information from remote host [ceph-vm01][DEBUG ] detect machine type [ceph-vm01][DEBUG ] fetch remote file [ceph_deploy.gatherkeys][WARNIN] Unable to find /etc/ceph/ceph.client.admin.keyring on ['ceph-vm01'] [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-vm01 for /var/lib/ceph/bootstrap-osd/ceph.keyring [ceph-vm01][DEBUG ] connected to host: ceph-vm01 [ceph-vm01][DEBUG ] detect platform information from remote host [ceph-vm01][DEBUG ] detect machine type [ceph-vm01][DEBUG ] fetch remote file [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-osd/ceph.keyring on ['ceph-vm01'] [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-vm01 for /var/lib/ceph/bootstrap-mds/ceph.keyring [ceph-vm01][DEBUG ] connected to host: ceph-vm01 [ceph-vm01][DEBUG ] detect platform information from remote host [ceph-vm01][DEBUG ] detect machine type [ceph-vm01][DEBUG ] fetch remote file [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-mds/ceph.keyring on ['ceph-vm01'] [ceph@adm ~]$ more /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.10.214 ceph-vm05.2memory.ru ceph-vm05 adm 192.168.10.210ceph-vm01.2memory.ru ceph-vm01 node01 mon01 osd01 192.168.10.211ceph-vm02.2memory.ru ceph-vm02 node02 mon02 osd02 192.168.10.212ceph-vm03.2memory.ru ceph-vm03 node03 mon03 osd03 192.168.10.213 ceph-vm04.2memory.ru ceph-vm04 node04 mon04 osd04 -- Best regards, Konstantin Khatskevich ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Too few pgs per osd - Health_warn for EC pool
Hi Greg, Do you have any idea about the health warning? Regards K.Mohamed Pakkeer On Tue, Feb 10, 2015 at 4:49 PM, Mohamed Pakkeer mdfakk...@gmail.com wrote: Hi We have created EC pool ( k =10 and m =3) with 540 osds. We followed the following rule to calculate the pgs count for the EC pool. (OSDs * 100) Total PGs = pool size Where *pool size* is either the number of replicas for replicated pools or the K+M sum for erasure coded pools Total pgs = 540 *100/13 = 4153.8 Nearest power of 2: 8192 So we have configured 8192 as the EC pool pg size. But we are getting health HEALTH_WARN ; too few pgs per osd (15 min 20). we checked all the osds and it has more than 80 pgs What's wrong here? - Regards K.Mohamed Pakkeer ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 答复: Re: can not add osd
Just wondering if this was ever resolved �C I am seeing the exact same issue when I moved from Centos 6.5 firefly to Centos7 on giant release using “ceph-deploy osd prepare . . . ” the script fails to umount and then posts a device is busy message. Details are below in yang bin18’s posting below. Ubuntu Trusty with giant seems OK. I have redeployed the cluster and also tried deploying on virtual machines as well as physical ones. Setup is minimal 3 x OSD nodes with one monitor node. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of yang.bi...@zte.com.cn Sent: Monday, December 22, 2014 2:58 AM To: Karan Singh Cc: ceph-users Subject: [ceph-users] 答复: Re: can not add osd Hi I have deploied ceph osd according official Ceph docs,and the same error came out again. 发件人: Karan Singh karan.si...@csc.fimailto:karan.si...@csc.fi 收件人: yang.bi...@zte.com.cnmailto:yang.bi...@zte.com.cn, 抄送:ceph-users ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com 日期: 2014/12/16 22:51 主题:Re: [ceph-users] can not add osd Hi You logs does not provides much information , if you are following any other documentation for Ceph , i would recommend you to follow official Ceph docs. http://ceph.com/docs/master/start/quick-start-preflight/ Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ On 16 Dec 2014, at 09:55, yang.bi...@zte.com.cnmailto:yang.bi...@zte.com.cn wrote: hi When i execute ceph-deploy osd prepare node3:/dev/sdb,always come out err like this : [node3][WARNIN] INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.u2KXW3 [node3][WARNIN] umount: /var/lib/ceph/tmp/mnt.u2KXW3: target is busy. Then i execute /bin/umount -- /var/lib/ceph/tmp/mnt.u2KXW3,result is ok. ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately. ___ ceph-users mailing list ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hello, Your osd does not have weights , please assign some weight to your ceph cluster osds as Udo said in his last comment. osd crush reweight name float[0.0-] change name's weight to weight in crush map sudo ceph osd crush reweight 0.0095 osd.0 to osd.5. Regards, Vikhyat On 02/10/2015 06:11 PM, B L wrote: Hello Udo, Thanks for your answer .. 2 questions here: 1- Does what you say mean that I have to remove my drive devices (8GB each) and add new ones with at least 10GB? 2- Shall I manually re-weight after disk creation and preparation using this command (*ceph osd reweight osd.2 1.0*), or things will work automatically with no too much fuss when disk drives are bigger than or equal 10GB? Beanos On Feb 10, 2015, at 2:26 PM, Udo Lembke ulem...@polarzone.de mailto:ulem...@polarzone.de wrote: Hi, your will get further trouble, because your weight is not correct. You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB or greater! Udo Am 10.02.2015 12:22, schrieb B L: Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree # idweighttype nameup/downreweight -10root default -20host ceph-node1 00osd.0up1 10osd.1up1 -30host ceph-node3 20osd.2up1 30osd.3up1 -40host ceph-node2 40osd.4up1 50osd.5up1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hi, your will get further trouble, because your weight is not correct. You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB or greater! Udo Am 10.02.2015 12:22, schrieb B L: Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree # idweighttype nameup/downreweight -10root default -20host ceph-node1 00osd.0up1 10osd.1up1 -30host ceph-node3 20osd.2up1 30osd.3up1 -40host ceph-node2 40osd.4up1 50osd.5up1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hi, To add to Udo's point, Do remember that by default journals take ~6Gb. For this reason I suggest making Virtual disks larger than 20Gb for testing although its slightly bigger than absolutely necessary. Best regards Owen On 02/10/2015 01:26 PM, Udo Lembke wrote: Hi, your will get further trouble, because your weight is not correct. You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB or greater! Udo Am 10.02.2015 12:22, schrieb B L: Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree # idweighttype nameup/downreweight -10root default -20host ceph-node1 00osd.0up1 10osd.1up1 -30host ceph-node3 20osd.2up1 30osd.3up1 -40host ceph-node2 40osd.4up1 50osd.5up1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Maxfeldstraße 5 90409 Nürnberg Germany ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hello Udo, Thanks for your answer .. 2 questions here: 1- Does what you say mean that I have to remove my drive devices (8GB each) and add new ones with at least 10GB? 2- Shall I manually re-weight after disk creation and preparation using this command (ceph osd reweight osd.2 1.0), or things will work automatically with no too much fuss when disk drives are bigger than or equal 10GB? Beanos On Feb 10, 2015, at 2:26 PM, Udo Lembke ulem...@polarzone.de wrote: Hi, your will get further trouble, because your weight is not correct. You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB or greater! Udo Am 10.02.2015 12:22, schrieb B L: Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree # idweighttype nameup/downreweight -10root default -20host ceph-node1 00osd.0up1 10osd.1up1 -30host ceph-node3 20osd.2up1 30osd.3up1 -40host ceph-node2 40osd.4up1 50osd.5up1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hi, maybe other way around: name float = osd.0 0.0095 Met vriendelijke groet, Micha Kersloot Blijf op de hoogte en ontvang de laatste tips over Zimbra/KovoKs Contact: http://twitter.com/kovoks KovoKs B.V. is ingeschreven onder KvK nummer: 1104 From: B L super.itera...@gmail.com To: Vikhyat Umrao vum...@redhat.com, Udo Lembke ulem...@polarzone.de Cc: ceph-users@lists.ceph.com Sent: Tuesday, February 10, 2015 3:01:34 PM Subject: Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in Thanks Vikhyat, As suggested .. ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0 Invalid command: osd.0 doesn't represent a float osd crush reweight name float[0.0-] : change name's weight to weight in crush map Error EINVAL: invalid command What do you think On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com wrote: sudo ceph osd crush reweight 0.0095 osd.0 to osd.5 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Hi, use: ceph osd crush set 0 0.01 pool=default host=ceph-node1 ceph osd crush set 1 0.01 pool=default host=ceph-node1 ceph osd crush set 2 0.01 pool=default host=ceph-node3 ceph osd crush set 3 0.01 pool=default host=ceph-node3 ceph osd crush set 4 0.01 pool=default host=ceph-node2 ceph osd crush set 5 0.01 pool=default host=ceph-node2 Udo Am 10.02.2015 15:01, schrieb B L: Thanks Vikhyat, As suggested .. ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0 Invalid command: osd.0 doesn't represent a float osd crush reweight name float[0.0-] : change name's weight to weight in crush map Error EINVAL: invalid command What do you think On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com mailto:vum...@redhat.com wrote: sudo ceph osd crush reweight 0.0095 osd.0 to osd.5 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in
Oh , I have miss placed the places for osd names and weight ceph osd crush reweight osd.0 0.0095 and so on .. Regards, Vikhyat On 02/10/2015 07:31 PM, B L wrote: Thanks Vikhyat, As suggested .. ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0 Invalid command: osd.0 doesn't represent a float osd crush reweight name float[0.0-] : change name's weight to weight in crush map Error EINVAL: invalid command What do you think On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com mailto:vum...@redhat.com wrote: sudo ceph osd crushreweight 0.0095 osd.0 to osd.5 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com