Re: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?)

2015-02-10 Thread Alexandre DERUMIER
Hi,

you need to import foreign config from openmanage webui.
somewhere in storage controller



BTW, I'm currently testing new dell r630 with a perc h330 ( lsi 3008)

With this controller, it's possible to do hardware for some disks, and 
passthrough for some others disks.

So, perfect for ceph :)


- Mail original -
De: pixelfairy pixelfa...@gmail.com
À: ceph-users ceph-us...@ceph.com
Envoyé: Mardi 10 Février 2015 11:38:48
Objet: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?)

Im stuck with these servers with dell perc 710p raid cards. 8 bays, 
looking at a pair of 256gig ssds in raid 1 for / and journals, the 
rest as 4tb sas we already have. 

since that card refuses jbod, we made them all single disk raid0, then 
pulled one as a test. putting it back, its state is foreign and 
there doesnt seem to be anything that can change this from omconfig, 
the om web ui, or idrac web ui. is there any way to restore this? does 
the machine need to be rebooted to fix it? 

lspci reports these to be lsi mega raid 2208 thunderbolt, has anyone 
here used the perc 710p with the mega raid utilities under ubuntu 
14.04? 

if it really is only fixable from the bios (needing to reboot), were 
looking at putting in lsi 9211 cards 
___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
Thanks for everyone!!

After applying the re-weighting command (ceph osd crush reweight osd.0 0.0095), 
my cluster is getting healthy now :))

But I have one question, what if I have hundreds of OSDs, shall I do the 
re-weighting on each device, or there is some way to make this happen 
automatically .. the question in other words, why would I need to do weighting 
in the first place??




 On Feb 10, 2015, at 4:00 PM, Vikhyat Umrao vum...@redhat.com wrote:
 
 Oh , I have miss placed the places for osd names and weight 
 
 ceph osd crush reweight osd.0 0.0095  and so on ..
 
 Regards,
 Vikhyat
 
 On 02/10/2015 07:31 PM, B L wrote:
 Thanks Vikhyat,
 
 As suggested .. 
 
 ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0
 
 Invalid command:  osd.0 doesn't represent a float
 osd crush reweight name float[0.0-] :  change name's weight to 
 weight in crush map
 Error EINVAL: invalid command
 
 What do you think
 
 
 On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com 
 mailto:vum...@redhat.com wrote:
 
 sudo ceph osd crush reweight 0.0095 osd.0 to osd.5
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Update 0.80.5 to 0.80.8 --the VM's read request become too slow

2015-02-10 Thread 杨万元
Hello!
We use Ceph+Openstack in our private cloud. Recently we upgrade our
centos6.5 based cluster from Ceph Emperor to Ceph Firefly.
At first,we use redhat yum repo epel to upgrade, this Ceph's version is
0.80.5. First upgrade monitor,then osd,last client. when we complete this
upgrade, we boot a VM on the cluster,then use fio to test the io
performance. The io performance is as better as before. Everything is ok!
Then we upgrade the cluster from 0.80.5 to 0.80.8,when we  completed ,
we reboot the VM to load the newest librbd. after that we also use fio to
test the io performance*.then we find the randwrite and write is as good as
before.but the randread and read is become worse, randwrite's iops from
4000-5000 to 300-400 ,and the latency is worse. the write's bw from 400MB/s
to 115MB/s*. then I downgrade the ceph client version from 0.80.8 to
0.80.5, then the reslut become  normal.
 So I think maybe something cause about librbd.  I compare the 0.80.8
release notes with 0.80.5 (
http://ceph.com/docs/master/release-notes/#v0-80-8-firefly ), I just find
this change in  0.80.8 is something about read request  :  librbd: cap
memory utilization for read requests (Jason Dillaman)  .  Who can  explain
this?


   * My ceph cluster is 400osd,5mons*:
ceph -s
 health HEALTH_OK
 monmap e11: 5 mons at {BJ-M1-Cloud71=
172.28.2.71:6789/0,BJ-M1-Cloud73=172.28.2.73:6789/0,BJ-M2-Cloud80=172.28.2.80:6789/0,BJ-M2-Cloud81=172.28.2.81:6789/0,BJ-M3-Cloud85=172.28.2.85:6789/0},
election epoch 198, quorum 0,1,2,3,4
BJ-M1-Cloud71,BJ-M1-Cloud73,BJ-M2-Cloud80,BJ-M2-Cloud81,BJ-M3-Cloud85
 osdmap e120157: 400 osds: 400 up, 400 in
 pgmap v26161895: 29288 pgs, 6 pools, 20862 GB data, 3014 kobjects
41084 GB used, 323 TB / 363 TB avail
   29288 active+clean
 client io 52640 kB/s rd, 32419 kB/s wr, 5193 op/s


 *The follwing is my ceph client conf :*
 [global]
 auth_service_required = cephx
 filestore_xattr_use_omap = true
 auth_client_required = cephx
 auth_cluster_required = cephx
 mon_host =
 172.29.204.24,172.29.204.48,172.29.204.55,172.29.204.58,172.29.204.73
 mon_initial_members = ZR-F5-Cloud24, ZR-F6-Cloud48, ZR-F7-Cloud55,
ZR-F8-Cloud58, ZR-F9-Cloud73
 fsid = c01c8e28-304e-47a4-b876-cb93acc2e980
 mon osd full ratio = .85
 mon osd nearfull ratio = .75
 public network = 172.29.204.0/24
 mon warn on legacy crush tunables = false

 [osd]
 osd op threads = 12
 filestore journal writeahead = true
 filestore merge threshold = 40
 filestore split multiple = 8

 [client]
 rbd cache = true
 rbd cache writethrough until flush = false
 rbd cache size = 67108864
 rbd cache max dirty = 50331648
 rbd cache target dirty = 33554432

 [client.cinder]
 admin socket = /var/run/ceph/rbd-$pid.asok



* My VM is 8core16G,we use fio scripts is : *
 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randread -size=60G
-filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200
 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=60G
-filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200
 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=read -size=60G
-filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200
 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=write -size=60G
-filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200

 *The following is the io test result*
 ceph client verison :0.80.5
 read:  bw=*430MB*
 write: bw=420MB
 randread:   iops=*4875*   latency=65ms
 randwrite:   iops=6844   latency=46ms

 ceph client verison :0.80.8
 read: bw=*115MB*
 write: bw=480MB
 randread:   iops=*381*   latency=83ms
 randwrite:  iops=4843   latency=68ms
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?)

2015-02-10 Thread pixelfairy
turns out you can do some stuff with omconfig as long as you enable
auto import in the cards bios utility. still need the web ui to turn
the new disk into a usable block device.

have you been able to automate the whole recovery process? id like to
just put the new disk in and have the system notice and automatically
set it up.

On Tue, Feb 10, 2015 at 7:28 AM, Don Doerner don.doer...@quantum.com wrote:
 I've been involved on projects over many years that use the MegaRAID 2208, in 
 any of many forms, including the Dell H710.  Without resorting the a 
 BIOS-time utility, I only know of one way to manage them: MegaCLI.  Look 
 around on the web a bit: you can download it from LSI, and there is full 
 documentation on-line also.
 I am presently doing some prototyping work with Ceph using these RAID 
 controllers (H710, not H710P, but that's a negligible difference; and H810), 
 but I haven't started investigating failure scenarios yet...
 ___
 Don Doerner
 Technical Director, Advanced Projects
 Quantum Corporation


 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
 pixelfairy
 Sent: 10 February, 2015 02:39
 To: ceph-users
 Subject: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?)

 Im stuck with these servers with dell perc 710p raid cards. 8 bays, looking 
 at a pair of 256gig ssds in raid 1 for / and journals, the rest as 4tb sas we 
 already have.

 since that card refuses jbod, we made them all single disk raid0, then pulled 
 one as a test. putting it back, its state is foreign and there doesnt seem 
 to be anything that can change this from omconfig, the om web ui, or idrac 
 web ui. is there any way to restore this? does the machine need to be 
 rebooted to fix it?

 lspci reports these to be lsi mega raid 2208 thunderbolt, has anyone here 
 used the perc 710p with the mega raid utilities under ubuntu 14.04?

 if it really is only fixable from the bios (needing to reboot), were looking 
 at putting in lsi 9211 cards ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 https://urldefense.proofpoint.com/v1/url?u=http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.comk=8F5TVnBDKF32UabxXsxZiA%3D%3D%0Ar=klXZewu0kUquU7GVFsSHwpsWEaffmLRymeSfL%2FX1EJo%3D%0Am=iPJ3KSOTK1wAgXCHkNbxNA%2FrXHlBuS6kMvNOZKaf%2BqA%3D%0As=f82b4bd26f3bae0074b979a54994791a9a8a5ba92dfa89799b596ba41072ad31

 --
 The information contained in this transmission may be confidential. Any 
 disclosure, copying, or further distribution of confidential information is 
 not permitted unless such privilege is explicitly granted in writing by 
 Quantum. Quantum reserves the right to have electronic communications, 
 including email and attachments, sent across its networks filtered through 
 anti virus and spam software programs and retain such messages in order to 
 comply with applicable data security and retention requirements. Quantum is 
 not responsible for the proper and complete transmission of the substance of 
 this communication or for any delay in its receipt.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph Performance vs PG counts

2015-02-10 Thread Vikhyat Umrao

Hi,

Just a heads up I hope , you are aware of this tool:
http://ceph.com/pgcalc/

Regards,
Vikhyat

On 02/11/2015 09:11 AM, Sumit Gaur wrote:

Hi ,
I am not sure why PG numbers have not given that much importance in 
the ceph documents, I am seeing huge variation in performance number 
by changing PG numbers.

Just an example

*without SSD* :*
*
36 OSD HDD = PG count 2048 gives me random write (1024K bz) 
performance of 550 MBps


*with SSD  :*
6 SSD for journals + 24 OSD HDD = PG count 2048 gives me random write 
(1024K bz) performance 250 MBps

if I change it to
6 SSD for journals + 24 OSD HDD = PG count 512 gives me random write 
(1024K bz) performance 700 MBps


Variation of PG numbers make SSD looks bad in number. I am bit 
confused here with this behaviour.


Thanks
sumit




On Mon, Feb 9, 2015 at 11:36 AM, Gregory Farnum g...@gregs42.com 
mailto:g...@gregs42.com wrote:


On Sun, Feb 8, 2015 at 6:00 PM, Sumit Gaur sumitkg...@gmail.com
mailto:sumitkg...@gmail.com wrote:
 Hi
 I have installed 6 node ceph cluster and doing a performance
bench mark for
 the same using Nova VMs. What I have observed that FIO random
write reports
 around 250 MBps for 1M block size and PGs 4096 and 650MBps for
iM block size
 and PG counts 2048  . Can some body let me know if I am missing
any ceph
 Architecture point here ? As per my understanding PG numbers are
mainly
 involved in calculating the hash and should not effect
performance so much.

PGs are also serialization points within the codebase, so depending on
how you're testing you can run into contention if you have multiple
objects within a single PG that you're trying to write to at once.
This isn't normally a problem, but for a single benchmark run the
random collisions can become noticeable.
-Greg




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Vickie ch
Hi
The weight is reflect spaces or ability  of disks.
For example, the weight of 100G OSD disk is 0.100(100G/1T).


Best wishes,
Vickie

2015-02-10 22:25 GMT+08:00 B L super.itera...@gmail.com:

 Thanks for everyone!!

 After applying the re-weighting command (*ceph osd crush reweight osd.0
 0.0095*), my cluster is getting healthy now :))

 But I have one question, what if I have hundreds of OSDs, shall I do the
 re-weighting on each device, or there is some way to make this happen
 automatically .. the question in other words, why would I need to do
 weighting in the first place??




 On Feb 10, 2015, at 4:00 PM, Vikhyat Umrao vum...@redhat.com wrote:

  Oh , I have miss placed the places for osd names and weight

 ceph osd crush reweight osd.0 0.0095  and so on ..

 Regards,
 Vikhyat

  On 02/10/2015 07:31 PM, B L wrote:

 Thanks Vikhyat,

  As suggested ..

  ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0

  Invalid command:  osd.0 doesn't represent a float
 osd crush reweight name float[0.0-] :  change name's weight to
 weight in crush map
 Error EINVAL: invalid command

  What do you think


  On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com wrote:

 sudo ceph osd crush reweight 0.0095 osd.0 to osd.5





 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] wider rados namespace support?

2015-02-10 Thread Blair Bethwaite
Just came across this in the docs:
Currently (i.e., firefly), namespaces are only useful for
applications written on top of librados. Ceph clients such as block
device, object storage and file system do not currently support this
feature.

Then found:
https://wiki.ceph.com/Planning/Sideboard/rbd%3A_namespace_support

Is there any progress or plans to address this (particularly for rbd
clients but also cephfs)?

-- 
Cheers,
~Blairo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph Performance vs PG counts

2015-02-10 Thread Sumit Gaur
Hi ,
I am not sure why PG numbers have not given that much importance in the
ceph documents, I am seeing huge variation in performance number by
changing PG numbers.
Just an example

*without SSD* :
36 OSD HDD = PG count 2048 gives me random write (1024K bz) performance of
550 MBps

*with SSD  :*
6 SSD for journals + 24 OSD HDD = PG count 2048 gives me random write
(1024K bz) performance 250 MBps
if I change it to
6 SSD for journals + 24 OSD HDD = PG count 512 gives me random write
(1024K bz) performance 700 MBps

Variation of PG numbers make SSD looks bad in number. I am bit confused
here with this behaviour.

Thanks
sumit




On Mon, Feb 9, 2015 at 11:36 AM, Gregory Farnum g...@gregs42.com wrote:

 On Sun, Feb 8, 2015 at 6:00 PM, Sumit Gaur sumitkg...@gmail.com wrote:
  Hi
  I have installed 6 node ceph cluster and doing a performance bench mark
 for
  the same using Nova VMs. What I have observed that FIO random write
 reports
  around 250 MBps for 1M block size and PGs 4096 and 650MBps for iM block
 size
  and PG counts 2048  . Can some body let me know if I am missing any ceph
  Architecture point here ? As per my understanding PG numbers are mainly
  involved in calculating the hash and should not effect performance so
 much.

 PGs are also serialization points within the codebase, so depending on
 how you're testing you can run into contention if you have multiple
 objects within a single PG that you're trying to write to at once.
 This isn't normally a problem, but for a single benchmark run the
 random collisions can become noticeable.
 -Greg

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] combined ceph roles

2015-02-10 Thread David Graham
Hello, I'm giving thought to a minimal footprint scenario with full
redundancy. I realize it isn't ideal--and may impact overall performance
--  but wondering if the below example would work, supported, or known to
cause issue?

Example, 3x hosts each running:
-- OSD's
-- Mon
-- Client


I thought I read a post a while back about Client+OSD on the same host
possibly being an issue -- but i am having difficulty finding that
reference.

I would appreciate if anyone has insight into such a setup,

thanks!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph vs Hardware RAID: No battery backed cache

2015-02-10 Thread Mark Kirkwood

On 10/02/15 20:40, Thomas Güttler wrote:

Hi,

does the lack of a battery backed cache in Ceph introduce any
disadvantages?

We use PostgreSQL and our servers have UPS.

But I want to survive a power outage, although it is unlikely. But hope
is not an option ...



You can certainly make use of adapter cards that have a battery backed 
cache with Ceph - either using RAID as usual or creating arrays of RAID 
0 of 1 disk that enable you to use the nice battery backed cache + 
writeback options on the card and still have a 1 osd mapped to 1 disk 
topology.


Without such cards it is still quite possible to have a power loss safe 
setup. These days (with reasonably modern 3.* kernels) using SATA or SAS 
plus mount options that do *not* disable write barriers will leave you 
with a consistent, safe state in the advent of power loss. You might 
want to test your SATA disk of choice to be sure, but SAS should be safe!


Cheers

Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
Having problem with my fresh non-healthy cluster, my cluster status summary 
shows this:

ceph@ceph-node1:~$ ceph -s

cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
 health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs 
stuck unclean; pool data pg_num 128  pgp_num 64
 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, 
quorum 0 ceph-node1
 osdmap e25: 6 osds: 6 up, 6 in
  pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects
198 MB used, 18167 MB / 18365 MB avail
 192 incomplete
  64 creating+incomplete


Where shall I start troubleshooting this?

P.S. I’m new to CEPH.

Thanks!
Beanos___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L

 On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com wrote:
 
 Hi Vickie,
 
 Thanks for your reply!
 
 You can find the dump in this link:
 
 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca 
 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca
 
 Thanks!
 B.
 
 
 On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com 
 mailto:mika.leaf...@gmail.com wrote:
 
 Hi Beanos:
Would you post the reult of $ceph osd dump?
 
 Best wishes,
 Vickie
 
 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com:
 Having problem with my fresh non-healthy cluster, my cluster status summary 
 shows this:
 
 ceph@ceph-node1:~$ ceph -s
 
 cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
  health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs 
 stuck unclean; pool data pg_num 128  pgp_num 64
  monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0 
 http://172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1
  osdmap e25: 6 osds: 6 up, 6 in
   pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects
 198 MB used, 18167 MB / 18365 MB avail
  192 incomplete
   64 creating+incomplete
 
 
 Where shall I start troubleshooting this?
 
 P.S. I’m new to CEPH.
 
 Thanks!
 Beanos
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ISCSI LIO hang after 2-3 days of working

2015-02-10 Thread Nick Fisk
Hi Mike,

I can also seem to reproduce this behaviour. If I shutdown a Ceph node, the
delay while Ceph works out that the OSD's are down seems to trigger similar
error messages. It seems fairly reliable that if a OSD is down for more than
10 seconds that LIO will have this problem.

Below is an excerpt from the kernel log, showing the OSD's going down,
timing out and then coming back up. Even when the OSD's are back up ESXi
never seems to be able to resume IO to the LUN (at least not within 24
hours).

However if you remember I am working on the Active/Standby ALUA solution,
when I promote the other LIO node to active for the LUN, access resumes
immediately. 

This is a test/dev cluster for my ALUA resource agents, so please let me
know if you need me to run any commands. Ideally it would be best to make
LIO recover cleanly, but extending the timeout if possible would probably
help in the majority of day to day cases.

Feb  9 16:33:34 ceph-iscsi1 kernel: [ 3656.412071] libceph: osd1
10.3.2.31:6800 socket closed (con state OPEN)
Feb  9 16:33:34 ceph-iscsi1 kernel: [ 3656.418621] libceph: osd0
10.3.2.31:6805 socket closed (con state OPEN)
Feb  9 16:33:45 ceph-iscsi1 kernel: [ 3666.927352] ABORT_TASK: Found
referenced iSCSI task_tag: 26731
Feb  9 16:33:45 ceph-iscsi1 kernel: [ 3666.927356] ABORT_TASK: ref_tag:
26731 already complete, skipping
Feb  9 16:33:45 ceph-iscsi1 kernel: [ 3666.927369] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26731
Feb  9 16:33:56 ceph-iscsi1 kernel: [ 3678.230684] libceph: osd0 down
Feb  9 16:33:56 ceph-iscsi1 kernel: [ 3678.230689] libceph: osd1 down
Feb  9 16:33:57 ceph-iscsi1 kernel: [ 3679.279227] ABORT_TASK: Found
referenced iSCSI task_tag: 26733
Feb  9 16:33:57 ceph-iscsi1 kernel: [ 3679.279855] ABORT_TASK: Sending
TMR_FUNCTION_COMPLETE for ref_tag: 26733
Feb  9 16:33:57 ceph-iscsi1 kernel: [ 3679.279861] ABORT_TASK: Found
referenced iSCSI task_tag: 26735
Feb  9 16:33:57 ceph-iscsi1 kernel: [ 3679.279862] ABORT_TASK: ref_tag:
26735 already complete, skipping
Feb  9 16:33:57 ceph-iscsi1 kernel: [ 3679.279863] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26735
Feb  9 16:34:14 ceph-iscsi1 kernel: [ 3696.143515] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26739
Feb  9 16:34:28 ceph-iscsi1 kernel: [ 3709.731706] ABORT_TASK: Found
referenced iSCSI task_tag: 26745
Feb  9 16:34:28 ceph-iscsi1 kernel: [ 3709.731711] ABORT_TASK: ref_tag:
26745 already complete, skipping
Feb  9 16:34:28 ceph-iscsi1 kernel: [ 3709.731712] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26745
Feb  9 16:34:37 ceph-iscsi1 kernel: [ 3718.905787] libceph: osd1 up
Feb  9 16:34:40 ceph-iscsi1 kernel: [ 3721.904940] libceph: osd0 up
Feb  9 16:34:55 ceph-iscsi1 kernel: [ 3736.924123] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26754
Feb  9 16:35:22 ceph-iscsi1 kernel: [ 3764.102399] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26762
Feb  9 16:35:36 ceph-iscsi1 kernel: [ 3777.689640] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26768
Feb  9 16:36:03 ceph-iscsi1 kernel: [ 3804.866891] ABORT_TASK: Found
referenced iSCSI task_tag: 26777
Feb  9 16:36:03 ceph-iscsi1 kernel: [ 3804.866896] ABORT_TASK: ref_tag:
26777 already complete, skipping
Feb  9 16:36:03 ceph-iscsi1 kernel: [ 3804.866897] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26777
Feb  9 16:36:30 ceph-iscsi1 kernel: [ 3832.011008] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x000a
Feb  9 16:36:30 ceph-iscsi1 kernel: [ 3832.011137] ABORT_TASK: Found
referenced iSCSI task_tag: 26792
Feb  9 16:36:30 ceph-iscsi1 kernel: [ 3832.011139] ABORT_TASK: ref_tag:
26792 already complete, skipping
Feb  9 16:36:30 ceph-iscsi1 kernel: [ 3832.011141] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26792
Feb  9 16:36:44 ceph-iscsi1 kernel: [ 3845.613219] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x0014
Feb  9 16:36:57 ceph-iscsi1 kernel: [ 3859.215023] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x001e
Feb  9 16:37:11 ceph-iscsi1 kernel: [ 3872.791732] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x0028
Feb  9 16:37:11 ceph-iscsi1 kernel: [ 3872.791826] ABORT_TASK: Found
referenced iSCSI task_tag: 26809
Feb  9 16:37:11 ceph-iscsi1 kernel: [ 3872.791827] ABORT_TASK: ref_tag:
26809 already complete, skipping
Feb  9 16:37:11 ceph-iscsi1 kernel: [ 3872.791828] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 26809
Feb  9 16:37:24 ceph-iscsi1 kernel: [ 3886.378032] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x0032
Feb  9 16:37:38 ceph-iscsi1 kernel: [ 3899.958060] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x003c
Feb  9 16:37:38 ceph-iscsi1 kernel: [ 3899.958144] ABORT_TASK: Found
referenced iSCSI task_tag: 26819


Nick



-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Mike Christie
Sent: 06 February 2015 04:09

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
Here is the updated direct copy/paste dump

eph@ceph-node1:~$ ceph osd dump
epoch 25
fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d
created 2015-02-08 16:59:07.050875
modified 2015-02-09 22:35:33.191218
flags
pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins 
pg_num 128 pgp_num 64 last_change 24 flags hashpspool crash_replay_interval 45 
stripe_width 0
pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
max_osd 6
osd.0 up   in  weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval 
[0,0) 172.31.0.84:6800/11739 172.31.0.84:6801/11739 172.31.0.84:6802/11739 
172.31.0.84:6803/11739 exists,up 765f5066-d13e-4a9e-a446-8630ee06e596
osd.1 up   in  weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval [0,0) 
172.31.0.84:6805/12279 172.31.0.84:6806/12279 172.31.0.84:6807/12279 
172.31.0.84:6808/12279 exists,up e1d073e5-9397-4b63-8b7c-a4064e430f7a
osd.2 up   in  weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval 
[0,0) 172.31.3.57:6800/5517 172.31.3.57:6801/5517 172.31.3.57:6802/5517 
172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f
osd.3 up   in  weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval 
[0,0) 172.31.3.57:6805/6043 172.31.3.57:6806/6043 172.31.3.57:6807/6043 
172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92
osd.4 up   in  weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval 
[0,0) 172.31.3.56:6800/5106 172.31.3.56:6801/5106 172.31.3.56:6802/5106 
172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be
osd.5 up   in  weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval 
[0,0) 172.31.3.56:6805/7019 172.31.3.56:6806/7019 172.31.3.56:6807/7019 
172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3


 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com wrote:
 
 
 On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com wrote:
 
 Hi Vickie,
 
 Thanks for your reply!
 
 You can find the dump in this link:
 
 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca 
 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca
 
 Thanks!
 B.
 
 
 On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com 
 mailto:mika.leaf...@gmail.com wrote:
 
 Hi Beanos:
Would you post the reult of $ceph osd dump?
 
 Best wishes,
 Vickie
 
 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com:
 Having problem with my fresh non-healthy cluster, my cluster status summary 
 shows this:
 
 ceph@ceph-node1:~$ ceph -s
 
 cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
  health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs 
 stuck unclean; pool data pg_num 128  pgp_num 64
  monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0 
 http://172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1
  osdmap e25: 6 osds: 6 up, 6 in
   pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects
 198 MB used, 18167 MB / 18365 MB avail
  192 incomplete
   64 creating+incomplete
 
 
 Where shall I start troubleshooting this?
 
 P.S. I’m new to CEPH.
 
 Thanks!
 Beanos
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Vickie ch
Hi Beanos:
So you have 3 OSD servers and each of them have 2 disks.
I have a question. What result of ceph osd tree. Look like the osd status
is down.


Best wishes,
Vickie

2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com:

 Here is the updated direct copy/paste dump

 eph@ceph-node1:~$ ceph osd dump
 epoch 25
 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d
 created 2015-02-08 16:59:07.050875
 modified 2015-02-09 22:35:33.191218
 flags
 pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash
 rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool
 crash_replay_interval 45 stripe_width 0
 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
 max_osd 6
 osd.0 up   in  weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval
 [0,0) 172.31.0.84:6800/11739 172.31.0.84:6801/11739 172.31.0.84:6802/11739
 172.31.0.84:6803/11739 exists,up 765f5066-d13e-4a9e-a446-8630ee06e596
 osd.1 up   in  weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval
 [0,0) 172.31.0.84:6805/12279 172.31.0.84:6806/12279 172.31.0.84:6807/12279
 172.31.0.84:6808/12279 exists,up e1d073e5-9397-4b63-8b7c-a4064e430f7a
 osd.2 up   in  weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval
 [0,0) 172.31.3.57:6800/5517 172.31.3.57:6801/5517 172.31.3.57:6802/5517
 172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f
 osd.3 up   in  weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval
 [0,0) 172.31.3.57:6805/6043 172.31.3.57:6806/6043 172.31.3.57:6807/6043
 172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92
 osd.4 up   in  weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval
 [0,0) 172.31.3.56:6800/5106 172.31.3.56:6801/5106 172.31.3.56:6802/5106
 172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be
 osd.5 up   in  weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval
 [0,0) 172.31.3.56:6805/7019 172.31.3.56:6806/7019 172.31.3.56:6807/7019
 172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3


 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com wrote:


 On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com wrote:

 Hi Vickie,

 Thanks for your reply!

 You can find the dump in this link:

 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca

 Thanks!
 B.


 On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com wrote:

 Hi Beanos:
Would you post the reult of $ceph osd dump?

 Best wishes,
 Vickie

 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com:

 Having problem with my fresh non-healthy cluster, my cluster status
 summary shows this:

 ceph@ceph-node1:~$* ceph -s*

 cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
  health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256
 pgs stuck unclean; pool data pg_num 128  pgp_num 64
  monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election
 epoch 2, quorum 0 ceph-node1
  osdmap e25: 6 osds: 6 up, 6 in
   pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects
 198 MB used, 18167 MB / 18365 MB avail
  192 incomplete
   64 creating+incomplete


 Where shall I start troubleshooting this?

 P.S. I’m new to CEPH.

 Thanks!
 Beanos

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?)

2015-02-10 Thread Daniel Swarbrick
On 10/02/15 11:38, pixelfairy wrote:
 since that card refuses jbod, we made them all single disk raid0, then
 pulled one as a test. putting it back, its state is foreign and
 there doesnt seem to be anything that can change this from omconfig,
 the om web ui, or idrac web ui. is there any way to restore this? does
 the machine need to be rebooted to fix it?

On real LSI MegaRAIDs, you can use megacli to show / clear the foreign
state of drives.

The following will clear the foreign state of any foreign drives on all
controllers:

# megacli -cfgforeign -clear -aALL

I'm not sure if you can use megacli with PERC controllers, but I'm sure
there will be something very similar.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
Hi Vickie,

My OSD tree looks like this:

ceph@ceph-node3:/home/ubuntu$ ceph osd tree
# idweight  type name   up/down reweight
-1  0   root default
-2  0   host ceph-node1
0   0   osd.0   up  1
1   0   osd.1   up  1
-3  0   host ceph-node3
2   0   osd.2   up  1
3   0   osd.3   up  1
-4  0   host ceph-node2
4   0   osd.4   up  1
5   0   osd.5   up  1


 On Feb 10, 2015, at 1:18 PM, Vickie ch mika.leaf...@gmail.com wrote:
 
 Hi Beanos:
 BTW, if your cluster just for test. You may try to reduce replica size and 
 min_size. 
 ceph osd pool set rbd size 2;ceph osd pool set data size 2;ceph osd pool set 
 metadata size 2 
 ceph osd pool set rbd min_size 1;ceph osd pool set data min_size 1;ceph osd 
 pool set metadata min_size 1
 Open another terminal and use command ceph -w watch pg and pgs status .
 
 Best wishes,
 Vickie
 
 2015-02-10 19:16 GMT+08:00 Vickie ch mika.leaf...@gmail.com 
 mailto:mika.leaf...@gmail.com:
 Hi Beanos:
 So you have 3 OSD servers and each of them have 2 disks. 
 I have a question. What result of ceph osd tree. Look like the osd status 
 is down.
 
 
 Best wishes,
 Vickie
 
 2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com:
 Here is the updated direct copy/paste dump
 
 eph@ceph-node1:~$ ceph osd dump
 epoch 25
 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d
 created 2015-02-08 16:59:07.050875
 modified 2015-02-09 22:35:33.191218
 flags
 pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash 
 rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool 
 crash_replay_interval 45 stripe_width 0
 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash 
 rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash 
 rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
 max_osd 6
 osd.0 up   in  weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval 
 [0,0) 172.31.0.84:6800/11739 http://172.31.0.84:6800/11739 
 172.31.0.84:6801/11739 http://172.31.0.84:6801/11739 172.31.0.84:6802/11739 
 http://172.31.0.84:6802/11739 172.31.0.84:6803/11739 
 http://172.31.0.84:6803/11739 exists,up 765f5066-d13e-4a9e-a446-8630ee06e596
 osd.1 up   in  weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.0.84:6805/12279 http://172.31.0.84:6805/12279 
 172.31.0.84:6806/12279 http://172.31.0.84:6806/12279 172.31.0.84:6807/12279 
 http://172.31.0.84:6807/12279 172.31.0.84:6808/12279 
 http://172.31.0.84:6808/12279 exists,up e1d073e5-9397-4b63-8b7c-a4064e430f7a
 osd.2 up   in  weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.3.57:6800/5517 http://172.31.3.57:6800/5517 
 172.31.3.57:6801/5517 http://172.31.3.57:6801/5517 172.31.3.57:6802/5517 
 http://172.31.3.57:6802/5517 172.31.3.57:6803/5517 
 http://172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f
 osd.3 up   in  weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.3.57:6805/6043 http://172.31.3.57:6805/6043 
 172.31.3.57:6806/6043 http://172.31.3.57:6806/6043 172.31.3.57:6807/6043 
 http://172.31.3.57:6807/6043 172.31.3.57:6808/6043 
 http://172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92
 osd.4 up   in  weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.3.56:6800/5106 http://172.31.3.56:6800/5106 
 172.31.3.56:6801/5106 http://172.31.3.56:6801/5106 172.31.3.56:6802/5106 
 http://172.31.3.56:6802/5106 172.31.3.56:6803/5106 
 http://172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be
 osd.5 up   in  weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.3.56:6805/7019 http://172.31.3.56:6805/7019 
 172.31.3.56:6806/7019 http://172.31.3.56:6806/7019 172.31.3.56:6807/7019 
 http://172.31.3.56:6807/7019 172.31.3.56:6808/7019 
 http://172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3
 
 
 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com wrote:
 
 
 On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com wrote:
 
 Hi Vickie,
 
 Thanks for your reply!
 
 You can find the dump in this link:
 
 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca 
 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca
 
 Thanks!
 B.
 
 
 On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com 
 mailto:mika.leaf...@gmail.com wrote:
 
 Hi Beanos:
Would you post the reult of $ceph osd dump?
 
 Best wishes,
 Vickie
 
 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com:
 Having problem with my fresh non-healthy cluster, my cluster status 
 summary shows this:
 

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Vickie ch
Hi Beanos:
BTW, if your cluster just for test. You may try to reduce replica size and
min_size.
ceph osd pool set rbd size 2;ceph osd pool set data size 2;ceph osd pool
set metadata size 2 
ceph osd pool set rbd min_size 1;ceph osd pool set data min_size 1;ceph
osd pool set metadata min_size 1
Open another terminal and use command ceph -w watch pg and pgs status .

Best wishes,
Vickie

2015-02-10 19:16 GMT+08:00 Vickie ch mika.leaf...@gmail.com:

 Hi Beanos:
 So you have 3 OSD servers and each of them have 2 disks.
 I have a question. What result of ceph osd tree. Look like the osd
 status is down.


 Best wishes,
 Vickie

 2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com:

 Here is the updated direct copy/paste dump

 eph@ceph-node1:~$ ceph osd dump
 epoch 25
 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d
 created 2015-02-08 16:59:07.050875
 modified 2015-02-09 22:35:33.191218
 flags
 pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash
 rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool
 crash_replay_interval 45 stripe_width 0
 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0
 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool
 stripe_width 0
 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
 max_osd 6
 osd.0 up   in  weight 1 up_from 4 up_thru 17 down_at 0
 last_clean_interval [0,0) 172.31.0.84:6800/11739 172.31.0.84:6801/11739
 172.31.0.84:6802/11739 172.31.0.84:6803/11739 exists,up
 765f5066-d13e-4a9e-a446-8630ee06e596
 osd.1 up   in  weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval
 [0,0) 172.31.0.84:6805/12279 172.31.0.84:6806/12279
 172.31.0.84:6807/12279 172.31.0.84:6808/12279 exists,up
 e1d073e5-9397-4b63-8b7c-a4064e430f7a
 osd.2 up   in  weight 1 up_from 10 up_thru 0 down_at 0
 last_clean_interval [0,0) 172.31.3.57:6800/5517 172.31.3.57:6801/5517
 172.31.3.57:6802/5517 172.31.3.57:6803/5517 exists,up
 5af5deed-7a6d-4251-aa3c-819393901d1f
 osd.3 up   in  weight 1 up_from 13 up_thru 0 down_at 0
 last_clean_interval [0,0) 172.31.3.57:6805/6043 172.31.3.57:6806/6043
 172.31.3.57:6807/6043 172.31.3.57:6808/6043 exists,up
 958f37ab-b434-40bd-87ab-3acbd3118f92
 osd.4 up   in  weight 1 up_from 16 up_thru 0 down_at 0
 last_clean_interval [0,0) 172.31.3.56:6800/5106 172.31.3.56:6801/5106
 172.31.3.56:6802/5106 172.31.3.56:6803/5106 exists,up
 ce5c0b86-96be-408a-8022-6397c78032be
 osd.5 up   in  weight 1 up_from 22 up_thru 0 down_at 0
 last_clean_interval [0,0) 172.31.3.56:6805/7019 172.31.3.56:6806/7019
 172.31.3.56:6807/7019 172.31.3.56:6808/7019 exists,up
 da67b604-b32a-44a0-9920-df0774ad2ef3


 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com wrote:


 On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com wrote:

 Hi Vickie,

 Thanks for your reply!

 You can find the dump in this link:

 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca

 Thanks!
 B.


 On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com wrote:

 Hi Beanos:
Would you post the reult of $ceph osd dump?

 Best wishes,
 Vickie

 2015-02-10 16:36 GMT+08:00 B L super.itera...@gmail.com:

 Having problem with my fresh non-healthy cluster, my cluster status
 summary shows this:

 ceph@ceph-node1:~$* ceph -s*

 cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
  health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256
 pgs stuck unclean; pool data pg_num 128  pgp_num 64
  monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election
 epoch 2, quorum 0 ceph-node1
  osdmap e25: 6 osds: 6 up, 6 in
   pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects
 198 MB used, 18167 MB / 18365 MB avail
  192 incomplete
   64 creating+incomplete


 Where shall I start troubleshooting this?

 P.S. I’m new to CEPH.

 Thanks!
 Beanos

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
I will try to change the replication size now as you suggested .. but how is 
that related to the non-healthy cluster?


 On Feb 10, 2015, at 1:22 PM, B L super.itera...@gmail.com wrote:
 
 Hi Vickie,
 
 My OSD tree looks like this:
 
 ceph@ceph-node3:/home/ubuntu$ ceph osd tree
 # id  weight  type name   up/down reweight
 -10   root default
 -20   host ceph-node1
 0 0   osd.0   up  1
 1 0   osd.1   up  1
 -30   host ceph-node3
 2 0   osd.2   up  1
 3 0   osd.3   up  1
 -40   host ceph-node2
 4 0   osd.4   up  1
 5 0   osd.5   up  1
 
 
 On Feb 10, 2015, at 1:18 PM, Vickie ch mika.leaf...@gmail.com 
 mailto:mika.leaf...@gmail.com wrote:
 
 Hi Beanos:
 BTW, if your cluster just for test. You may try to reduce replica size and 
 min_size. 
 ceph osd pool set rbd size 2;ceph osd pool set data size 2;ceph osd pool 
 set metadata size 2 
 ceph osd pool set rbd min_size 1;ceph osd pool set data min_size 1;ceph osd 
 pool set metadata min_size 1
 Open another terminal and use command ceph -w watch pg and pgs status .
 
 Best wishes,
 Vickie
 
 2015-02-10 19:16 GMT+08:00 Vickie ch mika.leaf...@gmail.com 
 mailto:mika.leaf...@gmail.com:
 Hi Beanos:
 So you have 3 OSD servers and each of them have 2 disks. 
 I have a question. What result of ceph osd tree. Look like the osd status 
 is down.
 
 
 Best wishes,
 Vickie
 
 2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com:
 Here is the updated direct copy/paste dump
 
 eph@ceph-node1:~$ ceph osd dump
 epoch 25
 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d
 created 2015-02-08 16:59:07.050875
 modified 2015-02-09 22:35:33.191218
 flags
 pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash 
 rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool 
 crash_replay_interval 45 stripe_width 0
 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash 
 rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash 
 rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
 max_osd 6
 osd.0 up   in  weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval 
 [0,0) 172.31.0.84:6800/11739 http://172.31.0.84:6800/11739 
 172.31.0.84:6801/11739 http://172.31.0.84:6801/11739 
 172.31.0.84:6802/11739 http://172.31.0.84:6802/11739 
 172.31.0.84:6803/11739 http://172.31.0.84:6803/11739 exists,up 
 765f5066-d13e-4a9e-a446-8630ee06e596
 osd.1 up   in  weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.0.84:6805/12279 http://172.31.0.84:6805/12279 
 172.31.0.84:6806/12279 http://172.31.0.84:6806/12279 
 172.31.0.84:6807/12279 http://172.31.0.84:6807/12279 
 172.31.0.84:6808/12279 http://172.31.0.84:6808/12279 exists,up 
 e1d073e5-9397-4b63-8b7c-a4064e430f7a
 osd.2 up   in  weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.3.57:6800/5517 http://172.31.3.57:6800/5517 
 172.31.3.57:6801/5517 http://172.31.3.57:6801/5517 172.31.3.57:6802/5517 
 http://172.31.3.57:6802/5517 172.31.3.57:6803/5517 
 http://172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f
 osd.3 up   in  weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.3.57:6805/6043 http://172.31.3.57:6805/6043 
 172.31.3.57:6806/6043 http://172.31.3.57:6806/6043 172.31.3.57:6807/6043 
 http://172.31.3.57:6807/6043 172.31.3.57:6808/6043 
 http://172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92
 osd.4 up   in  weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.3.56:6800/5106 http://172.31.3.56:6800/5106 
 172.31.3.56:6801/5106 http://172.31.3.56:6801/5106 172.31.3.56:6802/5106 
 http://172.31.3.56:6802/5106 172.31.3.56:6803/5106 
 http://172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be
 osd.5 up   in  weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval 
 [0,0) 172.31.3.56:6805/7019 http://172.31.3.56:6805/7019 
 172.31.3.56:6806/7019 http://172.31.3.56:6806/7019 172.31.3.56:6807/7019 
 http://172.31.3.56:6807/7019 172.31.3.56:6808/7019 
 http://172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3
 
 
 On Feb 10, 2015, at 12:55 PM, B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com wrote:
 
 
 On Feb 10, 2015, at 12:37 PM, B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com wrote:
 
 Hi Vickie,
 
 Thanks for your reply!
 
 You can find the dump in this link:
 
 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca 
 https://gist.github.com/anonymous/706d4a1ec81c93fd1eca
 
 Thanks!
 B.
 
 
 On Feb 10, 2015, at 12:23 PM, Vickie ch mika.leaf...@gmail.com 
 mailto:mika.leaf...@gmail.com wrote:
 
 Hi Beanos:
Would you post the reult of $ceph 

[ceph-users] Too few pgs per osd - Health_warn for EC pool

2015-02-10 Thread Mohamed Pakkeer
Hi

We have created EC pool ( k =10 and m =3) with 540 osds. We followed the
following rule to calculate the pgs count for the EC pool.

   (OSDs * 100)
Total PGs =  
pool size

Where *pool size* is either the number of replicas for replicated pools or
the K+M sum for erasure coded pools

Total pgs = 540 *100/13 = 4153.8 Nearest power of 2: 8192

So we have configured 8192 as the EC pool pg size. But we are getting health
HEALTH_WARN ; too few pgs per osd (15  min 20). we checked all the osds
and it has more than 80 pgs

What's wrong here?

-
Regards
K.Mohamed Pakkeer
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
Hello Vickie,

After changing the size and min_size on all the existing pools, the cluster 
seems to be working, and I can store objects to the cluster .. but the cluster 
still shows non healthy: 

cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
 health HEALTH_WARN 256 pgs degraded; 256 pgs stuck unclean; recovery 1/2 
objects degraded (50.000%); pool data pg_num 128  pgp_num 64
 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, 
quorum 0 ceph-node1
 osdmap e31: 6 osds: 6 up, 6 in
  pgmap v99: 256 pgs, 3 pools, 10240 kB data, 1 objects
210 MB used, 18155 MB / 18365 MB avail
1/2 objects degraded (50.000%)
 256 active+degraded

I can see some changes like:
1- recovery 1/2 objects degraded (50.000%)
2- 1/2 objects degraded (50.000%)
3- 256 active+degraded

My question is:
 1- What do those changes mean
 2- How changing replication size can cause the cluster to be un healthy 

Thanks Vickie!
Beanos


 On Feb 10, 2015, at 1:28 PM, B L super.itera...@gmail.com wrote:
 
 I changed the size and min_size as you suggested while opening the ceph -w on 
 a different window, and I got this:
 
 
 ceph@ceph-node1:~$ ceph -w
 cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
  health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs 
 stuck unclean; pool data pg_num 128  pgp_num 64
  monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, 
 quorum 0 ceph-node1
  osdmap e25: 6 osds: 6 up, 6 in
   pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects
 198 MB used, 18167 MB / 18365 MB avail
  192 incomplete
   64 creating+incomplete
 
 2015-02-10 11:22:24.421000 mon.0 [INF] osdmap e26: 6 osds: 6 up, 6 in
 2015-02-10 11:22:24.425906 mon.0 [INF] pgmap v83: 256 pgs: 192 incomplete, 64 
 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
 2015-02-10 11:22:25.432950 mon.0 [INF] osdmap e27: 6 osds: 6 up, 6 in
 2015-02-10 11:22:25.437626 mon.0 [INF] pgmap v84: 256 pgs: 192 incomplete, 64 
 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
 2015-02-10 11:22:26.449640 mon.0 [INF] osdmap e28: 6 osds: 6 up, 6 in
 2015-02-10 11:22:26.454749 mon.0 [INF] pgmap v85: 256 pgs: 192 incomplete, 64 
 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
 2015-02-10 11:22:27.474113 mon.0 [INF] pgmap v86: 256 pgs: 192 incomplete, 64 
 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
 2015-02-10 11:22:31.770385 mon.0 [INF] pgmap v87: 256 pgs: 192 incomplete, 64 
 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
 2015-02-10 11:22:41.695656 mon.0 [INF] osdmap e29: 6 osds: 6 up, 6 in
 2015-02-10 11:22:41.700296 mon.0 [INF] pgmap v88: 256 pgs: 192 incomplete, 64 
 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
 2015-02-10 11:22:42.712288 mon.0 [INF] osdmap e30: 6 osds: 6 up, 6 in
 2015-02-10 11:22:42.716877 mon.0 [INF] pgmap v89: 256 pgs: 192 incomplete, 64 
 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
 2015-02-10 11:22:43.723701 mon.0 [INF] osdmap e31: 6 osds: 6 up, 6 in
 2015-02-10 11:22:43.732035 mon.0 [INF] pgmap v90: 256 pgs: 192 incomplete, 64 
 creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
 2015-02-10 11:22:46.774217 mon.0 [INF] pgmap v91: 256 pgs: 256 
 active+degraded; 0 bytes data, 199 MB used, 18165 MB / 18365 MB avail
 2015-02-10 11:23:08.232686 mon.0 [INF] pgmap v92: 256 pgs: 256 
 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail
 2015-02-10 11:23:27.767358 mon.0 [INF] pgmap v93: 256 pgs: 256 
 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail
 2015-02-10 11:23:40.769794 mon.0 [INF] pgmap v94: 256 pgs: 256 
 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail
 2015-02-10 11:23:45.530713 mon.0 [INF] pgmap v95: 256 pgs: 256 
 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail
 
 
 
 On Feb 10, 2015, at 1:24 PM, B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com wrote:
 
 I will try to change the replication size now as you suggested .. but how is 
 that related to the non-healthy cluster?
 
 
 On Feb 10, 2015, at 1:22 PM, B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com wrote:
 
 Hi Vickie,
 
 My OSD tree looks like this:
 
 ceph@ceph-node3:/home/ubuntu$ ceph osd tree
 # idweight  type name   up/down reweight
 -1  0   root default
 -2  0   host ceph-node1
 0   0   osd.0   up  1
 1   0   osd.1   up  1
 -3  0   host ceph-node3
 2   0   osd.2   up  1
 3   0   osd.3   up  1
 -4  0   host ceph-node2
 4   0   osd.4   up  1
 5   0   osd.5   up  1
 
 
 On Feb 10, 2015, 

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
I changed the size and min_size as you suggested while opening the ceph -w on a 
different window, and I got this:


ceph@ceph-node1:~$ ceph -w
cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
 health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs 
stuck unclean; pool data pg_num 128  pgp_num 64
 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, 
quorum 0 ceph-node1
 osdmap e25: 6 osds: 6 up, 6 in
  pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects
198 MB used, 18167 MB / 18365 MB avail
 192 incomplete
  64 creating+incomplete

2015-02-10 11:22:24.421000 mon.0 [INF] osdmap e26: 6 osds: 6 up, 6 in
2015-02-10 11:22:24.425906 mon.0 [INF] pgmap v83: 256 pgs: 192 incomplete, 64 
creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
2015-02-10 11:22:25.432950 mon.0 [INF] osdmap e27: 6 osds: 6 up, 6 in
2015-02-10 11:22:25.437626 mon.0 [INF] pgmap v84: 256 pgs: 192 incomplete, 64 
creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
2015-02-10 11:22:26.449640 mon.0 [INF] osdmap e28: 6 osds: 6 up, 6 in
2015-02-10 11:22:26.454749 mon.0 [INF] pgmap v85: 256 pgs: 192 incomplete, 64 
creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
2015-02-10 11:22:27.474113 mon.0 [INF] pgmap v86: 256 pgs: 192 incomplete, 64 
creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
2015-02-10 11:22:31.770385 mon.0 [INF] pgmap v87: 256 pgs: 192 incomplete, 64 
creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
2015-02-10 11:22:41.695656 mon.0 [INF] osdmap e29: 6 osds: 6 up, 6 in
2015-02-10 11:22:41.700296 mon.0 [INF] pgmap v88: 256 pgs: 192 incomplete, 64 
creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
2015-02-10 11:22:42.712288 mon.0 [INF] osdmap e30: 6 osds: 6 up, 6 in
2015-02-10 11:22:42.716877 mon.0 [INF] pgmap v89: 256 pgs: 192 incomplete, 64 
creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
2015-02-10 11:22:43.723701 mon.0 [INF] osdmap e31: 6 osds: 6 up, 6 in
2015-02-10 11:22:43.732035 mon.0 [INF] pgmap v90: 256 pgs: 192 incomplete, 64 
creating+incomplete; 0 bytes data, 198 MB used, 18167 MB / 18365 MB avail
2015-02-10 11:22:46.774217 mon.0 [INF] pgmap v91: 256 pgs: 256 active+degraded; 
0 bytes data, 199 MB used, 18165 MB / 18365 MB avail
2015-02-10 11:23:08.232686 mon.0 [INF] pgmap v92: 256 pgs: 256 active+degraded; 
0 bytes data, 200 MB used, 18165 MB / 18365 MB avail
2015-02-10 11:23:27.767358 mon.0 [INF] pgmap v93: 256 pgs: 256 active+degraded; 
0 bytes data, 200 MB used, 18165 MB / 18365 MB avail
2015-02-10 11:23:40.769794 mon.0 [INF] pgmap v94: 256 pgs: 256 active+degraded; 
0 bytes data, 200 MB used, 18165 MB / 18365 MB avail
2015-02-10 11:23:45.530713 mon.0 [INF] pgmap v95: 256 pgs: 256 active+degraded; 
0 bytes data, 200 MB used, 18165 MB / 18365 MB avail



 On Feb 10, 2015, at 1:24 PM, B L super.itera...@gmail.com wrote:
 
 I will try to change the replication size now as you suggested .. but how is 
 that related to the non-healthy cluster?
 
 
 On Feb 10, 2015, at 1:22 PM, B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com wrote:
 
 Hi Vickie,
 
 My OSD tree looks like this:
 
 ceph@ceph-node3:/home/ubuntu$ ceph osd tree
 # id weight  type name   up/down reweight
 -1   0   root default
 -2   0   host ceph-node1
 00   osd.0   up  1
 10   osd.1   up  1
 -3   0   host ceph-node3
 20   osd.2   up  1
 30   osd.3   up  1
 -4   0   host ceph-node2
 40   osd.4   up  1
 50   osd.5   up  1
 
 
 On Feb 10, 2015, at 1:18 PM, Vickie ch mika.leaf...@gmail.com 
 mailto:mika.leaf...@gmail.com wrote:
 
 Hi Beanos:
 BTW, if your cluster just for test. You may try to reduce replica size and 
 min_size. 
 ceph osd pool set rbd size 2;ceph osd pool set data size 2;ceph osd pool 
 set metadata size 2 
 ceph osd pool set rbd min_size 1;ceph osd pool set data min_size 1;ceph 
 osd pool set metadata min_size 1
 Open another terminal and use command ceph -w watch pg and pgs status .
 
 Best wishes,
 Vickie
 
 2015-02-10 19:16 GMT+08:00 Vickie ch mika.leaf...@gmail.com 
 mailto:mika.leaf...@gmail.com:
 Hi Beanos:
 So you have 3 OSD servers and each of them have 2 disks. 
 I have a question. What result of ceph osd tree. Look like the osd status 
 is down.
 
 
 Best wishes,
 Vickie
 
 2015-02-10 19:00 GMT+08:00 B L super.itera...@gmail.com 
 mailto:super.itera...@gmail.com:
 Here is the updated direct copy/paste dump
 
 eph@ceph-node1:~$ ceph osd dump
 epoch 25
 fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d
 created 2015-02-08 16:59:07.050875
 modified 2015-02-09 22:35:33.191218
 flags
 pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash 
 rjenkins 

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
Thanks Vikhyat,

As suggested .. 

ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0

Invalid command:  osd.0 doesn't represent a float
osd crush reweight name float[0.0-] :  change name's weight to weight 
in crush map
Error EINVAL: invalid command

What do you think


 On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com wrote:
 
 sudo ceph osd crush reweight 0.0095 osd.0 to osd.5

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] combined ceph roles

2015-02-10 Thread Lindsay Mathieson
Similar setup works well for me - 2 vm hosts, 1 Mon only mode. 6 osd's, 3 per 
vm host. Using rbd and cephfs

The more memory on your vm hosts, the better.

Lindsay Mathieson 

-Original Message-
From: David Graham xtn...@gmail.com
Sent: ‎11/‎02/‎2015 3:07 AM
To: ceph-us...@ceph.com ceph-us...@ceph.com
Subject: [ceph-users] combined ceph roles

Hello, I'm giving thought to a minimal footprint scenario with full redundancy. 
I realize it isn't ideal--and may impact overall performance --  but wondering 
if the below example would work, supported, or known to cause issue?


Example, 3x hosts each running:
-- OSD's
-- Mon
-- Client



I thought I read a post a while back about Client+OSD on the same host possibly 
being an issue -- but i am having difficulty finding that reference.


I would appreciate if anyone has insight into such a setup,

thanks!___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cannot obtain keys from the nodes : [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-mds/ceph.keyring on ['ceph-vm01']

2015-02-10 Thread Konstantin Khatskevich

Hello!

I am novice in Ceph and I in the desperation.

My problem in the fact that I cannot obtain keys from the nodes.
I found similar problem in maillist 
(http://www.spinics.net/lists/ceph-users/msg03843.html), but me did not 
succeed in solving it.


There Francesc Alted writes about the fact that  I tracked down my 
problem.  It turned out that I was setting different names for the ceph 
servers in /etc/hosts than their own `hostname`.

I rename all hostnames on all my nodes but it did not help.
My admin server and nodes are:
adm ceph-vm01 ceph-vm02 ceph-vm03 ceph-vm04

Below my ceph.conf, log and hosts file.

[ceph@adm ~]$ more ceph.conf
[global]
auth_service_required = cephx
filestore_xattr_use_omap = true
auth_client_required = cephx
auth_cluster_required = cephx
mon_host = 192.168.10.210,192.168.10.211,192.168.10.212,192.168.10.213
mon_initial_members = ceph-vm01, ceph-vm02, ceph-vm03, ceph-vm04
fsid = 0a5be896-bbd8-4bea-9ca9-486d93222164
osd pool default size = 2

[ceph@adm ~]$ ceph-deploy gatherkeys ceph-vm01
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.11): /usr/bin/ceph-deploy 
gatherkeys ceph-vm01
[ceph_deploy.gatherkeys][DEBUG ] Checking ceph-vm01 for 
/etc/ceph/ceph.client.admin.keyring

[ceph-vm01][DEBUG ] connected to host: ceph-vm01
[ceph-vm01][DEBUG ] detect platform information from remote host
[ceph-vm01][DEBUG ] detect machine type
[ceph-vm01][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] Unable to find 
/etc/ceph/ceph.client.admin.keyring on ['ceph-vm01']

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring
[ceph_deploy.gatherkeys][DEBUG ] Checking ceph-vm01 for 
/var/lib/ceph/bootstrap-osd/ceph.keyring

[ceph-vm01][DEBUG ] connected to host: ceph-vm01
[ceph-vm01][DEBUG ] detect platform information from remote host
[ceph-vm01][DEBUG ] detect machine type
[ceph-vm01][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] Unable to find 
/var/lib/ceph/bootstrap-osd/ceph.keyring on ['ceph-vm01']
[ceph_deploy.gatherkeys][DEBUG ] Checking ceph-vm01 for 
/var/lib/ceph/bootstrap-mds/ceph.keyring

[ceph-vm01][DEBUG ] connected to host: ceph-vm01
[ceph-vm01][DEBUG ] detect platform information from remote host
[ceph-vm01][DEBUG ] detect machine type
[ceph-vm01][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] Unable to find 
/var/lib/ceph/bootstrap-mds/ceph.keyring on ['ceph-vm01']


[ceph@adm ~]$ more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 
localhost6.localdomain6


192.168.10.214  ceph-vm05.2memory.ru ceph-vm05 adm
192.168.10.210ceph-vm01.2memory.ru ceph-vm01 node01 mon01 osd01
192.168.10.211ceph-vm02.2memory.ru ceph-vm02 node02 mon02 osd02
192.168.10.212ceph-vm03.2memory.ru ceph-vm03 node03 mon03 osd03
192.168.10.213  ceph-vm04.2memory.ru ceph-vm04 node04 mon04 osd04

--
Best regards,
Konstantin Khatskevich

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Too few pgs per osd - Health_warn for EC pool

2015-02-10 Thread Mohamed Pakkeer
Hi Greg,

Do you have any idea about the health warning?

Regards
K.Mohamed Pakkeer

On Tue, Feb 10, 2015 at 4:49 PM, Mohamed Pakkeer mdfakk...@gmail.com
wrote:

 Hi

 We have created EC pool ( k =10 and m =3) with 540 osds. We followed the
 following rule to calculate the pgs count for the EC pool.

(OSDs * 100)
 Total PGs =  
 pool size

 Where *pool size* is either the number of replicas for replicated pools
 or the K+M sum for erasure coded pools

 Total pgs = 540 *100/13 = 4153.8 Nearest power of 2: 8192

 So we have configured 8192 as the EC pool pg size. But we are getting health
 HEALTH_WARN ; too few pgs per osd (15  min 20). we checked all the osds
 and it has more than 80 pgs

 What's wrong here?

 -
 Regards
 K.Mohamed Pakkeer


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 答复: Re: can not add osd

2015-02-10 Thread Alan Johnson
Just wondering if this was ever resolved �C I am seeing the exact same issue 
when I moved from Centos 6.5 firefly to Centos7 on giant release using 
“ceph-deploy osd prepare . . . ” the script fails to umount and then posts a  
device is busy message. Details are below in yang bin18’s posting below. Ubuntu 
Trusty with giant seems OK. I have redeployed the cluster and also tried 
deploying on virtual machines as well as physical ones. Setup is minimal 3 x 
OSD nodes with one monitor node.



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
yang.bi...@zte.com.cn
Sent: Monday, December 22, 2014 2:58 AM
To: Karan Singh
Cc: ceph-users
Subject: [ceph-users] 答复: Re: can not add osd

Hi

I have deploied ceph osd  according official Ceph docs,and the same error came 
out again.




发件人: Karan Singh karan.si...@csc.fimailto:karan.si...@csc.fi
收件人: yang.bi...@zte.com.cnmailto:yang.bi...@zte.com.cn,
抄送:ceph-users 
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
日期: 2014/12/16 22:51
主题:Re: [ceph-users] can not add osd




Hi

You logs does not provides much information , if you are following any other 
documentation for Ceph , i would recommend you to follow official Ceph docs.

http://ceph.com/docs/master/start/quick-start-preflight/




Karan Singh
Systems Specialist , Storage Platforms
CSC - IT Center for Science,
Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
mobile: +358 503 812758
tel. +358 9 4572001
fax +358 9 4572302
http://www.csc.fi/


On 16 Dec 2014, at 09:55, yang.bi...@zte.com.cnmailto:yang.bi...@zte.com.cn 
wrote:

hi

When i execute ceph-deploy osd prepare node3:/dev/sdb,always come out err 
like this :

[node3][WARNIN] INFO:ceph-disk:Running command: /bin/umount -- 
/var/lib/ceph/tmp/mnt.u2KXW3
[node3][WARNIN] umount: /var/lib/ceph/tmp/mnt.u2KXW3: target is busy.

Then i execute /bin/umount -- /var/lib/ceph/tmp/mnt.u2KXW3,result is ok.


ZTE Information Security Notice: The information contained in this mail (and 
any attachment transmitted herewith) is privileged and confidential and is 
intended for the exclusive use of the addressee(s).  If you are not an intended 
recipient, any disclosure, reproduction, distribution or other dissemination or 
use of the information contained is strictly prohibited.  If you have received 
this mail in error, please delete it and notify us immediately.



___
ceph-users mailing list
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com






ZTE Information Security Notice: The information contained in this mail (and 
any attachment transmitted herewith) is privileged and confidential and is 
intended for the exclusive use of the addressee(s).  If you are not an intended 
recipient, any disclosure, reproduction, distribution or other dissemination or 
use of the information contained is strictly prohibited.  If you have received 
this mail in error, please delete it and notify us immediately.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Vikhyat Umrao

Hello,

Your osd does not have weights , please assign some weight to your ceph 
cluster osds as Udo said in his last comment.


osd crush reweight name float[0.0-]  change name's weight to 
weight in

  crush map

sudo ceph osd crush reweight 0.0095 osd.0 to osd.5.

Regards,
Vikhyat

On 02/10/2015 06:11 PM, B L wrote:

Hello Udo,

Thanks for your answer .. 2 questions here:

1- Does what you say mean that I have to remove my drive devices (8GB 
each) and add new ones with at least 10GB?
2- Shall I manually re-weight after disk creation and preparation 
using this command (*ceph osd reweight osd.2 1.0*), or things will 
work automatically with no too much fuss when disk drives are bigger 
than or equal 10GB?


Beanos


On Feb 10, 2015, at 2:26 PM, Udo Lembke ulem...@polarzone.de 
mailto:ulem...@polarzone.de wrote:


Hi,
your will get further trouble, because your weight is not correct.

You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB
or greater!


Udo

Am 10.02.2015 12:22, schrieb B L:

Hi Vickie,

My OSD tree looks like this:

ceph@ceph-node3:/home/ubuntu$ ceph osd tree
# idweighttype nameup/downreweight
-10root default
-20host ceph-node1
00osd.0up1
10osd.1up1
-30host ceph-node3
20osd.2up1
30osd.3up1
-40host ceph-node2
40osd.4up1
50osd.5up1








___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Udo Lembke
Hi,
your will get further trouble, because your weight is not correct.

You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB
or greater!


Udo

Am 10.02.2015 12:22, schrieb B L:
 Hi Vickie,
 
 My OSD tree looks like this:
 
 ceph@ceph-node3:/home/ubuntu$ ceph osd tree
 # idweighttype nameup/downreweight
 -10root default
 -20host ceph-node1
 00osd.0up1
 10osd.1up1
 -30host ceph-node3
 20osd.2up1
 30osd.3up1
 -40host ceph-node2
 40osd.4up1
 50osd.5up1
 
 
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Owen Synge
Hi,

To add to Udo's point,

Do remember that by default journals take ~6Gb.

For this reason I suggest making Virtual disks larger than 20Gb for
testing although its slightly bigger than absolutely necessary.

Best regards

Owen



On 02/10/2015 01:26 PM, Udo Lembke wrote:
 Hi,
 your will get further trouble, because your weight is not correct.
 
 You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB
 or greater!
 
 
 Udo
 
 Am 10.02.2015 12:22, schrieb B L:
 Hi Vickie,

 My OSD tree looks like this:

 ceph@ceph-node3:/home/ubuntu$ ceph osd tree
 # idweighttype nameup/downreweight
 -10root default
 -20host ceph-node1
 00osd.0up1
 10osd.1up1
 -30host ceph-node3
 20osd.2up1
 30osd.3up1
 -40host ceph-node2
 40osd.4up1
 50osd.5up1



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

-- 
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB
21284 (AG
Nürnberg)

Maxfeldstraße 5

90409 Nürnberg

Germany
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
Hello Udo,

Thanks for your answer .. 2 questions here:

1- Does what you say mean that I have to remove my drive devices (8GB each) and 
add new ones with at least 10GB?
2- Shall I manually re-weight after disk creation and preparation using this 
command (ceph osd reweight osd.2 1.0), or things will work automatically with 
no too much fuss when disk drives are bigger than or equal 10GB?

Beanos


 On Feb 10, 2015, at 2:26 PM, Udo Lembke ulem...@polarzone.de wrote:
 
 Hi,
 your will get further trouble, because your weight is not correct.
 
 You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB
 or greater!
 
 
 Udo
 
 Am 10.02.2015 12:22, schrieb B L:
 Hi Vickie,
 
 My OSD tree looks like this:
 
 ceph@ceph-node3:/home/ubuntu$ ceph osd tree
 # idweighttype nameup/downreweight
 -10root default
 -20host ceph-node1
 00osd.0up1
 10osd.1up1
 -30host ceph-node3
 20osd.2up1
 30osd.3up1
 -40host ceph-node2
 40osd.4up1
 50osd.5up1
 
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Micha Kersloot
Hi, 

maybe other way around: name float = osd.0 0.0095 

Met vriendelijke groet, 

Micha Kersloot 

Blijf op de hoogte en ontvang de laatste tips over Zimbra/KovoKs Contact: 
http://twitter.com/kovoks 

KovoKs B.V. is ingeschreven onder KvK nummer: 1104 

 From: B L super.itera...@gmail.com
 To: Vikhyat Umrao vum...@redhat.com, Udo Lembke ulem...@polarzone.de
 Cc: ceph-users@lists.ceph.com
 Sent: Tuesday, February 10, 2015 3:01:34 PM
 Subject: Re: [ceph-users] Placement Groups fail on fresh Ceph cluster
 installation with all OSDs up and in

 Thanks Vikhyat,

 As suggested ..

 ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0

 Invalid command: osd.0 doesn't represent a float
 osd crush reweight name float[0.0-] : change name's weight to weight 
 in
 crush map
 Error EINVAL: invalid command

 What do you think

 On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao  vum...@redhat.com  wrote:

 sudo ceph osd crush reweight 0.0095 osd.0 to osd.5

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Udo Lembke
Hi,
use:
ceph osd crush set 0 0.01 pool=default host=ceph-node1
ceph osd crush set 1 0.01 pool=default host=ceph-node1
ceph osd crush set 2 0.01 pool=default host=ceph-node3
ceph osd crush set 3 0.01 pool=default host=ceph-node3
ceph osd crush set 4 0.01 pool=default host=ceph-node2
ceph osd crush set 5 0.01 pool=default host=ceph-node2

Udo
Am 10.02.2015 15:01, schrieb B L:
 Thanks Vikhyat,
 
 As suggested .. 
 
 ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0
 
 Invalid command:  osd.0 doesn't represent a float
 osd crush reweight name float[0.0-] :  change name's weight to
 weight in crush map
 Error EINVAL: invalid command
 
 What do you think
 
 
 On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com
 mailto:vum...@redhat.com wrote:

 sudo ceph osd crush reweight 0.0095 osd.0 to osd.5
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Vikhyat Umrao

Oh , I have miss placed the places for osd names and weight

ceph osd crush reweight osd.0 0.0095  and so on ..

Regards,
Vikhyat

On 02/10/2015 07:31 PM, B L wrote:

Thanks Vikhyat,

As suggested ..

ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0

Invalid command:  osd.0 doesn't represent a float
osd crush reweight name float[0.0-] :  change name's weight to 
weight in crush map

Error EINVAL: invalid command

What do you think


On Feb 10, 2015, at 3:18 PM, Vikhyat Umrao vum...@redhat.com 
mailto:vum...@redhat.com wrote:


sudo ceph osd crushreweight 0.0095 osd.0 to osd.5




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com