[ceph-users] Fw: Re: Why does "df" on a cephfs not report same free space as "rados df" ?

2019-01-16 Thread David Young
Forgot to reply to the list! ‐‐‐ Original Message ‐‐‐ On Thursday, January 17, 2019 8:32 AM, David Young wrote: > Thanks David, > > "ceph osd df" looks like this: > > - > root@node1:~# ceph osd df > ID CLASS WEIGHT REWEIGHT SIZEUSE AV

[ceph-users] Why does "df" on a cephfs not report same free space as "rados df" ?

2019-01-15 Thread David Young
Hi folks, My ceph cluster is used exclusively for cephfs, as follows: --- root@node1:~# grep ceph /etc/fstab node2:6789:/ /ceph ceph auto,_netdev,name=admin,secretfile=/root/ceph.admin.secret root@node1:~# --- "rados df" shows me the following: --- root@node1:~# rados df POOL_NAME USE

[ceph-users] OSDs crashing in EC pool (whack-a-mole)

2019-01-08 Thread David Young
Hi all, One of my OSD hosts recently ran into RAM contention (was swapping heavily), and after rebooting, I'm seeing this error on random OSDs in the cluster: --- Jan 08 03:34:36 prod1 ceph-osd[3357939]: ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable) Jan 08 03:34

[ceph-users] Why does "df" against a mounted cephfs report (vastly) different free space?

2018-12-12 Thread David Young
Hi all, I have a cluster used exclusively for cephfs (A EC "media" pool, and a standard metadata pool for the cephfs). "ceph -s" shows me: --- data: pools: 2 pools, 260 pgs objects: 37.18 M objects, 141 TiB usage: 177 TiB used, 114 TiB / 291 TiB avail pgs: 260 active+c

Re: [ceph-users] Lost 1/40 OSDs at EC 4+1, now PGs are incomplete

2018-12-11 Thread David Young
will cause some PG’s to not >> have at least K size available as you only have 1 extra M. >> >> As per the error you can get your pool back online by setting min_size to 4. >> >> However this would only be a temp fix while you get the OSD back online / >> rebuilt so

[ceph-users] Lost 1/40 OSDs at EC 4+1, now PGs are incomplete

2018-12-11 Thread David Young
Hi all, I have a small 2-node cluster with 40 OSDs, using erasure coding 4+1 I lost osd38, and now I have 39 incomplete PGs. --- PG_AVAILABILITY Reduced data availability: 39 pgs inactive, 39 pgs incomplete pg 22.2 is incomplete, acting [19,33,10,8,29] (reducing pool media min_size from 5 m

[ceph-users] OSD fails to start after power failure (with FAILED assert(num_unsent <= log_queue.size()) error)

2018-07-14 Thread David Young
Hey folks, Sorry, posting this from a second account, since for some reason my primary account doesn't seem tobeable to post to the list... I have a Luminous 12.2.6 cluster which suffered a power failure recently. On recovery, one of my OSDs is continually crashing and restarting, with the e

[ceph-users] OSD fails to start after power failure

2018-07-14 Thread David Young
Hey folks, I have a Luminous 12.2.6 cluster which suffered a power failure recently. On recovery, one of my OSDs is continually crashing and restarting, with the error below: 9ae00 con 0     -3> 2018-07-15 09:50:58.313242 7f131c5a9700 10 monclient: tick     -2> 2018-07-15 09:50:58.313277