Re: [ceph-users] Fedora 29 Issues.

2019-03-26 Thread Brad Hubbard
https://bugzilla.redhat.com/show_bug.cgi?id=1662496 On Wed, Mar 27, 2019 at 5:00 AM Andrew J. Hutton wrote: > > More or less followed the install instructions with modifications as > needed; but I'm suspecting that either a dependency was missed in the > F29 package or something else is up. I

Re: [ceph-users] scrub errors

2019-03-26 Thread Brad Hubbard
http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-pg/ Did you try repairing the pg? On Tue, Mar 26, 2019 at 9:08 AM solarflow99 wrote: > > yes, I know its old. I intend to have it replaced but thats a few months > away and was hoping to get past this. the other OSDs

Re: [ceph-users] Resizing a cache tier rbd

2019-03-26 Thread Jason Dillaman
When using cache pools (which are essentially deprecated functionality BTW), you should always reference the base tier pool. The fact that a cache tier sits in front of a slower, base tier is transparently handled. On Tue, Mar 26, 2019 at 5:41 PM Götz Reinicke wrote: > > Hi, > > I have a rbd in

[ceph-users] Resizing a cache tier rbd

2019-03-26 Thread Götz Reinicke
Hi, I have a rbd in a cache tier setup which I need to extend. The question is, do I resize it trough the cache pool or directly on the slow/storage pool? Or dosen t that matter at all? Thanks for feedback and regards . Götz smime.p7s Description: S/MIME cryptographic signature

[ceph-users] PG stuck in active+clean+remapped

2019-03-26 Thread Vladimir Prokofev
CEPH 12.2.11, pool size 3, min_size 2. One node went down today(private network interface started flapping, and after a while OSD processes crashed), no big deal, cluster recovered, but not completely. 1 PG stuck in active+clean+remapped state. PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED

[ceph-users] Fedora 29 Issues.

2019-03-26 Thread Andrew J. Hutton
More or less followed the install instructions with modifications as needed; but I'm suspecting that either a dependency was missed in the F29 package or something else is up. I don't see anything obvious; any ideas? When I try to start setting up my first node I get the following:

Re: [ceph-users] "No space left on device" when deleting a file

2019-03-26 Thread Toby Darling
Hi Dan Thanks! ceph tell mds.ceph1 config set mds_bal_fragment_size_max 20 got us running again. Cheers toby On 26/03/2019 16:56, Dan van der Ster wrote: > See http://tracker.ceph.com/issues/38849 > > As an immediate workaround you can increase `mds bal fragment size > max` to 20

Re: [ceph-users] "No space left on device" when deleting a file

2019-03-26 Thread Dan van der Ster
See http://tracker.ceph.com/issues/38849 As an immediate workaround you can increase `mds bal fragment size max` to 20 (which will increase the max number of strays to 2 million.) (Try injecting that option to the mds's -- I think it is read at runtime). And you don't need to stop the mds's

[ceph-users] "No space left on device" when deleting a file

2019-03-26 Thread Toby Darling
Hi [root@ceph1 ~]# ceph version ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable) We've run into a "No space left on device" issue when trying to delete a file, despite there being free space: [root@ceph1 ~]# ceph df GLOBAL: SIZEAVAIL RAW USED

Re: [ceph-users] Ceph nautilus upgrade problem

2019-03-26 Thread Stadsnet
On 26-3-2019 16:39, Ashley Merrick wrote: Have you upgraded any OSD's? No didn't go through with the osd's On a test cluster I saw the same and as I upgraded / restarted the OSD's the PG's started to show online till it was 100%. I know it says to not change anything to do with pool's

Re: [ceph-users] Ceph nautilus upgrade problem

2019-03-26 Thread Ashley Merrick
Have you upgraded any OSD's? On a test cluster I saw the same and as I upgraded / restarted the OSD's the PG's started to show online till it was 100%. I know it says to not change anything to do with pool's during the upgrade so I am guessing there is a code change that cause this till all is

[ceph-users] Ceph nautilus upgrade problem

2019-03-26 Thread Stadsnet
We did a upgrade from luminous to nautilus after upgrading the three monitors we got that all our pgs where inactive   cluster:     id: 5bafad08-31b2-4716-be77-07ad2e2647eb     health: HEALTH_ERR     noout flag(s) set     1 scrub errors     Reduced data

Re: [ceph-users] How to config mclock_client queue?

2019-03-26 Thread J. Eric Ivancich
So I do not think mclock_client queue works the way you’re hoping it does. For categorization purposes it joins the operation class and the client identifier with the intent that that will execute operations among clients more evenly (i.e., it won’t favor one client over another). However, it

Re: [ceph-users] OS Upgrade now monitor wont start

2019-03-26 Thread Brent Kennedy
Thanks Brad! I completely forgot about that trick! I copied the output and modified the command as suggested and the monitor came up. So at least that does work, now I just need to figure out why the normal service setup is borked. I was quick concerned that it wouldn’t come back at all and

Re: [ceph-users] RBD Mirror Image Resync

2019-03-26 Thread Jason Dillaman
On Fri, Mar 22, 2019 at 8:38 AM Vikas Rana wrote: > > Hi Jason, > > Thanks you for your help and support. > > > One last question, after the demotion and promotion and when you do a resync > again, does it copies the whole image again or sends just the changes since > the last journal update?

[ceph-users] How to config mclock_client queue?

2019-03-26 Thread Wang Chuanwen
I am now trying to run tests to see how mclock_client queue works on mimic. But when I tried to config tag (r,w,l) of each client, I found there are no options to distinguish different clients. All I got are following options for mclock_opclass, which are used to distinguish different types of

Re: [ceph-users] Checking cephfs compression is working

2019-03-26 Thread Frank Schilder
Hi Rhian, not sure if you fond an answer already. I believe in luminous and mimic it is only possible to extract compression information on osd device level. According to the recent announcement of nautilus, this seems to get better in the future. If you want to check if anything is

[ceph-users] Dealing with SATA resets and consequently slow ops

2019-03-26 Thread Christian Balzer
Hello, We've got some Intel DC S3610s 800GB in operation on cache tiers. On the ones with G2010150 firmware we've seen _very_ infrequent SATA bus resets [1]. On the order of once per year and these are fairly busy critters with an average of 400 IOPS and peaks much higher than that. Funnily