Re: [ceph-users] MDS corruption

2019-08-12 Thread ☣Adam
Pierre Dittes helped me with adding --rank=yourfsname:all and I ran the following steps from the disaster recovery page: journal export, dentry recovery, journal truncation, mds table wipes (session, snap and inode), scan_extents, scan_inodes, scan_links, and cleanup. Now all three of my MDS serve

[ceph-users] ceph osd crash help needed

2019-08-12 Thread response
Hi all, I have a production cluster, I recently purged all snaps. Now on a set of OSD's when they backfill im getting an assert like the below : -4> 2019-08-13 00:25:14.577 7ff4637b1700 5 osd.99 pg_epoch: 206049 pg[0.12ed( v 206047'25372641 (199518'25369560,206047'25372641] local-lis/les

Re: [ceph-users] Possibly a bug on rocksdb

2019-08-12 Thread Neha Ojha
Hi Samuel, You can use https://tracker.ceph.com/issues/41211 to provide the information that Brad requested, along with debug_osd=20, using debug_rocksdb=20 and debug_bluestore=20 might be useful. Thanks, Neha On Sun, Aug 11, 2019 at 4:18 PM Brad Hubbard wrote: > > Could you create a tracker

Re: [ceph-users] optane + 4x SSDs for VM disk images?

2019-08-12 Thread jesper
>> Could performance of Optane + 4x SSDs per node ever exceed that of >> pure Optane disks? > > No. With Ceph, the results for Optane and just for good server SSDs are > almost the same. One thing is that you can run more OSDs per an Optane > than per a usual SSD. However, the latency you get from

Re: [ceph-users] New CRUSH device class questions

2019-08-12 Thread Robert LeBlanc
On Wed, Aug 7, 2019 at 7:05 AM Paul Emmerich wrote: > ~ is the internal implementation of device classes. Internally it's > still using separate roots, that's how it stays compatible with older > clients that don't know about device classes. > That makes sense. > And since it wasn't mentioned

Re: [ceph-users] optane + 4x SSDs for VM disk images?

2019-08-12 Thread vitalif
Could performance of Optane + 4x SSDs per node ever exceed that of pure Optane disks? No. With Ceph, the results for Optane and just for good server SSDs are almost the same. One thing is that you can run more OSDs per an Optane than per a usual SSD. However, the latency you get from both is a

Re: [ceph-users] optane + 4x SSDs for VM disk images?

2019-08-12 Thread Mark Lehrer
The problem with caching is that if the performance delta between the two storage types isn't large enough, the cost of the caching algorithms and the complexity of managing everything outweigh the performance gains. With Optanes vs. SSDs, the main thing to consider is how busy the devices are in

Re: [ceph-users] optane + 4x SSDs for VM disk images?

2019-08-12 Thread Maged Mokhtar
On 11/08/2019 19:46, Victor Hooi wrote: Hi I am building a 3-node Ceph cluster to storE VM disk images. We are running Ceph Nautilus with KVM. Each node has: Xeon 4116 512GB ram per node Optane 905p NVMe disk with 980 GB Previously, I was creating four OSDs per Optane disk, and using only

[ceph-users] Scrub start-time and end-time

2019-08-12 Thread Torben Hørup
Hi I have a few questions regarding the options for limiting the scrubbing to a certain time frame : "osd scrub begin hour" and "osd scrub end hour". Is it allowed to have the scrub period cross midnight ? eg have start time at 22:00 and end time 07:00 next morning. I assume that if you on

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-08-12 Thread Janek Bevendorff
I've been copying happily for days now (not very fast, but the MDS were stable), but eventually the MDSs started flapping again due to large cache sizes (they are being killed after 11M inodes). I could solve the problem by temporarily increasing the cache size in order to allow them to rejoin,