[ceph-users] Re: How to get num ops blocked per OSD

2020-03-13 Thread Anthony D'Atri
Yeah the removal of that was annoying for sure. ISTR that one can gather the information from the OSDs’ admin sockets. Envision a Prometheus exporter that polls the admin sockets (in parallel) and Grafana panes that graph slow requests by OSD and by node. > On Mar 13, 2020, at 4:14 PM, Robert

[ceph-users] How to get num ops blocked per OSD

2020-03-13 Thread Robert LeBlanc
For Jewel I wrote a script to take the output of `ceph health detail --format=json` and send alerts to our system that ordered the osds based on how long the ops were blocked and which OSDs had the most ops blocked. This was really helpful to quickly identify which OSD out of a list of 100 would be

[ceph-users] Re: Is there a better way to make a samba/nfs gateway?

2020-03-13 Thread Seth Galitzer
Thanks to all who have offered advise on this. I have been looking at using vfs_ceph in samba, but I'm unsure how to get it on Centos7. As I understand it, it's optional at compile time. When searching for a package for it, I see one glusterfs (samba-vfs-glusterfs), but nothing for ceph. Is it

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-13 Thread Jason Dillaman
On Fri, Mar 13, 2020 at 3:31 PM Jason Dillaman wrote: > > On Fri, Mar 13, 2020 at 2:48 PM Matt Dunavant > wrote: > > > > Jason Dillaman wrote: > > > On Fri, Mar 13, 2020 at 11:36 AM Matt Dunavant > > > > > > > > > > Jason Dillaman wrote: > > > > > On Fri, Mar 13, 2020 at 11:17 AM Matt Dunavant

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-13 Thread Jason Dillaman
On Fri, Mar 13, 2020 at 2:48 PM Matt Dunavant wrote: > > Jason Dillaman wrote: > > On Fri, Mar 13, 2020 at 11:36 AM Matt Dunavant > > > > > > > Jason Dillaman wrote: > > > > On Fri, Mar 13, 2020 at 11:17 AM Matt Dunavant > > > > > > > > > > > > > I'm not sure of the last known good release

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-13 Thread Matt Dunavant
Jason Dillaman wrote: > On Fri, Mar 13, 2020 at 11:36 AM Matt Dunavant > > > > Jason Dillaman wrote: > > > On Fri, Mar 13, 2020 at 11:17 AM Matt Dunavant > > > > > > > > > > I'm not sure of the last known good release of the rbd CLI where this > > worked. I just > > > > ran the sha1sum ag

[ceph-users] Re: Inactive PGs

2020-03-13 Thread Wido den Hollander
On 3/13/20 5:44 PM, Peter Eisch wrote: > > >  > Peter Eisch​ > Senior Site Reliability Engineer > > T > > *1.612.445.5135* > > Facebook > > > LinkedIn > > > Twitter

[ceph-users] Re: Inactive PGs

2020-03-13 Thread Peter Eisch
 Peter Eisch Senior Site Reliability Engineer T1.612.445.5135 virginpulse.com |virginpulse.com/global-challenge Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland | United Kingdom | USA Confidentiality Notice: The information contained in this e-mail, including any

[ceph-users] Re: Inactive PGs

2020-03-13 Thread Wido den Hollander
On 3/13/20 4:09 PM, Peter Eisch wrote: > Full cluster is 14.2.8. > > I had some OSD drop overnight which results now in 4 inactive PGs. The > pools had three participant (2 ssd, 1 sas) OSDs. In each pool at least 1 > ssd and 1 sas OSD is working without issue. I’ve ‘ceph pg repair ’ > but it doe

[ceph-users] Re: MGRs failing once per day and generally slow response times

2020-03-13 Thread Janek Bevendorff
Indeed. I just had another MGR go bye-bye. I don't think host clock skew is the problem. On 13/03/2020 15:29, Anthony D'Atri wrote: Chrony does converge faster, but I doubt this will solve your problem if you don’t have quality peers. Or if it’s not really a time problem. On Mar 13, 2020,

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-13 Thread Jason Dillaman
On Fri, Mar 13, 2020 at 11:36 AM Matt Dunavant wrote: > > Jason Dillaman wrote: > > On Fri, Mar 13, 2020 at 11:17 AM Matt Dunavant > > > > > > > I'm not sure of the last known good release of the rbd CLI where this > > > worked. I just > > > ran the sha1sum against the images and they always co

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-13 Thread Matt Dunavant
Jason Dillaman wrote: > On Fri, Mar 13, 2020 at 11:17 AM Matt Dunavant > > > > I'm not sure of the last known good release of the rbd CLI where this > > worked. I just > > ran the sha1sum against the images and they always come up as different. > > Might be worth > > knowing, this is a volume

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-13 Thread Jason Dillaman
On Fri, Mar 13, 2020 at 11:17 AM Matt Dunavant wrote: > > I'm not sure of the last known good release of the rbd CLI where this worked. > I just ran the sha1sum against the images and they always come up as > different. Might be worth knowing, this is a volume that's provisioned at > 512GB (wit

[ceph-users] Re: Is there a better way to make a samba/nfs gateway? (Marc Roos)

2020-03-13 Thread Chad William Seys
Awhile back I thought there were some limitations which prevented us from trying this, but I cannot remember... What does the ceph vfs gain you over exporting by cephfs kernel module (kernel 4.19). What does it lose you? (I.e. pros and cons versus kernel module?) Thanks! C. It's based on v

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-13 Thread Matt Dunavant
I'm not sure of the last known good release of the rbd CLI where this worked. I just ran the sha1sum against the images and they always come up as different. Might be worth knowing, this is a volume that's provisioned at 512GB (with much less actually used) but after export, it only shows up as

[ceph-users] Inactive PGs

2020-03-13 Thread Peter Eisch
Full cluster is 14.2.8. I had some OSD drop overnight which results now in 4 inactive PGs. The pools had three participant (2 ssd, 1 sas) OSDs. In each pool at least 1 ssd and 1 sas OSD is working without issue. I’ve ‘ceph pg repair ’ but it doesn’t seem to make any changes. PG_AVAILABILITY

[ceph-users] Re: Is there a better way to make a samba/nfs gateway?

2020-03-13 Thread Marc Roos
Can you also create snapshots via the vfs_ceph solution? -Original Message- Sent: 13 March 2020 14:46 Subject: [ceph-users] Re: Is there a better way to make a samba/nfs gateway? Hello, we have a CTDB based HA Samba in our Ceph Management Solution. It works like a charm and we conn

[ceph-users] Re: Is there a better way to make a samba/nfs gateway?

2020-03-13 Thread Martin Verges
Hello, we have a CTDB based HA Samba in our Ceph Management Solution. It works like a charm and we connect it to existing active directories as well. It's based on vfs_ceph and you can read more about how to configure it yourself on https://www.samba.org/samba/docs/current/man-html/vfs_ceph.8.htm

[ceph-users] Re: MGRs failing once per day and generally slow response times

2020-03-13 Thread Janek Bevendorff
I replaced ntpd with chronyd and will let you know if it changes anything. Thanks. On 13/03/2020 06:25, Konstantin Shalygin wrote: On 3/13/20 12:57 AM, Janek Bevendorff wrote: NTPd is running, all the nodes have the same time to the second. I don't think that is the problem. As always in s

[ceph-users] Cancelled: Ceph Day Oslo May 13th

2020-03-13 Thread Wido den Hollander
Hi, Due to the recent developments around the COVID-19 virus we (the organizers) have decided to cancel the Ceph Day in Oslo on May 13th. Altough it's still 8 weeks away we don't know how the situation will develop and if travel will be possible or people are willing to travel. Therefor we thoug

[ceph-users] Re: Ceph Performance of Micron 5210 SATA?

2020-03-13 Thread vitalif
Hi, Can you test it slightly differently (and simpler)? Like in this googledoc: https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit#gid=0 As we know that it's a QLC drive, first let it fill the SLC cache: fio -ioengine=libaio -direct=1 -name=test -bs=4M -

[ceph-users] Re: Is there a better way to make a samba/nfs gateway?

2020-03-13 Thread Nathan Fish
Note that we have had issues with deadlocks when re-exporting CephFS via Samba. It appears to only occur with Mac clients, though. In some cases it has hung on a request for a high-level directory and hung that branch for all clients. On Fri, Mar 13, 2020 at 1:56 AM Konstantin Shalygin wrote: > >

[ceph-users] ceph qos

2020-03-13 Thread 展荣臻(信泰)
Hi everyone: There are two qos in ceph(one based on tokenbucket algorithm,another based on mclock ). Which one I can use in production environment? Thank you ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-l

[ceph-users] Re: Ceph Performance of Micron 5210 SATA?

2020-03-13 Thread Marc Roos
Hi Mourik Jan, > So, ran the fio commands, and pasted output (as it's quite a lot) here: > I hope someone here can draw some conclusions from this output... Now you know, it sort of performs similar to other enterprise drives. And you know your ceph solution will never perform beyond this

[ceph-users] Re: EC pool 4+2 - failed to guarantee a failure domain

2020-03-13 Thread Eugen Block
Hi, this is unexpected, of course, but it can happen if one OSD is full (or also nearfull?). Have you checked 'ceph osd df'? The pg availability has more priority than the placement, so it's possible that during a failure some chunks are recreated on the same OSD or host even if the crush

[ceph-users] Point-in-Time Recovery

2020-03-13 Thread Ml Ml
Hello List, when reading: https://docs.ceph.com/docs/master/rbd/rbd-mirroring/ it says: (...)Journal-based: This mode uses the RBD journaling image feature to ensure point-in-time, crash-consistent replication between clusters(...) Does this mean, that mean, that we have some kind of transactio