[ceph-users] Ceph nvme timeout and then aborting

2021-02-19 Thread zxcs
Hi, I have one ceph cluster with nautilus 14.2.10 and one node has 3 SSD and 4 HDD each. Also has two nvmes as cache. (Means nvme0n1 cache for 0-2 SSD and Nvme1n1 cache for 3-7 HDD) but there is one nodes’ nvme0n1 always hit below issues(see name..I/O…timeout, aborting), and sudden this nv

[ceph-users] Re: Ceph nvme timeout and then aborting

2021-02-19 Thread Konstantin Shalygin
Please paste your `name smart-log /dev/nvme0n1` output k > On 19 Feb 2021, at 12:53, zxcs wrote: > > I have one ceph cluster with nautilus 14.2.10 and one node has 3 SSD and 4 > HDD each. > Also has two nvmes as cache. (Means nvme0n1 cache for 0-2 SSD and Nvme1n1 > cache for 3-7 HDD) >

[ceph-users] Re: Ceph nvme timeout and then aborting

2021-02-19 Thread zxcs
Thank you very much, Konstantin! Here is the output of `nvme smart-log /dev/nvme0n1` Smart Log for NVME device:nvme0n1 namespace-id: critical_warning: 0 temperature : 27 C available_spare : 100% available_spare_threshold

[ceph-users] Re: Ceph nvme timeout and then aborting

2021-02-19 Thread zxcs
BTW, actually i have two nodes has same issues, and another error node's nvme output as below Smart Log for NVME device:nvme0n1 namespace-id: critical_warning: 0 temperature : 29 C available_spare : 100% available_spare_thre

[ceph-users] Re: Ceph nvme timeout and then aborting

2021-02-19 Thread Konstantin Shalygin
Look's good, what is your hardware? Server model & NVM'es? k > On 19 Feb 2021, at 13:22, zxcs wrote: > > BTW, actually i have two nodes has same issues, and another error node's nvme > output as below > > Smart Log for NVME device:nvme0n1 namespace-id: > critical_warning

[ceph-users] Re: Ceph nvme timeout and then aborting

2021-02-19 Thread zxcs
you mean OS? it ubuntu 16.04 and Nvme is Samsung 970 PRO 1TB. Thanks, zx > 在 2021年2月19日,下午6:56,Konstantin Shalygin > 写道: > > Look's good, what is your hardware? Server model & NVM'es? > > > > k > >> On 19 Feb 2021, at 13:22, zxcs > > wrote:

[ceph-users] Re: Data Missing with RBD-Mirror

2021-02-19 Thread Vikas Rana
Friends, Any help or suggestion here for missing data? Thanks, -Vikas From: Vikas Rana Sent: Tuesday, February 16, 2021 12:20 PM To: 'ceph-users@ceph.io' Subject: Data Missing with RBD-Mirror Hi Friends, We have a very weird issue with rbd-mirror replication. As per the comm

[ceph-users] Re: Data Missing with RBD-Mirror

2021-02-19 Thread Eugen Block
Did you see the responses yet? There were two of them. Zitat von Vikas Rana : Friends, Any help or suggestion here for missing data? Thanks, -Vikas From: Vikas Rana Sent: Tuesday, February 16, 2021 12:20 PM To: 'ceph-users@ceph.io' Subject: Data Missing with RBD-Mirror Hi Friend

[ceph-users] Re: Data Missing with RBD-Mirror

2021-02-19 Thread Vikas Rana
Hello Mykola and Eugen, There was no interruption and we are in a campus with 10G backbone. We are on 12.2.10 I believe. We wanted to check the data on DR side and then we created a snapshot on primary which was available on DR side very quickly. It kind of gave me feeling that rbd-mirror is not s

[ceph-users] How to get ceph-volume to take pre-existing, working auth?

2021-02-19 Thread Philip Brown
I'm trying to use ceph-volume to do various things. It works fine locally, for things like ceph-volume lvm zap But when I want it to do OSD level things, it is unhappy. To use a trivial example, it wants to do things like /usr/bin/ceph --cluster ceph --keyring /var/lib/ceph/bootstrap-osd/ceph.

[ceph-users] Re: How to get ceph-volume to take pre-existing, working auth?

2021-02-19 Thread Philip Brown
Guess I was trying to do things in a non-optimal way. found the answer on reddit, of all places. >From https://www.reddit.com/r/ceph/comments/kv3z7h/cephadm_osd_deploy_errors/ (edited a bit) ceph auth get client.bootstrap-osd > /var/lib/ceph/bootstrap-osd/ceph.keyring - Original Message ---