[ceph-users] NVMe's

2020-09-22 Thread Brent Kennedy
We currently run a SSD cluster and HDD clusters and are looking at possibly creating a cluster for NVMe storage. For spinners and SSDs, it seemed the max recommended per osd host server was 16 OSDs ( I know it depends on the CPUs and RAM, like 1 cpu core and 2GB memory ). Questions: 1. If

[ceph-users] Re: Vitastor, a fast Ceph-like block storage for VMs

2020-09-22 Thread Lindsay Mathieson
On 23/09/2020 2:29 pm, Виталий Филиппов wrote: Not RBD, it has an own qemu driver How have you integrated it into Qemu? from memory Qemu doesn't support plugin drivers. Do we need to custom patch Qemu? -- Lindsay ___ ceph-users mailing list --

[ceph-users] Re: Vitastor, a fast Ceph-like block storage for VMs

2020-09-22 Thread Виталий Филиппов
Yes Not RBD, it has an own qemu driver 23 сентября 2020 г. 3:24:23 GMT+03:00, Lindsay Mathieson пишет: >On 23/09/2020 8:44 am, vita...@yourcmc.ru wrote: >> There are more details in the README file which currently opens from >the domainhttps://vitastor.io > >that redirects to

[ceph-users] Re: Low level bluestore usage

2020-09-22 Thread tri
You can also expand the OSD. ceph-bluestore-tool has an option for expansion of the OSD. I'm not 100% sure if that would solve the rockdb out of space issue. I think it will, though. If not, you can move rockdb to a separate block device. September 22, 2020 7:31 PM, "George Shuklin" wrote: >

[ceph-users] Re: ceph-volume lvm cannot zap???

2020-09-22 Thread Brent Kennedy
I run this command, e.g. "ceph-volume lvm zap --destroy /dev/sdm" when zapping. but I haven't run into any locks since going to nautilus, it seems to know when a disk is dead. Removing an OSD requires you to stop the process, so I can imagine anything was running to that. Perhaps a bad

[ceph-users] Re: Low level bluestore usage

2020-09-22 Thread Alexander E. Patrakov
On Wed, Sep 23, 2020 at 3:03 AM Ivan Kurnosov wrote: > > Hi, > > this morning I woke up to a degraded test ceph cluster (managed by rook, > but it does not really change anything for the question I'm about to ask). > > After checking logs I have found that bluestore on one of the OSDs run out >

[ceph-users] Re: Unknown PGs after osd move

2020-09-22 Thread Frank Schilder
No, the recipe I gave was for trying to recover healthy status of all PGs in the current situation. I would avoid moving OSDs at all cost, because it will always imply rebalancing. Any change to the crush map changes how PGs are hashed onto OSDs, which in turn triggers a rebalancing. If

[ceph-users] Re: Vitastor, a fast Ceph-like block storage for VMs

2020-09-22 Thread Lindsay Mathieson
On 23/09/2020 8:44 am, vita...@yourcmc.ru wrote: There are more details in the README file which currently opens from the domainhttps://vitastor.io that redirects to https://yourcmc.ru/git/vitalif/vitastor Is that your own site? -- Lindsay ___

[ceph-users] Re: Vitastor, a fast Ceph-like block storage for VMs

2020-09-22 Thread Lindsay Mathieson
On 23/09/2020 8:44 am, vita...@yourcmc.ru wrote: After almost a year of development in my spare time I present my own software-defined block storage system: Vitastor -https://vitastor.io Interesting, thanks. It supports qemu connecting via rbd? -- Lindsay

[ceph-users] Re: Vitastor, a fast Ceph-like block storage for VMs

2020-09-22 Thread Alexander E. Patrakov
On Wed, Sep 23, 2020 at 3:44 AM wrote: > > Hi! > > After almost a year of development in my spare time I present my own > software-defined block storage system: Vitastor - https://vitastor.io > > I designed it similar to Ceph in many ways, it also has Pools, PGs, OSDs, > different coding

[ceph-users] Re: Unknown PGs after osd move

2020-09-22 Thread Frank Schilder
> > Is the crush map aware about that? > > Yes, it correctly shows the osds at serve8 (previously server15). > > > I didn't ever try that, but don't you need to cursh move it? > > I originally imagined this, too. But as soon as the osd starts on a new > server it is automatically put into the

[ceph-users] Re: Low level bluestore usage

2020-09-22 Thread George Shuklin
As far as I know, bluestore doesn't like super small sizes. Normally odd should stop doing funny things as full mark, but if device is too small it may be too late and bluefs run out of space. Two things: 1. Don't use too small osd 2. Have a spare area on the drive. I usually reserve 1% for

[ceph-users] Vitastor, a fast Ceph-like block storage for VMs

2020-09-22 Thread vitalif
Hi! After almost a year of development in my spare time I present my own software-defined block storage system: Vitastor - https://vitastor.io I designed it similar to Ceph in many ways, it also has Pools, PGs, OSDs, different coding schemes, rebalancing and so on. However it's much simpler

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
Follow up on the tell hanging: iterating over all osds and trying to raise the max-backfills gives hanging ceph tell processes like this: root 1007846 15.3 1.2 918388 50972 pts/5Sl 00:03 0:48 /usr/bin/python3 /usr/bin/ceph tell osd.4 injectargs --osd-max-backfill root 1007890

[ceph-users] Re: Troubleshooting stuck unclean PGs?

2020-09-22 Thread Lindsay Mathieson
On 23/09/2020 12:51 am, Lenz Grimmer wrote: It's on the OSD page, click "Cluster-wide configuration -> Recovery priority" option on top of the table. Thanks Lenz! totally missed that that (and the PG Scrub options) -- Lindsay ___ ceph-users mailing

[ceph-users] Low level bluestore usage

2020-09-22 Thread Ivan Kurnosov
Hi, this morning I woke up to a degraded test ceph cluster (managed by rook, but it does not really change anything for the question I'm about to ask). After checking logs I have found that bluestore on one of the OSDs run out of space. Some cluster details: ceph version 15.2.4

[ceph-users] Re: Unknown PGs after osd move

2020-09-22 Thread Nico Schottelius
Hey Andreas, thanks for the insights. Maybe a bit more background: We are running a variety of pools, the majority of data is stored on the "hdd" and "ssd" pools, which make use of the "ssd" and "hdd-big" (as in 3.5") classes. Andreas John writes: > On 22.09.20 22:09, Nico Schottelius

[ceph-users] Re: Remove separate WAL device from OSD

2020-09-22 Thread Andreas John
Hello, isnt ceph-osd -i osdnum... –flush-journal and then removing the journal enough? On 22.09.20 21:09, Michael Fladischer wrote: > Hi, > > Is it possible to remove an existing WAL device from an OSD? I saw > that ceph-bluestore-tool has a command bluefs-bdev-migrate, but it's > not clear to

[ceph-users] Re: Unknown PGs after osd move

2020-09-22 Thread Nico Schottelius
Hey Frank, Frank Schilder writes: >> > Is the crush map aware about that? >> >> Yes, it correctly shows the osds at serve8 (previously server15). >> >> > I didn't ever try that, but don't you need to cursh move it? >> >> I originally imagined this, too. But as soon as the osd starts on a new

[ceph-users] Re: Unknown PGs after osd move

2020-09-22 Thread Nico Schottelius
Hey Andreas, Andreas John writes: > Hey Nico, > > maybe you "pinned" the IP of the OSDs in question in ceph.conf to the IP > of the old chassis? That would be nice - unfortunately our ceph.conf is almost empty: [22:11:59] server15.place6:/sys/class/block/sdg# cat /etc/ceph/ceph.conf #

[ceph-users] Re: Unknown PGs after osd move

2020-09-22 Thread Nico Schottelius
Update: Restarting other osds on the server that we took osds from seems to have reduced the amount of unknown pgs down to 170. However the peering and activating states seem to stay very long for these OSDs: cluster: id: 1ccd84f6-e362-4c50-9ffe-59436745e445 health: HEALTH_ERR

[ceph-users] Re: Understanding what ceph-volume does, with bootstrap-osd/ceph.keyring, tmpfs

2020-09-22 Thread Marc Roos
At least ceph thought you the essence of doing first proper testing ;) Because if you test your use case you either get a positive or negative result and not a problem. However I do have to admit that ceph could be more transparent with publishing testing and performance results. I have

[ceph-users] Re: Unknown PGs after osd move

2020-09-22 Thread Nico Schottelius
Hey Andreas, Andreas John writes: > Hello, > > On 22.09.20 20:45, Nico Schottelius wrote: >> Hello, >> >> after having moved 4 ssds to another host (+ the ceph tell hanging issue >> - see previous mail), we ran into 241 unknown pgs: > > You mean, that you re-seated the OSDs into another

[ceph-users] Remove separate WAL device from OSD

2020-09-22 Thread Michael Fladischer
Hi, Is it possible to remove an existing WAL device from an OSD? I saw that ceph-bluestore-tool has a command bluefs-bdev-migrate, but it's not clear to me if this can only move a WAL device or if it can be used to remove it ... Regards, Michael

[ceph-users] Re: Unknown PGs after osd move

2020-09-22 Thread Andreas John
Hello, On 22.09.20 20:45, Nico Schottelius wrote: > Hello, > > after having moved 4 ssds to another host (+ the ceph tell hanging issue > - see previous mail), we ran into 241 unknown pgs: You mean, that you re-seated the OSDs into another chassis/host? Is the crush map aware about that? I

[ceph-users] Unknown PGs after osd move

2020-09-22 Thread Nico Schottelius
Hello, after having moved 4 ssds to another host (+ the ceph tell hanging issue - see previous mail), we ran into 241 unknown pgs: cluster: id: 1ccd84f6-e362-4c50-9ffe-59436745e445 health: HEALTH_WARN noscrub flag(s) set 2 nearfull osd(s) 1

[ceph-users] Documentation broken

2020-09-22 Thread Frank Schilder
Hi all, during the migration of documentation, would it be possible to make the old documentation available somehow? A lot of pages are broken and I can't access the documentation for mimic at all any more. Is there an archive or something similar? Thanks and best regards, =

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
So the same problem happens with pgs which are in "unknown" state, [19:31:08] black2.place6:~# ceph pg 2.5b2 query | tee query_2.5b2 hangs until the pg actually because active again. I assume that this should not be the case, should it? Nico Schottelius writes: > Update to the update:

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Stefan Kooman
Hi, > However as soon as we issue either of the above tell commands, it just > hangs. Furthermore when ceph tell hangs, pg are also becoming stuck in > "Activating" and "Peering" states. > > It seems to be related, as soon as we stop ceph tell (ctrl-c it), a few > minutes later the pgs are

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
Update to the update: currently debugging why pgs are stuck in the peering state: [18:57:49] black2.place6:~# ceph pg dump all | grep 2.7d1 dumped all 2.7d1 1 00 0 0 69698617344 0 0 3002 3002

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
I started now to iterate over all osds in the tree and some of the osds are completely unresponsive: [18:27:18] black1.place6:~# for osd in $(ceph osd tree | grep osd. | awk '{ print $4 }'); do echo $osd; ceph tell $osd injectargs '--osd-max-backfills 1'; done osd.20 osd.56 osd.62 osd.63

[ceph-users] Re: Understanding what ceph-volume does, with bootstrap-osd/ceph.keyring, tmpfs

2020-09-22 Thread Kevin Myers
Tbh ceph caused us more problems than it tried to fix ymmv good luck > On 22 Sep 2020, at 13:04, t...@postix.net wrote: > > The key is stored in the ceph cluster config db. It can be retrieved by > > KEY=`/usr/bin/ceph --cluster ceph --name client.osd-lockbox.${OSD_FSID} > --keyring

[ceph-users] Re: Ceph MDS stays in "up:replay" for hours. MDS failover takes 10-15 hours.

2020-09-22 Thread Patrick Donnelly
On Tue, Sep 22, 2020 at 2:59 AM wrote: > > Hi there, > > We have 9 nodes Ceph cluster. Ceph version is 15.2.5. The cluster has 175 OSD > (HDD) + 3 NVMe for cache tier for "cephfs_data" pool. CephFS pools info: > POOLID STORED OBJECTS USED %USED MAX AVAIL >

[ceph-users] Re: Understanding what ceph-volume does, with bootstrap-osd/ceph.keyring, tmpfs

2020-09-22 Thread tri
The key is stored in the ceph cluster config db. It can be retrieved by KEY=`/usr/bin/ceph --cluster ceph --name client.osd-lockbox.${OSD_FSID} --keyring $OSD_PATH/lockbox.keyring config-key get dm-crypt/osd/$OSD_FSID/luks` September 22, 2020 2:25 AM, "Janne Johansson" wrote: > Den mån 21

[ceph-users] Re: one-liner getting block device from mounted osd

2020-09-22 Thread Marc Roos
ceph device ls -Original Message- To: ceph-users Subject: [ceph-users] one-liner getting block device from mounted osd I have a optimize script that I run after the reboot of a ceph node. It sets among other things /sys/block/sdg/queue/read_ahead_kb and

[ceph-users] Re: rgw.none vs quota

2020-09-22 Thread Jean-Sebastien Landry
Manuel & Konstantin, thank you to confirm this. I should upgrade to Nautilus in the next few weeks. I just live with it for now. Thanks! ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] one-liner getting block device from mounted osd

2020-09-22 Thread Marc Roos
I have a optimize script that I run after the reboot of a ceph node. It sets among other things /sys/block/sdg/queue/read_ahead_kb and /sys/block/sdg/queue/nr_requests of block devices being used for osd's. Normally I am using the mount command to discover these but with the tmpfs and

[ceph-users] Slow cluster and incorrect peers

2020-09-22 Thread Nico Schottelius
Hello again, following up on the previous mail, one cluster gets rather slow at the moment and we have spotted something "funny": When checking ceph pg dump we see some osds have HB peers with osds that they should not have any pg in common with. When restarting one of the effected osds, we

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
Hello Stefan, Stefan Kooman writes: > Hi, > >> However as soon as we issue either of the above tell commands, it just >> hangs. Furthermore when ceph tell hangs, pg are also becoming stuck in >> "Activating" and "Peering" states. >> >> It seems to be related, as soon as we stop ceph tell

[ceph-users] Re: Mount CEPH-FS on multiple hosts with concurrent access to the same data objects?

2020-09-22 Thread Stefan Kooman
On 2020-09-21 21:12, Wout van Heeswijk wrote: > Hi Rene, > > Yes, cephfs is a good filesystem for concurrent writing. When using CephFS > with ganesha you can even scale out. > > It will perform better but why don't you mount CephFS inside the VM? ^^ This. But it depends on the VMs you are

[ceph-users] Re: Mount CEPH-FS on multiple hosts with concurrent access to the same data objects?

2020-09-22 Thread Andreas John
Hello, https://docs.ceph.com/en/latest/rados/operations/erasure-code/ but, you could probably manually intervent, if you want an erasure coded pool. rgds, j. On 22.09.20 14:55, René Bartsch wrote: > Am Dienstag, den 22.09.2020, 14:43 +0200 schrieb Andreas John: >> Hello, >> >> yes, it does.

[ceph-users] Re: Mount CEPH-FS on multiple hosts with concurrent access to the same data objects?

2020-09-22 Thread Lindsay Mathieson
On 22/09/2020 10:55 pm, René Bartsch wrote: What do you mean with EC? Proxmox doesn't support creating EC pools via its gui, as ec is not considered a good fit for VM hosting. However you can create EC pools viua the command line as normal. -- Lindsay

[ceph-users] Re: Mount CEPH-FS on multiple hosts with concurrent access to the same data objects?

2020-09-22 Thread René Bartsch
Am Dienstag, den 22.09.2020, 14:43 +0200 schrieb Andreas John: > Hello, > > yes, it does. I even comes with a GUI so manage ceph and own basic- > setup > tool. No EC support. What do you mean with EC? Regards, Renne ___ ceph-users mailing list --

[ceph-users] Re: RBD-Mirror: snapshots automatically created?

2020-09-22 Thread Eugen Block
Thank you, Jason. As usual you're very helpful. :-) Zitat von Jason Dillaman : On Tue, Sep 22, 2020 at 7:23 AM Eugen Block wrote: It just hit me when I pushed the "send" button: the (automatically created) first snapshot initiates the first full sync to catch up on the remote site, but

[ceph-users] Re: RBD-Mirror: snapshots automatically created?

2020-09-22 Thread Jason Dillaman
On Tue, Sep 22, 2020 at 7:23 AM Eugen Block wrote: > > It just hit me when I pushed the "send" button: the (automatically > created) first snapshot initiates the first full sync to catch up on > the remote site, but from then it's either a manual process or the > snapshot schedule. Is that it?

[ceph-users] Re: Mount CEPH-FS on multiple hosts with concurrent access to the same data objects?

2020-09-22 Thread Andreas John
Hello, yes, it does. I even comes with a GUI so manage ceph and own basic-setup tool. No EC support. The only issue comes is with the backup stuff, which uses "vzdump" under the hood that causes possibly  high load. The reason is not really known yet, but some suspect that small block sizes

[ceph-users] Re: Mount CEPH-FS on multiple hosts with concurrent access to the same data objects?

2020-09-22 Thread René Bartsch
Am Dienstag, den 22.09.2020, 08:50 +0200 schrieb Robert Sander: > Do you know that Proxmox is able to store VM images as RBD directly > in a > Ceph cluster? Does Proxmox support snapshots, backups and thin provisioning with RBD- VM images? Regards, Renne

[ceph-users] [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
Hello, recently we wanted to re-adjust rebalancing speed in one cluster with ceph tell osd.* injectargs '--osd-max-backfills 4' ceph tell osd.* injectargs '--osd-recovery-max-active 4' The first osds responded and after about 6-7 osds ceph tell stopped progressing, just after it encountered a

[ceph-users] Re: Mount CEPH-FS on multiple hosts with concurrent access to the same data objects?

2020-09-22 Thread René Bartsch
Am Dienstag, den 22.09.2020, 11:25 +0200 schrieb Stefan Kooman: > On 2020-09-21 21:12, Wout van Heeswijk wrote: > > Hi Rene, > > > > Yes, cephfs is a good filesystem for concurrent writing. When using > > CephFS with ganesha you can even scale out. > > > > It will perform better but why don't

[ceph-users] Re: Troubleshooting stuck unclean PGs?

2020-09-22 Thread Lindsay Mathieson
On 22/09/2020 7:10 pm, Lenz Grimmer wrote: Alternatively, you could have used the Dashboard's OSD Recovery Priority Feature (see screen shot) Whereabouts is that? not seeing it on the ceph nautilus dashboard -- Lindsay ___ ceph-users mailing list

[ceph-users] Re: RBD-Mirror: snapshots automatically created?

2020-09-22 Thread Eugen Block
It just hit me when I pushed the "send" button: the (automatically created) first snapshot initiates the first full sync to catch up on the remote site, but from then it's either a manual process or the snapshot schedule. Is that it? Zitat von Eugen Block : Hi *, I encountered a rather

[ceph-users] RBD-Mirror: snapshots automatically created?

2020-09-22 Thread Eugen Block
Hi *, I encountered a rather strange (or at least unexpected) behaviour of the rbd-mirror. Maybe I don't fully understand the feature so please correct me if my assumptions are wrong. My two (virtual one-node) clusters are still on ceph version 15.2.4-864-g0f510cb110 and the following

[ceph-users] Ceph MDS stays in "up:replay" for hours. MDS failover takes 10-15 hours.

2020-09-22 Thread heilig . oleg
Hi there, We have 9 nodes Ceph cluster. Ceph version is 15.2.5. The cluster has 175 OSD (HDD) + 3 NVMe for cache tier for "cephfs_data" pool. CephFS pools info: POOLID STORED OBJECTS USED %USED MAX AVAIL cephfs_data 1 350 TiB 179.53M 350 TiB 66.93

[ceph-users] Re: virtual machines crashes after upgrade to octopus

2020-09-22 Thread Michael Bisig
Hallo all We also facing the problem and we would like to upgrade the clients to the specific release. @jason can you point us to the respective commit and the point release that contains the fix? Thanks in advance for your help. Best regards, Michael On 18.09.20, 15:12, "Lomayani S.

[ceph-users] Re: Troubleshooting stuck unclean PGs?

2020-09-22 Thread Wout van Heeswijk
Hi Matt, Looks like you are on the right track. Kind regards, Wout 42on From: Matt Larson Sent: Tuesday, September 22, 2020 5:44 AM To: Wout van Heeswijk Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Troubleshooting stuck unclean PGs? I tried this:

[ceph-users] Re: ceph docs redirect not good

2020-09-22 Thread Marc Roos
also https://docs.ceph.com/docs/mimic/rados/configuration/ceph-conf/ -Original Message- From: Marc Roos Sent: zondag 20 september 2020 15:36 To: ceph-users Subject: [ceph-users] ceph docs redirect not good https://docs.ceph.com/docs/mimic/man/8/ceph-volume-systemd/

[ceph-users] Re: Mount CEPH-FS on multiple hosts with concurrent access to the same data objects?

2020-09-22 Thread Robert Sander
On 21.09.20 18:44, René Bartsch wrote: > We're planning a Proxmox-Cluster. The data-center operator advised to > use a virtual machine with NFS on top of a single CEPH-FS instance to > mount the shared CEPH-FS storage on multiple hosts/VMs. For what purpose do you plan to use CephFS? Do you

[ceph-users] Re: Understanding what ceph-volume does, with bootstrap-osd/ceph.keyring, tmpfs

2020-09-22 Thread Janne Johansson
Den mån 21 sep. 2020 kl 16:15 skrev Marc Roos : > When I create a new encrypted osd with ceph volume[1] > > Q4: Where is this luks passphrase stored? > I think the OSD asks the mon for it after auth:ing, so "in the mon DBs" somewhere. -- May the most significant bit of your life be positive.