More issues:
ceph-volume simple activate --file
/etc/ceph/osd/33-615e6d0c-e3e9-4f55-9b6a-94243faa848b.json --no-systemd
Running command: /usr/bin/mount -v /dev/sdb1 /var/lib/ceph/osd/ceph-33
stderr: mount: mount point /var/lib/ceph/osd/ceph-33 does not exist
Shouldn't the mount point creation b
After disabling insights module in mgr, mons rocksdb submit sync latency
gets down and my problem solved!!
On Fri, Feb 5, 2021 at 2:36 PM Seena Fallah wrote:
> Is there any suggestion on disk spec? I don’t find any doc about it on
> ceph too!
>
> On Fri, Feb 5, 2021 at 11:37 AM Eugen Block wrot
Hi all,
I'm experimenting with ceph-volume on Centos7, ceph mimic 13.2.10. When I
execute "ceph-volume deactivate ..." on a previousy activated OSD, I get this
error:
# ceph-volume lvm deactivate 12 0bbf481c-6a3d-4724-9a27-3a845eb05911
stderr: /usr/bin/findmnt: invalid option -- 'M'
stderr: U
> Redhat/Micron/Samsung/Supermicro have all put out white papers backing the
> idea of 2 copies on NVMe's as safe for production.
It's not like you can just jump from "unsafe" to "safe" -- it is about
comparing the probability of losing data against how valuable that
data is.
A vendor's decision
On 05/02/2021 20:10, Mario Giammarco wrote:
It is not that a morning I wake up and put some random hardware together,
I followed guidelines.
The result should be:
- if a disk (or more) brokes work goes on
- if a server brokes the VMs on the server start on another server and
work goes on.
The
I have just one more suggestion for you:
> but even our Supermicro contact that we worked the
> config out with was in agreement with 2x on NVMe
These kinds of settings aren't set in stone, it is a one line command
to rebalance (admittedly you wouldn't want to just do this casually).
I don't kno
Il giorno gio 4 feb 2021 alle ore 12:19 Eneko Lacunza
ha scritto:
> Hi all,
>
> El 4/2/21 a las 11:56, Frank Schilder escribió:
> >> - three servers
> >> - three monitors
> >> - 6 osd (two per server)
> >> - size=3 and min_size=2
> > This is a set-up that I would not run at all. The first one is,
I don't run a secondary site and don't know if short windows of read-only
access are terrible. From the data security point of view, min_size 2 is fine.
Its the min_size 1 that really is dangerous, because it accepts non-redundant
writes.
Even if you loose the second site entirely, you can alwa
Hi,
I am running a Ceph Octopus (15.2.8) cluster primarily for CephFS.
Metadata is stored on SSD, data is stored in three different pools on
HDD. Currently, I use 22 subvolumes.
I am rotating snapshots on 16 subvolumes, all in the same pool, which is
the primary data pool for CephFS. Current
Why would you use RAID underneath Ceph? The only reason I've seen to
do that is if you don't have enough CPU to run enough OSDs.
On Fri, Feb 5, 2021 at 11:09 AM Jack wrote:
>
> Is raid1 dangerous ?
> Is raid5 dangerous ?
>
> They both allow non-redondant writes
>
>
> On 2/5/21 4:19 PM, Frank Schi
Analogies between a distributed system and one that isn’t can be a bit strained
or nuanced.
The question really isn’t IF a given solution is dangerous, but HOW dangerous
it is. There is always a long tail ; one picks a point along it based on
capex, business needs, etc.
I sometimes read t
> Picture this, using size=3, min_size=2:
> - One node is down for maintenance
> - You loose a couple of devices
> - You loose data
>
> Is it likely that a nvme device dies during a short maintenance window ?
> Is it likely that two devices dies at the same time ?
If you just look at it from this
Is raid1 dangerous ?
Is raid5 dangerous ?
They both allow non-redondant writes
On 2/5/21 4:19 PM, Frank Schilder wrote:
I don't run a secondary site and don't know if short windows of read-only
access are terrible. From the data security point of view, min_size 2 is fine.
Its the min_size 1
I'll power the cluster up today or tomorrow and take a look again, Dan, but
the initial problem is that many of the pgs can't be queried — the requests
time out. I don't know if it's purely the stale, or just the unknown pgs,
that can't be queried, but I'll investigate if there's something wrong wi
Eeek! Don't run `osd_find_best_info_ignore_history_les = true` -- that
leads to data loss even such that you don't expect.
Are you sure all OSDs are up?
Query a PG to find out why it is unknown: `ceph pg query`. Feel
free to share that
In fact, the 'unknown' state means the MGR doesn't know th
I was in the middle of a rebalance on a small test cluster with about 1% of
pgs degraded, and shut the cluster entirely down for maintenance.
On startup, many pgs are entirely unknown, and most stale. In fact most pgs
can't be queried! No mon failures. Would osd logs tell me why pgs aren't
even mo
Those are my thoughts as well. We have 40Gbit/s of dedicated dark fiber that we
manage between the two sites.
From: "Frank Schilder"
To: "adamb"
Cc: "Jack" , "ceph-users"
Sent: Friday, February 5, 2021 10:19:06 AM
Subject: Re: [ceph-users] Re: NVMe and 2x Replica
I don't run a seconda
Hi Kenneth,
I managed to succeed with this just now. It's a lab environment and
the OSDs are not encrypted but I was able to get the OSDs up again.
The ceph-volume commands also worked (just activation didn't) so I had
the required information about those OSDs.
What I did was
- collect the
I think the answer is very simple: Data loss. You are setting yourself up for
data loss. Having only +1 redundancy is a design flaw and you will be fully
responsible for loosing data on such a set-up. If this is not a problem, then
that's an option. If this will get you fired, its not.
> There
This turned into a great thread. Lots of good information and clarification.
I am 100% on board with 3 copies for the primary.
What does everyone think about possibly only doing 2 copies on the secondary?
Keeping in mind that I would keep min=2 which I think will be reasonable for a
secondary
At the end, this is nothing but a probability stuff
Picture this, using size=3, min_size=2:
- One node is down for maintenance
- You loose a couple of devices
- You loose data
Is it likely that a nvme device dies during a short maintenance window ?
Is it likely that two devices dies at the same
On 04/02/2021 18:57, Adam Boyhan wrote:
All great input and points guys.
Helps me lean towards 3 copes a bit more.
I mean honestly NVMe cost per TB isn't that much more than SATA SSD now.
Somewhat surprised the salesmen aren't pitching 3x replication as it makes them
more money.
To add to
Is there any suggestion on disk spec? I don’t find any doc about it on ceph
too!
On Fri, Feb 5, 2021 at 11:37 AM Eugen Block wrote:
> Hi,
>
> > My disk latency is 25ms because of the high block size that rocksdb is
> > using.
> > should I provide a high-performance disk than I'm using for my mon
Den fre 5 feb. 2021 kl 07:38 skrev Pascal Ehlert :
> Sorry to jump in here, but would you care to explain why the total disk
> usage should stay under 60%?
> This is not something I have heard before and a quick Google search
> didn't return anything useful.
>
If you have 3 hosts with 3 drives ea
On Thu, Feb 4, 2021 at 10:30 PM huxia...@horebdata.cn
wrote:
>
> >IMO with a cluster this size, you should not ever mark out any OSDs --
> >rather, you should leave the PGs degraded, replace the disk (keep the
> >same OSD ID), then recover those objects to the new disk.
> >Or, keep it <40% used (w
Hi,
My disk latency is 25ms because of the high block size that rocksdb is
using.
should I provide a high-performance disk than I'm using for my monitor
nodes?
what are you currently using on the MON nodes? There are
recommendations out there [1] to setup MONs with SSDs:
An SSD or other su
26 matches
Mail list logo