There is no difference in allocation between replication or EC. If failure
domain is host, one osd per host ok s used for a PG. So if you use a 2+1 EC
profile with a host failure domain, you need 3 hosts for a healthy cluster.
The pool will go read-only when you have a failure (host or disk), or ar
Bastiaan,
Regarding EC pools: Our concern at 3 nodes is that 2-way replication
seems risky - if the two copies don't match, which one is corrupted.
However, 3-way replication on a 3 node cluster triples the price per
TB. Doing EC pools that are the equivalent of RAID-5 2+1 seems like
th
This debugging started because the ceph-provisioner from k8s was making
those users...but what we found was doing something similar by hand caused
the same issue. Just surprised no one else using k8s and ceph backed
PVC/PVs ran into this issue.
Thanks again for all your help!
Cheers
Aaron
On Th
No worries, can definitely do that.
Cheers
Aaron
On Thu, Jan 16, 2020 at 8:08 PM Jeff Layton wrote:
> On Thu, 2020-01-16 at 18:42 -0500, Jeff Layton wrote:
> > On Wed, 2020-01-15 at 08:05 -0500, Aaron wrote:
> > > Seeing a weird mount issue. Some info:
> > >
> > > No LSB modules are available.
Paul;
So is the 3/30/300GB a limit of RocksDB, or of Bluestore?
The percentages you list, are they used DB / used data? If so... Where do you
get the used DB data from?
Thank you,
Dominic L. Hilsbos, MBA
Director – Information Technology
Perform Air International Inc.
dhils...@performair.co
Discussing DB size requirements without knowing the exact cluster
requirements doesn't work.
Here are some real-world examples:
cluster1: CephFS, mostly large files, replicated x3
0.2% used for metadata
cluster2: radosgw, mix between replicated and erasure, mixed file sizes
(lots of tiny files,
Hello,
We are trying to route backups & snapshots of cinder volume and nova instances,
into the s3 buckets hosted on ceph. Currently ceph is the block storage target
as well.
What we want to achieve ?
1. all snapshots of cinder volumes / nova instance to be routed to s3 buckets
of that ten
Dave made a good point WAL + DB might end up a little over 60G, I would
probably go with ~70Gig partitions /LV's per OSD in your case. (if the nvme
drive is smart enough to spread the writes over all available capacity,
mort recent nvme's are). I have not yet seen a WAL larger or even close to
than
Hello Stefan, but if I want to use rbd mirroring I must have site-a.conf
and site-b.conf on one of my nodes.probably one of the mon nodes. Is it
only a configuration on ceph client side ?
Thanks
Ignazio
Il Gio 16 Gen 2020, 22:13 Stefan Kooman ha scritto:
> Quoting Ignazio Cassano (ignazioca
Quoting Ignazio Cassano (ignaziocass...@gmail.com):
> Hello, I just deployed nautilus with ceph-deploy.
> I did not find any option to give a cluster name to my ceph so its name is
> "ceph".
> Please, how can I chenge my cluster name without reinstalling ?
>
> Please, how can I set the cluster nam
Hello, I just deployed nautilus with ceph-deploy.
I did not find any option to give a cluster name to my ceph so its name is
"ceph".
Please, how can I chenge my cluster name without reinstalling ?
Please, how can I set the cluster name in installation phase ?
Many thanks for help
Ignazio
Hi Igor,
answers inline.
Am 16.01.20 um 21:34 schrieb Igor Fedotov:
> you may want to run fsck against failing OSDs. Hopefully it will shed
> some light.
fsck just says everything fine:
# ceph-bluestore-tool --command fsck --path /var/lib/ceph/osd/ceph-27/
fsck success
> Also wondering if OSD
Stefan,
you may want to run fsck against failing OSDs. Hopefully it will shed
some light.
Also wondering if OSD is able to recover (startup and proceed working)
after facing the issue?
If so do you have any one which failed multiple times? Do you have logs
for these occurrences?
Also pl
Dave;
I don't like reading inline responses, so...
I have zero experience with EC pools, so I won't pretend to give advice in that
area.
I would think that small NVMe for DB would be better than nothing, but I don't
know.
Once I got the hang of building clusters, it was relatively easy to wip
Dominic,
We ended up with a 1.6TB PCIe NVMe in each node. For 8 drives this
worked out to a DB size of something like 163GB per OSD. Allowing for
expansion to 12 drives brings it down to 124GB. So maybe just put the
WALs on NVMe and leave the DBs on the platters?
Understood that we will wan
Dave;
I'd like to expand on this answer, briefly...
The information in the docs is wrong. There have been many discussions about
changing it, but no good alternative has been suggested, thus it hasn't been
changed.
The 3rd party project that Ceph's BlueStore uses for its database (RocksDB),
Paul, Bastiaan,
Thank you for your responses and for alleviating my concerns about
Nautilus. The good news is that I can still easily move up to Debian
10. BTW, I assume that this is still with the 4.19 kernel?
Also, I'd like to inject additional customizations into my Debian
configs via c
Don't use Mimic, support for it is far worse than Nautilus or Luminous. I
think we were the only company who built a product around Mimic, both
Redhat and Suse enterprise storage was Luminous and then Nautilus skipping
Mimic entirely.
We only offered Mimic as a default for a limited time and immed
Hi Igor,
ouch sorry. Here we go:
-1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
submit_transaction error: Corruption: block checksum mismatch code = 2
Rocksdb transaction:
Put( Prefix = M key =
0x0402'.OBJ_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e
I would definitely go for Nautilus. there are quite some optimizations that
went in after mimic.
Bluestore DB size usually ends up at either 30 or 60 GB.
30 GB is one of the sweet spots during normal operation. But during
compaction, ceph writes the new data before removing the old, hence the
60GB
Hello all.
Sorry for the beginner questions...
I am in the process of setting up a small (3 nodes, 288TB) Ceph cluster
to store some research data. It is expected that this cluster will grow
significantly in the next year, possibly to multiple petabytes and 10s
of nodes. At this time I'm ex
Hi Stefan,
would you please share log snippet prior the assertions? Looks like
RocksDB is failing during transaction submission...
Thanks,
Igor
On 1/16/2020 11:56 AM, Stefan Priebe - Profihost AG wrote:
Hello,
does anybody know a fix for this ASSERT / crash?
2020-01-16 02:02:31.316394 7f
Hello,
We are trying to route backups & snapshots of cinder volume and nova instances,
into the s3 buckets hosted on ceph. Currently ceph is the block storage target
as well.
What we want to achieve ?
1. all snapshots of cinder volumes / nova instance to be routed to s3 buckets
of that ten
We upgraded to 14.2.4 back in October and this week to v14.2.6.
But I don't think the cluster had a network outage until yesterday, so I
wouldn't have thought this is a .6 regression.
If it happens again I'll look for the waiting for map message.
-- dan
On Thu, Jan 16, 2020 at 12:08 PM Nick Fis
On Thursday, January 16, 2020 09:15 GMT, Dan van der Ster
wrote:
> Hi Nick,
>
> We saw the exact same problem yesterday after a network outage -- a few of
> our down OSDs were stuck down until we restarted their processes.
>
> -- Dan
>
>
> On Wed, Jan 15, 2020 at 3:37 PM Nick Fisk wrote:
Hi Nick,
We saw the exact same problem yesterday after a network outage -- a few of
our down OSDs were stuck down until we restarted their processes.
-- Dan
On Wed, Jan 15, 2020 at 3:37 PM Nick Fisk wrote:
> Hi All,
>
> Running 14.2.5, currently experiencing some network blips isolated to a
>
Hello,
does anybody know a fix for this ASSERT / crash?
2020-01-16 02:02:31.316394 7f8c3f5ab700 -1
/build/ceph/src/os/bluestore/BlueStore.cc: In function 'void
BlueStore::_kv_sync_thread()' thread 7f8c3f5ab700 time 2020-01-16
02:02:31.304993
/build/ceph/src/os/bluestore/BlueStore.cc: 8808: FAILED
And I confirm that a repair is not useful. As as far I can see it simply
"cleans" the error (without modifying the big object) but the error of
course reappears when the deep scrub runs again on that PG
Cheers, Massimo
On Thu, Jan 16, 2020 at 9:35 AM Massimo Sgaravatto <
massimo.sgarava...@gmail.
In my cluster I saw that the problematic objects have been uploaded by a
specific application (onedata), which I think used to upload the files
doing something like:
rados --pool put
Now (since Luminous ?) the default object size is 128MB but if I am not
wrong it was 100GB before.
This would e
29 matches
Mail list logo