Hello Oliver,
as 512e requires the drive to read a 4k block, change the 512 byte and then
write back the 4k block to the disk, it should have a significant
performance impact. However costs are the same, so always choose 4Kn drives.
By the way, this might not affect you, as long as you write 4k
Dear all,
in real-world use, is there a significant performance
benefit in using 4kn instead of 512e HDDs (using
Ceph bluestore with block-db on NVMe-SSD)?
Cheers and thanks for any advice,
Oliver
___
ceph-users mailing list
Hi Igor,
On 11/01/2019 20:16, Igor Fedotov wrote:
In short - we're planning to support main device expansion for Nautilus+
and to introduce better error handling for the case in Mimic and
Luminous. Nautilus PR has been merged, M & L PRs are pending review at
the moment:
Got it. No problem
Hi Hector,
just realized that you're trying to expand main (and exclusive) device
which isn't supported in mimic.
Here is bluestore_tool complaint (pretty confusing and not preventing
from the partial expansion though) while expanding:
expanding dev 1 from 0x1df2eb0 to 0x3a38120
Sorry for the late reply,
Here's what I did this time around. osd.0 and osd.1 should be identical,
except osd.0 was recreated (that's the first one that failed) and I'm
trying to expand osd.1 from its original size.
# ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0 | grep size
Hector,
One more thing to mention - after expansion please run fsck using
ceph-bluestore-tool prior to running osd daemon and collect another log
using CEPH_ARGS variable.
Thanks,
Igor
On 12/27/2018 2:41 PM, Igor Fedotov wrote:
Hi Hector,
I've never tried bluefs-bdev-expand over
Hi Hector,
I've never tried bluefs-bdev-expand over encrypted volumes but it works
absolutely fine for me in other cases.
So it would be nice to troubleshoot this a bit.
Suggest to do the following:
1) Backup first 8K for all OSD.1 devices (block, db and wal) using dd.
This will probably
Hi list,
I'm slightly expanding the underlying LV for two OSDs and figured I
could use ceph-bluestore-tool to avoid having to re-create them from
scratch.
I first shut down the OSD, expanded the LV, and then ran:
ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-0
I
On 22.11.2018 17:06, Eddy Castillon wrote:
Hello dear ceph users:
We are running a ceph cluster with Luminous (BlueStore). As you may
know this new ceph version has a new feature called "Checksums". I
would like to ask if this feature replace to deep-scrub. In our
cluster, we run
Hello dear ceph users:
We are running a ceph cluster with Luminous (BlueStore). As you may know
this new ceph version has a new feature called "Checksums". I would like
to ask if this feature replace to deep-scrub. In our cluster, we run
deep-scrub ever month however the impact in the
"description": "bluefs wal"
},
"/var/lib/ceph/osd/ceph-2/block.db": {
"osd_uuid": "6d999288-a4a4-4088-b764-bf2379b4492b",
"size": 524288000,
"btime": "2018-10-18 15:59:06.175997",
You might want to try --path option instead of --dev one.
On 10/31/2018 7:29 AM, ST Wong (ITSC) wrote:
Hi all,
We deployed a testing mimic CEPH cluster using bluestore. We can’t
run ceph-bluestore-tool on OSD with following error:
---
# ceph-bluestore-tool show-label --dev
Hi all,
We deployed a testing mimic CEPH cluster using bluestore.We can't run
ceph-bluestore-tool on OSD with following error:
---
# ceph-bluestore-tool show-label --dev *device*
2018-10-31 09:42:01.712 7f3ac5bb4a00 -1 auth: unable to find a keyring on
Firstly I'd suggest to inspect bluestore performance counters before and
after adjusting cache parameters (and after running the same test suite).
Namely:
"bluestore_buffer_bytes"
"bluestore_buffer_hit_bytes"
"bluestore_buffer_miss_bytes"
Is hit ratio (bluestore_buffer_hit_bytes) much
Hi Team,
We need a mechanism to have some data cache on OSD build on bluestore . Is
there an option available to enable data cache?
With default configurations , OSD logs state that data cache is disabled by
default,
bluestore(/var/lib/ceph/osd/ceph-66) _set_cache_sizes cache_size 1073741824
Am 24.02.2018 um 07:00 schrieb David Turner:
> Your 6.7GB of DB partition for each 4TB osd is on the very small side of
> things. It's been discussed a few times in the ML and the general use case
> seems to be about 10GB DB per 1TB of osd. That would be about 40GB DB
> partition for each of
Your 6.7GB of DB partition for each 4TB osd is on the very small side of
things. It's been discussed a few times in the ML and the general use case
seems to be about 10GB DB per 1TB of osd. That would be about 40GB DB
partition for each of your osds. This general rule covers most things
except for
Hi Vadim,
many thanks for these benchmark results!
This indeed looks extremely similar to what we achieve after enabling connected
mode.
Our 6 OSD-hosts are Supermicro systems with 2 HDDs (Raid 1) for the OS, and 32
HDDs (4 TB) + 2 SSDs for the OSDs.
The 2 SSDs have 16 LVM volumes each
Hi Oliver,
i also use Infiniband and Cephfs for HPC purposes.
My setup:
* 4x Dell R730xd and expansion shelf, 24 OSD à 8TB, 128GB Ram,
2x10Core Intel 4th Gen, Mellanox ConnectX-3, no SSD-Cache
* 7x Dell R630 Clients
* Ceph-Cluster running on Ubuntu Xenial and Ceph Jewel deployed with
Answering the first RDMA question myself...
Am 18.02.2018 um 16:45 schrieb Oliver Freyermuth:
> This leaves me with two questions:
> - Is it safe to use RDMA with 12.2.2 already? Reading through this mail
> archive,
> I grasped it may lead to memory exhaustion and in any case needs some
"I checked and the OSD-hosts peaked at a load average of about 22 (they
have 24+24HT cores) in our dd benchmark,
but stayed well below that (only about 20 % per OSD daemon) in the rados
bench test."
Maybe because your dd test uses bs=1M and rados bench is using 4M as
default block size?
Caspar
Hi Stijn,
> the IPoIB network is not 56gb, it's probably a lot less (20gb or so).
> the ib_write_bw test is verbs/rdma based. do you have iperf tests
> between hosts, and if so, can you share those reuslts?
Wow - indeed, yes, I was completely mistaken about ib_write_bw.
Good that I asked!
hi oliver,
the IPoIB network is not 56gb, it's probably a lot less (20gb or so).
the ib_write_bw test is verbs/rdma based. do you have iperf tests
between hosts, and if so, can you share those reuslts?
stijn
> we are just getting started with our first Ceph cluster (Luminous 12.2.2) and
>
Dear Cephalopodians,
we are just getting started with our first Ceph cluster (Luminous 12.2.2) and
doing some basic benchmarking.
We have two pools:
- cephfs_metadata, living on 4 SSD devices (each is a bluestore OSD, 240 GB) on
2 hosts (i.e. 2 SSDs each), setup as:
- replicated, min size
"Gregory Farnum" <gfar...@redhat.com>
CC: "Wido den Hollander" <w...@42on.com>, "ceph-users"
<ceph-users@lists.ceph.com>, "Marcus Haarmann" <marcus.haarm...@midoco.de>
Gesendet: Dienstag, 8. August 2017 17:50:44
Betreff: Re: [ceph-use
Marcus,
You may want to look at the bluestore_min_alloc_size setting as well
as the respective bluestore_min_alloc_size_ssd and
bluestore_min_alloc_size_hdd. By default bluestore sets a 64k block
size for ssds. I'm also using ceph for small objects and I've see my
OSD usage go down from 80% to
Don't forget that at those sizes the internal journals and rocksdb size
tunings are likely to be a significant fixed cost.
On Thu, Aug 3, 2017 at 3:13 AM Wido den Hollander wrote:
>
> > Op 2 augustus 2017 om 17:55 schreef Marcus Haarmann <
> marcus.haarm...@midoco.de>:
> >
> >
>
> Op 2 augustus 2017 om 17:55 schreef Marcus Haarmann
> :
>
>
> Hi,
> we are doing some tests here with a Kraken setup using bluestore backend (on
> Ubuntu 64 bit).
> We are trying to store > 10 mio very small objects using RADOS.
> (no fs, no rdb, only osd and
Hi,
we are doing some tests here with a Kraken setup using bluestore backend (on
Ubuntu 64 bit).
We are trying to store > 10 mio very small objects using RADOS.
(no fs, no rdb, only osd and monitors)
The setup was done with ceph-deploy, using the standard bluestore option, no
separate
- Mail original -
> De: "Benoit GEORGELIN" <benoit.george...@yulpa.io>
> À: "ceph-users" <ceph-users@lists.ceph.com>
> Envoyé: Samedi 13 Mai 2017 19:57:41
> Objet: [ceph-users] ceph bluestore RAM over used - luminous
> Hi dear members of the
Hi dear members of the list,
I'm discovering CEPH and doing some testing.
I came across a strange behavior about the RAM used by OSD process.
Configuration :
ceph version 12.0.2
3xOSD nodes , 2 OSD by nodes, total 6 OSD and 6 Disks
4Vcpu
6Go de ram
64 PGS
Ubuntu 16.04
From the
Hello,
On Wed, 15 Mar 2017 09:07:10 +0100 Michał Chybowski wrote:
> > Hello,
> >
> > your subject line has little relevance to your rather broad questions.
> >
> > On Tue, 14 Mar 2017 23:45:26 +0100 Michał Chybowski wrote:
> >
> >> Hi,
> >>
> >> I'm going to set up a small cluster (5 nodes
W dniu 15.03.2017 o 09:05, Eneko Lacunza pisze:
Hi Michal,
El 14/03/17 a las 23:45, Michał Chybowski escribió:
I'm going to set up a small cluster (5 nodes with 3 MONs, 2 - 4 HDDs
per node) to test if ceph in such small scale is going to perform
good enough to put it into production
Hello,
your subject line has little relevance to your rather broad questions.
On Tue, 14 Mar 2017 23:45:26 +0100 Michał Chybowski wrote:
Hi,
I'm going to set up a small cluster (5 nodes with 3 MONs, 2 - 4 HDDs per
node) to test if ceph in such small scale is going to perform good
enough to
Hi Michal,
El 14/03/17 a las 23:45, Michał Chybowski escribió:
I'm going to set up a small cluster (5 nodes with 3 MONs, 2 - 4 HDDs
per node) to test if ceph in such small scale is going to perform good
enough to put it into production environment (or does it perform well
only if there are
Hello,
your subject line has little relevance to your rather broad questions.
On Tue, 14 Mar 2017 23:45:26 +0100 Michał Chybowski wrote:
> Hi,
>
> I'm going to set up a small cluster (5 nodes with 3 MONs, 2 - 4 HDDs per
> node) to test if ceph in such small scale is going to perform good
>
Hi,
I'm going to set up a small cluster (5 nodes with 3 MONs, 2 - 4 HDDs per
node) to test if ceph in such small scale is going to perform good
enough to put it into production environment (or does it perform well
only if there are tens of OSDs, etc.).
Are there any "do's" and "don'ts" in
37 matches
Mail list logo