can you get the value of osd_beacon_report_interval item? the default
is 300, you can set to 60, or maybe turn on debug_ms=1 debug_mon=10
can get more infos.
Zhenshi Zhou 于2019年3月13日周三 下午1:20写道:
>
> Hi,
>
> The servers are cennected to the same switch.
> I can ping from anyone of the servers to
I have a cluster with SSD and HDD storage. I wonder how to configure S3
buckets on HDD storage backends only.
Do I need to create pools on this particular storage and define radosgw
placement with those or there is a better or easier way to achieve this ?
Just assign your "crush hdd rule" to you
Hi,
The servers are cennected to the same switch.
I can ping from anyone of the servers to other servers
without a packet lost and the average round trip time
is under 0.1 ms.
Thanks
Ashley Merrick 于2019年3月13日周三 下午12:06写道:
> Can you ping all your OSD servers from all your mons, and ping your m
Can you ping all your OSD servers from all your mons, and ping your mons
from all your OSD servers?
I’ve seen this where a route wasn’t working one direction, so it made OSDs
flap when it used that mon to check availability:
On Wed, 13 Mar 2019 at 11:50 AM, Zhenshi Zhou wrote:
> After checking
After checking the network and syslog/dmsg, I think it's not the network or
hardware issue. Now there're some
osds being marked down every 15 minutes.
here is ceph.log:
2019-03-13 11:06:26.290701 mon.ceph-mon1 mon.0 10.39.0.34:6789/0 6756 :
cluster [INF] Cluster is now healthy
2019-03-13 11:21:21.
Hi there,
We are replicating a RBD image from Primary to DR site using RBD mirroring.
On Primary, we were using 10.2.10.
DR site is luminous and we promoted the DR copy to test the failure.
Everything checked out good.
Now we are trying to restart the replication and we did the demote
One pool per storage class is enough, you can share the metadata pools
across different placement policies.
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
On Tue, Mar
On Tue, Mar 12, 2019 at 8:56 PM David C wrote:
>
> Out of curiosity, are you guys re-exporting the fs to clients over something
> like nfs or running applications directly on the OSD nodes?
Kernel NFS + kernel CephFS can fall apart and deadlock itself in
exciting ways...
nfs-ganesha is so much
Both, in my case (since host, both local services and NFS export use the
CephFS mount). I use the in-kernel NFS server (not nfs-ganesha).
On 13/03/2019 04.55, David C wrote:
> Out of curiosity, are you guys re-exporting the fs to clients over
> something like nfs or running applications directly o
Out of curiosity, are you guys re-exporting the fs to clients over
something like nfs or running applications directly on the OSD nodes?
On Tue, 12 Mar 2019, 18:28 Paul Emmerich, wrote:
> Mounting kernel CephFS on an OSD node works fine with recent kernels
> (4.14+) and enough RAM in the servers
Dear Ceph users,
I have a cluster with SSD and HDD storage. I wonder how to configure S3
buckets on HDD storage backends only.
Do I need to create pools on this particular storage and define radosgw
placement with those or there is a better or easier way to achieve this ?
Regards,
_
Mounting kernel CephFS on an OSD node works fine with recent kernels
(4.14+) and enough RAM in the servers.
We did encounter problems with older kernels though
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 Mün
Answering my own question (getting help from Pavan), I see that all
the details are in this PR: https://github.com/ceph/ceph/pull/11051
So, the zone was updated to set metadata_heap: "" with
$ radosgw-admin zone get --rgw-zone=default > zone.json
[edit zone.json]
$ radosgw-admin zone set --rgw-zo
I bet you'd see better memstore results with my vector based object
implementation instead of bufferlists.
Where can I find it?
Nick Fisk noticed the same
thing you did. One interesting observation he made was that disabling
CPU C/P states helped bluestore immensely in the iodepth=1 case.
T
Hi all,
We have an S3 cluster with >10 million objects in default.rgw.meta.
# radosgw-admin zone get | jq .metadata_heap
"default.rgw.meta"
In these old tickets I realized that this setting is obsolete, and
those objects are probably useless:
http://tracker.ceph.com/issues/17256
http://tra
On 3/12/19 8:40 AM, vita...@yourcmc.ru wrote:
One way or another we can only have a single thread sending writes to
rocksdb. A lot of the prior optimization work on the write side was
to get as much processing out of the kv_sync_thread as possible.
That's still a worthwhile goal as it's typical
Yeah thank you xD
you just answered another thread where i asked for the kv-sync thread
And consider this done so i know what to do now.
Thank you
Am 12.03.19 um 14:43 schrieb Mark Nelson:
> Our default of 4 256MB WAL buffers is arguably already too big. On one
> hand we are making these buffe
Our default of 4 256MB WAL buffers is arguably already too big. On one
hand we are making these buffers large to hopefully avoid short lived
data going into the DB (pglog writes). IE if a pglog write comes in and
later a tombstone invalidating it comes in, we really want those to land
in the s
Sorry i mean L2
Am 12.03.19 um 14:25 schrieb Benjamin Zapiec:
> May I configure the size of WAL to increase block.db usage?
> For example I configure 20GB I would get an usage of about 48GB on L3.
>
> Or should I stay with ceph defaults?
> Is there a maximal size for WAL that makes sense?
>
>
One way or another we can only have a single thread sending writes to
rocksdb. A lot of the prior optimization work on the write side was
to get as much processing out of the kv_sync_thread as possible.
That's still a worthwhile goal as it's typically what bottlenecks with
high amounts of concur
On 3/12/19 7:31 AM, vita...@yourcmc.ru wrote:
Decreasing the min_alloc size isn't always a win, but ican be in some
cases. Originally bluestore_min_alloc_size_ssd was set to 4096 but we
increased it to 16384 because at the time our metadata path was slow
and increasing it resulted in a pretty s
May I configure the size of WAL to increase block.db usage?
For example I configure 20GB I would get an usage of about 48GB on L3.
Or should I stay with ceph defaults?
Is there a maximal size for WAL that makes sense?
signature.asc
Description: OpenPGP digital signature
On 3/12/19 7:24 AM, Benjamin Zapiec wrote:
Hello,
i was wondering about ceph block.db to be nearly empty and I started
to investigate.
The recommendations from ceph are that block.db should be at least
4% the size of block. So my OSD configuration looks like this:
wal.db - not explicit spec
I looked further into historic slow ops (thanks to some other posts on the
list) and I am confused a bit with the following event
{
"description": "osd_repop(client.85322.0:86478552 7.1b e502/466
7:d8d149b7:::rbd_data.ff7e3d1b58ba.0316:head v 502'10665506)",
Amount of the metadata depends on the amount of data. But RocksDB is
only putting metadata to the fast storage when it thinks all metadata on
the same level of the DB is going to fit there. So all sizes except 4,
30, 286 GB are useless.
___
ceph-users
Okay so i think i don't understand the mechanism of Ceph's RocksDB if
it should place data on block.db or not.
So the amount of data in block.db depends on the wal size?
I thought it depends on the objects saved to the storage.
In this case, say we have a 1GB file, it would have a size
of 10GB in
block.db is very unlikely to ever grow to 250GB with a 6TB data device.
However, there seems to be a funny "issue" with all block.db sizes
except 4, 30, and 286 GB being useless, because RocksDB puts the data on
the fast storage only if it thinks the whole LSM level will fit there.
Ceph's Rock
Decreasing the min_alloc size isn't always a win, but ican be in some
cases. Originally bluestore_min_alloc_size_ssd was set to 4096 but we
increased it to 16384 because at the time our metadata path was slow
and increasing it resulted in a pretty significant performance win
(along with increasin
Hello,
i was wondering about ceph block.db to be nearly empty and I started
to investigate.
The recommendations from ceph are that block.db should be at least
4% the size of block. So my OSD configuration looks like this:
wal.db - not explicit specified
block.db - 250GB of SSD storage
block
What exact error are you seeing after adding admin caps?
I tried the following steps on master and they worked fine: (TESTER1 is
adding a user policy to TESTER)
1. radosgw-admin --uid TESTER --display-name "TestUser" --access_key TESTER
--secret test123 user create
2. radosgw-admin --uid TESTER1 -
Hi Kevin,
I'm sure the firewalld are disabled on each host.
Well, the network is not a problem. The servers are connected
to the same switch and the connection is good when the osds
are marked as down. There was no interruption or delay.
I restart the leader monitor daemon and it seems return to
Are you sure that firewalld is stopped and disabled?
Looks exactly like that when I missed one host in a test cluster.
Kevin
Am Di., 12. März 2019 um 09:31 Uhr schrieb Zhenshi Zhou :
> Hi,
>
> I deployed a ceph cluster with good performance. But the logs
> indicate that the cluster is not as st
Quoting Zack Brenton (z...@imposium.com):
> Types of devices:
> We run our Ceph pods on 3 AWS i3.2xlarge nodes. We're running 3 OSDs, 3
> Mons, and 2 MDS pods (1 active, 1 standby-replay). Currently, each pod runs
> with the following resources:
> - osds: 2 CPU, 6Gi RAM, 1.7Ti NVMe disk
> - mds: 3
Hi,
I have problem with starting two of my OSD’s with error:
osd.19 pg_epoch: 8887 pg[1.2b5(unlocked)] enter Initial
0> 2019-03-01 09:41:30.259485 7f303486be00 -1
/build/ceph-12.2.11/src/osd/PGLog.h: In function 'static void
PGLog::read_log_and_missing(ObjectStore*, coll_t, coll_t, ghobject_t, c
It's worth noting that most containerized deployments can effectively
limit RAM for containers (cgroups), and the kernel has limits on how
many dirty pages it can keep around.
In particular, /proc/sys/vm/dirty_ratio (default: 20) means at most 20%
of your total RAM can be dirty FS pages. If yo
Yep, I think it maybe a network issue as well. I'll check the connections.
Thanks Eugen:)
Eugen Block 于2019年3月12日周二 下午4:35写道:
> Hi,
>
> my first guess would be a network issue. Double-check your connections
> and make sure the network setup works as expected. Check syslogs,
> dmesg, switches et
Hi,
my first guess would be a network issue. Double-check your connections
and make sure the network setup works as expected. Check syslogs,
dmesg, switches etc. for hints that a network interruption may have
occured.
Regards,
Eugen
Zitat von Zhenshi Zhou :
Hi,
I deployed a ceph clus
Hi,
I deployed a ceph cluster with good performance. But the logs
indicate that the cluster is not as stable as I think it should be.
The log shows the monitors mark some osd as down periodly:
[image: image.png]
I didn't find any useful information in osd logs.
ceph version 13.2.4 mimic (stable
Hi everyone,
I have an Intel D3-S4610 SSD with 1.92 TB here for testing and get some pretty
bad numbers, when running the fio benchmark suggested by Sébastien Han
(http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/):
Intel D3-S4610 1.92 TB
39 matches
Mail list logo