Yeah,I think the main reason is the setting of pg_num and pgp_num of some key
pool.
This site will tell you the correct value:http://ceph.com/pgcalc/
Before you adjust pg_num and pgp_num,if this is a product environment,you
should set as Christian Balzer said:
---
osd_max_backfills = 1
osd_ba
On Fri, 25 Mar 2016 09:17:08 + Zhang Qiang wrote:
> Hi Christian, Thanks for your reply, here're the test specs:
> >>>
> [global]
> ioengine=libaio
> runtime=90
> direct=1
There it is.
You do understand what that flag does and what latencies are, right?
You're basically telling the I/O stack
On Fri, 25 Mar 2016 14:14:37 -0700 Bob R wrote:
> Mike,
>
> Recovery would be based on placement groups and those degraded groups
> would only exist on the storage pool(s) rather than the cache tier in
> this scenario.
>
Precisely.
They are entirely different entities.
There may be partially id
Hello,
this was of course discussed here in the very recent thread
"data corruption with hammer"
Read it, it contains fixes and a workaround as well.
Also from that thread: http://tracker.ceph.com/issues/12814
You don't need to remove the cache tier to fix things.
And as also discussed here,
Help, my Ceph cluster is losing data slowly over time. I keep finding files
that are the same length as they should be, but all the content has been
lost & replaced by nulls.
Here is an example:
(from a backup I have the original file)
[root@blotter docker]# ls -lart
/backup/space/docker/ceph-m
So one more update.
I suspect I may need to do more than force the secondary osd to become
the primary due to the reported sate of the pg. The reported pg state
reflects that it contains what it thinks is the correct but inaccurate
pg state.
In the dump for one of the pg's below the version tim
So I think I know what might have gone wrong.
When I took might osd's out of the cluster and shut them down, the first
set of osds likely came back up and in the cluster before 300 seconds
expired. This would have prevented cluster triggering recovery of the
pg from the replica osd.
So the quest
Mike,
Recovery would be based on placement groups and those degraded groups would
only exist on the storage pool(s) rather than the cache tier in this
scenario.
Bob
On Fri, Mar 25, 2016 at 8:30 AM, Mike Miller
wrote:
> Hi,
>
> in case of a failure in the storage tier, say single OSD disk failu
Hi Folks,
One last dip into my old bobtail cluster. (new hardware is on order)
I have three pg in an incomplete state. The cluster was previously
stable but with a health warn state due to a few near full osds. I
started resizing drives on one host to expand space after taking the
osds that se
Hi,
in case of a failure in the storage tier, say single OSD disk failure or
complete system failure with several OSD disks, will the remaining cache
tier (on other nodes) be used for rapid backfilling/recovering first
until it is full? Or is backfill/recovery done directly to the storage tier
FYI when I performed testing on our cluster I saw the same thing.
fio randwrite 4k test over a large volume was a lot faster with larger RBD
object size (8mb was marginally better than the default 4mb). It makes no sense
to me unless there is a huge overhead with increasing number of objects. Or
V5 is supposedly stable, but that only means it will be just as bad as any
other XFS.
I recommend avoiding XFS whenever possible. Ext4 works perfectly and I never
lost any data with it, even when it got corrupted, while XFS still likes to eat
the data when something goes wrong (and it will, lik
Before adding/replacing new OSDs:
What version of xfs is preferred by ceph developers/testers now?
Time ago I move all to v5 (crc=1,finobt=1), it works, exclude
"logbsize=256k,logbufs=8" in 4.4. Now I see, v5 is default mode (xfsprogs &
kernel 4.5 at least).
I in doubts: make new OSDs old-style v
sorry, it's my fault. I have found the problem: because on that host, there is
wrong version librados.so, this version is built long time ago and I forgot
this, so it mislead me. I have removed it and link to the right one.
2016-03-25
archer.wudong
发件人:Dong Wu
发送时间:2016-03-25 16:02
主题:af
Hi Christian, Thanks for your reply, here're the test specs:
>>>
[global]
ioengine=libaio
runtime=90
direct=1
group_reporting
iodepth=16
ramp_time=5
size=1G
[seq_w_4k_20]
bs=4k
filename=seq_w_4k_20
rw=write
numjobs=20
[seq_w_1m_20]
bs=1m
filename=seq_w_1m_20
rw=write
numjobs=20
Test results
Hello,
On Fri, 25 Mar 2016 08:11:27 + Zhang Qiang wrote:
> Hi all,
>
> According to fio,
Exact fio command please.
>with 4k block size, the sequence write performance of
> my ceph-fuse mount
Exact mount options, ceph config (RBD cache) please.
>is just about 20+ M/s, only 200 Mb of 1 G
Hi all,
According to fio, with 4k block size, the sequence write performance of my
ceph-fuse mount is just about 20+ M/s, only 200 Mb of 1 Gb full duplex NIC
outgoing bandwidth was used for maximum. But for 1M block size the
performance could achieve as high as 1000 M/s, approaching the limit of t
Hi all, I upgraded my cluster from 0.80.11 to 0.94.6, everything is ok
except that rbd cmd cord dump on one host and success on others.
I have disabled auth in ceph.conf:
auth_cluster_required = none
auth_service_required = none
auth_client_required = none
here is the core message.
$ sudo rbd ls
18 matches
Mail list logo