Hi,
I've configured an erasure coded pool (3+2) in our Ceph lab environment (ceph
version 14.2.4), and I'm trying to verify the behaviour of
bluestore_min_alloc_size.
Our OSDs are HDDs, so by default the min_alloc_size is set to 64KB.
ceph daemon osd.X config show | grep bluestore_min_alloc_size_hdd
"bluestore_min_alloc_size_hdd": "65536",
According to the documentation, the unwritten area in each chunk is filled with
zeroes when it is written to the raw partition, which can lead to space
amplification when writing small objects.
In other words, a 4KB object stored in my cluster should theoretically use 64KB
* 5(k+m) = 320KB. Or, quite simply, 64KB per chunk.
To test this, I uploaded a 4KB object, and used the ceph-objectstore-tool to
output the size of the object on one of the OSDs:
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-20 --pgid 15.93s2
b458a7bf-0643-4c04-bccc-f7f8feb0bd20.4889853.3_ceph.txt dump | jq '.stat'
{
"size": 4096,
"blksize": 4096,
"blocks": 1,
"nlink": 1
}
I was expecting size to be 64KB, but perhaps it doesn't take into account the
area filled with zeroes? Note, in this case, size = 4K because that is the
stripe unit size specified in my erasure coding profile.
Is there any other way of querying the object to verify that each chunk is
using 64K, or that the object size in total is using 320KB?
Obviously, if I only have one object in the pool, then I can use "rados df",
but as soon as I add more objects of different sizes, I lose this ability.
rados df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY
UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
ec32 320 KiB 1 0 5 0
0 1 0 0 B 1 4 KiB 0 B 0 B
Thanks and regards,
James.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com