Hi Yuan,
Thanks for sharing the link, it is interesting to read. My understanding of the 
test results, is that with a fixed size of xattrs, using smaller stripe size 
will incur larger latency for read, which kind of makes sense since there are 
more k-v pairs, and with the size, it needs to get extents anyway. 

Correct me if I am wrong here...

Thanks,
Guang

> From: yuan.z...@intel.com
> To: s...@newdream.net; yguan...@outlook.com
> CC: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
> Subject: RE: xattrs vs. omap with radosgw
> Date: Wed, 17 Jun 2015 01:32:35 +0000
> 
> FWIW, there was some discussion in OpenStack Swift and their performance 
> tests showed 255 is not the best in recent XFS. They decided to use large 
> xattr boundary size(65535).
> 
> https://gist.github.com/smerritt/5e7e650abaa20599ff34
> 
> 
> -----Original Message-----
> From: ceph-devel-ow...@vger.kernel.org 
> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Sage Weil
> Sent: Wednesday, June 17, 2015 3:43 AM
> To: GuangYang
> Cc: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
> Subject: Re: xattrs vs. omap with radosgw
> 
> On Tue, 16 Jun 2015, GuangYang wrote:
>> Hi Cephers,
>> While looking at disk utilization on OSD, I noticed the disk was constantly 
>> busy with large number of small writes, further investigation showed that, 
>> as radosgw uses xattrs to store metadata (e.g. etag, content-type, etc.), 
>> which made the xattrs get from local to extents, which incurred extra I/O.
>> 
>> I would like to check if anybody has experience with offloading the metadata 
>> to omap:
>>   1> Offload everything to omap? If this is the case, should we make the 
>> inode size as 512 (instead of 2k)?
>>   2> Partial offload the metadata to omap, e.g. only offloading the rgw 
>> specified metadata to omap.
>> 
>> Any sharing is deeply appreciated. Thanks!
> 
> Hi Guang,
> 
> Is this hammer or firefly?
> 
> With hammer the size of object_info_t crossed the 255 byte boundary, which is 
> the max xattr value that XFS can inline. We've since merged something that 
> stripes over several small xattrs so that we can keep things inline, but it 
> hasn't been backported to hammer yet. See 
> c6cdb4081e366f471b372102905a1192910ab2da. Perhaps this is what you're seeing?
> 
> I think we're still better off with larger XFS inodes and inline xattrs if it 
> means we avoid leveldb at all for most objects.
> 
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
                                          
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to