On 1/18/19 7:26 AM, Igor Fedotov wrote:
Hi Kevin, On 1/17/2019 10:50 PM, KEVIN MICHAEL HRPCEK wrote: Hey, I recall reading about this somewhere but I can't find it in the docs or list archive and confirmation from a dev or someone who knows for sure would be nice. What I recall is that bluestore has a max 4GB file size limit based on the design of bluestore not the osd_max_object_size setting. The bluestore source seems to suggest that by setting the OBJECT_MAX_SIZE to a 32bit max, giving an error if osd_max_object_size is > OBJECT_MAX_SIZE, and not writing the data if offset+length >= OBJECT_MAX_SIZE. So it seems like the in osd file size int can't exceed 32 bits which is 4GB, like FAT32. Am I correct or maybe I'm reading all this wrong..? You're correct, BlueStore doesn't support object larger than OBJECT_MAX_SIZE(i.e. 4Gb) Thanks for confirming that! If bluestore has a hard 4GB object limit using radosstriper to break up an object would work, but does using an EC pool that breaks up the object to shards smaller than OBJECT_MAX_SIZE have the same effect as radosstriper to get around a 4GB limit? We use rados directly and would like to move to bluestore but we have some large objects <= 13G that may need attention if this 4GB limit does exist and an ec pool doesn't get around it. Theoretically object split using EC might help. But I'm not sure whether one needs to adjust osd_max_object_size greater than 4Gb to permit 13Gb object usage in EC pool. If it's needed than tosd_max_object_size <= OBJECT_MAX_SIZE constraint is violated and BlueStore wouldn't start. In my experience I had to increase osd_max_object_size from the 128M default it changed to a couple versions ago to ~20G to be able to write our largest objects with some margin. Do you think there is another way to handle osd_max_object_size > OBJECT_MAX_SIZE so that bluestore will start and EC pools or striping can be used to write objects that are greater than OBJECT_MAX_SIZE but each stripe/shard ends up smaller than OBJECT_MAX_SIZE after striping or being in an ec pool? https://github.com/ceph/ceph/blob/master/src/os/bluestore/BlueStore.cc#L88 #define OBJECT_MAX_SIZE 0xffffffff // 32 bits https://github.com/ceph/ceph/blob/master/src/os/bluestore/BlueStore.cc#L4395 // sanity check(s) auto osd_max_object_size = cct->_conf.get_val<Option::size_t>("osd_max_object_size"); if (osd_max_object_size >= (size_t)OBJECT_MAX_SIZE) { derr << __func__ << " osd_max_object_size >= 0x" << std::hex << OBJECT_MAX_SIZE << "; BlueStore has hard limit of 0x" << OBJECT_MAX_SIZE << "." << std::dec << dendl; return -EINVAL; } https://github.com/ceph/ceph/blob/master/src/os/bluestore/BlueStore.cc#L12331 if (offset + length >= OBJECT_MAX_SIZE) { r = -E2BIG; } else { _assign_nid(txc, o); r = _do_write(txc, c, o, offset, length, bl, fadvise_flags); txc->write_onode(o); } Thanks! Kevin -- Kevin Hrpcek NASA SNPP Atmosphere SIPS Space Science & Engineering Center University of Wisconsin-Madison _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Thanks, Igor
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com