This appears to be exactly the issue. After testing with xfs mounted with the option 'allocsize=64k' and even trying ext4, the extraneous filesystem usage disappears. Removing 'allocsize=64k' returns the additional space usage. The linked article also makes reference to /proc/sys/fs/xfs/speculative_prealloc_lifetime to show the lifetime of the allocation which matches with behavior I was seeing.
Thank you for this most excellent find! On Wed, Feb 9, 2022 at 10:01 AM Xavi Hernandez <jaher...@redhat.com> wrote: > Hi, > > this problem is most likely caused by the XFS speculative preallocation ( > https://linux-xfs.oss.sgi.narkive.com/jjjfnyI1/faq-xfs-speculative-preallocation > ) > > Regards, > > Xavi > > On Sat, Feb 5, 2022 at 10:19 AM Strahil Nikolov <hunter86...@yahoo.com> > wrote: > >> It seems quite odd. >> I'm adding the devel list,as it looks like a bug - but it could be a >> feature ;) >> >> Best Regards, >> Strahil Nikolov >> >> >> ----- Препратено съобщение ----- >> *От:* Fox <foxxz....@gmail.com> >> *До:* Gluster Users <gluster-us...@gluster.org> >> *Изпратено:* събота, 5 февруари 2022 г., 05:39:36 Гринуич+2 >> *Тема:* Re: [Gluster-users] Distributed-Disperse Shard Behavior >> >> I tried setting the shard size to 512MB. It slightly improved the space >> utilization during creation - not quite double space utilization. And I >> didn't run out of space creating a file that occupied 6gb of the 8gb volume >> (and I even tried 7168MB just fine). See attached command line log. >> >> On Fri, Feb 4, 2022 at 6:59 PM Strahil Nikolov <hunter86...@yahoo.com> >> wrote: >> >> It sounds like a bug to me. >> In virtualization sharding is quite common (yet, on replica volumes) and >> I have never observed such behavior. >> Can you increase the shard size to 512M and check if the situation is >> better ? >> Also, share the volume info. >> >> Best Regards, >> Strahil Nikolov >> >> On Fri, Feb 4, 2022 at 22:32, Fox >> <foxxz....@gmail.com> wrote: >> Using gluster v10.1 and creating a Distributed-Dispersed volume with >> sharding enabled. >> >> I create a 2gb file on the volume using the 'dd' tool. The file size >> shows 2gb with 'ls'. However, 'df' shows 4gb of space utilized on the >> volume. After several minutes the volume utilization drops to the 2gb I >> would expect. >> >> This is repeatable for different large file sizes and different >> disperse/redundancy brick configurations. >> >> I've also encountered a situation, as configured above, where I utilize >> close to full disk capacity and am momentarily unable to delete the file. >> >> I have attached a command line log of an example of above using a set of >> test VMs setup in a glusterfs cluster. >> >> Is this initial 2x space utilization anticipated behavior for sharding? >> >> It would mean that I can never create a file bigger than half my volume >> size as I get an I/O error with no space left on disk. >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> Gluster-users mailing list >> gluster-us...@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> Gluster-users mailing list >> gluster-us...@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> Gluster-users mailing list >> gluster-us...@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > ------- > > Community Meeting Calendar: > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > > Gluster-devel mailing list > Gluster-devel@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > >
------- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel