Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the 
following link:
https://bugzilla.lustre.org/show_bug.cgi?id=11471



Recently observed on one of our systems we had a user set the default striping
on one of his directories to stripe all files 160 wide.  He then proceeded to
create many thousands of multi-gigabyte byte files in the directory.  This
worked fine for roughly 44 days until the system memory was so fragmented the
order-4 allocations required for the LOV for each file started failing
regularly.  This resulted in the the normal system tools such as cp,mv,ls
getting ENOMEM error when  manipulating any of these files.

For now we've advised to user to not stripe quite so widely by default, and
we'll be rebooting the client to clear up the fragmentation.  That said I think
we're going to need to adjust how the LOV is allocated and not use a kmalloc()
but instead rely on a page array to keep the allocation size small.  Sadly I
think that's going to complicate how the LOV is packed when it needs to be sent
from the client<->server but it seems like the right fix.

_______________________________________________
Lustre-devel mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Reply via email to