Re: [SCIENTIFIC-LINUX-USERS] XFS memory allocation deadlock
On 22/08/2019 11:20 pm, Pat Riehecky wrote: I believe the solution is two fold: - The SL 7.7 kernel will help prevent the problem from reoccurring (currently in sl-testing, scheduled for release Monday) - Existing fragmentation should probably be cleaned up via xfs_fsr[1] Thanks Pat and others. I will await the updates and make some space available for de-fraging. -- Cheers Bill
Re: [SCIENTIFIC-LINUX-USERS] XFS memory allocation deadlock
I believe the solution is two fold: - The SL 7.7 kernel will help prevent the problem from reoccurring (currently in sl-testing, scheduled for release Monday) - Existing fragmentation should probably be cleaned up via xfs_fsr[1] Pat [1] https://blog.codecentric.de/en/2017/04/xfs-possible-memory-allocation-deadlock-kmem_alloc/ On 8/21/19 8:10 PM, Bill Maidment wrote: Hi During copying a large file (about 200GB) to a backup hard drive, I am getting a multitude of XFS possible memory allocation deadlock messages. RedHat Portal shows the following: XFS issues "possible memory allocation deadlock in kmem_alloc" messages Solution Verified - Updated August 9 2019 at 2:51 AM - English Issue Seeing file system access issues on XFS based file systems. dmesg shows continuous entries with: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) Does anyone know what the solution is? And if SL7 will get this solution soon? -- Pat Riehecky Fermi National Accelerator Laboratory www.fnal.gov www.scientificlinux.org
Re: XFS memory allocation deadlock
On 8/21/19 8:10 PM, Bill Maidment wrote: Hi During copying a large file (about 200GB) to a backup hard drive, I am getting a multitude of XFS possible memory allocation deadlock messages. RedHat Portal shows the following: XFS issues "possible memory allocation deadlock in kmem_alloc" messages Solution Verified - Updated August 9 2019 at 2:51 AM - English Issue Seeing file system access issues on XFS based file systems. dmesg shows continuous entries with: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) Does anyone know what the solution is? And if SL7 will get this solution soon? Here are the good bits from that page... Also, you can create a free developer account and get access to these support topics and such resources. There is a bit more info listed on the page, root cause, diagnostics, etc. Resolution This is a long standing issue with xfs and highly fragmented files. kernel-3.10.0-1062.el7 from Errata RHSA-2019:2029 contains fixes to mitigate this issue when caused by individual file fragmentation. Please upgrade to this kernel or later. Workarounds There are several solutions that can be used to avoid high file fragmentation: Preallocate the space to be used by the file with unwritten extents. This gives the allocator the opportunity to allocate the whole file in one go and use the least amount of extents. As the blocks are written they will break up the unwritten extents into written/unwritten space and when all of the unwritten space has been converted the extent map will match the original optimal preallocated state. Use the extent size hint feature of XFS. This feature tells the allocator to allocate more space than may be needed by the current write request so that a minimum extent size is used. The extent will initially be allocated as an unwritten extent and will be converted as the individual blocks within the extent are written. As with preallocated files, when the entire extent has been written the extent size will match the original unwritten extent. The extent size hint feature can be set on a file or directory with this command: Raw $ xfs_io -c "extsize " If set on a directory then all files created within that directory after the hint is set will inherit the feature. You cannot set the hint on files that already have extents allocated. If it is not possible to modify the application then this is the suggested option to use. Use asynchronous buffered I/O. This will offer the chance to have many logically consecutive pages build up in the cache before being written out. Extents can then be allocated for the entire range of outstanding pages instead of each page individually. This will not only reduce fragmentation but means less I/Os need to be issued to the storage device. Avoid writing the file in a random order. If blocks can be coalesced within the application before being written out using direct I/O then there's a chance the file can be written sequentially which the allocator can use to allocate extents contiguously. Use xfs_fsr to defragment individual large files. Note xfs_fsr is unable to defragment files that are currently in use, using the -v option is recommended to report on any issues that prevent defragmentation. Raw xfs_fsr -v /path/to/large/file
XFS memory allocation deadlock
Hi During copying a large file (about 200GB) to a backup hard drive, I am getting a multitude of XFS possible memory allocation deadlock messages. RedHat Portal shows the following: XFS issues "possible memory allocation deadlock in kmem_alloc" messages Solution Verified - Updated August 9 2019 at 2:51 AM - English Issue Seeing file system access issues on XFS based file systems. dmesg shows continuous entries with: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) Does anyone know what the solution is? And if SL7 will get this solution soon? -- Cheers Bill
Re: XFS memory allocation deadlock
On 27/06/2019 1:06 am, Denice Deatrich wrote: On Wed, 26 Jun 2019, Bill Maidment wrote: Hi friends I have run into a problem in SL7.6 copying a large KVM guest lvm snapshot file using cp --sparse=always I get flooded with the following message: XFS: cp(12985) possible memory allocation deadlock size 131088 in kmem_realloc (mode:0x250) The copy ends eventually, but it takes much longer than I expected. Has anyone else come across this? Googling doesn't throw much light on this. Did you come across this article in your searches: https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.codecentric.de_en_2017_04_xfs-2Dpossible-2Dmemory-2Dallocation-2Ddeadlock-2Dkmem-5Falloc_=DwICAg=gRgGjJ3BkIsb5y6s49QqsA=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A=4GjGRmmnQ9-vd6f8scya52mViTCGS_ZqoGYkx8Kolag=eD7uFM_WBarNpN9bJqTk3mUuyJRlCWrpHAFr_t2CoAo= However their deadlock is in kmem_alloc instead, and there is no mention of LVM. It's a nice analysis - it might shed some light on the problem. cheers, etc. Hi Denice Thank you very much. This explains a lot. This guest server has never been reorganised in recent years. A defrag is definitely in order. -- Cheers Bill
Re: XFS memory allocation deadlock
On Wed, 26 Jun 2019, Bill Maidment wrote: > Hi friends > I have run into a problem in SL7.6 copying a large KVM guest lvm snapshot > file using cp --sparse=always > I get flooded with the following message: > > XFS: cp(12985) possible memory allocation deadlock size 131088 in > kmem_realloc (mode:0x250) > > The copy ends eventually, but it takes much longer than I expected. > Has anyone else come across this? Googling doesn't throw much light on this. Did you come across this article in your searches: https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.codecentric.de_en_2017_04_xfs-2Dpossible-2Dmemory-2Dallocation-2Ddeadlock-2Dkmem-5Falloc_=DwIFAg=gRgGjJ3BkIsb5y6s49QqsA=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A=cY2WwCEYQwykF9FwZEUoviVd7ExZGdSBVlskQqqV0HQ=00g_RB_wGAclo8TAu-uQQgMdMiSqF_rO2I9lEVRbF2k= However their deadlock is in kmem_alloc instead, and there is no mention of LVM. It's a nice analysis - it might shed some light on the problem. cheers, etc. -- Denice Deatrich, TRIUMF/Science/ATLAS Ph: +1 604 222 7665 <*> This moment's fortune cookie: "What the scientists have in their briefcases is terrifying." -- Nikita Khrushchev
XFS memory allocation deadlock
Hi friends I have run into a problem in SL7.6 copying a large KVM guest lvm snapshot file using cp --sparse=always I get flooded with the following message: XFS: cp(12985) possible memory allocation deadlock size 131088 in kmem_realloc (mode:0x250) The copy ends eventually, but it takes much longer than I expected. Has anyone else come across this? Googling doesn't throw much light on this. -- Cheers Bill