We continue to see hangs on one or more servers every night. It happens when 
copying a file into the btrfs partition. The result of the hang is that the CPU 
goes to 50% I/O wait or higher for hours. Only a reboot resets the process.
 
Based on this behavior and the kernel errors, it looks like btrfs has a bug 
with managing extents and caching in the presence of compression that causes it 
to go into an infinite loop that continually reads or writes to the disk. This 
seems to be 100% reproducible when the right file is found. Rebooting and 
immediately restarting the copy process with the same file results in the same 
hang.
 
--
Russell Mosemann
 

Reply via email to