Re: [arch-general] khugepaged hangs and filesystem unresponsive

2012-08-28 Thread pants
On Tue, Aug 28, 2012 at 11:07:10AM +0200, Lukas Jirkovsky wrote:
> It is difficult to say where the problem is in. I'd go for LKML
> mailing list [1] or for the Kernel Bugzilla [2] as stated in [3]. You
> may try XFS mailing list if you think it's XFS-only issue (ie. it
> doesn't happen with other filesystems).

I don't know how much I can bring them; I ran xfs_repair on the
filesystem in question, deleted, and replaced the problematic file, and
have had no further problems with it.

Thank you for your input regardless,

pants.


pgpsAEcwjleHW.pgp
Description: PGP signature


Re: [arch-general] khugepaged hangs and filesystem unresponsive

2012-08-28 Thread Lukas Jirkovsky
On 27 August 2012 09:10, pants  wrote:
> Good evening,
>
> I just experienced a major problem with my system while listening to a
> music file in mpd from an xfs filesystem over a mdadm raid6.  A kernel
> error was thrown, with the following error.log entry:
>
> output: /var/log/error.log
>> Aug 26 23:34:50 localhost kernel: [283781.061258] xhci_hcd :0b:00.0: 
>> ERROR Transfer event TRB DMA ptr not part of current TD
>> Aug 26 23:34:50 localhost kernel: [283781.062268] xhci_hcd :0b:00.0: 
>> ERROR Transfer event TRB DMA ptr not part of current TD
>> Aug 26 23:34:50 localhost kernel: [283781.063273] xhci_hcd :0b:00.0: 
>> ERROR Transfer event TRB DMA ptr not part of current TD
>> Aug 26 23:34:50 localhost kernel: [283781.064245] xhci_hcd :0b:00.0: 
>> ERROR Transfer event TRB DMA ptr not part of current TD
>> Aug 26 23:34:51 localhost kernel: [283782.058901] timeout: still 1 active 
>> urbs..
>> Aug 26 23:38:48 localhost kernel: [284019.080666] INFO: task mpd:707 blocked 
>> for more than 120 seconds.
>> Aug 26 23:38:48 localhost kernel: [284019.080696] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 26 23:40:48 localhost kernel: [284139.071419] INFO: task khugepaged:32 
>> blocked for more than 120 seconds.
>> Aug 26 23:40:48 localhost kernel: [284139.071451] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 26 23:40:48 localhost kernel: [284139.071589] INFO: task mpd:525 blocked 
>> for more than 120 seconds.
>> Aug 26 23:40:48 localhost kernel: [284139.071613] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 26 23:40:48 localhost kernel: [284139.071721] INFO: task mpd:707 blocked 
>> for more than 120 seconds.
>> Aug 26 23:40:48 localhost kernel: [284139.071744] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 26 23:40:48 localhost kernel: [284139.071943] INFO: task mplayer:28316 
>> blocked for more than 120 seconds.
>> Aug 26 23:40:48 localhost kernel: [284139.071968] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 26 23:42:48 localhost kernel: [284259.062189] INFO: task khugepaged:32 
>> blocked for more than 120 seconds.
>> Aug 26 23:42:48 localhost kernel: [284259.062220] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 26 23:42:48 localhost kernel: [284259.062358] INFO: task mpd:525 blocked 
>> for more than 120 seconds.
>> Aug 26 23:42:48 localhost kernel: [284259.062382] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 26 23:42:48 localhost kernel: [284259.062489] INFO: task mpd:702 blocked 
>> for more than 120 seconds.
>> Aug 26 23:42:48 localhost kernel: [284259.062512] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 26 23:42:48 localhost kernel: [284259.062688] INFO: task mpd:703 blocked 
>> for more than 120 seconds.
>> Aug 26 23:42:48 localhost kernel: [284259.062712] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 26 23:42:48 localhost kernel: [284259.062829] INFO: task mpd:704 blocked 
>> for more than 120 seconds.
>> Aug 26 23:42:48 localhost kernel: [284259.062852] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>
> Attempts to access other files on the same filesystem after the incident
> caused the applications used to also go into interruptible sleep (see
> the mplayer processes that appear later in the log).  I was forced to
> kill and unmount what I could, then force the system down.  Afterwards,
> I could replicate the error by attempting to read the file in question
> at the same point
>
> Even if you have no solution, pointing me towards the relevant kernel
> mailing list would be very helpful.
>
> Thanks,
>
> pants.

It is difficult to say where the problem is in. I'd go for LKML
mailing list [1] or for the Kernel Bugzilla [2] as stated in [3]. You
may try XFS mailing list if you think it's XFS-only issue (ie. it
doesn't happen with other filesystems).

Lukas

[1] https://lkml.org/ (the email address is linux-ker...@vger.kernel.org)
[2] https://bugzilla.kernel.org/
[3] http://www.kernel.org/doc/man-pages/reporting_code_bugs.html


[arch-general] khugepaged hangs and filesystem unresponsive

2012-08-27 Thread pants
Good evening,

I just experienced a major problem with my system while listening to a
music file in mpd from an xfs filesystem over a mdadm raid6.  A kernel
error was thrown, with the following error.log entry:

output: /var/log/error.log
> Aug 26 23:34:50 localhost kernel: [283781.061258] xhci_hcd :0b:00.0: 
> ERROR Transfer event TRB DMA ptr not part of current TD
> Aug 26 23:34:50 localhost kernel: [283781.062268] xhci_hcd :0b:00.0: 
> ERROR Transfer event TRB DMA ptr not part of current TD
> Aug 26 23:34:50 localhost kernel: [283781.063273] xhci_hcd :0b:00.0: 
> ERROR Transfer event TRB DMA ptr not part of current TD
> Aug 26 23:34:50 localhost kernel: [283781.064245] xhci_hcd :0b:00.0: 
> ERROR Transfer event TRB DMA ptr not part of current TD
> Aug 26 23:34:51 localhost kernel: [283782.058901] timeout: still 1 active 
> urbs..
> Aug 26 23:38:48 localhost kernel: [284019.080666] INFO: task mpd:707 blocked 
> for more than 120 seconds.
> Aug 26 23:38:48 localhost kernel: [284019.080696] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 26 23:40:48 localhost kernel: [284139.071419] INFO: task khugepaged:32 
> blocked for more than 120 seconds.
> Aug 26 23:40:48 localhost kernel: [284139.071451] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 26 23:40:48 localhost kernel: [284139.071589] INFO: task mpd:525 blocked 
> for more than 120 seconds.
> Aug 26 23:40:48 localhost kernel: [284139.071613] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 26 23:40:48 localhost kernel: [284139.071721] INFO: task mpd:707 blocked 
> for more than 120 seconds.
> Aug 26 23:40:48 localhost kernel: [284139.071744] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 26 23:40:48 localhost kernel: [284139.071943] INFO: task mplayer:28316 
> blocked for more than 120 seconds.
> Aug 26 23:40:48 localhost kernel: [284139.071968] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 26 23:42:48 localhost kernel: [284259.062189] INFO: task khugepaged:32 
> blocked for more than 120 seconds.
> Aug 26 23:42:48 localhost kernel: [284259.062220] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 26 23:42:48 localhost kernel: [284259.062358] INFO: task mpd:525 blocked 
> for more than 120 seconds.
> Aug 26 23:42:48 localhost kernel: [284259.062382] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 26 23:42:48 localhost kernel: [284259.062489] INFO: task mpd:702 blocked 
> for more than 120 seconds.
> Aug 26 23:42:48 localhost kernel: [284259.062512] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 26 23:42:48 localhost kernel: [284259.062688] INFO: task mpd:703 blocked 
> for more than 120 seconds.
> Aug 26 23:42:48 localhost kernel: [284259.062712] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 26 23:42:48 localhost kernel: [284259.062829] INFO: task mpd:704 blocked 
> for more than 120 seconds.
> Aug 26 23:42:48 localhost kernel: [284259.062852] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Attempts to access other files on the same filesystem after the incident
caused the applications used to also go into interruptible sleep (see
the mplayer processes that appear later in the log).  I was forced to
kill and unmount what I could, then force the system down.  Afterwards,
I could replicate the error by attempting to read the file in question
at the same point

Even if you have no solution, pointing me towards the relevant kernel
mailing list would be very helpful.

Thanks,

pants.


pgpPR4tBmtYGO.pgp
Description: PGP signature