Hi all, We are having a bunch of processes going into uninterruptable sleep when accessing nfs-mounted files. Sometimes the processes recover, but sometimes they don't and we must reboot the machine. This is probably associated with quite high load on the NFS server. Below is an example of what we see in the kernel logs.
We are running RHEL 6.1, kernel version 2.6.32-131.17.1.el6.x86_64. My question is; is anyone else running RHEL 6.1 seeing these problems and are there any solutions? (There are several reports on the net about similar problems with kernels released within the last year, but I have found no solutions.) --------------------------------------------------------------------------------------------------------------------------------------------- Nov 24 02:05:04 lclcx487 kernel: INFO: task java:12278 blocked for more than 120 seconds. Nov 24 02:05:04 lclcx487 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 24 02:05:04 lclcx487 kernel: java D 0000000000000008 0 12278 12269 0x00000080 Nov 24 02:05:04 lclcx487 kernel: ffff88078bf05c78 0000000000000082 ffff88078bf05bf8 ffffffffa0257ca9 Nov 24 02:05:04 lclcx487 kernel: ffff8805d7c75c00 ffff880969800340 ffff8806872feb60 ffff8805d7c75c08 Nov 24 02:05:04 lclcx487 kernel: ffff88096931ba78 ffff88078bf05fd8 000000000000f598 ffff88096931ba78 Nov 24 02:05:04 lclcx487 kernel: Call Trace: Nov 24 02:05:04 lclcx487 kernel: [<ffffffffa0257ca9>] ? rpc_run_task+0xd9/0x130 [sunrpc] Nov 24 02:05:04 lclcx487 kernel: [<ffffffff81098d19>] ? ktime_get_ts+0xa9/0xe0 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110d3d0>] ? sync_page+0x0/0x50 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff814db743>] io_schedule+0x73/0xc0 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110d40d>] sync_page+0x3d/0x50 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff814dbfaf>] __wait_on_bit+0x5f/0x90 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110d5c3>] wait_on_page_bit+0x73/0x80 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8108e1c0>] ? wake_bit_function+0x0/0x50 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff811232d5>] ? pagevec_lookup_tag+0x25/0x40 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110d9db>] wait_on_page_writeback_range+0xfb/0x190 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8110dba8>] filemap_write_and_wait_range+0x78/0x90 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff811a0abe>] vfs_fsync_range+0x7e/0xe0 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff811a0b8d>] vfs_fsync+0x1d/0x20 Nov 24 02:05:04 lclcx487 kernel: [<ffffffffa0309410>] nfs_file_flush+0x70/0xa0 [nfs] Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8116f7cc>] filp_close+0x3c/0x90 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8116f8c5>] sys_close+0xa5/0x100 Nov 24 02:05:04 lclcx487 kernel: [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b --------------------------------------------------------------------------------------------------------------------------------------------- Regards, -- Eiríkur Hjartarson E-mail: [email protected] deCODE genetics Mobile: +3546641898 Sturlugötu 7 IS-101 Reykjavík _______________________________________________ rhelv6-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/rhelv6-list
