Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-21 Thread Andrew Morton
On Thu, 22 Feb 2007 05:18:45 +0900
OGAWA Hirofumi <[EMAIL PROTECTED]> wrote:

> Kumar Gala <[EMAIL PROTECTED]> writes:
> 
> >>> I usually run the following twice to get the hang state:
> >>>
> >>> time ./trunc_test bar 1 &
> >>> time ./trunc_test baz 1 &
> >>>
> >>> I was wondering if anyone had any suggestions on what to poke at next
> >>> to try and figure out what is going on.
> >
> > So I realized I could use sysrq to provide some more debug  
> > information.  When the system locks up I get the following output  
> > from 't'
> >
> > [  497.499249] usb-storage   D  0   671  5
> > 773   670 (L-TLB)
> > [  497.506930] Call Trace:
> > [  497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40
> > [  497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc
> > [  497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0
> > [  497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54
> > [  497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c
> > [  497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84
> > [  497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4
> > [  497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc
> > [  497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0
> > [  497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694
> > [  497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc
> > [  497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c
> > [  497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8
> > [  497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ 
> > 0x138
> > [  497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310
> > [  497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ 
> > 0x344
> > [  497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command 
> > +0x10/0x20
> > [  497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290
> > [  497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130
> > [  497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60
> > [  497.625285] shD 3009C7EC 0   718  1
> 
> [...]
> 
> > and from 'm'
> >
> > [  731.834529] Show Memory
> > [  731.836968] Mem-info:
> > [  731.839234] DMA per-cpu:
> > [  731.841768] CPU0: Hot: hi:   18, btch:   3 usd:   3   Cold:  
> > hi:6, btch:   1 usd:   2
> > [  731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330  
> > unstable:0 free:1009 slab:1671 mapped:110 pagetables:19
> > [  731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB  
> > active:6040kB inactive:45236kB present:65024kB pages_scanned:292  
> > all_unreclaimable? no
> > [  731.874363] lowmem_reserve[]: 0 0
> > [  731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB  
> > 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB
> > [  731.887669] Free swap:0kB
> > [  731.893913] 16384 pages of RAM
> > [  731.896963] 798 reserved pages
> > [  731.900011] 10946 pages shared
> > [  731.903058] 0 pages swap cached
> >
> > It seems like usb-storage and aio are completely off in the weeds.   
> > Ideas?
> 
> It seems usb-storage should remove some kmalloc and use mempool() for
> urb...  Is someone working on this? And idea?

I think Pete said that we're supposed to be using GFP_NOIO in there.

Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO
and GFP_NOFS allocations, which is a bug.  Because if the caller holds
locks which prevent filesystem or IO progress, we deadlock.

I'll fix the VM if someone else fixes USB ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-21 Thread OGAWA Hirofumi
Kumar Gala <[EMAIL PROTECTED]> writes:

>>> I usually run the following twice to get the hang state:
>>>
>>> time ./trunc_test bar 1 &
>>> time ./trunc_test baz 1 &
>>>
>>> I was wondering if anyone had any suggestions on what to poke at next
>>> to try and figure out what is going on.
>
> So I realized I could use sysrq to provide some more debug  
> information.  When the system locks up I get the following output  
> from 't'
>
> [  497.499249] usb-storage   D  0   671  5
> 773   670 (L-TLB)
> [  497.506930] Call Trace:
> [  497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40
> [  497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc
> [  497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0
> [  497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54
> [  497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c
> [  497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84
> [  497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4
> [  497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc
> [  497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0
> [  497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694
> [  497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc
> [  497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c
> [  497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8
> [  497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ 
> 0x138
> [  497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310
> [  497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ 
> 0x344
> [  497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command 
> +0x10/0x20
> [  497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290
> [  497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130
> [  497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60
> [  497.625285] shD 3009C7EC 0   718  1

[...]

> and from 'm'
>
> [  731.834529] Show Memory
> [  731.836968] Mem-info:
> [  731.839234] DMA per-cpu:
> [  731.841768] CPU0: Hot: hi:   18, btch:   3 usd:   3   Cold:  
> hi:6, btch:   1 usd:   2
> [  731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330  
> unstable:0 free:1009 slab:1671 mapped:110 pagetables:19
> [  731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB  
> active:6040kB inactive:45236kB present:65024kB pages_scanned:292  
> all_unreclaimable? no
> [  731.874363] lowmem_reserve[]: 0 0
> [  731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB  
> 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB
> [  731.887669] Free swap:0kB
> [  731.893913] 16384 pages of RAM
> [  731.896963] 798 reserved pages
> [  731.900011] 10946 pages shared
> [  731.903058] 0 pages swap cached
>
> It seems like usb-storage and aio are completely off in the weeds.   
> Ideas?

It seems usb-storage should remove some kmalloc and use mempool() for
urb...  Is someone working on this? And idea?
-- 
OGAWA Hirofumi <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-21 Thread OGAWA Hirofumi
Kumar Gala [EMAIL PROTECTED] writes:

 I usually run the following twice to get the hang state:

 time ./trunc_test bar 1 
 time ./trunc_test baz 1 

 I was wondering if anyone had any suggestions on what to poke at next
 to try and figure out what is going on.

 So I realized I could use sysrq to provide some more debug  
 information.  When the system locks up I get the following output  
 from 't'

 [  497.499249] usb-storage   D  0   671  5
 773   670 (L-TLB)
 [  497.506930] Call Trace:
 [  497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40
 [  497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc
 [  497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0
 [  497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54
 [  497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c
 [  497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84
 [  497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4
 [  497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc
 [  497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0
 [  497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694
 [  497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc
 [  497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c
 [  497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8
 [  497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ 
 0x138
 [  497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310
 [  497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ 
 0x344
 [  497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command 
 +0x10/0x20
 [  497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290
 [  497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130
 [  497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60
 [  497.625285] shD 3009C7EC 0   718  1

[...]

 and from 'm'

 [  731.834529] Show Memory
 [  731.836968] Mem-info:
 [  731.839234] DMA per-cpu:
 [  731.841768] CPU0: Hot: hi:   18, btch:   3 usd:   3   Cold:  
 hi:6, btch:   1 usd:   2
 [  731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330  
 unstable:0 free:1009 slab:1671 mapped:110 pagetables:19
 [  731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB  
 active:6040kB inactive:45236kB present:65024kB pages_scanned:292  
 all_unreclaimable? no
 [  731.874363] lowmem_reserve[]: 0 0
 [  731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB  
 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB
 [  731.887669] Free swap:0kB
 [  731.893913] 16384 pages of RAM
 [  731.896963] 798 reserved pages
 [  731.900011] 10946 pages shared
 [  731.903058] 0 pages swap cached

 It seems like usb-storage and aio are completely off in the weeds.   
 Ideas?

It seems usb-storage should remove some kmalloc and use mempool() for
urb...  Is someone working on this? And idea?
-- 
OGAWA Hirofumi [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-21 Thread Andrew Morton
On Thu, 22 Feb 2007 05:18:45 +0900
OGAWA Hirofumi [EMAIL PROTECTED] wrote:

 Kumar Gala [EMAIL PROTECTED] writes:
 
  I usually run the following twice to get the hang state:
 
  time ./trunc_test bar 1 
  time ./trunc_test baz 1 
 
  I was wondering if anyone had any suggestions on what to poke at next
  to try and figure out what is going on.
 
  So I realized I could use sysrq to provide some more debug  
  information.  When the system locks up I get the following output  
  from 't'
 
  [  497.499249] usb-storage   D  0   671  5
  773   670 (L-TLB)
  [  497.506930] Call Trace:
  [  497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40
  [  497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc
  [  497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0
  [  497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54
  [  497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c
  [  497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84
  [  497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4
  [  497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc
  [  497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0
  [  497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694
  [  497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc
  [  497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c
  [  497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8
  [  497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ 
  0x138
  [  497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310
  [  497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ 
  0x344
  [  497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command 
  +0x10/0x20
  [  497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290
  [  497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130
  [  497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60
  [  497.625285] shD 3009C7EC 0   718  1
 
 [...]
 
  and from 'm'
 
  [  731.834529] Show Memory
  [  731.836968] Mem-info:
  [  731.839234] DMA per-cpu:
  [  731.841768] CPU0: Hot: hi:   18, btch:   3 usd:   3   Cold:  
  hi:6, btch:   1 usd:   2
  [  731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330  
  unstable:0 free:1009 slab:1671 mapped:110 pagetables:19
  [  731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB  
  active:6040kB inactive:45236kB present:65024kB pages_scanned:292  
  all_unreclaimable? no
  [  731.874363] lowmem_reserve[]: 0 0
  [  731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB  
  1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB
  [  731.887669] Free swap:0kB
  [  731.893913] 16384 pages of RAM
  [  731.896963] 798 reserved pages
  [  731.900011] 10946 pages shared
  [  731.903058] 0 pages swap cached
 
  It seems like usb-storage and aio are completely off in the weeds.   
  Ideas?
 
 It seems usb-storage should remove some kmalloc and use mempool() for
 urb...  Is someone working on this? And idea?

I think Pete said that we're supposed to be using GFP_NOIO in there.

Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO
and GFP_NOFS allocations, which is a bug.  Because if the caller holds
locks which prevent filesystem or IO progress, we deadlock.

I'll fix the VM if someone else fixes USB ;)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-20 Thread OGAWA Hirofumi
Kumar Gala <[EMAIL PROTECTED]> writes:

> On Feb 19, 2007, at 4:19 PM, OGAWA Hirofumi wrote:
>
>> Kumar Gala <[EMAIL PROTECTED]> writes:
>>
>>> Once the system locks up I dont have any ability to do anything.
>>
>> Ah, doesn't sysrq also work? If sysrq work, it can use to see IO
>> request state with a patch.
>
> Yeah, got sysrq working today.  If you can point me at the patch I  
> happy to apply it and get data.

Ok, please try attached patch. I hope it helps you.
BTW, new sysrq is sysrq-j, and it will show disk stats.
-- 
OGAWA Hirofumi <[EMAIL PROTECTED]>



Signed-off-by: OGAWA Hirofumi <[EMAIL PROTECTED]>
---

 block/genhd.c|   27 +++
 drivers/char/sysrq.c |   15 ++-
 2 files changed, 41 insertions(+), 1 deletion(-)

diff -puN drivers/char/sysrq.c~debug-block drivers/char/sysrq.c
--- linux-2.6/drivers/char/sysrq.c~debug-block	2007-02-21 00:58:35.0 +0900
+++ linux-2.6-hirofumi/drivers/char/sysrq.c	2007-02-21 02:02:52.0 +0900
@@ -311,6 +311,19 @@ static struct sysrq_key_op sysrq_kill_op
 	.enable_mask	= SYSRQ_ENABLE_SIGNAL,
 };
 
+extern void block_req_callback(struct work_struct *ignored);
+static DECLARE_WORK(block_req_work, block_req_callback);
+static void sysrq_handle_block_req(int key, struct tty_struct *tty)
+{
+	schedule_work(_req_work);
+}
+static struct sysrq_key_op sysrq_block_req_op = {
+	.handler	= sysrq_handle_block_req,
+	.help_msg	= "block req (j)",
+	.action_msg	= "Block Req",
+	.enable_mask	= SYSRQ_ENABLE_DUMP,
+};
+
 static void sysrq_handle_unrt(int key, struct tty_struct *tty)
 {
 	normalize_rt_tasks();
@@ -351,7 +364,7 @@ static struct sysrq_key_op *sysrq_key_ta
 	NULL,/* g */
 	NULL,/* h */
 	_kill_op,			/* i */
-	NULL,/* j */
+	_block_req_op,		/* j */
 	_SAK_op,			/* k */
 	NULL,/* l */
 	_showmem_op,		/* m */
diff -puN block/genhd.c~debug-block block/genhd.c
--- linux-2.6/block/genhd.c~debug-block	2007-02-21 01:02:13.0 +0900
+++ linux-2.6-hirofumi/block/genhd.c	2007-02-21 02:15:56.0 +0900
@@ -555,6 +555,33 @@ static struct kset_uevent_ops block_ueve
 
 decl_subsys(block, _block, _uevent_ops);
 
+void block_req_callback(struct work_struct *ignored)
+{
+	struct gendisk *gp;
+	char buf[BDEVNAME_SIZE];
+
+	mutex_lock(_subsys_lock);
+	list_for_each_entry(gp, _subsys.kset.list, kobj.entry) {
+		printk("%4d %4d %s %lu %lu %llu %u %lu %lu %llu %u %u %u %u:"
+		   " %u %u %u\n",
+		   gp->major, gp->first_minor, disk_name(gp, 0, buf),
+		   disk_stat_read(gp, ios[0]),
+		   disk_stat_read(gp, merges[0]),
+		   (unsigned long long)disk_stat_read(gp, sectors[0]),
+		   jiffies_to_msecs(disk_stat_read(gp, ticks[0])),
+		   disk_stat_read(gp, ios[1]),
+		   disk_stat_read(gp, merges[1]),
+		   (unsigned long long)disk_stat_read(gp, sectors[1]),
+		   jiffies_to_msecs(disk_stat_read(gp, ticks[1])),
+		   gp->in_flight,
+		   jiffies_to_msecs(disk_stat_read(gp, io_ticks)),
+		   jiffies_to_msecs(disk_stat_read(gp, time_in_queue)),
+		   gp->queue->rq.count[0], gp->queue->rq.count[1],
+		   gp->queue->in_flight);
+	}
+	mutex_unlock(_subsys_lock);
+}
+
 /*
  * aggregate disk stat collector.  Uses the same stats that the sysfs
  * entries do, above, but makes them available through one seq_file.
_


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-20 Thread OGAWA Hirofumi
Kumar Gala [EMAIL PROTECTED] writes:

 On Feb 19, 2007, at 4:19 PM, OGAWA Hirofumi wrote:

 Kumar Gala [EMAIL PROTECTED] writes:

 Once the system locks up I dont have any ability to do anything.

 Ah, doesn't sysrq also work? If sysrq work, it can use to see IO
 request state with a patch.

 Yeah, got sysrq working today.  If you can point me at the patch I  
 happy to apply it and get data.

Ok, please try attached patch. I hope it helps you.
BTW, new sysrq is sysrq-j, and it will show disk stats.
-- 
OGAWA Hirofumi [EMAIL PROTECTED]



Signed-off-by: OGAWA Hirofumi [EMAIL PROTECTED]
---

 block/genhd.c|   27 +++
 drivers/char/sysrq.c |   15 ++-
 2 files changed, 41 insertions(+), 1 deletion(-)

diff -puN drivers/char/sysrq.c~debug-block drivers/char/sysrq.c
--- linux-2.6/drivers/char/sysrq.c~debug-block	2007-02-21 00:58:35.0 +0900
+++ linux-2.6-hirofumi/drivers/char/sysrq.c	2007-02-21 02:02:52.0 +0900
@@ -311,6 +311,19 @@ static struct sysrq_key_op sysrq_kill_op
 	.enable_mask	= SYSRQ_ENABLE_SIGNAL,
 };
 
+extern void block_req_callback(struct work_struct *ignored);
+static DECLARE_WORK(block_req_work, block_req_callback);
+static void sysrq_handle_block_req(int key, struct tty_struct *tty)
+{
+	schedule_work(block_req_work);
+}
+static struct sysrq_key_op sysrq_block_req_op = {
+	.handler	= sysrq_handle_block_req,
+	.help_msg	= block req (j),
+	.action_msg	= Block Req,
+	.enable_mask	= SYSRQ_ENABLE_DUMP,
+};
+
 static void sysrq_handle_unrt(int key, struct tty_struct *tty)
 {
 	normalize_rt_tasks();
@@ -351,7 +364,7 @@ static struct sysrq_key_op *sysrq_key_ta
 	NULL,/* g */
 	NULL,/* h */
 	sysrq_kill_op,			/* i */
-	NULL,/* j */
+	sysrq_block_req_op,		/* j */
 	sysrq_SAK_op,			/* k */
 	NULL,/* l */
 	sysrq_showmem_op,		/* m */
diff -puN block/genhd.c~debug-block block/genhd.c
--- linux-2.6/block/genhd.c~debug-block	2007-02-21 01:02:13.0 +0900
+++ linux-2.6-hirofumi/block/genhd.c	2007-02-21 02:15:56.0 +0900
@@ -555,6 +555,33 @@ static struct kset_uevent_ops block_ueve
 
 decl_subsys(block, ktype_block, block_uevent_ops);
 
+void block_req_callback(struct work_struct *ignored)
+{
+	struct gendisk *gp;
+	char buf[BDEVNAME_SIZE];
+
+	mutex_lock(block_subsys_lock);
+	list_for_each_entry(gp, block_subsys.kset.list, kobj.entry) {
+		printk(%4d %4d %s %lu %lu %llu %u %lu %lu %llu %u %u %u %u:
+		%u %u %u\n,
+		   gp-major, gp-first_minor, disk_name(gp, 0, buf),
+		   disk_stat_read(gp, ios[0]),
+		   disk_stat_read(gp, merges[0]),
+		   (unsigned long long)disk_stat_read(gp, sectors[0]),
+		   jiffies_to_msecs(disk_stat_read(gp, ticks[0])),
+		   disk_stat_read(gp, ios[1]),
+		   disk_stat_read(gp, merges[1]),
+		   (unsigned long long)disk_stat_read(gp, sectors[1]),
+		   jiffies_to_msecs(disk_stat_read(gp, ticks[1])),
+		   gp-in_flight,
+		   jiffies_to_msecs(disk_stat_read(gp, io_ticks)),
+		   jiffies_to_msecs(disk_stat_read(gp, time_in_queue)),
+		   gp-queue-rq.count[0], gp-queue-rq.count[1],
+		   gp-queue-in_flight);
+	}
+	mutex_unlock(block_subsys_lock);
+}
+
 /*
  * aggregate disk stat collector.  Uses the same stats that the sysfs
  * entries do, above, but makes them available through one seq_file.
_


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread Robert Hancock

Kumar Gala wrote:
[  497.499249] usb-storage   D  0   671  5   
773   670 (L-TLB)

[  497.506930] Call Trace:
[  497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40
[  497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc
[  497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0
[  497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54
[  497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c
[  497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84
[  497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4
[  497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc
[  497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0
[  497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694
[  497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc
[  497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c
[  497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8
[  497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/0x138
[  497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310
[  497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/0x344
[  497.601303] [C3F35F50] [C0166B2C] 
usb_stor_transparent_scsi_command+0x10/0x20

[  497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290
[  497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130
[  497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60


This seems like a problem, the usb-storage thread is trying to allocate 
some memory which is ending up waiting for VM writeout, which obviously 
won't proceed since this thread is the one that needs to do this.. It 
looks like the allocation in usb_stor_bulk_transfer_sglist is done with 
GFP_NOIO, so I wonder why we're getting into this state?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread Kumar Gala


On Feb 19, 2007, at 4:19 PM, OGAWA Hirofumi wrote:


Kumar Gala <[EMAIL PROTECTED]> writes:


I usually run the following twice to get the hang state:

time ./trunc_test bar 1 &
time ./trunc_test baz 1 &

I was wondering if anyone had any suggestions on what to poke at  
next

to try and figure out what is going on.


Can you check /sys/block/xxx/stat or something to make sure there is
no outstanding IO request?

It seems to be no response from the lower layer...


Once the system locks up I dont have any ability to do anything.


Ah, doesn't sysrq also work? If sysrq work, it can use to see IO
request state with a patch.


Yeah, got sysrq working today.  If you can point me at the patch I  
happy to apply it and get data.


- k

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread OGAWA Hirofumi
Kumar Gala <[EMAIL PROTECTED]> writes:

>>> I usually run the following twice to get the hang state:
>>>
>>> time ./trunc_test bar 1 &
>>> time ./trunc_test baz 1 &
>>>
>>> I was wondering if anyone had any suggestions on what to poke at next
>>> to try and figure out what is going on.
>>
>> Can you check /sys/block/xxx/stat or something to make sure there is
>> no outstanding IO request?
>>
>> It seems to be no response from the lower layer...
>
> Once the system locks up I dont have any ability to do anything.

Ah, doesn't sysrq also work? If sysrq work, it can use to see IO
request state with a patch.
-- 
OGAWA Hirofumi <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread Kumar Gala


On Feb 18, 2007, at 10:10 AM, OGAWA Hirofumi wrote:


Kumar Gala <[EMAIL PROTECTED]> writes:


I'm seeing an issue with a stock 2.6.20 kernel running on an embedded
PPC.  I've got a usb flash drive plugged in and the filesystem on the
drive is vfat.  Running with 64M and no swap.

If I execute a series of large (100M+) ftruncate() on the disk the
kernel will hang and never return.  It seems to be stuck in the idle
loop().

The following is the test program I'm running:

#include 
#include 
#include 
#include 
#include 
#include 
#include 

void usage (void)
{
 printf ("truncate_test  \n\n");
}

int main(int argc, char *argv[])
{
 int fd, i;
 int ret = 0;
 unsigned int len;

 if (argc != 3) {
 printf("Invalid number of arguments\n\n");
 usage();
 exit(1);
 }

 fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU);
 len = strtoul(argv[2], NULL, 0);

 ret = ftruncate(fd, len);

 if (ret)
 printf ("ftruncate ret = %d %d\n", ret, errno);

 close(fd);

 return ret;
}

I usually run the following twice to get the hang state:

time ./trunc_test bar 1 &
time ./trunc_test baz 1 &

I was wondering if anyone had any suggestions on what to poke at next
to try and figure out what is going on.


So I realized I could use sysrq to provide some more debug  
information.  When the system locks up I get the following output  
from 't'


[  496.901002] Show State
[  496.903356]
[  496.903360]  free 
sibling
[  496.911532]   task PCstack   pid father child  
younger older
[  496.918486] init  S 3009C7EC 0 1  0  
2   (NOTLB)

[  496.926169] Call Trace:
[  496.928611] [C3FC7DA0] [C006F03C] __link_path_walk+0xd24/0x112c  
(unreliable)

[  496.935687] [C3FC7E60] [C00083AC] __switch_to+0x28/0x40
[  496.940931] [C3FC7E80] [C01F4B78] schedule+0x324/0x6bc
[  496.946086] [C3FC7EC0] [C001E164] do_wait+0x700/0x100c
[  496.951242] [C3FC7F40] [C000FAD4] ret_from_syscall+0x0/0x38
[  496.956828] --- Exception: c01 at 0x3009c7ec
[  496.961099] LR = 0x3009c3e0
[  496.964234] ksoftirqd/0   S  0 2   
1 3   (L-TLB)

[  496.971913] Call Trace:
[  496.974355] [C033DE80] [C0133F64] scsi_io_completion+0x74/0x318  
(unreliable)

[  496.981428] [C033DF40] [C00083AC] __switch_to+0x28/0x40
[  496.986664] [C033DF60] [C01F4B78] schedule+0x324/0x6bc
[  496.991811] [C033DFA0] [C00210CC] ksoftirqd+0xfc/0x114
[  496.996960] [C033DFC0] [C0033E48] kthread+0xf4/0x130
[  497.001941] [C033DFF0] [C001093C] kernel_thread+0x44/0x60
[  497.007350] events/0  S  0 3   
1 4 2 (L-TLB)

[  497.015030] Call Trace:
[  497.017472] [C033FEE0] [C00083AC] __switch_to+0x28/0x40
[  497.022707] [C033FF00] [C01F4B78] schedule+0x324/0x6bc
[  497.027855] [C033FF40] [C002F67C] worker_thread+0x144/0x148
[  497.033435] [C033FFC0] [C0033E48] kthread+0xf4/0x130
[  497.038409] [C033FFF0] [C001093C] kernel_thread+0x44/0x60
[  497.043817] khelper   S  0 4   
1 5 3 (L-TLB)

[  497.051497] Call Trace:
[  497.053940] [C3FE1E20] [C3FE] 0xc3fe (unreliable)
[  497.059351] [C3FE1EE0] [C00083AC] __switch_to+0x28/0x40
[  497.064586] [C3FE1F00] [C01F4B78] schedule+0x324/0x6bc
[  497.069734] [C3FE1F40] [C002F67C] worker_thread+0x144/0x148
[  497.075316] [C3FE1FC0] [C0033E48] kthread+0xf4/0x130
[  497.080291] [C3FE1FF0] [C001093C] kernel_thread+0x44/0x60
[  497.085697] kthread   S  0 5  137  
617 4 (L-TLB)

[  497.093378] Call Trace:
[  497.095820] [C3FCBE20] [1032] 0x1032 (unreliable)
[  497.100881] --- Exception: c3fcbef0 at __switch_to+0x28/0x40
[  497.106545] LR = 0xc3fcbef0
[  497.109681] [C3FCBEE0] [C00083AC] __switch_to+0x28/0x40 (unreliable)
[  497.116051] [C3FCBF00] [C01F4B78] schedule+0x324/0x6bc
[  497.121201] [C3FCBF40] [C002F67C] worker_thread+0x144/0x148
[  497.126783] [C3FCBFC0] [C0033E48] kthread+0xf4/0x130
[  497.131758] [C3FCBFF0] [C001093C] kernel_thread+0x44/0x60
[  497.137165] kblockd/0 S  037  5 
41   (L-TLB)

[  497.144845] Call Trace:
[  497.147286] [C3D9FE20] [C3EBF490] 0xc3ebf490 (unreliable)
[  497.152697] [C3D9FEE0] [C00083AC] __switch_to+0x28/0x40
[  497.157933] [C3D9FF00] [C01F4B78] schedule+0x324/0x6bc
[  497.163082] [C3D9FF40] [C002F67C] worker_thread+0x144/0x148
[  497.168663] [C3D9FFC0] [C0033E48] kthread+0xf4/0x130
[  497.173637] [C3D9FFF0] [C001093C] kernel_thread+0x44/0x60
[  497.179045] khubd S  041  5 
5337 (L-TLB)

[  497.186726] Call Trace:
[  497.189167] [C0341E00] [C3F03900] 0xc3f03900 (unreliable)
[  497.194578] [C0341EC0] [C00083AC] __switch_to+0x28/0x40
[  497.199813] [C0341EE0] [C01F4B78] schedule+0x324/0x6bc
[  497.204961] [C0341F20] 

Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread Kumar Gala


On Feb 18, 2007, at 10:10 AM, OGAWA Hirofumi wrote:


Kumar Gala <[EMAIL PROTECTED]> writes:


I'm seeing an issue with a stock 2.6.20 kernel running on an embedded
PPC.  I've got a usb flash drive plugged in and the filesystem on the
drive is vfat.  Running with 64M and no swap.

If I execute a series of large (100M+) ftruncate() on the disk the
kernel will hang and never return.  It seems to be stuck in the idle
loop().

The following is the test program I'm running:

#include 
#include 
#include 
#include 
#include 
#include 
#include 

void usage (void)
{
 printf ("truncate_test  \n\n");
}

int main(int argc, char *argv[])
{
 int fd, i;
 int ret = 0;
 unsigned int len;

 if (argc != 3) {
 printf("Invalid number of arguments\n\n");
 usage();
 exit(1);
 }

 fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU);
 len = strtoul(argv[2], NULL, 0);

 ret = ftruncate(fd, len);

 if (ret)
 printf ("ftruncate ret = %d %d\n", ret, errno);

 close(fd);

 return ret;
}

I usually run the following twice to get the hang state:

time ./trunc_test bar 1 &
time ./trunc_test baz 1 &

I was wondering if anyone had any suggestions on what to poke at next
to try and figure out what is going on.


Can you check /sys/block/xxx/stat or something to make sure there is
no outstanding IO request?

It seems to be no response from the lower layer...


Once the system locks up I dont have any ability to do anything.

- k
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread Kumar Gala


On Feb 18, 2007, at 10:10 AM, OGAWA Hirofumi wrote:


Kumar Gala [EMAIL PROTECTED] writes:


I'm seeing an issue with a stock 2.6.20 kernel running on an embedded
PPC.  I've got a usb flash drive plugged in and the filesystem on the
drive is vfat.  Running with 64M and no swap.

If I execute a series of large (100M+) ftruncate() on the disk the
kernel will hang and never return.  It seems to be stuck in the idle
loop().

The following is the test program I'm running:

#include sys/mman.h
#include sys/types.h
#include sys/stat.h
#include fcntl.h
#include stdio.h
#include unistd.h
#include errno.h

void usage (void)
{
 printf (truncate_test filename size\n\n);
}

int main(int argc, char *argv[])
{
 int fd, i;
 int ret = 0;
 unsigned int len;

 if (argc != 3) {
 printf(Invalid number of arguments\n\n);
 usage();
 exit(1);
 }

 fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU);
 len = strtoul(argv[2], NULL, 0);

 ret = ftruncate(fd, len);

 if (ret)
 printf (ftruncate ret = %d %d\n, ret, errno);

 close(fd);

 return ret;
}

I usually run the following twice to get the hang state:

time ./trunc_test bar 1 
time ./trunc_test baz 1 

I was wondering if anyone had any suggestions on what to poke at next
to try and figure out what is going on.


Can you check /sys/block/xxx/stat or something to make sure there is
no outstanding IO request?

It seems to be no response from the lower layer...


Once the system locks up I dont have any ability to do anything.

- k
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread Kumar Gala


On Feb 18, 2007, at 10:10 AM, OGAWA Hirofumi wrote:


Kumar Gala [EMAIL PROTECTED] writes:


I'm seeing an issue with a stock 2.6.20 kernel running on an embedded
PPC.  I've got a usb flash drive plugged in and the filesystem on the
drive is vfat.  Running with 64M and no swap.

If I execute a series of large (100M+) ftruncate() on the disk the
kernel will hang and never return.  It seems to be stuck in the idle
loop().

The following is the test program I'm running:

#include sys/mman.h
#include sys/types.h
#include sys/stat.h
#include fcntl.h
#include stdio.h
#include unistd.h
#include errno.h

void usage (void)
{
 printf (truncate_test filename size\n\n);
}

int main(int argc, char *argv[])
{
 int fd, i;
 int ret = 0;
 unsigned int len;

 if (argc != 3) {
 printf(Invalid number of arguments\n\n);
 usage();
 exit(1);
 }

 fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU);
 len = strtoul(argv[2], NULL, 0);

 ret = ftruncate(fd, len);

 if (ret)
 printf (ftruncate ret = %d %d\n, ret, errno);

 close(fd);

 return ret;
}

I usually run the following twice to get the hang state:

time ./trunc_test bar 1 
time ./trunc_test baz 1 

I was wondering if anyone had any suggestions on what to poke at next
to try and figure out what is going on.


So I realized I could use sysrq to provide some more debug  
information.  When the system locks up I get the following output  
from 't'


[  496.901002] Show State
[  496.903356]
[  496.903360]  free 
sibling
[  496.911532]   task PCstack   pid father child  
younger older
[  496.918486] init  S 3009C7EC 0 1  0  
2   (NOTLB)

[  496.926169] Call Trace:
[  496.928611] [C3FC7DA0] [C006F03C] __link_path_walk+0xd24/0x112c  
(unreliable)

[  496.935687] [C3FC7E60] [C00083AC] __switch_to+0x28/0x40
[  496.940931] [C3FC7E80] [C01F4B78] schedule+0x324/0x6bc
[  496.946086] [C3FC7EC0] [C001E164] do_wait+0x700/0x100c
[  496.951242] [C3FC7F40] [C000FAD4] ret_from_syscall+0x0/0x38
[  496.956828] --- Exception: c01 at 0x3009c7ec
[  496.961099] LR = 0x3009c3e0
[  496.964234] ksoftirqd/0   S  0 2   
1 3   (L-TLB)

[  496.971913] Call Trace:
[  496.974355] [C033DE80] [C0133F64] scsi_io_completion+0x74/0x318  
(unreliable)

[  496.981428] [C033DF40] [C00083AC] __switch_to+0x28/0x40
[  496.986664] [C033DF60] [C01F4B78] schedule+0x324/0x6bc
[  496.991811] [C033DFA0] [C00210CC] ksoftirqd+0xfc/0x114
[  496.996960] [C033DFC0] [C0033E48] kthread+0xf4/0x130
[  497.001941] [C033DFF0] [C001093C] kernel_thread+0x44/0x60
[  497.007350] events/0  S  0 3   
1 4 2 (L-TLB)

[  497.015030] Call Trace:
[  497.017472] [C033FEE0] [C00083AC] __switch_to+0x28/0x40
[  497.022707] [C033FF00] [C01F4B78] schedule+0x324/0x6bc
[  497.027855] [C033FF40] [C002F67C] worker_thread+0x144/0x148
[  497.033435] [C033FFC0] [C0033E48] kthread+0xf4/0x130
[  497.038409] [C033FFF0] [C001093C] kernel_thread+0x44/0x60
[  497.043817] khelper   S  0 4   
1 5 3 (L-TLB)

[  497.051497] Call Trace:
[  497.053940] [C3FE1E20] [C3FE] 0xc3fe (unreliable)
[  497.059351] [C3FE1EE0] [C00083AC] __switch_to+0x28/0x40
[  497.064586] [C3FE1F00] [C01F4B78] schedule+0x324/0x6bc
[  497.069734] [C3FE1F40] [C002F67C] worker_thread+0x144/0x148
[  497.075316] [C3FE1FC0] [C0033E48] kthread+0xf4/0x130
[  497.080291] [C3FE1FF0] [C001093C] kernel_thread+0x44/0x60
[  497.085697] kthread   S  0 5  137  
617 4 (L-TLB)

[  497.093378] Call Trace:
[  497.095820] [C3FCBE20] [1032] 0x1032 (unreliable)
[  497.100881] --- Exception: c3fcbef0 at __switch_to+0x28/0x40
[  497.106545] LR = 0xc3fcbef0
[  497.109681] [C3FCBEE0] [C00083AC] __switch_to+0x28/0x40 (unreliable)
[  497.116051] [C3FCBF00] [C01F4B78] schedule+0x324/0x6bc
[  497.121201] [C3FCBF40] [C002F67C] worker_thread+0x144/0x148
[  497.126783] [C3FCBFC0] [C0033E48] kthread+0xf4/0x130
[  497.131758] [C3FCBFF0] [C001093C] kernel_thread+0x44/0x60
[  497.137165] kblockd/0 S  037  5 
41   (L-TLB)

[  497.144845] Call Trace:
[  497.147286] [C3D9FE20] [C3EBF490] 0xc3ebf490 (unreliable)
[  497.152697] [C3D9FEE0] [C00083AC] __switch_to+0x28/0x40
[  497.157933] [C3D9FF00] [C01F4B78] schedule+0x324/0x6bc
[  497.163082] [C3D9FF40] [C002F67C] worker_thread+0x144/0x148
[  497.168663] [C3D9FFC0] [C0033E48] kthread+0xf4/0x130
[  497.173637] [C3D9FFF0] [C001093C] kernel_thread+0x44/0x60
[  497.179045] khubd S  041  5 
5337 (L-TLB)

[  497.186726] Call Trace:
[  497.189167] [C0341E00] [C3F03900] 0xc3f03900 (unreliable)
[  497.194578] [C0341EC0] [C00083AC] __switch_to+0x28/0x40
[  497.199813] 

Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread OGAWA Hirofumi
Kumar Gala [EMAIL PROTECTED] writes:

 I usually run the following twice to get the hang state:

 time ./trunc_test bar 1 
 time ./trunc_test baz 1 

 I was wondering if anyone had any suggestions on what to poke at next
 to try and figure out what is going on.

 Can you check /sys/block/xxx/stat or something to make sure there is
 no outstanding IO request?

 It seems to be no response from the lower layer...

 Once the system locks up I dont have any ability to do anything.

Ah, doesn't sysrq also work? If sysrq work, it can use to see IO
request state with a patch.
-- 
OGAWA Hirofumi [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread Kumar Gala


On Feb 19, 2007, at 4:19 PM, OGAWA Hirofumi wrote:


Kumar Gala [EMAIL PROTECTED] writes:


I usually run the following twice to get the hang state:

time ./trunc_test bar 1 
time ./trunc_test baz 1 

I was wondering if anyone had any suggestions on what to poke at  
next

to try and figure out what is going on.


Can you check /sys/block/xxx/stat or something to make sure there is
no outstanding IO request?

It seems to be no response from the lower layer...


Once the system locks up I dont have any ability to do anything.


Ah, doesn't sysrq also work? If sysrq work, it can use to see IO
request state with a patch.


Yeah, got sysrq working today.  If you can point me at the patch I  
happy to apply it and get data.


- k

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-19 Thread Robert Hancock

Kumar Gala wrote:
[  497.499249] usb-storage   D  0   671  5   
773   670 (L-TLB)

[  497.506930] Call Trace:
[  497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40
[  497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc
[  497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0
[  497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54
[  497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c
[  497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84
[  497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4
[  497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc
[  497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0
[  497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694
[  497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc
[  497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c
[  497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8
[  497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/0x138
[  497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310
[  497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/0x344
[  497.601303] [C3F35F50] [C0166B2C] 
usb_stor_transparent_scsi_command+0x10/0x20

[  497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290
[  497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130
[  497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60


This seems like a problem, the usb-storage thread is trying to allocate 
some memory which is ending up waiting for VM writeout, which obviously 
won't proceed since this thread is the one that needs to do this.. It 
looks like the allocation in usb_stor_bulk_transfer_sglist is done with 
GFP_NOIO, so I wonder why we're getting into this state?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-18 Thread OGAWA Hirofumi
Kumar Gala <[EMAIL PROTECTED]> writes:

> I'm seeing an issue with a stock 2.6.20 kernel running on an embedded  
> PPC.  I've got a usb flash drive plugged in and the filesystem on the  
> drive is vfat.  Running with 64M and no swap.
>
> If I execute a series of large (100M+) ftruncate() on the disk the  
> kernel will hang and never return.  It seems to be stuck in the idle  
> loop().
>
> The following is the test program I'm running:
>
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
>
> void usage (void)
> {
>  printf ("truncate_test  \n\n");
> }
>
> int main(int argc, char *argv[])
> {
>  int fd, i;
>  int ret = 0;
>  unsigned int len;
>
>  if (argc != 3) {
>  printf("Invalid number of arguments\n\n");
>  usage();
>  exit(1);
>  }
>
>  fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU);
>  len = strtoul(argv[2], NULL, 0);
>
>  ret = ftruncate(fd, len);
>
>  if (ret)
>  printf ("ftruncate ret = %d %d\n", ret, errno);
>
>  close(fd);
>
>  return ret;
> }
>
> I usually run the following twice to get the hang state:
>
> time ./trunc_test bar 1 &
> time ./trunc_test baz 1 &
>
> I was wondering if anyone had any suggestions on what to poke at next  
> to try and figure out what is going on.

Can you check /sys/block/xxx/stat or something to make sure there is
no outstanding IO request?

It seems to be no response from the lower layer...
-- 
OGAWA Hirofumi <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-18 Thread OGAWA Hirofumi
Kumar Gala [EMAIL PROTECTED] writes:

 I'm seeing an issue with a stock 2.6.20 kernel running on an embedded  
 PPC.  I've got a usb flash drive plugged in and the filesystem on the  
 drive is vfat.  Running with 64M and no swap.

 If I execute a series of large (100M+) ftruncate() on the disk the  
 kernel will hang and never return.  It seems to be stuck in the idle  
 loop().

 The following is the test program I'm running:

 #include sys/mman.h
 #include sys/types.h
 #include sys/stat.h
 #include fcntl.h
 #include stdio.h
 #include unistd.h
 #include errno.h

 void usage (void)
 {
  printf (truncate_test filename size\n\n);
 }

 int main(int argc, char *argv[])
 {
  int fd, i;
  int ret = 0;
  unsigned int len;

  if (argc != 3) {
  printf(Invalid number of arguments\n\n);
  usage();
  exit(1);
  }

  fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU);
  len = strtoul(argv[2], NULL, 0);

  ret = ftruncate(fd, len);

  if (ret)
  printf (ftruncate ret = %d %d\n, ret, errno);

  close(fd);

  return ret;
 }

 I usually run the following twice to get the hang state:

 time ./trunc_test bar 1 
 time ./trunc_test baz 1 

 I was wondering if anyone had any suggestions on what to poke at next  
 to try and figure out what is going on.

Can you check /sys/block/xxx/stat or something to make sure there is
no outstanding IO request?

It seems to be no response from the lower layer...
-- 
OGAWA Hirofumi [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-16 Thread Kumar Gala


On Feb 16, 2007, at 5:10 PM, Robert Hancock wrote:


Kumar Gala wrote:
I'm seeing an issue with a stock 2.6.20 kernel running on an  
embedded PPC.  I've got a usb flash drive plugged in and the  
filesystem on the drive is vfat.  Running with 64M and no swap.
If I execute a series of large (100M+) ftruncate() on the disk the  
kernel will hang and never return.  It seems to be stuck in the  
idle loop().


On FAT filesystems this forces the entire file contents of that  
size to be written out with zeros. Are you sure the kernel just  
isn't busy writing out all that data to the disk?


I'm pretty sure, seeing as if I run the test it takes maybe 20-30  
seconds to create the file if it succeeds.  However, I've weighted 10  
minutes and still no prompt.


I'm also able to break in with a HW debugger and am always in the  
idle loop.


- k
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-16 Thread Robert Hancock

Kumar Gala wrote:
I'm seeing an issue with a stock 2.6.20 kernel running on an embedded 
PPC.  I've got a usb flash drive plugged in and the filesystem on the 
drive is vfat.  Running with 64M and no swap.


If I execute a series of large (100M+) ftruncate() on the disk the 
kernel will hang and never return.  It seems to be stuck in the idle 
loop().


On FAT filesystems this forces the entire file contents of that size to 
be written out with zeros. Are you sure the kernel just isn't busy 
writing out all that data to the disk?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-16 Thread Robert Hancock

Kumar Gala wrote:
I'm seeing an issue with a stock 2.6.20 kernel running on an embedded 
PPC.  I've got a usb flash drive plugged in and the filesystem on the 
drive is vfat.  Running with 64M and no swap.


If I execute a series of large (100M+) ftruncate() on the disk the 
kernel will hang and never return.  It seems to be stuck in the idle 
loop().


On FAT filesystems this forces the entire file contents of that size to 
be written out with zeros. Are you sure the kernel just isn't busy 
writing out all that data to the disk?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate

2007-02-16 Thread Kumar Gala


On Feb 16, 2007, at 5:10 PM, Robert Hancock wrote:


Kumar Gala wrote:
I'm seeing an issue with a stock 2.6.20 kernel running on an  
embedded PPC.  I've got a usb flash drive plugged in and the  
filesystem on the drive is vfat.  Running with 64M and no swap.
If I execute a series of large (100M+) ftruncate() on the disk the  
kernel will hang and never return.  It seems to be stuck in the  
idle loop().


On FAT filesystems this forces the entire file contents of that  
size to be written out with zeros. Are you sure the kernel just  
isn't busy writing out all that data to the disk?


I'm pretty sure, seeing as if I run the test it takes maybe 20-30  
seconds to create the file if it succeeds.  However, I've weighted 10  
minutes and still no prompt.


I'm also able to break in with a HW debugger and am always in the  
idle loop.


- k
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/