Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Linus Torvalds wrote: > On Fri, 22 Sep 2000, Molnar Ingo wrote: > > > > i'm still getting VM related lockups during heavy write load, in > > test9-pre5 + your 2.4.0-t9p2-vmpatch (which i understand as being your > > last VM related fix-patch, correct?). Here is a histogram of such a > > lockup: > > those VM patches are going away RSN if these issues do not get > fixed. I'm really disappointed, and suspect that it would be > easier to go back to the old VM with just page aging added, not > your new code that seems to be full of deadlocks everywhere. I've been away on a conference last week, so I haven't had much chance to take a look at the code after you integrated it and the test base got increased ;( One thing I discovered are some UP-only deadlocks and the page ping-pong thing, which I am fixing right now. If I had a choice, I'd have chosen /next/ week as the time to integrate the code ... doing this while I'm away at a conference was really inconvenient ;) I'm looking into the email backlog and the bug reports right now (today, tuesday and wednesday I'm at /another/ conferenc and thursday will be the next opportunity). It looks like ther are no fundamental issues left, just a bunch of small thinkos that can be fixed in a (few?) week(s). regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Linus Torvalds wrote: On Fri, 22 Sep 2000, Molnar Ingo wrote: i'm still getting VM related lockups during heavy write load, in test9-pre5 + your 2.4.0-t9p2-vmpatch (which i understand as being your last VM related fix-patch, correct?). Here is a histogram of such a lockup: those VM patches are going away RSN if these issues do not get fixed. I'm really disappointed, and suspect that it would be easier to go back to the old VM with just page aging added, not your new code that seems to be full of deadlocks everywhere. I've been away on a conference last week, so I haven't had much chance to take a look at the code after you integrated it and the test base got increased ;( One thing I discovered are some UP-only deadlocks and the page ping-pong thing, which I am fixing right now. If I had a choice, I'd have chosen /next/ week as the time to integrate the code ... doing this while I'm away at a conference was really inconvenient ;) I'm looking into the email backlog and the bug reports right now (today, tuesday and wednesday I'm at /another/ conferenc and thursday will be the next opportunity). It looks like ther are no fundamental issues left, just a bunch of small thinkos that can be fixed in a (few?) week(s). regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Molnar Ingo wrote: > > i'm still getting VM related lockups during heavy write load, in > test9-pre5 + your 2.4.0-t9p2-vmpatch (which i understand as being your > last VM related fix-patch, correct?). Here is a histogram of such a > lockup: Rik, those VM patches are going away RSN if these issues do not get fixed. I'm really disappointed, and suspect that it would be easier to go back to the old VM with just page aging added, not your new code that seems to be full of deadlocks everywhere. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
If the process that barfed is swapper then this is the oops that I got in test9-pre4 w/o any patches. http://marc.theaimsgroup.com/?l=linux-kernel=96936789621245=2 On Fri, 22 Sep 2000, André Dahlqvist wrote: > On Fri, Sep 22, 2000 at 07:27:30AM -0300, Rik van Riel wrote: > > > Linus, > > > > could you please include this patch in the next > > pre patch? > > Rik, > > I just had an oops with this patch applied. I ran into BUG at > buffer.c:730. The machine was not under load when the oops occured, I > was just reading e-mail in Mutt. I had to type the oops down by hand, > but I will provide ksymoops output soon if you need it. > -- = Mohammad A. Haque http://www.haque.net/ [EMAIL PROTECTED] "Alcohol and calculus don't mix. Project Lead Don't drink and derive." --Unknown http://wm.themes.org/ [EMAIL PROTECTED] = - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensive workload
> I had to type the oops down by hand, but I will provide ksymoops > output soon if you need it. Let's hope I typed down the oops from the screen without misstakes. Here is the ksymoops output: ksymoops 2.3.4 on i586 2.4.0-test9. Options used -V (default) -k 2922143001.ksyms (specified) -l 2922143001.modules (specified) -o /lib/modules/2.4.0-test9/ (default) -m /boot/System.map-2.4.0-test9 (default) invalid operand: CPU:0 EIP:0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010086 eax: 001c ebx: c31779e0 ecx: edx: 0082 esi: c11f6f80 edi: 0008 ebp: 0001 esp: c01f3eec ds: 0018 es: 0018 ss: 0018 Process swapper (pid:0, stackpage=c01f3000) Stack: c01bb465 c01bb79a 02da c0150d3f e31779e0 0001 c11f6480 0046 c1168360 c0248460 c01684e3 c11f6f80 0001 c0248584 c11f6f80 c02484a0 c016e563 0001 c1168360 c02484a0 c1168360 0286 c0169cc7 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 0f 0b 83 c4 0c c3 57 56 53 86 74 24 10 8b 54 24 14 85 d2 74 >>EIP; c012c1be<= Trace; c01bb4b5 Trace; c01bb79a Trace; c0150d3f Trace; c01684e3 Trace; c016e563 Trace; c0169cc7 Trace; c016e500 Trace; c010a02c Trace; c010a18e Trace; c0107120 Trace; c0108de0 Trace; c0107120 Trace; c0107143 Trace; c01071a7 Trace; c0105000 Trace; c0100192 Code; c012c1be <_EIP>: Code; c012c1be<= 0: 0f 0b ud2a <= Code; c012c1c0 2: 83 c4 0c add$0xc,%esp Code; c012c1c3 5: c3ret Code; c012c1c4 6: 57push %edi Code; c012c1c5 7: 56push %esi Code; c012c1c6 8: 53push %ebx Code; c012c1c7 9: 86 74 24 10 xchg %dh,0x10(%esp,1) Code; c012c1cb d: 8b 54 24 14 mov0x14(%esp,1),%edx Code; c012c1cf 11: 85 d2 test %edx,%edx Code; c012c1d1 13: 74 00 je 15 <_EIP+0x15> c012c1d3 Aiee, killing interrupt handler Kernel panic: Attempted to kill the idle task! -- // André - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensive workload
On Fri, Sep 22, 2000 at 07:27:30AM -0300, Rik van Riel wrote: > Linus, > > could you please include this patch in the next > pre patch? Rik, I just had an oops with this patch applied. I ran into BUG at buffer.c:730. The machine was not under load when the oops occured, I was just reading e-mail in Mutt. I had to type the oops down by hand, but I will provide ksymoops output soon if you need it. -- // André - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Molnar Ingo wrote: > yep this has done the trick, the deadlock is gone. I've attached the full > VM-fixes patch (this fix included) against vanilla test9-pre5. Linus, could you please include this patch in the next pre patch? (in the mean time, I'll go back to looking at the balancing thing with shared memory ... which is unrelated to this deadlock problem) thanks, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ --- linux/fs/buffer.c.orig Fri Sep 22 02:31:07 2000 +++ linux/fs/buffer.c Fri Sep 22 02:31:13 2000 @@ -706,9 +706,7 @@ static void refill_freelist(int size) { if (!grow_buffers(size)) { - balance_dirty(NODEV); - wakeup_kswapd(0); /* We can't wait because of __GFP_IO */ - schedule(); + try_to_free_pages(GFP_BUFFER); } } --- linux/mm/filemap.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/filemap.c Fri Sep 22 02:31:13 2000 @@ -255,7 +255,7 @@ * up kswapd. */ age_page_up(page); - if (inactive_shortage() > (inactive_target * 3) / 4) + if (inactive_shortage() > inactive_target / 2 && free_shortage()) wakeup_kswapd(0); not_found: return page; --- linux/mm/page_alloc.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/page_alloc.c Fri Sep 22 02:31:13 2000 @@ -444,7 +444,8 @@ * processes, etc). */ if (gfp_mask & __GFP_WAIT) { - wakeup_kswapd(1); + try_to_free_pages(gfp_mask); + memory_pressure++; goto try_again; } } --- linux/mm/swap.c.origFri Sep 22 02:31:07 2000 +++ linux/mm/swap.c Fri Sep 22 02:31:13 2000 @@ -233,27 +233,11 @@ spin_lock(_lru_lock); if (!PageLocked(page)) BUG(); - /* -* Heisenbug Compensator(tm) -* This bug shouldn't trigger, but for unknown reasons it -* sometimes does. If there are no signs of list corruption, -* we ignore the problem. Else we BUG()... -*/ - if (PageActive(page) || PageInactiveDirty(page) || - PageInactiveClean(page)) { - struct list_head * page_lru = >lru; - if (page_lru->next->prev != page_lru) { - printk("VM: lru_cache_add, bit or list corruption..\n"); - BUG(); - } - printk("VM: lru_cache_add, page already in list!\n"); - goto page_already_on_list; - } + DEBUG_ADD_PAGE add_page_to_active_list(page); /* This should be relatively rare */ if (!page->age) deactivate_page_nolock(page); -page_already_on_list: spin_unlock(_lru_lock); } --- linux/mm/vmscan.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/vmscan.c Fri Sep 22 02:31:27 2000 @@ -377,7 +377,7 @@ #define SWAP_SHIFT 5 #define SWAP_MIN 8 -static int swap_out(unsigned int priority, int gfp_mask) +static int swap_out(unsigned int priority, int gfp_mask, unsigned long idle_time) { struct task_struct * p; int counter; @@ -407,6 +407,7 @@ struct mm_struct *best = NULL; int pid = 0; int assign = 0; + int found_task = 0; select: read_lock(_lock); p = init_task.next_task; @@ -416,6 +417,11 @@ continue; if (mm->rss <= 0) continue; + /* Skip tasks which haven't slept long enough yet when +idle-swapping. */ + if (idle_time && !assign && (!(p->state & TASK_INTERRUPTIBLE) +|| + time_before(p->sleep_time + idle_time * HZ, +jiffies))) + continue; + found_task++; /* Refresh swap_cnt? */ if (assign == 1) { mm->swap_cnt = (mm->rss >> SWAP_SHIFT); @@ -430,7 +436,7 @@ } read_unlock(_lock); if (!best) { - if (!assign) { + if (!assign && found_task > 0) { assign = 1; goto select; } @@ -691,9 +697,9 @@ * Now the page is really freeable, so we * move it to the inactive_clean list.
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
yep this has done the trick, the deadlock is gone. I've attached the full VM-fixes patch (this fix included) against vanilla test9-pre5. Ingo --- linux/fs/buffer.c.orig Fri Sep 22 02:31:07 2000 +++ linux/fs/buffer.c Fri Sep 22 02:31:13 2000 @@ -706,9 +706,7 @@ static void refill_freelist(int size) { if (!grow_buffers(size)) { - balance_dirty(NODEV); - wakeup_kswapd(0); /* We can't wait because of __GFP_IO */ - schedule(); + try_to_free_pages(GFP_BUFFER); } } --- linux/mm/filemap.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/filemap.c Fri Sep 22 02:31:13 2000 @@ -255,7 +255,7 @@ * up kswapd. */ age_page_up(page); - if (inactive_shortage() > (inactive_target * 3) / 4) + if (inactive_shortage() > inactive_target / 2 && free_shortage()) wakeup_kswapd(0); not_found: return page; --- linux/mm/page_alloc.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/page_alloc.c Fri Sep 22 02:31:13 2000 @@ -444,7 +444,8 @@ * processes, etc). */ if (gfp_mask & __GFP_WAIT) { - wakeup_kswapd(1); + try_to_free_pages(gfp_mask); + memory_pressure++; goto try_again; } } --- linux/mm/swap.c.origFri Sep 22 02:31:07 2000 +++ linux/mm/swap.c Fri Sep 22 02:31:13 2000 @@ -233,27 +233,11 @@ spin_lock(_lru_lock); if (!PageLocked(page)) BUG(); - /* -* Heisenbug Compensator(tm) -* This bug shouldn't trigger, but for unknown reasons it -* sometimes does. If there are no signs of list corruption, -* we ignore the problem. Else we BUG()... -*/ - if (PageActive(page) || PageInactiveDirty(page) || - PageInactiveClean(page)) { - struct list_head * page_lru = >lru; - if (page_lru->next->prev != page_lru) { - printk("VM: lru_cache_add, bit or list corruption..\n"); - BUG(); - } - printk("VM: lru_cache_add, page already in list!\n"); - goto page_already_on_list; - } + DEBUG_ADD_PAGE add_page_to_active_list(page); /* This should be relatively rare */ if (!page->age) deactivate_page_nolock(page); -page_already_on_list: spin_unlock(_lru_lock); } --- linux/mm/vmscan.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/vmscan.c Fri Sep 22 02:31:27 2000 @@ -377,7 +377,7 @@ #define SWAP_SHIFT 5 #define SWAP_MIN 8 -static int swap_out(unsigned int priority, int gfp_mask) +static int swap_out(unsigned int priority, int gfp_mask, unsigned long idle_time) { struct task_struct * p; int counter; @@ -407,6 +407,7 @@ struct mm_struct *best = NULL; int pid = 0; int assign = 0; + int found_task = 0; select: read_lock(_lock); p = init_task.next_task; @@ -416,6 +417,11 @@ continue; if (mm->rss <= 0) continue; + /* Skip tasks which haven't slept long enough yet when +idle-swapping. */ + if (idle_time && !assign && (!(p->state & TASK_INTERRUPTIBLE) +|| + time_before(p->sleep_time + idle_time * HZ, +jiffies))) + continue; + found_task++; /* Refresh swap_cnt? */ if (assign == 1) { mm->swap_cnt = (mm->rss >> SWAP_SHIFT); @@ -430,7 +436,7 @@ } read_unlock(_lock); if (!best) { - if (!assign) { + if (!assign && found_task > 0) { assign = 1; goto select; } @@ -691,9 +697,9 @@ * Now the page is really freeable, so we * move it to the inactive_clean list. */ - UnlockPage(page); del_page_from_inactive_dirty_list(page); add_page_to_inactive_clean_list(page); + UnlockPage(page); cleaned_pages++; } else { /* @@ -701,9 +707,9 @@ * It's no use keeping it here, so we move it to
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Rik van Riel wrote: > 894 if (current->need_resched && !(gfp_mask & __GFP_IO)) { > 895 __set_current_state(TASK_RUNNING); > 896 schedule(); > 897 } > The idea was to not allow processes which have IO locks > to schedule away, but as you can see, the check is > reversed ... thanks ... sounds good. Will have this tested in about 15 mins. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Molnar Ingo wrote: > i'm still getting VM related lockups during heavy write load, in > test9-pre5 + your 2.4.0-t9p2-vmpatch (which i understand as being your > last VM related fix-patch, correct?). Here is a histogram of such a > lockup: > this lockup happens both during vanilla test9-pre5 and with > 2.4.0-t9p2-vmpatch. Your patch makes the lockup happen a bit > later than previous, but it still happens. During the lockup all > dirty buffers are written out to disk until it reaches such a > state: It seems that conference life has taken its toll, I seem to have reversed the logic in the test if we can reschedule in refill_inactive() ;( On mm/vmscan.c, please remove the `!' in the following fragment of code: 894 if (current->need_resched && !(gfp_mask & __GFP_IO)) { 895 __set_current_state(TASK_RUNNING); 896 schedule(); 897 } The idea was to not allow processes which have IO locks to schedule away, but as you can see, the check is reversed ... With the above fix, can you still lock it up? And if you can, does it lock up in the same way or in a new and exciting way? ;) regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
btw. - no swapdevice here. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
test9-pre5+t9p2-vmpatch VM deadlock during write-intensive workload
i'm still getting VM related lockups during heavy write load, in test9-pre5 + your 2.4.0-t9p2-vmpatch (which i understand as being your last VM related fix-patch, correct?). Here is a histogram of such a lockup: 1 Trace; 4010a720 <__switch_to+38/e8> 5 Trace; 4010a74b <__switch_to+63/e8> 13 Trace; 4010abc4 819 Trace; 4010abca 1806 Trace; 4010abce 1 Trace; 4010abd0 2 Trace; 4011af51 1 Trace; 4011af77 1 Trace; 4011b010 3 Trace; 4011b018 1 Trace; 4011b02d 1 Trace; 4011b051 1 Trace; 4011b056 2 Trace; 4011b05c 3 Trace; 4011b06d 4 Trace; 4011b076 537 Trace; 4011b2bb 2 Trace; 4011b2c6 1 Trace; 4011b2c9 4 Trace; 4011b2d5 31 Trace; 4011b31a 1 Trace; 4011b31d 1 Trace; 4011b32a 1 Trace; 4011b346 11 Trace; 4011b378 2 Trace; 4011b381 5 Trace; 4011b3f8 17 Trace; 4011b404 9 Trace; 4011b43f 1 Trace; 4011b450 1 Trace; 4011b457 2 Trace; 4011b48c 1 Trace; 4011b49c 428 Trace; 4011b4cd 6 Trace; 4011b4f7 4 Trace; 4011b500 2 Trace; 4011b509 1 Trace; 4011b560 1 Trace; 4011b809 <__wake_up+79/3f0> 1 Trace; 4011b81b <__wake_up+8b/3f0> 8 Trace; 4011b81e <__wake_up+8e/3f0> 310 Trace; 4011ba90 <__wake_up+300/3f0> 1 Trace; 4011bb7b <__wake_up+3eb/3f0> 2 Trace; 4011c32b 244 Trace; 4011d40e 1 Trace; 4011d411 1 Trace; 4011d56c 618 Trace; 4011d62e 2 Trace; 40122f28 2 Trace; 40126c3c 1 Trace; 401377ab 1 Trace; 401377c8 5 Trace; 401377cc 15 Trace; 401377d4 11 Trace; 401377dc 2 Trace; 401377e0 6 Trace; 401377ee 8 Trace; 4013783c 1 Trace; 401378f8 3 Trace; 4013792d 2 Trace; 401379af 2 Trace; 401379f3 1 Trace; 40138524 <__alloc_pages+7c/4b8> 1 Trace; 4013852b <__alloc_pages+83/4b8> (first column is number of profiling hits, profiling hits taken on all CPUs.) unfortunately i havent captured which processes are running. This is an 8-CPU SMP box, 8 write-intensive processes are running, they create new 1k-1MB files in new directories - a total of many gigabytes. this lockup happens both during vanilla test9-pre5 and with 2.4.0-t9p2-vmpatch. Your patch makes the lockup happen a bit later than previous, but it still happens. During the lockup all dirty buffers are written out to disk until it reaches such a state: 2162688 pages of RAM 1343488 pages of HIGHMEM 116116 reserved pages 652826 pages shared 0 pages swap cached 0 pages in page table cache Buffer memory:52592kB CLEAN: 664 buffers, 2302 kbyte, 5 used (last=93), 0 locked, 0 protected, 0 dirty LOCKED: 661752 buffers, 2646711 kbyte, 37 used (last=661397), 0 locked, 0 protected, 0 dirty DIRTY: 17 buffers, 26 kbyte, 1 used (last=1), 0 locked, 0 protected, 17 dirty no disk IO happens anymore, but the lockup persists. The histogram was taken after all disk IO has stopped. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Molnar Ingo wrote: yep this has done the trick, the deadlock is gone. I've attached the full VM-fixes patch (this fix included) against vanilla test9-pre5. Linus, could you please include this patch in the next pre patch? (in the mean time, I'll go back to looking at the balancing thing with shared memory ... which is unrelated to this deadlock problem) thanks, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ --- linux/fs/buffer.c.orig Fri Sep 22 02:31:07 2000 +++ linux/fs/buffer.c Fri Sep 22 02:31:13 2000 @@ -706,9 +706,7 @@ static void refill_freelist(int size) { if (!grow_buffers(size)) { - balance_dirty(NODEV); - wakeup_kswapd(0); /* We can't wait because of __GFP_IO */ - schedule(); + try_to_free_pages(GFP_BUFFER); } } --- linux/mm/filemap.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/filemap.c Fri Sep 22 02:31:13 2000 @@ -255,7 +255,7 @@ * up kswapd. */ age_page_up(page); - if (inactive_shortage() (inactive_target * 3) / 4) + if (inactive_shortage() inactive_target / 2 free_shortage()) wakeup_kswapd(0); not_found: return page; --- linux/mm/page_alloc.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/page_alloc.c Fri Sep 22 02:31:13 2000 @@ -444,7 +444,8 @@ * processes, etc). */ if (gfp_mask __GFP_WAIT) { - wakeup_kswapd(1); + try_to_free_pages(gfp_mask); + memory_pressure++; goto try_again; } } --- linux/mm/swap.c.origFri Sep 22 02:31:07 2000 +++ linux/mm/swap.c Fri Sep 22 02:31:13 2000 @@ -233,27 +233,11 @@ spin_lock(pagemap_lru_lock); if (!PageLocked(page)) BUG(); - /* -* Heisenbug Compensator(tm) -* This bug shouldn't trigger, but for unknown reasons it -* sometimes does. If there are no signs of list corruption, -* we ignore the problem. Else we BUG()... -*/ - if (PageActive(page) || PageInactiveDirty(page) || - PageInactiveClean(page)) { - struct list_head * page_lru = page-lru; - if (page_lru-next-prev != page_lru) { - printk("VM: lru_cache_add, bit or list corruption..\n"); - BUG(); - } - printk("VM: lru_cache_add, page already in list!\n"); - goto page_already_on_list; - } + DEBUG_ADD_PAGE add_page_to_active_list(page); /* This should be relatively rare */ if (!page-age) deactivate_page_nolock(page); -page_already_on_list: spin_unlock(pagemap_lru_lock); } --- linux/mm/vmscan.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/vmscan.c Fri Sep 22 02:31:27 2000 @@ -377,7 +377,7 @@ #define SWAP_SHIFT 5 #define SWAP_MIN 8 -static int swap_out(unsigned int priority, int gfp_mask) +static int swap_out(unsigned int priority, int gfp_mask, unsigned long idle_time) { struct task_struct * p; int counter; @@ -407,6 +407,7 @@ struct mm_struct *best = NULL; int pid = 0; int assign = 0; + int found_task = 0; select: read_lock(tasklist_lock); p = init_task.next_task; @@ -416,6 +417,11 @@ continue; if (mm-rss = 0) continue; + /* Skip tasks which haven't slept long enough yet when +idle-swapping. */ + if (idle_time !assign (!(p-state TASK_INTERRUPTIBLE) +|| + time_before(p-sleep_time + idle_time * HZ, +jiffies))) + continue; + found_task++; /* Refresh swap_cnt? */ if (assign == 1) { mm-swap_cnt = (mm-rss SWAP_SHIFT); @@ -430,7 +436,7 @@ } read_unlock(tasklist_lock); if (!best) { - if (!assign) { + if (!assign found_task 0) { assign = 1; goto select; } @@ -691,9 +697,9 @@ * Now the page is really freeable, so we * move it to the inactive_clean list. */ - UnlockPage(page); del_page
test9-pre5+t9p2-vmpatch VM deadlock during write-intensive workload
i'm still getting VM related lockups during heavy write load, in test9-pre5 + your 2.4.0-t9p2-vmpatch (which i understand as being your last VM related fix-patch, correct?). Here is a histogram of such a lockup: 1 Trace; 4010a720 __switch_to+38/e8 5 Trace; 4010a74b __switch_to+63/e8 13 Trace; 4010abc4 poll_idle+10/2c 819 Trace; 4010abca poll_idle+16/2c 1806 Trace; 4010abce poll_idle+1a/2c 1 Trace; 4010abd0 poll_idle+1c/2c 2 Trace; 4011af51 schedule+45/884 1 Trace; 4011af77 schedule+6b/884 1 Trace; 4011b010 schedule+104/884 3 Trace; 4011b018 schedule+10c/884 1 Trace; 4011b02d schedule+121/884 1 Trace; 4011b051 schedule+145/884 1 Trace; 4011b056 schedule+14a/884 2 Trace; 4011b05c schedule+150/884 3 Trace; 4011b06d schedule+161/884 4 Trace; 4011b076 schedule+16a/884 537 Trace; 4011b2bb schedule+3af/884 2 Trace; 4011b2c6 schedule+3ba/884 1 Trace; 4011b2c9 schedule+3bd/884 4 Trace; 4011b2d5 schedule+3c9/884 31 Trace; 4011b31a schedule+40e/884 1 Trace; 4011b31d schedule+411/884 1 Trace; 4011b32a schedule+41e/884 1 Trace; 4011b346 schedule+43a/884 11 Trace; 4011b378 schedule+46c/884 2 Trace; 4011b381 schedule+475/884 5 Trace; 4011b3f8 schedule+4ec/884 17 Trace; 4011b404 schedule+4f8/884 9 Trace; 4011b43f schedule+533/884 1 Trace; 4011b450 schedule+544/884 1 Trace; 4011b457 schedule+54b/884 2 Trace; 4011b48c schedule+580/884 1 Trace; 4011b49c schedule+590/884 428 Trace; 4011b4cd schedule+5c1/884 6 Trace; 4011b4f7 schedule+5eb/884 4 Trace; 4011b500 schedule+5f4/884 2 Trace; 4011b509 schedule+5fd/884 1 Trace; 4011b560 schedule+654/884 1 Trace; 4011b809 __wake_up+79/3f0 1 Trace; 4011b81b __wake_up+8b/3f0 8 Trace; 4011b81e __wake_up+8e/3f0 310 Trace; 4011ba90 __wake_up+300/3f0 1 Trace; 4011bb7b __wake_up+3eb/3f0 2 Trace; 4011c32b interruptible_sleep_on_timeout+283/290 244 Trace; 4011d40e add_wait_queue+14e/154 1 Trace; 4011d411 add_wait_queue+151/154 1 Trace; 4011d56c remove_wait_queue+8/d0 618 Trace; 4011d62e remove_wait_queue+ca/d0 2 Trace; 40122f28 do_softirq+48/88 2 Trace; 40126c3c del_timer_sync+6c/78 1 Trace; 401377ab wakeup_kswapd+7/254 1 Trace; 401377c8 wakeup_kswapd+24/254 5 Trace; 401377cc wakeup_kswapd+28/254 15 Trace; 401377d4 wakeup_kswapd+30/254 11 Trace; 401377dc wakeup_kswapd+38/254 2 Trace; 401377e0 wakeup_kswapd+3c/254 6 Trace; 401377ee wakeup_kswapd+4a/254 8 Trace; 4013783c wakeup_kswapd+98/254 1 Trace; 401378f8 wakeup_kswapd+154/254 3 Trace; 4013792d wakeup_kswapd+189/254 2 Trace; 401379af wakeup_kswapd+20b/254 2 Trace; 401379f3 wakeup_kswapd+24f/254 1 Trace; 40138524 __alloc_pages+7c/4b8 1 Trace; 4013852b __alloc_pages+83/4b8 (first column is number of profiling hits, profiling hits taken on all CPUs.) unfortunately i havent captured which processes are running. This is an 8-CPU SMP box, 8 write-intensive processes are running, they create new 1k-1MB files in new directories - a total of many gigabytes. this lockup happens both during vanilla test9-pre5 and with 2.4.0-t9p2-vmpatch. Your patch makes the lockup happen a bit later than previous, but it still happens. During the lockup all dirty buffers are written out to disk until it reaches such a state: 2162688 pages of RAM 1343488 pages of HIGHMEM 116116 reserved pages 652826 pages shared 0 pages swap cached 0 pages in page table cache Buffer memory:52592kB CLEAN: 664 buffers, 2302 kbyte, 5 used (last=93), 0 locked, 0 protected, 0 dirty LOCKED: 661752 buffers, 2646711 kbyte, 37 used (last=661397), 0 locked, 0 protected, 0 dirty DIRTY: 17 buffers, 26 kbyte, 1 used (last=1), 0 locked, 0 protected, 17 dirty no disk IO happens anymore, but the lockup persists. The histogram was taken after all disk IO has stopped. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
btw. - no swapdevice here. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Molnar Ingo wrote: i'm still getting VM related lockups during heavy write load, in test9-pre5 + your 2.4.0-t9p2-vmpatch (which i understand as being your last VM related fix-patch, correct?). Here is a histogram of such a lockup: this lockup happens both during vanilla test9-pre5 and with 2.4.0-t9p2-vmpatch. Your patch makes the lockup happen a bit later than previous, but it still happens. During the lockup all dirty buffers are written out to disk until it reaches such a state: It seems that conference life has taken its toll, I seem to have reversed the logic in the test if we can reschedule in refill_inactive() ;( On mm/vmscan.c, please remove the `!' in the following fragment of code: 894 if (current-need_resched !(gfp_mask __GFP_IO)) { 895 __set_current_state(TASK_RUNNING); 896 schedule(); 897 } The idea was to not allow processes which have IO locks to schedule away, but as you can see, the check is reversed ... With the above fix, can you still lock it up? And if you can, does it lock up in the same way or in a new and exciting way? ;) regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Rik van Riel wrote: 894 if (current-need_resched !(gfp_mask __GFP_IO)) { 895 __set_current_state(TASK_RUNNING); 896 schedule(); 897 } The idea was to not allow processes which have IO locks to schedule away, but as you can see, the check is reversed ... thanks ... sounds good. Will have this tested in about 15 mins. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
yep this has done the trick, the deadlock is gone. I've attached the full VM-fixes patch (this fix included) against vanilla test9-pre5. Ingo --- linux/fs/buffer.c.orig Fri Sep 22 02:31:07 2000 +++ linux/fs/buffer.c Fri Sep 22 02:31:13 2000 @@ -706,9 +706,7 @@ static void refill_freelist(int size) { if (!grow_buffers(size)) { - balance_dirty(NODEV); - wakeup_kswapd(0); /* We can't wait because of __GFP_IO */ - schedule(); + try_to_free_pages(GFP_BUFFER); } } --- linux/mm/filemap.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/filemap.c Fri Sep 22 02:31:13 2000 @@ -255,7 +255,7 @@ * up kswapd. */ age_page_up(page); - if (inactive_shortage() (inactive_target * 3) / 4) + if (inactive_shortage() inactive_target / 2 free_shortage()) wakeup_kswapd(0); not_found: return page; --- linux/mm/page_alloc.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/page_alloc.c Fri Sep 22 02:31:13 2000 @@ -444,7 +444,8 @@ * processes, etc). */ if (gfp_mask __GFP_WAIT) { - wakeup_kswapd(1); + try_to_free_pages(gfp_mask); + memory_pressure++; goto try_again; } } --- linux/mm/swap.c.origFri Sep 22 02:31:07 2000 +++ linux/mm/swap.c Fri Sep 22 02:31:13 2000 @@ -233,27 +233,11 @@ spin_lock(pagemap_lru_lock); if (!PageLocked(page)) BUG(); - /* -* Heisenbug Compensator(tm) -* This bug shouldn't trigger, but for unknown reasons it -* sometimes does. If there are no signs of list corruption, -* we ignore the problem. Else we BUG()... -*/ - if (PageActive(page) || PageInactiveDirty(page) || - PageInactiveClean(page)) { - struct list_head * page_lru = page-lru; - if (page_lru-next-prev != page_lru) { - printk("VM: lru_cache_add, bit or list corruption..\n"); - BUG(); - } - printk("VM: lru_cache_add, page already in list!\n"); - goto page_already_on_list; - } + DEBUG_ADD_PAGE add_page_to_active_list(page); /* This should be relatively rare */ if (!page-age) deactivate_page_nolock(page); -page_already_on_list: spin_unlock(pagemap_lru_lock); } --- linux/mm/vmscan.c.orig Fri Sep 22 02:31:07 2000 +++ linux/mm/vmscan.c Fri Sep 22 02:31:27 2000 @@ -377,7 +377,7 @@ #define SWAP_SHIFT 5 #define SWAP_MIN 8 -static int swap_out(unsigned int priority, int gfp_mask) +static int swap_out(unsigned int priority, int gfp_mask, unsigned long idle_time) { struct task_struct * p; int counter; @@ -407,6 +407,7 @@ struct mm_struct *best = NULL; int pid = 0; int assign = 0; + int found_task = 0; select: read_lock(tasklist_lock); p = init_task.next_task; @@ -416,6 +417,11 @@ continue; if (mm-rss = 0) continue; + /* Skip tasks which haven't slept long enough yet when +idle-swapping. */ + if (idle_time !assign (!(p-state TASK_INTERRUPTIBLE) +|| + time_before(p-sleep_time + idle_time * HZ, +jiffies))) + continue; + found_task++; /* Refresh swap_cnt? */ if (assign == 1) { mm-swap_cnt = (mm-rss SWAP_SHIFT); @@ -430,7 +436,7 @@ } read_unlock(tasklist_lock); if (!best) { - if (!assign) { + if (!assign found_task 0) { assign = 1; goto select; } @@ -691,9 +697,9 @@ * Now the page is really freeable, so we * move it to the inactive_clean list. */ - UnlockPage(page); del_page_from_inactive_dirty_list(page); add_page_to_inactive_clean_list(page); + UnlockPage(page); cleaned_pages++; } else { /* @@ -701,9 +707,9 @@ * It's no use keeping it here, so we move it to
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensive workload
I had to type the oops down by hand, but I will provide ksymoops output soon if you need it. Let's hope I typed down the oops from the screen without misstakes. Here is the ksymoops output: ksymoops 2.3.4 on i586 2.4.0-test9. Options used -V (default) -k 2922143001.ksyms (specified) -l 2922143001.modules (specified) -o /lib/modules/2.4.0-test9/ (default) -m /boot/System.map-2.4.0-test9 (default) invalid operand: CPU:0 EIP:0010:[c012c1be] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010086 eax: 001c ebx: c31779e0 ecx: edx: 0082 esi: c11f6f80 edi: 0008 ebp: 0001 esp: c01f3eec ds: 0018 es: 0018 ss: 0018 Process swapper (pid:0, stackpage=c01f3000) Stack: c01bb465 c01bb79a 02da c0150d3f e31779e0 0001 c11f6480 0046 c1168360 c0248460 c01684e3 c11f6f80 0001 c0248584 c11f6f80 c02484a0 c016e563 0001 c1168360 c02484a0 c1168360 0286 c0169cc7 Call Trace: [c01bb4b5] [c01bb79a] [c0150d3f] [c01684e3] [c016e563] [c0169cc7] [c016e500] [c010a02c] [c010a18e] [c0107120] [c0108de0] [c0107120] [c0107143] [c01071a7] [c0105000] [c0100192] Code: 0f 0b 83 c4 0c c3 57 56 53 86 74 24 10 8b 54 24 14 85 d2 74 EIP; c012c1be end_buffer_io_bad+42/48 = Trace; c01bb4b5 tvecs+36dd/cde8 Trace; c01bb79a tvecs+39c2/cde8 Trace; c0150d3f end_that_request_first+5f/b8 Trace; c01684e3 ide_end_request+27/74 Trace; c016e563 ide_dma_intr+63/9c Trace; c0169cc7 ide_intr+fb/150 Trace; c016e500 ide_dma_intr+0/9c Trace; c010a02c handle_IRQ_event+30/5c Trace; c010a18e do_IRQ+6e/b0 Trace; c0107120 default_idle+0/28 Trace; c0108de0 ret_from_intr+0/20 Trace; c0107120 default_idle+0/28 Trace; c0107143 default_idle+23/28 Trace; c01071a7 cpu_idle+3f/54 Trace; c0105000 empty_bad_page+0/1000 Trace; c0100192 L6+0/2 Code; c012c1be end_buffer_io_bad+42/48 _EIP: Code; c012c1be end_buffer_io_bad+42/48 = 0: 0f 0b ud2a = Code; c012c1c0 end_buffer_io_bad+44/48 2: 83 c4 0c add$0xc,%esp Code; c012c1c3 end_buffer_io_bad+47/48 5: c3ret Code; c012c1c4 end_buffer_io_async+0/b4 6: 57push %edi Code; c012c1c5 end_buffer_io_async+1/b4 7: 56push %esi Code; c012c1c6 end_buffer_io_async+2/b4 8: 53push %ebx Code; c012c1c7 end_buffer_io_async+3/b4 9: 86 74 24 10 xchg %dh,0x10(%esp,1) Code; c012c1cb end_buffer_io_async+7/b4 d: 8b 54 24 14 mov0x14(%esp,1),%edx Code; c012c1cf end_buffer_io_async+b/b4 11: 85 d2 test %edx,%edx Code; c012c1d1 end_buffer_io_async+d/b4 13: 74 00 je 15 _EIP+0x15 c012c1d3 end_buffer_io_async+f/b4 Aiee, killing interrupt handler Kernel panic: Attempted to kill the idle task! -- // André - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensive workload
On Fri, Sep 22, 2000 at 07:27:30AM -0300, Rik van Riel wrote: Linus, could you please include this patch in the next pre patch? Rik, I just had an oops with this patch applied. I ran into BUG at buffer.c:730. The machine was not under load when the oops occured, I was just reading e-mail in Mutt. I had to type the oops down by hand, but I will provide ksymoops output soon if you need it. -- // André - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: test9-pre5+t9p2-vmpatch VM deadlock during write-intensiveworkload
On Fri, 22 Sep 2000, Molnar Ingo wrote: i'm still getting VM related lockups during heavy write load, in test9-pre5 + your 2.4.0-t9p2-vmpatch (which i understand as being your last VM related fix-patch, correct?). Here is a histogram of such a lockup: Rik, those VM patches are going away RSN if these issues do not get fixed. I'm really disappointed, and suspect that it would be easier to go back to the old VM with just page aging added, not your new code that seems to be full of deadlocks everywhere. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix sg in 2.4.0-test9-pre5 when builtin
Douglas Gilbert wrote: > @@ -1298,18 +1302,20 @@ > } > > #ifdef MODULE > - > MODULE_PARM(def_reserved_size, "i"); > MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd"); > +#endif /* MODULE */ MODULE_xxx typically doesn't need to be surrounded by ifdef MODULE. Also note that proc_fs.h provides no-op inline replacements for the !CONFIG_PROC_FS case. If you want to be really space conscious, you still need the ifdef's in the code for the most part, but the no-ops sometimes eliminate ifdefs if you look carefully. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix sg in 2.4.0-test9-pre5 when builtin
Linus, This patch has been generated in response to the thread: "[2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken" on lkml today. Simon Kirby reported that the SCSI generic (sg) wasn't working in the latest pre-release when the driver was built into the kernel. Torben Mathiasen wrote: > > On Thu, Sep 21 2000, Douglas Gilbert wrote: > > Torben Mathiasen wrote: > > > > > > Ok, small patch cooked up. Not tested, not compiled. Give > > > it a try, and if it works please send it off to Linus. > > > I really need to get some work done on a project... > > > > Here is a very similar patch that has been tested > > [with a USB zip drive using sg (builtin) to read it]. > > It worked and the /proc/scsi/sg directory was > > properly populated. > > > > Looks good, but you should make the init functions static. Done. Also added conditionals to make it compile cleanly when procfs is not present. Tested as builtin and module, with and without procfs. Doug Gilbert --- linux/include/scsi/sg.h Sun Jul 16 18:38:11 2000 +++ linux/include/scsi/sg.h3117 Thu Sep 21 20:00:08 2000 @@ -11,9 +11,13 @@ Version 2 and 3 extensions to driver: * Copyright (C) 1998 - 2000 Douglas Gilbert -Version: 3.1.16 (2716) -This version is for 2.3/2.4 series kernels. +Version: 3.1.17 (2921) +This version is for 2.4 series kernels. +Changes since 3.1.16 (2716) + - changes for new scsi subsystem initialization + - change Scsi_Cmnd usage to Scsi_Request + - cleanup for no procfs Changes since 3.1.15 (2528) - further (scatter gather) buffer length changes Changes since 3.1.14 (2503) --- linux/drivers/scsi/sg.c Wed Sep 20 22:06:26 2000 +++ linux/drivers/scsi/sg.c3117 Thu Sep 21 20:13:12 2000 @@ -17,8 +17,11 @@ * any later version. * */ - static char * sg_version_str = "Version: 3.1.16 (2716)"; - static int sg_version_num = 30116; /* 2 digits for each component */ +#include +#ifdef CONFIG_PROC_FS + static char * sg_version_str = "Version: 3.1.17 (2921)"; +#endif + static int sg_version_num = 30117; /* 2 digits for each component */ /* * D. P. Gilbert ([EMAIL PROTECTED], [EMAIL PROTECTED]), notes: * - scsi logging is available via SCSI_LOG_TIMEOUT macros. First @@ -38,7 +41,6 @@ * # cat /proc/scsi/sg/debug * */ -#include #include #include @@ -235,10 +237,12 @@ static int sg_ms_to_jif(unsigned int msecs); static unsigned sg_jif_to_ms(int jifs); static int sg_allow_access(unsigned char opcode, char dev_type); -static int sg_last_dev(void); static int sg_build_dir(Sg_request * srp, Sg_fd * sfp, int dxfer_len); static void sg_unmap_and(Sg_scatter_hold * schp, int free_also); static Sg_device * sg_get_dev(int dev); +#ifdef CONFIG_PROC_FS +static int sg_last_dev(void); +#endif static Sg_device ** sg_dev_arr = NULL; @@ -1298,18 +1302,20 @@ } #ifdef MODULE - MODULE_PARM(def_reserved_size, "i"); MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd"); +#endif /* MODULE */ -int init_module(void) { +static int __init init_sg(void) { +#ifdef MODULE if (def_reserved_size >= 0) sg_big_buff = def_reserved_size; +#endif /* MODULE */ sg_template.module = THIS_MODULE; return scsi_register_module(MODULE_SCSI_DEV, _template); } -void cleanup_module( void) +static void __exit exit_sg( void) { #ifdef CONFIG_PROC_FS sg_proc_cleanup(); @@ -1324,7 +1330,6 @@ } sg_template.dev_max = 0; } -#endif /* MODULE */ #if 0 @@ -1972,6 +1977,7 @@ return resp; } +#ifdef CONFIG_PROC_FS static Sg_request * sg_get_nth_request(Sg_fd * sfp, int nth) { Sg_request * resp; @@ -1985,6 +1991,7 @@ read_unlock_irqrestore(>rq_list_lock, iflags); return resp; } +#endif /* always adds to end of list */ static Sg_request * sg_add_request(Sg_fd * sfp) @@ -2064,6 +2071,7 @@ return res; } +#ifdef CONFIG_PROC_FS static Sg_fd * sg_get_nth_sfp(Sg_device * sdp, int nth) { Sg_fd * resp; @@ -2077,6 +2085,7 @@ read_unlock_irqrestore(_dev_arr_lock, iflags); return resp; } +#endif static Sg_fd * sg_add_sfp(Sg_device * sdp, int dev) { @@ -2410,6 +2419,7 @@ } +#ifdef CONFIG_PROC_FS static int sg_last_dev() { int k; @@ -2421,6 +2431,7 @@ read_unlock_irqrestore(_dev_arr_lock, iflags); return k + 1; /* origin 1 */ } +#endif static Sg_device * sg_get_dev(int dev) { @@ -2782,3 +2793,7 @@ return 1; } #endif /* CONFIG_PROC_FS */ + + +module_init(init_sg); +module_exit(exit_sg);
USB behavior screwy for uhci.o in test9-pre5
Sometime between test9-pre3 and test9-pre5, the alternative UHCI driver (uhci.o) got screwed up - with my MS Natural Keyboard Pro in USB mode & using the keybdev + hid + uhci driver, pressing one of caps/num/scroll lock turns the appropriate light on, but then when pressing the same caps/num/scroll lock button again, the light stays on and I get the following error message: kernel: hid.c: usb_submit_urb(out) failed Using the usb-uhci.o driver instead of uhci.o corrects the problem, but I can't use usb-uhci.o properly on a regular basis because my ibm usb webcam does not work properly with it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21 2000, Douglas Gilbert wrote: > Torben Mathiasen wrote: > > > > Ok, small patch cooked up. Not tested, not compiled. Give > > it a try, and if it works please send it off to Linus. > > I really need to get some work done on a project... > > Here is a very similar patch that has been tested > [with a USB zip drive using sg (builtin) to read it]. > It worked and the /proc/scsi/sg directory was > properly populated. > Looks good, but you should make the init functions static. -- Torben Mathiasen <[EMAIL PROTECTED]> Linux ThunderLAN maintainer http://tlan.kernel.dk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Cannot boot with 2.4.0-test9-pre5
Cannot boot with 2.4.0-test9-pre5 gcc 2.7.3 compiled as PIII the .config is the same of previous mails :) Yuri -- "I bambini nascono per essere felici" Jose' Marti' - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21, 2000 at 09:39:07PM +0200, Torben Mathiasen wrote: > Ok, small patch cooked up. Not tested, not compiled. Give > it a try, and if it works please send it off to Linus. > I really need to get some work done on a project... This worked, thanks. :) Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ [EMAIL PROTECTED] ][ [EMAIL PROTECTED]] [ Opinions expressed are not necessarily those of my employers. ] > diff -ur --exclude-from=/root/torben /opt/kernel/kernels/linux/drivers/scsi/sg.c >linux/drivers/scsi/sg.c > --- /opt/kernel/kernels/linux/drivers/scsi/sg.c Thu Sep 21 21:29:44 2000 > +++ linux/drivers/scsi/sg.c Thu Sep 21 21:35:46 2000 > @@ -1298,18 +1298,18 @@ > } > > #ifdef MODULE > - > MODULE_PARM(def_reserved_size, "i"); > MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd"); > +#endif > > -int init_module(void) { > +static int __init init_sg(void) { > if (def_reserved_size >= 0) > sg_big_buff = def_reserved_size; > sg_template.module = THIS_MODULE; > return scsi_register_module(MODULE_SCSI_DEV, _template); > } > > -void cleanup_module( void) > +static void __exit exit_sg( void) > { > #ifdef CONFIG_PROC_FS > sg_proc_cleanup(); > @@ -1324,7 +1324,9 @@ > } > sg_template.dev_max = 0; > } > -#endif /* MODULE */ > + > +module_init(init_sg); > +module_exit(exit_sg); > > > #if 0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Torben Mathiasen wrote: > > Ok, small patch cooked up. Not tested, not compiled. Give > it a try, and if it works please send it off to Linus. > I really need to get some work done on a project... Here is a very similar patch that has been tested [with a USB zip drive using sg (builtin) to read it]. It worked and the /proc/scsi/sg directory was properly populated. Doug Gilbert --- linux/drivers/scsi/sg.c Thu Sep 21 15:05:28 2000 +++ linux/drivers/scsi/sg.c3117 Thu Sep 21 15:22:08 2000 @@ -17,8 +17,8 @@ * any later version. * */ - static char * sg_version_str = "Version: 3.1.16 (2716)"; - static int sg_version_num = 30116; /* 2 digits for each component */ + static char * sg_version_str = "Version: 3.1.17 (2921)"; + static int sg_version_num = 30117; /* 2 digits for each component */ /* * D. P. Gilbert ([EMAIL PROTECTED], [EMAIL PROTECTED]), notes: * - scsi logging is available via SCSI_LOG_TIMEOUT macros. First @@ -1298,18 +1298,20 @@ } #ifdef MODULE - MODULE_PARM(def_reserved_size, "i"); MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd"); +#endif /* MODULE */ -int init_module(void) { +int __init init_sg(void) { +#ifdef MODULE if (def_reserved_size >= 0) sg_big_buff = def_reserved_size; +#endif /* MODULE */ sg_template.module = THIS_MODULE; return scsi_register_module(MODULE_SCSI_DEV, _template); } -void cleanup_module( void) +void __exit exit_sg( void) { #ifdef CONFIG_PROC_FS sg_proc_cleanup(); @@ -1324,7 +1326,6 @@ } sg_template.dev_max = 0; } -#endif /* MODULE */ #if 0 @@ -2782,3 +2783,7 @@ return 1; } #endif /* CONFIG_PROC_FS */ + + +module_init(init_sg); +module_exit(exit_sg);
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Ok, small patch cooked up. Not tested, not compiled. Give it a try, and if it works please send it off to Linus. I really need to get some work done on a project... -- Torben Mathiasen <[EMAIL PROTECTED]> Linux ThunderLAN maintainer http://tlan.kernel.dk diff -ur --exclude-from=/root/torben /opt/kernel/kernels/linux/drivers/scsi/sg.c linux/drivers/scsi/sg.c --- /opt/kernel/kernels/linux/drivers/scsi/sg.c Thu Sep 21 21:29:44 2000 +++ linux/drivers/scsi/sg.c Thu Sep 21 21:35:46 2000 @@ -1298,18 +1298,18 @@ } #ifdef MODULE - MODULE_PARM(def_reserved_size, "i"); MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd"); +#endif -int init_module(void) { +static int __init init_sg(void) { if (def_reserved_size >= 0) sg_big_buff = def_reserved_size; sg_template.module = THIS_MODULE; return scsi_register_module(MODULE_SCSI_DEV, _template); } -void cleanup_module( void) +static void __exit exit_sg( void) { #ifdef CONFIG_PROC_FS sg_proc_cleanup(); @@ -1324,7 +1324,9 @@ } sg_template.dev_max = 0; } -#endif /* MODULE */ + +module_init(init_sg); +module_exit(exit_sg); #if 0
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21 2000, Douglas Gilbert wrote: [delete] > > At one point before I followed some of the debug/logging commands listed > > at the top of sg.c and got an Oops as well... > > Seems as though I've got a lot of retesting to do. > Please note that the changes to the scsi midlayer requires all upper layers to use the module_init/exit functions. We do _not_ explicitly call the layers init funtions anymore. Adding the module stuff will probaly fix most problems (asuming module and builtin do not differ). The link order should make sure everything gets called in order. -- Torben Mathiasen <[EMAIL PROTECTED]> Linux ThunderLAN maintainer http://tlan.kernel.dk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21, 2000 at 02:34:01PM -0400, Douglas Gilbert wrote: > I do nearly all of my testing with sg as a module. > So this looks like (another recent) breakage. > > It is beginning to look like the sg driver is not > (properly) initialized when it is built into the > kernel. Perhaps you could put a printk in > sg_init() and sg_attach() to see if they are called. Actually, I also had a printk in sg_init() and it never got printed. I didn't have one in sg_attach, but I can try that. > > At one point before I followed some of the debug/logging commands listed > > at the top of sg.c and got an Oops as well... > > Seems as though I've got a lot of retesting to do. The oops may have been the result of it not being properly initialized or something... Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ [EMAIL PROTECTED] ][ [EMAIL PROTECTED]] [ Opinions expressed are not necessarily those of my employers. ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Simon Kirby wrote: > > On Thu, Sep 21, 2000 at 01:12:27PM -0400, Douglas Gilbert wrote: > > > Interesting. 'cat /proc/scsi/scsi' should show the same > > devices as 'cat /proc/scsi/sg/device_strs' [and > > 'cat /proc/scsi/sg/devices']. If not, then the SCSI > > mid-level is not calling sg_detect() [in sg.c] for > > all new scsi devices detected by the mid-level. > > > > The sg_detect() routine is silent for all devices that > > are "owned" by other upper level drivers (i.e. disks, > > cdroms and tapes) but outputs a line for any other > > scsi type (e.g. scanners which are scsi type 6). > > I didn't fiddle with it too much, but I added a printk to sg_detect and > verified it was not getting called at all. I notice now, however, that I > don't even have a /proc/scsi/sg. Does that mean it's not getting > initialized at all? CONFIG_CHR_DEV_SG=y, assuming that's what needs to > be set (config didn't change between kernel versions). I do nearly all of my testing with sg as a module. So this looks like (another recent) breakage. It is beginning to look like the sg driver is not (properly) initialized when it is built into the kernel. Perhaps you could put a printk in sg_init() and sg_attach() to see if they are called. > At one point before I followed some of the debug/logging commands listed > at the top of sg.c and got an Oops as well... Seems as though I've got a lot of retesting to do. Doug Gilbert - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Torben Mathiasen wrote: > > On Thu, Sep 21 2000, Douglas Gilbert wrote: > > [deleted] > > > It is not clear to me what "hacking" sg requires as > > Torben Mathiasen suggested in his response. This seems > > like a mid level problem. I'll check with my scsi > > scanner this evening. > > > > Well first of all the sg driver needs to be updated the > same way sd and sr was. Well looking at sr in test9-pre5 the only changes are the addition of 'static' before the sr_template definition and various functions. Sg already has the corresponding functions declared static and the sg_template definition has been changed to 'static'. So as far as I can see the mid level has failed to call sg_detect() when it should have. Simon has now confirmed with a printk that sg_detect() was not called for the scanner which the mid level obviously knows about. Doug Gilbert - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21, 2000 at 01:12:27PM -0400, Douglas Gilbert wrote: > Interesting. 'cat /proc/scsi/scsi' should show the same > devices as 'cat /proc/scsi/sg/device_strs' [and > 'cat /proc/scsi/sg/devices']. If not, then the SCSI > mid-level is not calling sg_detect() [in sg.c] for > all new scsi devices detected by the mid-level. > > The sg_detect() routine is silent for all devices that > are "owned" by other upper level drivers (i.e. disks, > cdroms and tapes) but outputs a line for any other > scsi type (e.g. scanners which are scsi type 6). I didn't fiddle with it too much, but I added a printk to sg_detect and verified it was not getting called at all. I notice now, however, that I don't even have a /proc/scsi/sg. Does that mean it's not getting initialized at all? CONFIG_CHR_DEV_SG=y, assuming that's what needs to be set (config didn't change between kernel versions). At one point before I followed some of the debug/logging commands listed at the top of sg.c and got an Oops as well... Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ [EMAIL PROTECTED] ][ [EMAIL PROTECTED]] [ Opinions expressed are not necessarily those of my employers. ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21 2000, Douglas Gilbert wrote: [deleted] > It is not clear to me what "hacking" sg requires as > Torben Mathiasen suggested in his response. This seems > like a mid level problem. I'll check with my scsi > scanner this evening. > Well first of all the sg driver needs to be updated the same way sd and sr was. > > Other random scsi notes: > - scsi modules were completely broken in 2.4.0-test9-pre4 > but worked again in pre5 [Makefile hacks?] [snip] Yes, check the scsi scanning thread, and the patch I sent yesterday. -- Torben Mathiasen <[EMAIL PROTECTED]> Linux ThunderLAN maintainer http://tlan.kernel.dk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
scsi-ide lockup is fixed in test9-pre5
Just tested it with a plain 2.4.0-test9-pre5 kernel and the problem is now fixed. Thanks to all involved, Frank. -- + --- -- - - -- |Frank van de Pol -o) | [EMAIL PROTECTED] /\\ | _\_v |Linux - Why use Windows, since there is a door? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Simon Kirby wrote: > Around 2.4.0-test9-pre2 (or so, definitely in pre3) both my SCSI scanner > and trident sound card stopped being happy. They are still both broken > in pre5. On test8, both work perfectly. > > On test8: > > (scsi0:6:0) Synchronous Data Transfer Request was rejected > Vendor: Model: Scanner Rev: 1.70 > Type: ScannerANSI SCSI revision: 04 > Detected scsi generic sg0 at scsi0, channel 0, id 6, lun 0, type 6 > (scsi1:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. > Vendor: YAMAHAModel: CRW4416S Rev: 1.0e > Type: CD-ROM ANSI SCSI revision: 02 > Detected scsi CD-ROM sr0 at scsi1, channel 0, id 3, lun 0 > scsi : detected 1 SCSI cdrom total. > sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray > > ... on test9pre5 and test9pre3: > > (scsi0:6:0) Synchronous Data Transfer Request was rejected > Vendor: Model: Scanner Rev: 1.70 > Type: ScannerANSI SCSI revision: 04 > (scsi0:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. > Vendor: YAMAHAModel: CRW4416S Rev: 1.0e > Type: CD-ROM ANSI SCSI revision: 02 > Detected scsi CD-ROM sr0 at scsi0, channel 0, id 3, lun 0 > sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray > > ("Detected scsi generic..." line missing.) [snipped trident problem report] Interesting. 'cat /proc/scsi/scsi' should show the same devices as 'cat /proc/scsi/sg/device_strs' [and 'cat /proc/scsi/sg/devices']. If not, then the SCSI mid-level is not calling sg_detect() [in sg.c] for all new scsi devices detected by the mid-level. The sg_detect() routine is silent for all devices that are "owned" by other upper level drivers (i.e. disks, cdroms and tapes) but outputs a line for any other scsi type (e.g. scanners which are scsi type 6). It is not clear to me what "hacking" sg requires as Torben Mathiasen suggested in his response. This seems like a mid level problem. I'll check with my scsi scanner this evening. Other random scsi notes: - scsi modules were completely broken in 2.4.0-test9-pre4 but worked again in pre5 [Makefile hacks?] - the sd module's name has now reverted to its historic name of "sd_mod.o" - the imm module (scsi over parallel port for ZIP drives) works on a UP machine but locks up a SMP machine (until the NMI notices) - the sg "stall" problem (plugged queues) has not been addressed yet Doug Gilbert - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
test9-pre5: Mouse doesn't work with uhci
My IntelliMouse Explorer USB doesn't work with test9-pre5 if I use the uhci.o module. With usb-uhci.o, it does work fine. The last kernel I tried was pre9-test2, which was ok. Other details: modules loaded: usbcore uhci (or usb-uhci) input usbmouse mousedev XFree 4.0.1 Jan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21 2000, Simon Kirby wrote: > ... on test9pre5 and test9pre3: > > (scsi0:6:0) Synchronous Data Transfer Request was rejected > Vendor: Model: Scanner Rev: 1.70 > Type: ScannerANSI SCSI revision: 04 > (scsi0:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. > Vendor: YAMAHAModel: CRW4416S Rev: 1.0e > Type: CD-ROM ANSI SCSI revision: 02 > Detected scsi CD-ROM sr0 at scsi0, channel 0, id 3, lun 0 > sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray > I suspect this to only be a minor issue. sg needs the same overhauls as the other layers. Unfortunately I won't be doing much hacking today, so if someone else could take a look. Otherwise I'll take a look, sometime tonight. Does it work when using modules? -- Torben Mathiasen <[EMAIL PROTECTED]> Linux ThunderLAN maintainer http://tlan.kernel.dk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21 2000, Simon Kirby wrote: ... on test9pre5 and test9pre3: (scsi0:6:0) Synchronous Data Transfer Request was rejected Vendor: Model: Scanner Rev: 1.70 Type: ScannerANSI SCSI revision: 04 (scsi0:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. Vendor: YAMAHAModel: CRW4416S Rev: 1.0e Type: CD-ROM ANSI SCSI revision: 02 Detected scsi CD-ROM sr0 at scsi0, channel 0, id 3, lun 0 sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray I suspect this to only be a minor issue. sg needs the same overhauls as the other layers. Unfortunately I won't be doing much hacking today, so if someone else could take a look. Otherwise I'll take a look, sometime tonight. Does it work when using modules? -- Torben Mathiasen [EMAIL PROTECTED] Linux ThunderLAN maintainer http://tlan.kernel.dk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Simon Kirby wrote: Around 2.4.0-test9-pre2 (or so, definitely in pre3) both my SCSI scanner and trident sound card stopped being happy. They are still both broken in pre5. On test8, both work perfectly. On test8: (scsi0:6:0) Synchronous Data Transfer Request was rejected Vendor: Model: Scanner Rev: 1.70 Type: ScannerANSI SCSI revision: 04 Detected scsi generic sg0 at scsi0, channel 0, id 6, lun 0, type 6 (scsi1:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. Vendor: YAMAHAModel: CRW4416S Rev: 1.0e Type: CD-ROM ANSI SCSI revision: 02 Detected scsi CD-ROM sr0 at scsi1, channel 0, id 3, lun 0 scsi : detected 1 SCSI cdrom total. sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray ... on test9pre5 and test9pre3: (scsi0:6:0) Synchronous Data Transfer Request was rejected Vendor: Model: Scanner Rev: 1.70 Type: ScannerANSI SCSI revision: 04 (scsi0:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. Vendor: YAMAHAModel: CRW4416S Rev: 1.0e Type: CD-ROM ANSI SCSI revision: 02 Detected scsi CD-ROM sr0 at scsi0, channel 0, id 3, lun 0 sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray ("Detected scsi generic..." line missing.) [snipped trident problem report] Interesting. 'cat /proc/scsi/scsi' should show the same devices as 'cat /proc/scsi/sg/device_strs' [and 'cat /proc/scsi/sg/devices']. If not, then the SCSI mid-level is not calling sg_detect() [in sg.c] for all new scsi devices detected by the mid-level. The sg_detect() routine is silent for all devices that are "owned" by other upper level drivers (i.e. disks, cdroms and tapes) but outputs a line for any other scsi type (e.g. scanners which are scsi type 6). It is not clear to me what "hacking" sg requires as Torben Mathiasen suggested in his response. This seems like a mid level problem. I'll check with my scsi scanner this evening. Other random scsi notes: - scsi modules were completely broken in 2.4.0-test9-pre4 but worked again in pre5 [Makefile hacks?] - the sd module's name has now reverted to its historic name of "sd_mod.o" - the imm module (scsi over parallel port for ZIP drives) works on a UP machine but locks up a SMP machine (until the NMI notices) - the sg "stall" problem (plugged queues) has not been addressed yet Doug Gilbert - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
scsi-ide lockup is fixed in test9-pre5
Just tested it with a plain 2.4.0-test9-pre5 kernel and the problem is now fixed. Thanks to all involved, Frank. -- + --- -- - - -- |Frank van de Pol -o) | [EMAIL PROTECTED] /\\ | _\_v |Linux - Why use Windows, since there is a door? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21 2000, Douglas Gilbert wrote: [deleted] It is not clear to me what "hacking" sg requires as Torben Mathiasen suggested in his response. This seems like a mid level problem. I'll check with my scsi scanner this evening. Well first of all the sg driver needs to be updated the same way sd and sr was. Other random scsi notes: - scsi modules were completely broken in 2.4.0-test9-pre4 but worked again in pre5 [Makefile hacks?] [snip] Yes, check the scsi scanning thread, and the patch I sent yesterday. -- Torben Mathiasen [EMAIL PROTECTED] Linux ThunderLAN maintainer http://tlan.kernel.dk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21, 2000 at 01:12:27PM -0400, Douglas Gilbert wrote: Interesting. 'cat /proc/scsi/scsi' should show the same devices as 'cat /proc/scsi/sg/device_strs' [and 'cat /proc/scsi/sg/devices']. If not, then the SCSI mid-level is not calling sg_detect() [in sg.c] for all new scsi devices detected by the mid-level. The sg_detect() routine is silent for all devices that are "owned" by other upper level drivers (i.e. disks, cdroms and tapes) but outputs a line for any other scsi type (e.g. scanners which are scsi type 6). I didn't fiddle with it too much, but I added a printk to sg_detect and verified it was not getting called at all. I notice now, however, that I don't even have a /proc/scsi/sg. Does that mean it's not getting initialized at all? CONFIG_CHR_DEV_SG=y, assuming that's what needs to be set (config didn't change between kernel versions). At one point before I followed some of the debug/logging commands listed at the top of sg.c and got an Oops as well... Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ [EMAIL PROTECTED] ][ [EMAIL PROTECTED]] [ Opinions expressed are not necessarily those of my employers. ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Torben Mathiasen wrote: On Thu, Sep 21 2000, Douglas Gilbert wrote: [deleted] It is not clear to me what "hacking" sg requires as Torben Mathiasen suggested in his response. This seems like a mid level problem. I'll check with my scsi scanner this evening. Well first of all the sg driver needs to be updated the same way sd and sr was. Well looking at sr in test9-pre5 the only changes are the addition of 'static' before the sr_template definition and various functions. Sg already has the corresponding functions declared static and the sg_template definition has been changed to 'static'. So as far as I can see the mid level has failed to call sg_detect() when it should have. Simon has now confirmed with a printk that sg_detect() was not called for the scanner which the mid level obviously knows about. Doug Gilbert - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Simon Kirby wrote: On Thu, Sep 21, 2000 at 01:12:27PM -0400, Douglas Gilbert wrote: Interesting. 'cat /proc/scsi/scsi' should show the same devices as 'cat /proc/scsi/sg/device_strs' [and 'cat /proc/scsi/sg/devices']. If not, then the SCSI mid-level is not calling sg_detect() [in sg.c] for all new scsi devices detected by the mid-level. The sg_detect() routine is silent for all devices that are "owned" by other upper level drivers (i.e. disks, cdroms and tapes) but outputs a line for any other scsi type (e.g. scanners which are scsi type 6). I didn't fiddle with it too much, but I added a printk to sg_detect and verified it was not getting called at all. I notice now, however, that I don't even have a /proc/scsi/sg. Does that mean it's not getting initialized at all? CONFIG_CHR_DEV_SG=y, assuming that's what needs to be set (config didn't change between kernel versions). I do nearly all of my testing with sg as a module. So this looks like (another recent) breakage. It is beginning to look like the sg driver is not (properly) initialized when it is built into the kernel. Perhaps you could put a printk in sg_init() and sg_attach() to see if they are called. At one point before I followed some of the debug/logging commands listed at the top of sg.c and got an Oops as well... Seems as though I've got a lot of retesting to do. Doug Gilbert - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21, 2000 at 02:34:01PM -0400, Douglas Gilbert wrote: I do nearly all of my testing with sg as a module. So this looks like (another recent) breakage. It is beginning to look like the sg driver is not (properly) initialized when it is built into the kernel. Perhaps you could put a printk in sg_init() and sg_attach() to see if they are called. Actually, I also had a printk in sg_init() and it never got printed. I didn't have one in sg_attach, but I can try that. At one point before I followed some of the debug/logging commands listed at the top of sg.c and got an Oops as well... Seems as though I've got a lot of retesting to do. The oops may have been the result of it not being properly initialized or something... Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ [EMAIL PROTECTED] ][ [EMAIL PROTECTED]] [ Opinions expressed are not necessarily those of my employers. ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21 2000, Douglas Gilbert wrote: [delete] At one point before I followed some of the debug/logging commands listed at the top of sg.c and got an Oops as well... Seems as though I've got a lot of retesting to do. Please note that the changes to the scsi midlayer requires all upper layers to use the module_init/exit functions. We do _not_ explicitly call the layers init funtions anymore. Adding the module stuff will probaly fix most problems (asuming module and builtin do not differ). The link order should make sure everything gets called in order. -- Torben Mathiasen [EMAIL PROTECTED] Linux ThunderLAN maintainer http://tlan.kernel.dk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Ok, small patch cooked up. Not tested, not compiled. Give it a try, and if it works please send it off to Linus. I really need to get some work done on a project... -- Torben Mathiasen [EMAIL PROTECTED] Linux ThunderLAN maintainer http://tlan.kernel.dk diff -ur --exclude-from=/root/torben /opt/kernel/kernels/linux/drivers/scsi/sg.c linux/drivers/scsi/sg.c --- /opt/kernel/kernels/linux/drivers/scsi/sg.c Thu Sep 21 21:29:44 2000 +++ linux/drivers/scsi/sg.c Thu Sep 21 21:35:46 2000 @@ -1298,18 +1298,18 @@ } #ifdef MODULE - MODULE_PARM(def_reserved_size, "i"); MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd"); +#endif -int init_module(void) { +static int __init init_sg(void) { if (def_reserved_size = 0) sg_big_buff = def_reserved_size; sg_template.module = THIS_MODULE; return scsi_register_module(MODULE_SCSI_DEV, sg_template); } -void cleanup_module( void) +static void __exit exit_sg( void) { #ifdef CONFIG_PROC_FS sg_proc_cleanup(); @@ -1324,7 +1324,9 @@ } sg_template.dev_max = 0; } -#endif /* MODULE */ + +module_init(init_sg); +module_exit(exit_sg); #if 0
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Torben Mathiasen wrote: Ok, small patch cooked up. Not tested, not compiled. Give it a try, and if it works please send it off to Linus. I really need to get some work done on a project... Here is a very similar patch that has been tested [with a USB zip drive using sg (builtin) to read it]. It worked and the /proc/scsi/sg directory was properly populated. Doug Gilbert --- linux/drivers/scsi/sg.c Thu Sep 21 15:05:28 2000 +++ linux/drivers/scsi/sg.c3117 Thu Sep 21 15:22:08 2000 @@ -17,8 +17,8 @@ * any later version. * */ - static char * sg_version_str = "Version: 3.1.16 (2716)"; - static int sg_version_num = 30116; /* 2 digits for each component */ + static char * sg_version_str = "Version: 3.1.17 (2921)"; + static int sg_version_num = 30117; /* 2 digits for each component */ /* * D. P. Gilbert ([EMAIL PROTECTED], [EMAIL PROTECTED]), notes: * - scsi logging is available via SCSI_LOG_TIMEOUT macros. First @@ -1298,18 +1298,20 @@ } #ifdef MODULE - MODULE_PARM(def_reserved_size, "i"); MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd"); +#endif /* MODULE */ -int init_module(void) { +int __init init_sg(void) { +#ifdef MODULE if (def_reserved_size = 0) sg_big_buff = def_reserved_size; +#endif /* MODULE */ sg_template.module = THIS_MODULE; return scsi_register_module(MODULE_SCSI_DEV, sg_template); } -void cleanup_module( void) +void __exit exit_sg( void) { #ifdef CONFIG_PROC_FS sg_proc_cleanup(); @@ -1324,7 +1326,6 @@ } sg_template.dev_max = 0; } -#endif /* MODULE */ #if 0 @@ -2782,3 +2783,7 @@ return 1; } #endif /* CONFIG_PROC_FS */ + + +module_init(init_sg); +module_exit(exit_sg);
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21, 2000 at 09:39:07PM +0200, Torben Mathiasen wrote: Ok, small patch cooked up. Not tested, not compiled. Give it a try, and if it works please send it off to Linus. I really need to get some work done on a project... This worked, thanks. :) Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ [EMAIL PROTECTED] ][ [EMAIL PROTECTED]] [ Opinions expressed are not necessarily those of my employers. ] diff -ur --exclude-from=/root/torben /opt/kernel/kernels/linux/drivers/scsi/sg.c linux/drivers/scsi/sg.c --- /opt/kernel/kernels/linux/drivers/scsi/sg.c Thu Sep 21 21:29:44 2000 +++ linux/drivers/scsi/sg.c Thu Sep 21 21:35:46 2000 @@ -1298,18 +1298,18 @@ } #ifdef MODULE - MODULE_PARM(def_reserved_size, "i"); MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd"); +#endif -int init_module(void) { +static int __init init_sg(void) { if (def_reserved_size = 0) sg_big_buff = def_reserved_size; sg_template.module = THIS_MODULE; return scsi_register_module(MODULE_SCSI_DEV, sg_template); } -void cleanup_module( void) +static void __exit exit_sg( void) { #ifdef CONFIG_PROC_FS sg_proc_cleanup(); @@ -1324,7 +1324,9 @@ } sg_template.dev_max = 0; } -#endif /* MODULE */ + +module_init(init_sg); +module_exit(exit_sg); #if 0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Cannot boot with 2.4.0-test9-pre5
Cannot boot with 2.4.0-test9-pre5 gcc 2.7.3 compiled as PIII the .config is the same of previous mails :) Yuri -- "I bambini nascono per essere felici" Jose' Marti' - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
On Thu, Sep 21 2000, Douglas Gilbert wrote: Torben Mathiasen wrote: Ok, small patch cooked up. Not tested, not compiled. Give it a try, and if it works please send it off to Linus. I really need to get some work done on a project... Here is a very similar patch that has been tested [with a USB zip drive using sg (builtin) to read it]. It worked and the /proc/scsi/sg directory was properly populated. Looks good, but you should make the init functions static. -- Torben Mathiasen [EMAIL PROTECTED] Linux ThunderLAN maintainer http://tlan.kernel.dk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
USB behavior screwy for uhci.o in test9-pre5
Sometime between test9-pre3 and test9-pre5, the alternative UHCI driver (uhci.o) got screwed up - with my MS Natural Keyboard Pro in USB mode using the keybdev + hid + uhci driver, pressing one of caps/num/scroll lock turns the appropriate light on, but then when pressing the same caps/num/scroll lock button again, the light stays on and I get the following error message: kernel: hid.c: usb_submit_urb(out) failed Using the usb-uhci.o driver instead of uhci.o corrects the problem, but I can't use usb-uhci.o properly on a regular basis because my ibm usb webcam does not work properly with it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix sg in 2.4.0-test9-pre5 when builtin
Douglas Gilbert wrote: @@ -1298,18 +1302,20 @@ } #ifdef MODULE - MODULE_PARM(def_reserved_size, "i"); MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd"); +#endif /* MODULE */ MODULE_xxx typically doesn't need to be surrounded by ifdef MODULE. Also note that proc_fs.h provides no-op inline replacements for the !CONFIG_PROC_FS case. If you want to be really space conscious, you still need the ifdef's in the code for the most part, but the no-ops sometimes eliminate ifdefs if you look carefully. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
[2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Hi n' stuff, Around 2.4.0-test9-pre2 (or so, definitely in pre3) both my SCSI scanner and trident sound card stopped being happy. They are still both broken in pre5. On test8, both work perfectly. On test8: (scsi0:6:0) Synchronous Data Transfer Request was rejected Vendor: Model: Scanner Rev: 1.70 Type: ScannerANSI SCSI revision: 04 Detected scsi generic sg0 at scsi0, channel 0, id 6, lun 0, type 6 (scsi1:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. Vendor: YAMAHAModel: CRW4416S Rev: 1.0e Type: CD-ROM ANSI SCSI revision: 02 Detected scsi CD-ROM sr0 at scsi1, channel 0, id 3, lun 0 scsi : detected 1 SCSI cdrom total. sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray ... on test9pre5 and test9pre3: (scsi0:6:0) Synchronous Data Transfer Request was rejected Vendor: Model: Scanner Rev: 1.70 Type: ScannerANSI SCSI revision: 04 (scsi0:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. Vendor: YAMAHAModel: CRW4416S Rev: 1.0e Type: CD-ROM ANSI SCSI revision: 02 Detected scsi CD-ROM sr0 at scsi0, channel 0, id 3, lun 0 sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray ("Detected scsi generic..." line missing.) The trident driver appears to be working, but the mixer (ac97_codec?) appears to always keep everything muted, even though programs let the levels be apparently adjusted. Turning up the volume all the way on my receiver lets me hear some very faint sound leaking through, which sounds like a mixer problem instead of a playback problem. An ALSA CVS snapshot works fine. Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ [EMAIL PROTECTED] ][ [EMAIL PROTECTED]] [ Opinions expressed are not necessarily those of my employers. ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
test9-pre5
- pre1: - USB: OHCI controller unlink and bandwidth reclamation fixes - USB: storage update - sparc64: register window race. Non-deadlock rwlocks. - name clash in hamradio/pi2.c and hamradio/pt.c - epic100 credits, 8139too driver update, sr.c initcalls - acenic update - NFS sillyrename fixups - mktime(). Do it just once - not 16 times. - misc small fixes to random drivers by Tigran - IDE driver picks up master/slave relationships on its own. - truncate unmapped/uptodate case handled correctly - don't do notifier locking at low level: higher levels do (or should do) this already. - ACPI interpreter updates (and file renames - making this part big) - SysKonnect gigabit driver update - 3c59x driver update - pcmcia debounce logic. Ugh. - MM balancing (Rik Riel) - pre2: - "extern inline" -> "static inline". It doesn't matter right now, but it's proactive for future gcc versions. - various net drvr updates and fixes - more initcall updates - PPC updates (including PPC-related drivers etc) - disallow re-mounting same filesystem in same place multiple times. Too confusing. And /etc/mtab gets strange. - Riel VM update - sparc updates - PCI bridge scanning fix: assign numbers properly - network updates - scsi fixes - pre3: - uninitialized == zero. Remove extra initializers. - block_prepare_write and block_truncate_page: if the page is up-to-date, then so are the buffer heads inside it once they are mapped.. - SCSI initialization - move over to the modular case. No more double initialization. - Sync up with Alans 2.2.x driver changes - networking updates (iipv6 works non-modular etc) - netfilter update - adfs correct dentry operations - ARM update (including ARM drivers) - acenic driver update - floppy: we'd better hold the io_request_lock when playing with "CURRENT". - NFS cache coherency across file locking fix - NFS over TCP - handle TCP socket writability right.. - USB updates - pre4: - more USB updates - continued SCSI cleanup - pre5: - more drivers synced to Alan's 2.2.x changes - sis900 driver update - Andries: net device name allocation as in 2.2.x - pmac SCSI driver init update - ixj telephony driver fixes - _fput/__fput are no longer used. - more USB updates - codafs update - byteorder: use statement expressions instead of macros, to avoid argument re-use. - don't disallow Onstream ide-scsi devices - fix cardbus bridge resources.. - Make SCSI initialization order be same as before. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
test9-pre5
- pre1: - USB: OHCI controller unlink and bandwidth reclamation fixes - USB: storage update - sparc64: register window race. Non-deadlock rwlocks. - name clash in hamradio/pi2.c and hamradio/pt.c - epic100 credits, 8139too driver update, sr.c initcalls - acenic update - NFS sillyrename fixups - mktime(). Do it just once - not 16 times. - misc small fixes to random drivers by Tigran - IDE driver picks up master/slave relationships on its own. - truncate unmapped/uptodate case handled correctly - don't do notifier locking at low level: higher levels do (or should do) this already. - ACPI interpreter updates (and file renames - making this part big) - SysKonnect gigabit driver update - 3c59x driver update - pcmcia debounce logic. Ugh. - MM balancing (Rik Riel) - pre2: - "extern inline" - "static inline". It doesn't matter right now, but it's proactive for future gcc versions. - various net drvr updates and fixes - more initcall updates - PPC updates (including PPC-related drivers etc) - disallow re-mounting same filesystem in same place multiple times. Too confusing. And /etc/mtab gets strange. - Riel VM update - sparc updates - PCI bridge scanning fix: assign numbers properly - network updates - scsi fixes - pre3: - uninitialized == zero. Remove extra initializers. - block_prepare_write and block_truncate_page: if the page is up-to-date, then so are the buffer heads inside it once they are mapped.. - SCSI initialization - move over to the modular case. No more double initialization. - Sync up with Alans 2.2.x driver changes - networking updates (iipv6 works non-modular etc) - netfilter update - adfs correct dentry operations - ARM update (including ARM drivers) - acenic driver update - floppy: we'd better hold the io_request_lock when playing with "CURRENT". - NFS cache coherency across file locking fix - NFS over TCP - handle TCP socket writability right.. - USB updates - pre4: - more USB updates - continued SCSI cleanup - pre5: - more drivers synced to Alan's 2.2.x changes - sis900 driver update - Andries: net device name allocation as in 2.2.x - pmac SCSI driver init update - ixj telephony driver fixes - _fput/__fput are no longer used. - more USB updates - codafs update - byteorder: use statement expressions instead of macros, to avoid argument re-use. - don't disallow Onstream ide-scsi devices - fix cardbus bridge resources.. - Make SCSI initialization order be same as before. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
[2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken
Hi n' stuff, Around 2.4.0-test9-pre2 (or so, definitely in pre3) both my SCSI scanner and trident sound card stopped being happy. They are still both broken in pre5. On test8, both work perfectly. On test8: (scsi0:6:0) Synchronous Data Transfer Request was rejected Vendor: Model: Scanner Rev: 1.70 Type: ScannerANSI SCSI revision: 04 Detected scsi generic sg0 at scsi0, channel 0, id 6, lun 0, type 6 (scsi1:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. Vendor: YAMAHAModel: CRW4416S Rev: 1.0e Type: CD-ROM ANSI SCSI revision: 02 Detected scsi CD-ROM sr0 at scsi1, channel 0, id 3, lun 0 scsi : detected 1 SCSI cdrom total. sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray ... on test9pre5 and test9pre3: (scsi0:6:0) Synchronous Data Transfer Request was rejected Vendor: Model: Scanner Rev: 1.70 Type: ScannerANSI SCSI revision: 04 (scsi0:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31. Vendor: YAMAHAModel: CRW4416S Rev: 1.0e Type: CD-ROM ANSI SCSI revision: 02 Detected scsi CD-ROM sr0 at scsi0, channel 0, id 3, lun 0 sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray ("Detected scsi generic..." line missing.) The trident driver appears to be working, but the mixer (ac97_codec?) appears to always keep everything muted, even though programs let the levels be apparently adjusted. Turning up the volume all the way on my receiver lets me hear some very faint sound leaking through, which sounds like a mixer problem instead of a playback problem. An ALSA CVS snapshot works fine. Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ [EMAIL PROTECTED] ][ [EMAIL PROTECTED]] [ Opinions expressed are not necessarily those of my employers. ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/