bug in __tcp_inherit_port ?

2001-07-01 Thread Bulent Abali


I get an occasional panic in __tcp_inherit_port(sk,child).  I believe the
reason is tb=sk->prev is NULL.

sk->prev is set to NULL in only few places including __tcp_put_port(sk).
Perhaps there is a serialization problem between __tcp_inherit_port and
__tcp_put_port ???   One possibility is that sk->num != child->num.
Therefore spin_locks in the two routines do not serialize.

This code is out of my league so I couldn't debug any further.  Ingo, this
is the same problem that I posted to linux-kernel couple weeks ago for
tcp_v4_syn_recv_sock.

Problem occurs when running TUX-B6, 2.4.5-ac4 with SPECweb99, dual PIII,
and one acenic adapter.   It is difficult to trigger but did occur few
times so far.   In the following are the oops and objdump
/bulent.

=

/* Caller must disable local BH processing. */
static __inline__ void __tcp_inherit_port(struct sock *sk, struct sock
*child)
{
 struct tcp_bind_hashbucket *head =
&tcp_bhash[tcp_bhashfn(child->num)];
 struct tcp_bind_bucket *tb;

 spin_lock(&head->lock);
 tb = (struct tcp_bind_bucket *)sk->prev;   <** line 149
 if ((child->bind_next = tb->owners) != NULL) <** panic here
  tb->owners->bind_pprev = &child->bind_next;
 tb->owners = child;
 child->bind_pprev = &tb->owners;
 child->prev = (struct sock *) tb;
 spin_unlock(&head->lock);
}


__inline__ void __tcp_put_port(struct sock *sk)
{
 struct tcp_bind_hashbucket *head = &tcp_bhash[tcp_bhashfn(sk->num)];
 struct tcp_bind_bucket *tb;

 spin_lock(&head->lock);
 tb = (struct tcp_bind_bucket *) sk->prev;
 if (sk->bind_next)
  sk->bind_next->bind_pprev = sk->bind_pprev;
 *(sk->bind_pprev) = sk->bind_next;
 sk->prev = NULL;
 sk->num = 0;
 if (tb->owners == NULL) {
  if (tb->next)
   tb->next->pprev = tb->pprev;
  *(tb->pprev) = tb->next;
  kmem_cache_free(tcp_bucket_cachep, tb);
 }
 spin_unlock(&head->lock);
}



oops output

NULL pointer dereference at virtual address 0008
 printing eip:
c0247a34
*pde = 
Oops: 
CPU:0
EIP:0010:[]
EFLAGS: 00010246
eax:    ebx: f74224c0   ecx:    edx: f74224c0
esi: f750   edi: f71e6cf0   ebp: f74225b4   esp: c0313c00
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 0, stackpage=c0313000)
Stack: f2a55ec4 f2d6bf64 459d1162 f74224c0 c024aff9 f74224c0 f2a55ec4
f2d6bf64
    459d1163 459d1162 459d1163  1000 f74225b4
f740f58c
   f7760c00 c022a3c5 f740f58c c0231e76 e11d2a9c f7760cd8 f740083c

Call Trace: [] [] [] []
[]
   [] [] [] [] []
[]
   [] [] [] [] []
[]
   [] [] [] [] []
[]
   [] [] [] [] []
[]
   [] [] [] [] []
[]
   [] [] [] [] []
[]
   []

Code: 8b 41 08 89 43 18 85 c0 74 09 8b 51 08 8d 43 18 89 42 1c 89
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing


ksymoops output

Code;  c0247a34 
 <_EIP>:
Code;  c0247a34 
   0:   8b 41 08  mov0x8(%ecx),%eax  //panics in
child->bind_next=tb->owners
Code;  c0247a37 
   3:   89 43 18  mov%eax,0x18(%ebx)
Code;  c0247a3a 
   6:   85 c0 test   %eax,%eax
Code;  c0247a3c 
   8:   74 09 je 13 <_EIP+0x13> c0247a47

Code;  c0247a3e 
   a:   8b 51 08  mov0x8(%ecx),%edx
Code;  c0247a41 
   d:   8d 43 18  lea0x18(%ebx),%eax
Code;  c0247a44 
  10:   89 42 1c  mov%eax,0x1c(%edx)
Code;  c0247a47 
  13:   89 00 mov%eax,(%eax)



objdump -S

/usr/src/linux-2.4.5-ac4/include/asm/spinlock.h:104
c0247a21:   f0 fe 0e lock decb (%esi)
c0247a24:   0f 88 85 79 03 00js c027f3af

/usr/src/linux-2.4.5-ac4/net/ipv4/tcp_ipv4.c:149
c0247a2a:   8b 54 24 14  mov0x14(%esp,1),%edx
c0247a2e:   8b 8a a4 00 00 00mov0xa4(%edx),%ecx   //tb =
sk->prev
/usr/src/linux-2.4.5-ac4/net/ipv4/tcp_ipv4.c:150
c0247a34:   8b 41 08 mov0x8(%ecx),%eax //
child->bind_next=tb->owners
c0247a37:   89 43 18 mov%eax,0x18(%ebx)
c0247a3a:   85 c0test   %eax,%eax
c0247a3c:   74 09je c0247a47

/usr/src/linux-2.4.5-ac4/net/ipv4/tcp_ipv4.c:151
c0247a3e:   8b 51 08 mov0x8(%ecx),%edx
c0247a41:   8d 43 18 lea0x18(%ebx),%eax
c0247a44:   89 42 1c mov%eax,0x1c(%edx)
/usr/src/linux-2.4.5-ac4/net/ipv4/tcp_ipv4.c:152
c0247a47:   89 59 08 mov%ebx,0x8(%ecx)
/usr/src/linux-2.4.5-ac4/net/ipv4/tcp_ipv4.c:153
c0247a4a:   8d 41 08 lea0x8(%ecx),%eax
c0247a4d:   89 43 1c mov%eax,0x1c(%ebx)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.htm

Re: all processes waiting in TASK_UNINTERRUPTIBLE state

2001-06-26 Thread Bulent Abali



>> I am running in to a problem, seemingly a deadlock situation, where
almost
>> all the processes end up in the TASK_UNINTERRUPTIBLE state.   All the
>
>could you try to reproduce with this patch applied on top of
>2.4.6pre5aa1 or 2.4.6pre5 vanilla?

Andrea,
I would like try your patch but so far I can trigger the bug only when
running TUX 2.0-B6 which runs on 2.4.5-ac4.  /bulent



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: all processes waiting in TASK_UNINTERRUPTIBLE state

2001-06-25 Thread Bulent Abali



>[EMAIL PROTECTED] said:
>> I am running in to a problem, seemingly a deadlock situation, where
>> almost all the processes end up in the TASK_UNINTERRUPTIBLE state.
>> All the process eventually stop responding, including login shell, no
>> screen updates, keyboard etc.  Can ping and sysrq key works.   I
>> traced the tasks through sysrq-t key.  The processors are in the idle
>> state.  Tasks all seem to get stuck in the __wait_on_page or
>> __lock_page.
>
>I've seen this under UML, Rik van Riel has seen it on a physical box, and
we
>suspect that they're the same problem (i.e. mine isn't a UML-specific
bug).

Can you give more details?  Was there an aic7xxx scsi driver on the box?
run_task_queue(&tq_disk) should eventually unlock those pages
but they remain locked.  I am trying to narrow it down to fs/buffer
code or the SCSI driver aic7xxx in my case. Thanks. /bulent



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



all processes waiting in TASK_UNINTERRUPTIBLE state

2001-06-25 Thread Bulent Abali


keywords:  tux, aic7xxx, 2.4.5-ac4, specweb99, __wait_on_page, __lock_page

Greetings,

I am running in to a problem, seemingly a deadlock situation, where almost
all the processes end up in the TASK_UNINTERRUPTIBLE state.   All the
process eventually stop responding, including login shell, no screen
updates, keyboard etc.  Can ping and sysrq key works.   I traced the tasks
through sysrq-t key.  The processors are in the idle state.  Tasks all seem
to get stuck in the __wait_on_page or __lock_page.  It appears from the
source that they are waiting for pages to be unlocked.   run_task_queue
(&tq_disk) should eventually cause pages to unlock but it doesn't happen.
Anybody familiar with this problem or have seen it before?  Thanks for any
comments.
Bulent

Here are the conditions:
Dual PIII, 1GHz, 1GB of memory,  aic7xxx scsi driver, acenic eth.
This occurs while TUX  (2.4.5-B6) webserver is being driven by SPECWeb99
benchmark at a rate of 800 c/s.  The system is very busy doing disk and
network I/O.  Problem occurs sometimes in an hour and sometimes 10-20 hours
in to the running.

Bulent


Process: 0, { swapper}
EIP: 0010:[] CPU: 1 EFLAGS: 0246
EAX:  EBX: c0105220 ECX: c2afe000 EDX: 0025
ESI: c2afe000 EDI: c2afe000 EBP: c0105220 DS: 0018 ES: 0018
CR0: 8005003b CR2: 08049df0 CR3: 268e CR4: 06d0
Call Trace: [] [] []
SysRq : Show Regs

Process: 0, { swapper}
EIP: 0010:[] CPU: 0 EFLAGS: 0246
EAX:  EBX: c0105220 ECX: c030a000 EDX: 
ESI: c030a000 EDI: c030a000 EBP: c0105220 DS: 0018 ES: 0018
CR0: 8005003b CR2: 08049f7c CR3: 37a63000 CR4: 06d0
Call Trace: [] [] []
SysRq : Show Regs

EIP: 0010:[] CPU: 1 EFLAGS: 0246
Using defaults from ksymoops -t elf32-i386 -a i386
EAX:  EBX: c0105220 ECX: c2afe000 EDX: 0025
ESI: c2afe000 EDI: c2afe000 EBP: c0105220 DS: 0018 ES: 0018
CR0: 8005003b CR2: 08049df0 CR3: 268e CR4: 06d0
Call Trace: [] [] []

EIP: 0010:[] CPU: 0 EFLAGS: 0246
EAX:  EBX: c0105220 ECX: c030a000 EDX: 
ESI: c030a000 EDI: c030a000 EBP: c0105220 DS: 0018 ES: 0018
CR0: 8005003b CR2: 08049f7c CR3: 37a63000 CR4: 06d0
Call Trace: [] [] []

>>EIP; c010524d<=
Trace; c01052d2 
Trace; c0119186 <__call_console_drivers+46/60>
Trace; c01192fb 

>>EIP; c010524d<=
Trace; c01052d2 
Trace; c0105000 
Trace; c01001cf 

=

SysRq : Show Memory
Mem-info:
Free pages:4300kB (   792kB HighMem)
( Active: 200434, inactive_dirty: 26808, inactive_clean: 1472, free: 1075
(574 1148 1722) )
24*4kB 15*8kB 2*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = 728kB)
493*4kB 3*8kB 1*16kB 0*32kB 0*64kB 0*128kB 1*256kB 1*512kB 0*1024kB
0*2048kB 0*4096kB = 2780kB)
0*4kB 1*8kB 1*16kB 0*32kB 0*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB
0*4096kB = 792kB)
Swap cache: add 2711, delete 643, find 5301/6721
Free swap:   2087996kB
253932 pages of RAM
24556 pages of HIGHMEM
7212 reserved pages
221419 pages shared
2068 pages swap cached
0 pages in page table cache
Buffer memory:12164kB
CLEAN: 2322 buffers, 9276 kbyte, 3 used (last=2322), 2 locked, 0
protected, 0 dirty
   LOCKED: 405 buffers, 1608 kbyte, 39 used (last=404), 348 locked, 0
protected, 0 dirty
DIRTY: 322 buffers, 1288 kbyte, 0 used (last=0), 322 locked, 0
protected, 322 dirty

=

async IO 0/2  D 0013 0  1061   1059  1062   (NOTLB)
Call Trace: [] [] [] []
[]
   [] [] [] [] []

Trace; c012e121 <___wait_on_page+91/c0>
Trace; c012f059 
Trace; c02614d7 
Trace; c0258c44 
Trace; c02588c0 
Trace; c025c65a 
Trace; c0256848 
Trace; c0258478 
Trace; c0105636 
Trace; c02582a0 


==

bash  D C2AE541C 0   920912 (NOTLB)
Call Trace: [] [] [] []
[]
   [] [] [] [] []
[]
   [] [] [] [

Trace; c012e1e1 <__lock_page+91/c0>
Trace; c012e04d 
Trace; c016b880 
Trace; c012fdac 
Trace; c012a49a 
Trace; c012a76a 
Trace; c012a8cb 
Trace; c021814c 
Trace; c0113ed0 
Trace; c0114106 
Trace; c0118aa5 
Trace; c01417d2 
Trace; c011e25b 
Trace; c0113ed0 
Trace; c01075b8 



void ___wait_on_page(struct page *page)
{
struct task_struct *tsk = current;
DECLARE_WAITQUEUE(wait, tsk);

add_wait_queue(&page->wait, &wait);
do {
sync_page(page);
set_task_state(tsk, TASK_UNINTERRUPTIBLE);
if (!PageLocked(page))
break;
run_task_queue(&tq_disk);
schedule();
} while (PageLocked(page));
tsk->state = TASK_RUNNING;
remove_wait_queue(&page->wait, &wait);
}

static void __lock_page(struct page *page)
{
struct task_struct *tsk = current;
DECLARE_WAITQUEUE(wait, tsk);

add_wait_queue_exclusive(&page->wait, &wait);
for (;;) {
sync_page(page);
set_task_state(tsk, TASK_UNINTERRUPTIBLE);
if (PageLocked(page)) {
   

Re: [RFQ] aic7xxx driver panics under heavy swap.

2001-06-20 Thread Bulent Abali



Justin,
Your patch works for me.  printk "Temporary Resource Shortage"
has to go, or may be you can make it a debug option.

Here is the cleaned up patch for 2.4.5-ac15 with TAILQ
macros replaced with LIST macros.  Thanks for the help.
Bulent



--- aic7xxx_linux.c.save Mon Jun 18 20:25:35 2001
+++ aic7xxx_linux.c Tue Jun 19 17:35:55 2001
@@ -1516,7 +1516,11 @@
 }
 cmd->result = CAM_REQ_INPROG << 16;
 TAILQ_INSERT_TAIL(&dev->busyq, (struct ahc_cmd *)cmd, acmd_links.tqe);
-ahc_linux_run_device_queue(ahc, dev);
+if ((dev->flags & AHC_DEV_ON_RUN_LIST) == 0) {
+ LIST_INSERT_HEAD(&ahc->platform_data->device_runq, dev, links);
+ dev->flags |= AHC_DEV_ON_RUN_LIST;
+ ahc_linux_run_device_queues(ahc);
+}
 ahc_unlock(ahc, &flags);
 return (0);
 }
@@ -1532,6 +1536,9 @@
 struct ahc_tmode_tstate *tstate;
 uint16_t mask;

+if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
+ panic("running device on run list");
+
 while ((acmd = TAILQ_FIRST(&dev->busyq)) != NULL
 && dev->openings > 0 && dev->qfrozen == 0) {

@@ -1540,8 +1547,6 @@
   * running is because the whole controller Q is frozen.
   */
  if (ahc->platform_data->qfrozen != 0) {
-  if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
-   return;

   LIST_INSERT_HEAD(&ahc->platform_data->device_runq,
  dev, links);
@@ -1552,8 +1557,6 @@
   * Get an scb to use.
   */
  if ((scb = ahc_get_scb(ahc)) == NULL) {
-  if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
-   panic("running device on run list");
   LIST_INSERT_HEAD(&ahc->platform_data->device_runq,
  dev, links);
   dev->flags |= AHC_DEV_ON_RUN_LIST;








-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[RFQ] aic7xxx driver panics under heavy swap.

2001-06-19 Thread Bulent Abali


Justin,
When free memory is low, I get a series of aic7xxx messages followed by
panic.
It appears to be a race condition in the code.  Should you panic?  I tried
the following
patch to not panic.  But I am not sure if it is functionally correct.
Bulent


scsi0: Temporary Resource Shortage
scsi0: Temporary Resource Shortage
scsi0: Temporary Resource Shortage
scsi0: Temporary Resource Shortage
scsi0: Temporary Resource Shortage
Kernel panic: running device on run list


--- aic7xxx_linux.c.save Mon Jun 18 20:25:35 2001
+++ aic7xxx_linux.c Mon Jun 18 20:26:29 2001
@@ -1552,12 +1552,14 @@
   * Get an scb to use.
   */
  if ((scb = ahc_get_scb(ahc)) == NULL) {
+  ahc->flags |= AHC_RESOURCE_SHORTAGE;
   if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
-   panic("running device on run list");
+   return;
+   // panic("running device on run list");
   LIST_INSERT_HEAD(&ahc->platform_data->device_runq,
  dev, links);
   dev->flags |= AHC_DEV_ON_RUN_LIST;
-  ahc->flags |= AHC_RESOURCE_SHORTAGE;
+  // ahc->flags |= AHC_RESOURCE_SHORTAGE;
   printf("%s: Temporary Resource Shortage\n",
  ahc_name(ahc));
   return;



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Please test: workaround to help swapoff behaviour

2001-06-10 Thread Bulent Abali



>The fix is to kill the dead/orphaned swap pages before we get to
>swapoff.  At shutdown time there is practically nothing active in
> ...
>Once the dead swap pages problem is fixed it is time to optimize
>swapoff.

I think fixing the orphaned swap pages problem will eliminate the
problem all together.  Probably there is no need to optimize
swapoff.

Because as the system is shutting down all the processes will be
killed and their pages in swap will be orphaned. If those pages
were to be reaped in a timely manner there wouldn't be any work
left for swapoff.

Bulent


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Please test: workaround to help swapoff behaviour

2001-06-09 Thread Bulent Abali




>Bulent,
>
>Could you please check if 2.4.6-pre2+the schedule patch has better
>swapoff behaviour for you?

Marcelo,

It works as expected.  Doesn't lockup the box however swapoff keeps burning
the CPU cycles.  It took 4 1/2 minutes to swapoff about 256MB of swap
content.  Shutdown took just as long.  I was hoping that shutdown would
kill the swapoff process but it doesn't.  It just hangs there.  Shutdown
is the common case.  Therefore, swapoff needs to be optimized for
shutdowns.
You could imagine users frustration waiting for a shutdown when there are
gigabytes in the swap.

So to summarize, schedule patch is better than nothing but falls far short.
I would put it in 2.4.6.  Read on.

--

The problem is with the try_to_unuse() algorithm which is very inefficient.
I searched the linux-mm archives and Tweedie was on to this. This is what
he wrote:  "it is much cheaper to find a swap entry for a given page than
to find the swap cache page for a given swap entry." And he posted a
patch http://mail.nl.linux.org/linux-mm/2001-03/msg00224.html
His patch is in the Redhat 7.1 kernel 2.4.2-2 and not in 2.4.5.

But in any case I believe the patch will not work as expected.
It seems to me that he is calling the function check_orphaned_swap(page)
in the wrong place.  He is calling the function while scanning the
active_list in refill_inactive_scan().  The problem with that is if you
wait
60 seconds or longer the orphaned swap pages will move from active
to inactive lists. Therefore the function will miss the orphans in inactive
lists.  Any comments?



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Please test: workaround to help swapoff behaviour

2001-06-08 Thread Bulent Abali


>> I looked at try_to_unuse in swapfile.c.  I believe that the algorithm is
>> broken.
>> For each and every swap entry it is walking the entire process list
>> (for_each_task(p)).  It is also grabbing a whole bunch of locks
>> for each swap entry.  It might be worthwhile processing swap entries in
>> batches instead of one entry at a time.
>>
>> In any case, I think having this patch is worthwhile as a quick and
dirty
>> remedy.
>
>Bulent,
>
>Could you please check if 2.4.6-pre2+the schedule patch has better
>swapoff behaviour for you?

No problem.  I will check it tomorrow. I don't think it can be any worse
than it is now.  The patch looks correct in principle.
I believe it should go in to 2.4.6.  But I will test it.

On small machines people don't notice it, but otherwise if you have few
GB of memory it really hurts.  Shutdowns take forever since swapoff takes
forever.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Please test: workaround to help swapoff behaviour

2001-06-07 Thread Bulent Abali





>This is for the people who has been experiencing the lockups while running
>swapoff.
>
>Please test. (against 2.4.6-pre1)
>
>
>--- linux.orig/mm/swapfile.c Wed Jun  6 18:16:45 2001
>+++ linux/mm/swapfile.c Thu Jun  7 16:06:11 2001
>@@ -345,6 +345,8 @@
> /*
>  * Find a swap page in use and read it in.
>  */
>+if (current->need_resched)
>+ schedule();
> swap_device_lock(si);
> for (i = 1; i < si->max ; i++) {
>  if (si->swap_map[i] > 0 && si->swap_map[i] != SWAP_MAP_BAD)
{


I tested your patch against 2.4.5.  It works.  No more lockups.  Without
the
patch it took 14 minutes 51 seconds to complete swapoff (this is to recover
1.5GB of
swap space).  During this time the system was frozen.  No keyboard, no
screen, etc. Practically locked-up.

With the patch there are no more lockups. Swapoff kept running in the
background.
This is a winner.

But here is the caveat: swapoff keeps burning 100% of the cycles until it
completes.
This is not going to be a big deal during shutdowns.  Only when you enter
swapoff from
the command line it is going to be a problem.

I looked at try_to_unuse in swapfile.c.  I believe that the algorithm is
broken.
For each and every swap entry it is walking the entire process list
(for_each_task(p)).  It is also grabbing a whole bunch of locks
for each swap entry.  It might be worthwhile processing swap entries in
batches instead of one entry at a time.

In any case, I think having this patch is worthwhile as a quick and dirty
remedy.

Bulent Abali



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Bulent Abali



>> O.k.  I think I'm ready to nominate the dead swap pages for the big
>> 2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
>> instead of being IO bound?  Just wanting to understand this the cheap
way :)
>
>There's no IO being done whatsoever (that I can see with only a blinky).
>I can fire up ktrace and find out exactly what's going on if that would
>be helpful.  Eating the dead swap pages from the active page list prior
>to swapoff cures all but a short freeze.  Eating the rest (few of those)
>might cure the rest, but I doubt it.
>
>-Mike

1)  I second Mike's observation.  swapoff either from command line or
during
shutdown, just hangs there.  No disk I/O is being done as I could see
from the blinkers.  This is not a I/O boundness issue.  It is more like
a deadlock.

I happened to saw this one with debugger attached serial port.
The system was alive.  I think I was watching the free page count and
it was decreasing very slowly may be couple pages per second.  Bigger
the swap usage longer it takes to do swapoff.  For example, if I had
1GB in the swap space then it would take may be an half hour to shutdown...


2)  Now why I would have 1 GB in the swap space, that is another problem.
Here is what I observe and it doesn't make much sense to me.
Let's say I have 1GB of memory and plenty of swap.  And let's
say there is process with little less than 1GB size.  Suppose the system
starts swapping because it is short few megabytes of memory.
Within *seconds* of swapping, I see that the swap disk usage balloons to
nearly 1GB. Nearly entire memory moves in to the page cache.  If you
run xosview you will know what I mean.  Memory usage suddenly turns from
green to red :-).   And I know for a fact that my disk cannot do 1GB per
second :-). The SHARE column of the big process in "top" goes up by
hundreds
of megabytes.
So it appears to me that MM is marking the whole process memory to be
swapped out and probably reserving nearly 1 GB in the swap space and
furthermore moves entire process pages to apparently to the page cache.
You would think that if you are short by few MB of memory MM would put
few MB worth of pages in the swap. But it wants to move entire processes
in to swap.

When the 1GB process exits, the swap usage doesn't change (dead swap
pages?).
And shutdown or swapoff will take forever due to #1 above.

Bulent




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



can I call wake_up_interruptible_all from an interrupt service routine?

2001-06-05 Thread Bulent Abali

Interrupt service routine of a driver makes a wake_up_interruptible_all()
call to wake up a kernel thread.   Is that legitimate?   Thanks for any
  advice
you might have. please cc: your response to me if you decide to post to
the mailing list.
Bulent


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/