Hello Gentlemen,

I analyzed a oops that occured on an "old" version of LiS : 2-16  in a process which is closing a streams pipe. I've attached the oops and the corresponding section of code hereafter.
The process is multithreaded and  we've perheaps reached a race condition where two threads are dealing with the closing of the same pipe ...  

I found similar errors reported on Linux archive and I understand that such race conditions  have been fixed with upper release of LiS. (i.e 2-18).
In particular, in the below mail there's a reference to a patch for problem 3) "synchronization problem between pipes."  And I think that it could be the root cause of my problem.
http://www.mail-archive.com/[email protected]/msg01855.html

My question is  : beyond the fact that LiS 2-18 will certainly fixed this issue,  I was wondering if someone could point me to the specific modification that has fixed this specific issue (perheaps the patch mentioned in the email from Jeff)

Best regards,
-Philippe

-----------------------------------------------------------------------------------------------------------------------------
== Section of code where the oops occured
-----------------------------------------------------------------------------------------------------------------------------

int lis_safe_SAMESTR(queue_t *q, char *f, int l)
{
    if (   lis_check_q_magic(q,f,l)
    && q->q_next != NULL
    && lis_check_q_magic(q->q_next,f,l)
       )
    return ((q->q_flag&QREADR) == (q->q_next->q_flag&QREADR));
                                                            ^^^^^^^^^
                                                            ksymoops showed that q->q_next is null
                                                             in spite of the condition test in if statement.


    return 0;
}


-----------------------------------------------------------------------------------------------------------------------------
== The OOPS :
-----------------------------------------------------------------------------------------------------------------------------
Unable to handle kernel NULL pointer dereference at virtual address 0000001c
 printing eip:
f8b332b9
*pde = 32ba1001
*pte = 58e29067
Oops: 0000
netconsole nfs lockd sunrpc streams swrmm parport_pc lp parport autofs4 audit e1000 tg3 ipv6 floppy sg microcode keybdev mousedev hid input usb-ohci usbcore e
CPU:    3
EIP:    0060:[<f8b332b9>]    Tainted: P
EFLAGS: 00010002

EIP is at lis_safe_SAMESTR [streams] 0x41 (2.4.21-20.ELsmp/i686)
eax: 00000000   ebx: cb2852ac   ecx: f8bd4ad4   edx: 00000020
esi: 0000068a   edi: f8bd4ad4   ebp: d6f61ed8   esp: d6f61ebc
ds: 0068   es: 0068   ss: 0068
Process oracle (pid: 1131, stackpage=d6f61000)
Stack: cb2852ac 00000000 cb2852ac f8b32c80 cb2852ac f8bd4ad4 0000068a 00000202
       d6f61f00 f4f272b4 00003a98 f4f2753c f8b291d1 cb2852ac f4f272b4 f8c0c960
       00000286 00000282 f4f2753c d6f61f30 00000000 f4f272b4 f8b29639 f4f272b4
Call Trace:   [<f8b32c80>] lis_qcountstrm [streams] 0x50 (0xd6f61ec8)
[<f8bd4ad4>] .rodata.str1.4 [streams] 0x4584 (0xd6f61ed0)
[<f8b291d1>] close_action [streams] 0x1c9 (0xd6f61eec)
[<f8c0c960>] lis_stdata_sem [streams] 0x0 (0xd6f61ef8)
[<f8b29639>] lis_doclose [streams] 0x2c9 (0xd6f61f14)
[<f8b29850>] lis_strclose [streams] 0x178 (0xd6f61f44)
[<c016472a>] __fput [kernel] 0xea (0xd6f61f78)
[<c01628ae>] filp_close [kernel] 0x8e (0xd6f61f94)
[<c0162956>] sys_close [kernel] 0x66 (0xd6f61fb0)

Code: 8b 40 1c 83 e0 10 83 e2 10 39 c2 0f 94 c0 0f b6 c0 eb d3 55
-----------------------------------------------------------------------------------------------------------------------------


Reply via email to