On Fri, Apr 12, 2013 at 1:22 PM, Techienote com <techienote....@gmail.com>wrote:

>
>
> On Fri, Apr 12, 2013 at 10:12 PM, Jeff Trawick <traw...@gmail.com> wrote:
>
>>  On Fri, Apr 12, 2013 at 12:10 PM, Techienote com <
>> techienote....@gmail.com> wrote:
>>
>>>
>>>
>>>  On Fri, Apr 12, 2013 at 4:59 PM, Jeff Trawick <traw...@gmail.com>wrote:
>>>
>>>>  On Fri, Apr 12, 2013 at 2:51 AM, Techienote com <
>>>> techienote....@gmail.com> wrote:
>>>>
>>>>>  Hi Folks,
>>>>>
>>>>>
>>>>>
>>>>> Recently we are facing core dump in Oracle HTTP Server which is build
>>>>> on Apache 1.3
>>>>>
>>>>>
>>>>>
>>>>> Following is the output of httpd -V command
>>>>>
>>>>>
>>>>> -------------------------------------------------------------------------------------------------------
>>>>>
>>>>> Server version: Oracle-Application-Server-10g/10.1.3.1.0
>>>>> Oracle-HTTP-Server
>>>>> Server built:   Sep 22 2006 04:35:27
>>>>> Server's Module Magic Number: 19990320:18
>>>>> Server compiled with....
>>>>>  -D EAPI
>>>>>  -D EAPI_MM
>>>>>  -D EAPI_MM_CORE_PATH="logs/mm"
>>>>>  -D HAVE_MMAP
>>>>>  -D USE_MMAP_SCOREBOARD
>>>>>  -D USE_MMAP_FILES
>>>>>  -D HAVE_FCNTL_SERIALIZED_ACCEPT
>>>>>  -D HAVE_SYSVSEM_SERIALIZED_ACCEPT
>>>>>  -D HAVE_PTHREAD_SERIALIZED_ACCEPT
>>>>>  -D DYNAMIC_MODULE_LIMIT=64
>>>>>  -D HARD_SERVER_LIMIT=8192
>>>>>  -D HTTPD_ROOT="/tmp/apache"
>>>>>  -D SUEXEC_BIN="/tmp/apache/bin/suexec"
>>>>>  -D DEFAULT_PIDLOG="logs/httpd.pid"
>>>>>  -D DEFAULT_SCOREBOARD="logs/httpd.scoreboard"
>>>>>  -D DEFAULT_LOCKFILE="logs/httpd.lock"
>>>>>  -D DEFAULT_ERRORLOG="logs/error_log"
>>>>>  -D TYPES_CONFIG_FILE="conf/mime.types"
>>>>>  -D SERVER_CONFIG_FILE="conf/httpd.conf"
>>>>>  -D ACCESS_CONFIG_FILE="conf/access.conf"
>>>>>  -D RESOURCE_CONFIG_FILE="conf/srm.conf"
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>> I have tried to run the same using pstack command. Following is the
>>>>> output of the pstack command
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -------------------------------------------------------------------------------------------------------
>>>>>
>>>>> core 'core' of 13893:   /ora10gas/OracleAS/Apache/Apache/bin/httpd -d
>>>>> /ora10gas/OracleAS/Apach
>>>>> -----------------  lwp# 1 / thread# 1  --------------------
>>>>>  ff091a28 memcpy   (ffbfc998, fddb2838, ffbff1e4, ffbff1f4, fe0d3d64,
>>>>> fe0d3d84) + 104c
>>>>>  fe063bbc shmcb_retrieve_session (259f40, fddb2838, ffbff270,
>>>>> 6b63feff, 80808080, 1010101) + 118
>>>>>  fe063044 ssl_scache_shmcb_retrieve (259f40, ffbff2e0, 20683c,
>>>>> ffbff60c, 468740, 468e74) + 7c
>>>>>  fe061430 ssl_scache_retrieve (259f40, ffbff360, 0, 0, 46895c,
>>>>> ffbff3a0) + f4
>>>>>  fe05e8f4 ssl_callback_GetSessionCacheEntry (2000, 33f498, ffffffff,
>>>>> ffffffff, ffbff6f8, 4683c4) + 88
>>>>>
>>>>
>>>> SSLSessionCache none (or something) will avoid this code/crash, but
>>>> you'll likely encounter noticeable performance degradation (client response
>>>> time and/or server CPU).  Unless the crash is happening very frequently
>>>> (i.e., severely affecting service) you probably don't want to do that.
>>>>
>>>> You need to get assistance from Oracle.  This is a proprietary SSL
>>>> toolkit and proprietary patches to old levels of open source.
>>>>
>>>> Can you please let me know why you are suspecting SSLSessionCache?
>>>
>>
>> Because the stack traces I responded to above are for looking up session
>> cache entries...
>>
> I have gone through the link
> http://publib.boulder.ibm.com/httpserv/ihsdiag/get_backtrace.html
> As per this following is my pflag output of core dump
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
> core 'core' of 13893: /ora10gas/OracleAS/Apache/Apache/bin/httpd -d
> /ora10gas/OracleAS/Apach
> data model = _ILP32 flags = MSACCT|MSFORK
> /1: flags = STOPPED
> why = PR_SUSPENDED
> lwppend = 0x00000400,0x00000000
> /2: flags = DETACH
> sigmask = 0xfffffefd,0x0000ffff cursig = SIGSEGV
> /3: flags = DETACH|STOPPED lwp_park(0x4,0xfdafbe08,0x0)
> why = PR_SUSPENDED
> sigmask = 0x0000e001,0x00000000
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Note that thread 2 has cursig = SIGSEGV next to it. That is the flag that
> Solaris thinks did the dirty deed. Thread 2 output is as follows
>
>
> ----------------- lwp# 2 / thread# 2 --------------------
> ff16e298 __pollsys (fdc1be68, 0, fdc1bed0, 0, 0, 0) + 8
> ff109abc pselect (fdc1be68, ff1e6790, ff1e6790, 0, fdc1bed0, 0) + 1c8
> ff109e34 select (0, 0, 0, 0, fdc1bf38, fe002394) + a0
> fe0035d8 swwwcsl_Sleep (ea60, 7, fe012d30, 3645, fdb00200, 1) + 40
> fe00470c wwccuctp_CleanupThreadProc (42bca0, fdc1c000, 0, 0, fe00465c, 1)
> + b0
> ff16a9c8 _lwp_start (0, 0, 0, 0, 0, 0)
>
> So i want to understand why you are suspecting thread 1
>
>
>>
>>
>>>
>>>>
>>>>>   fe1730bc nzospGetSession (ffbff448, ffbff450, 4683cc, 468740,
>>>>> 468740, 45b2cb) + 24
>>>>>  f87fdde0 ssl_Hshk_GetSessionID (20, 468909, ffbff534, 4, 468710,
>>>>> 468740) + a8
>>>>>  f8884274 ssl_Hshk_Priv_GetSessionDBRecord (468710, ffbff53f,
>>>>> ffbff534, 468964, 0, 1) + 74
>>>>>  f8883f04 ssl_Hshk_Priv_ProcessClientHello (300, 300, 469190, 468710,
>>>>> 0, ffbff5b0) + 174
>>>>>  f8879d24 STM_ExecuteLine (455238, f906374c, 1001, 469190, 0, 45525c)
>>>>> + 40
>>>>>  f8879a94 STM_DoOneCycle (455238, ffbff6dc, 20683c, ffbff60c, 468740,
>>>>> 468e74) + 148
>>>>>  f88798fc STM_Operate (455238, ffbff6dc, f8882b74, 468710, 46895c,
>>>>> 20683c) + 14
>>>>>  f87f3f90 ssl_Hshk_HandshakeProceed (468710, 0, 0, 4000, ffbff6f8,
>>>>> 4683c4) + b0
>>>>>  f87f2a1c ssl_Handshake (468710, 810a0038, 810d0013, 810d0000,
>>>>> ffbff780, 4683c4) + 30
>>>>>  fe16d80c nzos_Handshake (4683b8, 33f4cc, 434478, fffffff8, 0, 4364b0)
>>>>> + b0
>>>>>  fe05ca90 SSL_new_server_side (fe100818, 33f4cc, fe0ec98c, 2400, 2664,
>>>>> 4314b0) + 13c
>>>>>  fe05c8a0 ssl_hook_NewConnection (431440, 91314, 9b3f0, ffbff8d8,
>>>>> 8c4c0, 902ec) + 14c
>>>>>  00030538 new_connection (a59e0, 933a0, a5a18, ffbff994, ffbff984, 2)
>>>>> + 12c
>>>>>  00031ee4 child_main (90ad4, d8c, e20, a7c, 1800, c00) + 95c
>>>>>  00032278 make_child (933a0, 2, 516551db, 10, 1cf4, ff1e8140) + 16c
>>>>>  00032358 startup_children (5, 14, 869d8, 1b840, 0, 21cc8) + 8c
>>>>>  00032be8 standalone_main (800, 878, c00, d64, 1800, 1a44c) + 28c
>>>>>  00033820 main     (c00, dec, 1800, 1a20, 1800, 19ec) + 568
>>>>>  000193c0 _start   (0, 0, 0, 0, 0, 0) + 108
>>>>> -----------------  lwp# 2 / thread# 2  --------------------
>>>>>  ff16e298 __pollsys (fdc1be68, 0, fdc1bed0, 0, 0, 0) + 8
>>>>>  ff109abc pselect  (fdc1be68, ff1e6790, ff1e6790, 0, fdc1bed0, 0) + 1c8
>>>>>  ff109e34 select   (0, 0, 0, 0, fdc1bf38, fe002394) + a0
>>>>>  fe0035d8 swwwcsl_Sleep (ea60, 7, fe012d30, 3645, fdb00200, 1) + 40
>>>>>  fe00470c wwccuctp_CleanupThreadProc (42bca0, fdc1c000, 0, 0,
>>>>> fe00465c, 1) + b0
>>>>>  ff16a9c8 _lwp_start (0, 0, 0, 0, 0, 0)
>>>>> -----------------  lwp# 3 / thread# 3  --------------------
>>>>>  ff16aa6c __lwp_park (1, 42be50, fdafbe08, 0, 7dc18, 0) + 14
>>>>>  ff164ab0 cond_wait_queue (42bdc8, 42be50, fdafbe08, 0, 0, 0) + 4c
>>>>>  ff164ef4 cond_wait_common (42bdc8, 42be50, fdafbe08, 0, 0, 0) + 294
>>>>>  ff165088 _cond_timedwait (42bdc8, 42be50, fdafbed0, 0, 0, 0) + 34
>>>>>  ff16517c cond_timedwait (42bdc8, 42be50, fdafbed0, 0, 0, fdafbed8) +
>>>>> 14
>>>>>  f8df683c sltspctimewait (21e658, 3f99e0, 3f99e4, 493e0, 42be50,
>>>>> ff163f48) + d4
>>>>>  fe0034ac swwwctwe_TimedWaitForEvent (3f99e0, 493e0, fe0121ec, 8,
>>>>> f4240, 0) + 40
>>>>>  fe000afc wwchmctp_CHMThreadProc (18f4, 1800, 1800, 1800, 1800, 1a3c)
>>>>> + 118
>>>>>  ff16a9c8 _lwp_start (0, 0, 0, 0, 0, 0)
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>> Need your help to analyze further is this issue. Need to understand
>>>>> the cause of core dump and how we can fix it. Simultaneously I am
>>>>> also raising this with Oracle support but not receiving any proper reply.
>>>>>
>>>>
>>>>
Sadly, pflags is not always correct.  (I think it may be related to
sig_coredump() blocking the synchronous signal upon entry, but in my
experience only customers^H^H^H^H...users can duplicate this so I haven't
played.)

A way to double check this is to look at what thread 2 is doing -- a
blocking syscall.  Those don't crash.

A best-effort way to double check that thread 1 really could have crashed
is to look what it was doing -- memcpy.

What about all the other threads?  Also blocking.


>
>>>>
>>>> --
>>>> Born in Roswell... married an alien...
>>>> http://emptyhammock.com/
>>>>
>>>
>>>
>>
>>
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>>
>
>


-- 
Born in Roswell... married an alien...
http://emptyhammock.com/

Reply via email to