[jira] [Created] (TS-1032) Assertion when upstream connection is established (with event handled by thread A) and immediately disconnected (handled by thread B)

2011-11-24 Thread Uri Shachar (Created) (JIRA)
Assertion when upstream connection is established (with event handled by thread 
A) and immediately disconnected (handled by thread B)
-

 Key: TS-1032
 URL: https://issues.apache.org/jira/browse/TS-1032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core, HTTP
Affects Versions: 3.1.1
 Environment: Linux 32bit CentOS 5.4. Pre-open source version of ATS.
Reporter: Uri Shachar


This happened twice on a very old version of ATS (pre opensource code), but it 
looks like it can happen in current ATS as well (it's a very rare race 
condition, haven't been able to reproduce).

Scenario:
1)  Client request arrives, handled by TS thread 1 and is reenabled 
by a plugin (Inside a continuation called by ContSched)
2)  TS thread 2 starts to connect upstream
3)  A client disconnection event is placed in thread 1 queue.
4)  A successful connection event is placed in thread 2 queue.
5)  Thread 1 starts to handle pending events (setting cur_time to X)
6)  Thread 2 starts to handle pending events (setting cur_time to 
Z=X+Y)
7)  Thread 2 handles the connection established event (setting 
server_first_connect to Z)
8)  Thread 1 handles the client disconnection event - Getting a 
negative wait and asserting...

Sample stack trace:

Program received signal SIGABRT, Aborted.
[Switching to Thread 0xe3131b90 (LWP 14584)]
0xe410 in __kernel_vsyscall ()
#0  0xe410 in __kernel_vsyscall ()
#1  0x007e2df0 in raise () from /lib/libc.so.6
#2  0x007e484e in abort () from /lib/libc.so.6
#3  0x08427612 in ink_die_die_die (retval=1) at 
/usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_error.cc:45
#4  0x08427778 in ink_fatal_va (return_code=1, message_format=0xe312ee1f 
"/tmp/ushachar-rpmbuild/BUILD/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:5572: 
failed assert `wait >= 0`", ap=0xe312ee08 "\002") at 
/usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_error.cc:100
#5  0x084277d3 in ink_fatal (return_code=1, message_format=0xe312ee1f 
"/tmp/ushachar-rpmbuild/BUILD/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:5572: 
failed assert `wait >= 0`") at 
/usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_error.cc:111
#6  0x08424508 in _ink_assert (a=0x853db72 "wait >= 0", f=0x853ab3c 
"/tmp/ushachar-rpmbuild/BUILD/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc", 
l=5572) at /usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_assert.cc:27
#7  0x082f2505 in HttpSM::mark_server_down_on_client_abort (this=0xb622ece0) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:5572
#8  0x082f6080 in HttpSM::state_watch_for_client_abort (this=0xb622ece0, 
event=3, data=0x7e0e2a88) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:1148
#9  0x082fad0f in HttpSM::main_handler (this=0xb622ece0, event=3, 
data=0x7e0e2a88) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:3213
#10 0x0810a07b in Continuation::handleEvent (this=0xb622ece0, event=3, 
data=0x7e0e2a88) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/Continuation.h:85
#11 0x083ab348 in read_signal_and_update (event=3, vc=0x7e0e2a30) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:262
#12 0x083ab3fe in read_signal_done (event=3, nh=0xa339b28, vc=0x7e0e2a30) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:300
#13 0x083ab44f in read_signal_error (nh=0xa339b28, vc=0x7e0e2a30, lerrno=104) 
at /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:324
#14 0x083ae1c5 in read_from_net (nh=0xa339b28, vc=0x7e0e2a30, thread=0xa32e490) 
at /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:783
#15 0x083ae5a7 in UnixNetVConnection::net_read_io (this=0x7e0e2a30, 
nh=0xa339b28, lthread=0xa32e490) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:1059
#16 0x083adced in NetHandler::mainNetEvent (this=0xa339b28, event=5, 
e=0xa1ab810) at /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:1272
#17 0x0810a07b in Continuation::handleEvent (this=0xa339b28, event=5, 
data=0xa1ab810) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/Continuation.h:85
#18 0x083a19ac in EThread::process_event (this=0xa32e490, e=0xa1ab810, 
calling_code=5) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixEThread.cc:132
#19 0x0839f800 in EThread::execute (this=0xa32e490) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixEThread.cc:315
#20 0x083b4f9a in spawn_thread_internal (a=0xa385360) at 
/usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixThread.cc:71
#21 0x009065ab in start_thread () from /lib/libpthread.so.0
#22 0x0088bcfe in clone () from /lib/libc.so.6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https:/

[jira] [Updated] (TS-1032) Assertion when upstream connection is established (with event handled by thread A) and immediately disconnected (handled by thread B)

2011-11-24 Thread Uri Shachar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Shachar updated TS-1032:


Attachment: wait_patch.diff

> Assertion when upstream connection is established (with event handled by 
> thread A) and immediately disconnected (handled by thread B)
> -
>
> Key: TS-1032
> URL: https://issues.apache.org/jira/browse/TS-1032
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, HTTP
>Affects Versions: 3.1.1
> Environment: Linux 32bit CentOS 5.4. Pre-open source version of ATS.
>Reporter: Uri Shachar
> Attachments: wait_patch.diff
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> This happened twice on a very old version of ATS (pre opensource code), but 
> it looks like it can happen in current ATS as well (it's a very rare race 
> condition, haven't been able to reproduce).
> Scenario:
>   1)  Client request arrives, handled by TS thread 1 and is reenabled 
> by a plugin (Inside a continuation called by ContSched)
>   2)  TS thread 2 starts to connect upstream
>   3)  A client disconnection event is placed in thread 1 queue.
>   4)  A successful connection event is placed in thread 2 queue.
>   5)  Thread 1 starts to handle pending events (setting cur_time to X)
>   6)  Thread 2 starts to handle pending events (setting cur_time to 
> Z=X+Y)
>   7)  Thread 2 handles the connection established event (setting 
> server_first_connect to Z)
>   8)  Thread 1 handles the client disconnection event - Getting a 
> negative wait and asserting...
> Sample stack trace:
> Program received signal SIGABRT, Aborted.
> [Switching to Thread 0xe3131b90 (LWP 14584)]
> 0xe410 in __kernel_vsyscall ()
> #0  0xe410 in __kernel_vsyscall ()
> #1  0x007e2df0 in raise () from /lib/libc.so.6
> #2  0x007e484e in abort () from /lib/libc.so.6
> #3  0x08427612 in ink_die_die_die (retval=1) at 
> /usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_error.cc:45
> #4  0x08427778 in ink_fatal_va (return_code=1, message_format=0xe312ee1f 
> "/tmp/ushachar-rpmbuild/BUILD/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:5572:
>  failed assert `wait >= 0`", ap=0xe312ee08 "\002") at 
> /usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_error.cc:100
> #5  0x084277d3 in ink_fatal (return_code=1, message_format=0xe312ee1f 
> "/tmp/ushachar-rpmbuild/BUILD/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:5572:
>  failed assert `wait >= 0`") at 
> /usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_error.cc:111
> #6  0x08424508 in _ink_assert (a=0x853db72 "wait >= 0", f=0x853ab3c 
> "/tmp/ushachar-rpmbuild/BUILD/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc", 
> l=5572) at /usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_assert.cc:27
> #7  0x082f2505 in HttpSM::mark_server_down_on_client_abort (this=0xb622ece0) 
> at /usr/src/debug/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:5572
> #8  0x082f6080 in HttpSM::state_watch_for_client_abort (this=0xb622ece0, 
> event=3, data=0x7e0e2a88) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:1148
> #9  0x082fad0f in HttpSM::main_handler (this=0xb622ece0, event=3, 
> data=0x7e0e2a88) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:3213
> #10 0x0810a07b in Continuation::handleEvent (this=0xb622ece0, event=3, 
> data=0x7e0e2a88) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/Continuation.h:85
> #11 0x083ab348 in read_signal_and_update (event=3, vc=0x7e0e2a30) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:262
> #12 0x083ab3fe in read_signal_done (event=3, nh=0xa339b28, vc=0x7e0e2a30) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:300
> #13 0x083ab44f in read_signal_error (nh=0xa339b28, vc=0x7e0e2a30, lerrno=104) 
> at /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:324
> #14 0x083ae1c5 in read_from_net (nh=0xa339b28, vc=0x7e0e2a30, 
> thread=0xa32e490) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:783
> #15 0x083ae5a7 in UnixNetVConnection::net_read_io (this=0x7e0e2a30, 
> nh=0xa339b28, lthread=0xa32e490) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:1059
> #16 0x083adced in NetHandler::mainNetEvent (this=0xa339b28, event=5, 
> e=0xa1ab810) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:1272
> #17 0x0810a07b in Continuation::handleEvent (this=0xa339b28, event=5, 
> data=0xa1ab810) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/Continuation.h:85
> #18 0x083a19ac in EThread::process_event (this=0xa32e490, e=0xa1ab810, 
> calling_code=5) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixEThread.cc:132
> #19 0x0839f800 in EThread::execute (this=0xa3

[jira] [Updated] (TS-1006) memory management, cut down memory waste ?

2011-11-24 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1006:
--

Attachment: memusage.ods

This is the result after mohan_zl's patch that use Da & rbtree to reduce the 
mem alignemnt memory, 1.8G for my env.

> memory management, cut down memory waste ?
> --
>
> Key: TS-1006
> URL: https://issues.apache.org/jira/browse/TS-1006
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 3.1.1
>Reporter: Zhao Yongming
>Assignee: Zhao Yongming
> Fix For: 3.1.3
>
> Attachments: memusage.ods, memusage.ods
>
>
> when we review the memory usage in the production, there is something 
> abnormal, ie, looks like TS take much memory than index data + common system 
> waste, and here is some memory dump result by set 
> "proxy.config.dump_mem_info_frequency"
> 1, the one on a not so busy forwarding system:
> physics memory: 32G
> RAM cache: 22G
> DISK: 6140 GB
> average_object_size 64000
> {code}
>  allocated  |in-use  | type size  |   free list name
> |||--
>   671088640 |   37748736 |2097152 | 
> memory/ioBufAllocator[14]
>  2248146944 | 2135949312 |1048576 | 
> memory/ioBufAllocator[13]
>  1711276032 | 1705508864 | 524288 | 
> memory/ioBufAllocator[12]
>  1669332992 | 1667760128 | 262144 | 
> memory/ioBufAllocator[11]
>  2214592512 | 221184 | 131072 | 
> memory/ioBufAllocator[10]
>  2325741568 | 2323775488 |  65536 | 
> memory/ioBufAllocator[9]
>  2091909120 | 2089123840 |  32768 | 
> memory/ioBufAllocator[8]
>  1956642816 | 1956478976 |  16384 | 
> memory/ioBufAllocator[7]
>  2094530560 | 2094071808 |   8192 | 
> memory/ioBufAllocator[6]
>   356515840 |  355540992 |   4096 | 
> memory/ioBufAllocator[5]
> 1048576 |  14336 |   2048 | 
> memory/ioBufAllocator[4]
>  131072 |  0 |   1024 | 
> memory/ioBufAllocator[3]
>   65536 |  0 |512 | 
> memory/ioBufAllocator[2]
>   32768 |  0 |256 | 
> memory/ioBufAllocator[1]
>   16384 |  0 |128 | 
> memory/ioBufAllocator[0]
>   0 |  0 |576 | 
> memory/ICPRequestCont_allocator
>   0 |  0 |112 | 
> memory/ICPPeerReadContAllocator
>   0 |  0 |432 | 
> memory/PeerReadDataAllocator
>   0 |  0 | 32 | 
> memory/MIMEFieldSDKHandle
>   0 |  0 |240 | 
> memory/INKVConnAllocator
>   0 |  0 | 96 | 
> memory/INKContAllocator
>4096 |  0 | 32 | 
> memory/apiHookAllocator
>   0 |  0 |288 | 
> memory/FetchSMAllocator
>   0 |  0 | 80 | 
> memory/prefetchLockHandlerAllocator
>   0 |  0 |176 | 
> memory/PrefetchBlasterAllocator
>   0 |  0 | 80 | 
> memory/prefetchUrlBlaster
>   0 |  0 | 96 | memory/blasterUrlList
>   0 |  0 | 96 | 
> memory/prefetchUrlEntryAllocator
>   0 |  0 |128 | 
> memory/socksProxyAllocator
>   0 |  0 |144 | 
> memory/ObjectReloadCont
> 3258368 | 576016 |592 | 
> memory/httpClientSessionAllocator
>  825344 | 139568 |208 | 
> memory/httpServerSessionAllocator
>22597632 |1284848 |   9808 | memory/httpSMAllocator
>   0 |  0 | 32 | 
> memory/CacheLookupHttpConfigAllocator
>   0 |  0 |   9856 | 
> memory/httpUpdateSMAllocator
>   0 |  0 |128 | 
> memory/RemapPluginsAlloc
>   0 |  0 | 48 | 
> memory/CongestRequestParamAllocator
>   0 |  0 |128 | 
> memory/CongestionDBContAllocator
> 5767168 | 704512 |   2048 | memory/hdrStrHeap
>18350080 |1153024 |   2048 | memory/hdrHeap
>   53248 |   2912 |208 | 
> memory/httpCacheAltAllocator
>