[jira] [Commented] (TS-1130) ink_time_t is 64bit on x86_64

2012-04-20 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258130#comment-13258130
 ] 

weijin commented on TS-1130:


According Leif`s advice, by checking the sizeof() on ink_time_t before the 
atomic CAS.

> ink_time_t is 64bit on x86_64
> -
>
> Key: TS-1130
> URL: https://issues.apache.org/jira/browse/TS-1130
> Project: Traffic Server
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Zhao Yongming
>Assignee: weijin
> Fix For: 3.1.4
>
> Attachments: TS-1130.diff
>
>
> Weijin: paste your patch here, :D

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1203) Crash report: HdrHeap::duplicate_str, in host_set

2012-04-19 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257476#comment-13257476
 ] 

weijin commented on TS-1203:


It is similar as TS-996. We will back port it and test again.

> Crash report: HdrHeap::duplicate_str, in host_set
> -
>
> Key: TS-1203
> URL: https://issues.apache.org/jira/browse/TS-1203
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Affects Versions: 3.1.3
> Environment: 3.0.x, new crashes
>Reporter: Zhao Yongming
>Assignee: weijin
>Priority: Critical
> Fix For: 3.1.4
>
>
> we get some new crashes in the production:
> {code}
> warning: no loadable sections found in added symbol-file system-supplied DSO 
> at 0x727fd000
> Core was generated by `/usr/bin/traffic_server -M -A,12:X,13:X'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x003e5b07c24e in memcpy () from /lib64/libc.so.6
> (gdb) bt
> #0  0x003e5b07c24e in memcpy () from /lib64/libc.so.6
> #1  0x005aab68 in HdrHeap::duplicate_str (this=, 
> str=0x2aae474a6ec0 , 
> nbytes=21) at HdrHeap.cc:344
> #2  0x005b3ac3 in mime_str_u16_set (heap=0x2aaabd62be12, 
> s_str=0x2aae474a6ec0 , s_len=21, 
> d_str=0x2aae3656f348, d_len=0x2aae3656f322, must_copy=true) at 
> MIME.cc:3034
> #3  0x005aef28 in host_set (this=0x2aae268f8c18, url= out>) at URL.h:541
> #4  HTTPHdr::set_url_target_from_host_field (this=0x2aae268f8c18, url= optimized out>) at HTTP.cc:1484
> #5  0x0055dc69 in RemapProcessor::setup_for_remap (this= optimized out>, s=0x2aae268f83c8) at RemapProcessor.cc:130
> #6  0x005165d9 in HttpSM::do_remap_request (this=0x2aae268f8360, 
> run_inline=true) at HttpSM.cc:3666
> #7  0x00526cbb in HttpSM::set_next_state (this=0x2aaabd62be12) at 
> HttpSM.cc:6392
> #8  0x005136f0 in HttpSM::call_transact_and_set_next_state 
> (this=0x2aae268f8360, f=) at HttpSM.cc:6345
> #9  0x00526713 in HttpSM::set_next_state (this=0x2aae268f8360) at 
> HttpSM.cc:6553
> #10 0x005136f0 in HttpSM::call_transact_and_set_next_state 
> (this=0x2aae268f8360, f=) at HttpSM.cc:6345
> #11 0x00526713 in HttpSM::set_next_state (this=0x2aae268f8360) at 
> HttpSM.cc:6553
> #12 0x005136f0 in HttpSM::call_transact_and_set_next_state 
> (this=0x2aae268f8360, f=) at HttpSM.cc:6345
> #13 0x00520f21 in HttpSM::state_read_client_request_header 
> (this=0x2aae268f8360, event=100, data=)
> at HttpSM.cc:783
> #14 0x005259b9 in HttpSM::main_handler (this=0x2aae268f8360, 
> event=100, data=0x2aae68aee6e0) at HttpSM.cc:2456
> #15 0x0066d1fb in handleEvent (nh=0x2b105668, vc=0x2aae68aee520, 
> thread=0x2b104010)
> at ../../iocore/eventsystem/I_Continuation.h:146
> #16 read_signal_and_update (nh=0x2b105668, vc=0x2aae68aee520, 
> thread=0x2b104010) at UnixNetVConnection.cc:138
> #17 read_from_net (nh=0x2b105668, vc=0x2aae68aee520, 
> thread=0x2b104010) at UnixNetVConnection.cc:320
> #18 0x00666579 in NetHandler::mainNetEvent (this=0x2b105668, 
> event=, e=0x2b8ed028) at UnixNet.cc:389
> #19 0x00691c8f in EThread::process_event (this=0x2b104010, 
> e=0x35681c0, calling_code=5) at I_Continuation.h:146
> #20 0x0069259c in EThread::execute (this=0x2b104010) at 
> UnixEThread.cc:263
> #21 0x0069115e in spawn_thread_internal (a=0x35621b0) at Thread.cc:88
> #22 0x003e5b80673d in start_thread () from /lib64/libpthread.so.0
> #23 0x003e5b0d44bd in clone () from /lib64/libc.so.6
> (gdb) f 1
> #1  0x005aab68 in HdrHeap::duplicate_str (this=, 
> str=0x2aae474a6ec0 , 
> nbytes=21) at HdrHeap.cc:344
> 344 memcpy(new_str, str, nbytes);
> (gdb) p str
> $1 = 0x2aae474a6ec0 
> (gdb) p nbytes
> $2 = 21
> (gdb) f 2
> #2  0x005b3ac3 in mime_str_u16_set (heap=0x2aaabd62be12, 
> s_str=0x2aae474a6ec0 , s_len=21, 
> d_str=0x2aae3656f348, d_len=0x2aae3656f322, must_copy=true) at 
> MIME.cc:3034
> 3034  s_str = heap->duplicate_str(s_str, s_len);
> (gdb) p s_str
> $3 = 0x2aae474a6ec0 
> (gdb) f 3
> #3  0x005aef28 in host_set (this=0x2aae268f8c18, url= out>) at URL.h:541
> 541 url_host_set(m_heap, m_url_impl, value, length, true);
> (gdb) p value
> $4 = 
> (gdb) p length
> $5 = 
> (gdb) f 2
> #2  0x005b3ac3 in mime_str_u16_set (heap=0x2aaabd62be12, 
> s_str=0x2aae474a6ec0 , s_len=21, 
> d_str=0x2aae3656f348, d_len=0x2aae3656f322, must_copy=true) at 
> MIME.cc:3034
> 3034  s_str = heap->duplicate_str(s_str, s_len);
> (gdb) l
> 3029//either NULL or be valid ptr for a string already
> 3030//the string heaps
> 3031heap->free_string(*d_str, *d_len);
> 3032  
> 3033if (must_copy && s_str) {
> 3034

[jira] [Commented] (TS-621) writing 0 bytes to the HTTP cache means only update the header... need a new API: update_header_only() to allow 0 byte files to be cached

2012-04-18 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257302#comment-13257302
 ] 

weijin commented on TS-621:
---

John: the patch was just a temporary solution and I did not take into
account this situation you mentioned (even did not know). So if you have
any ideas about it, tell me.

On Thu, 2012-04-19 at 03:51 +, John Plevyak (Commented) (JIRA)





> writing 0 bytes to the HTTP cache means only update the header... need a new 
> API: update_header_only() to allow 0 byte files to be cached
> -
>
> Key: TS-621
> URL: https://issues.apache.org/jira/browse/TS-621
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Cache
>Affects Versions: 2.1.5
>Reporter: John Plevyak
>Assignee: weijin
> Fix For: 3.1.4
>
> Attachments: TS-621_cluster_zero_size_objects.patch, 
> force_empty.diff, ts-621-jp-1.patch, ts-621-jp-2.patch, ts-621-jp-3.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1130) ink_time_t is 64bit on x86_64

2012-04-10 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250553#comment-13250553
 ] 

weijin commented on TS-1130:


We found in some 64 bit systems, the ink_atomic_cas on time_t variable can lead 
very strange problems, such as endless loop.   

> ink_time_t is 64bit on x86_64
> -
>
> Key: TS-1130
> URL: https://issues.apache.org/jira/browse/TS-1130
> Project: Traffic Server
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Zhao Yongming
>Assignee: weijin
> Fix For: 3.1.4
>
> Attachments: TS-1130.diff
>
>
> Weijin: paste your patch here, :D

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-20 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234171#comment-13234171
 ] 

weijin commented on TS-1158:


I see the read_from_net and write_to_net_io function also have such mechanism 
to prevent the race condition. I read and read it again, but still can not 
figure out how the mutex is switched. Can you explain it more detailly, and I 
also want to know what is consequences of the race.  thanks ery much. 

> Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
> 
>
> Key: TS-1158
> URL: https://issues.apache.org/jira/browse/TS-1158
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 3.0.3
> Environment: ALL
>Reporter: John Plevyak
>Assignee: John Plevyak
> Fix For: 3.1.4
>
> Attachments: ts-1158-jp1.patch
>
>
> Because of the way session management works, the vio.mutex must be 
> re-verified to be identical to the one the lock was taken on after the lock 
> is acquired.  Otherwise there is a race when the mutex is switched allowing 
> such that the old lock is held while the new lock is in not held.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1154) quick_filter on HEAD does not work

2012-03-20 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233303#comment-13233303
 ] 

weijin commented on TS-1154:


We found the the order of http methods in _hdrtoken_strs is not match the order 
in proxy.config.http.quick_filter.mask, so I changed the methods order in 
_hdrtoken_strs as the same order of the config comments. I am sure TS-1140 fix 
this problem, if jMCg will not back port TS-1140 on 3.0.x, maybe it need this 
patch in short term way.

> quick_filter on HEAD does not work
> --
>
> Key: TS-1154
> URL: https://issues.apache.org/jira/browse/TS-1154
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Reporter: Zhao Yongming
>Assignee: weijin
> Attachments: head_method.diff
>
>
> we take quick filter as a good solution for some security concern, but when I 
> set it to 0x0733, it does not allow HEAD in, but setting as 0x0723 does that.
> Weijin have the patch in our tree: 
> https://gitorious.org/trafficserver/taobao/commit/cb23b87d167da4074e047fabc94786003ee94e9a/diffs/db7d0e5be69988b531e8d1e4eea717e6d46df5cd
> I will commit if no one complain in 2 days.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-621) writing 0 bytes to the HTTP cache means only update the header... need a new API: update_header_only() to allow 0 byte files to be cached

2012-02-27 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217116#comment-13217116
 ] 

weijin commented on TS-621:
---

Leif, John, Pleas review the patch and give me your advices.

> writing 0 bytes to the HTTP cache means only update the header... need a new 
> API: update_header_only() to allow 0 byte files to be cached
> -
>
> Key: TS-621
> URL: https://issues.apache.org/jira/browse/TS-621
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Cache
>Affects Versions: 2.1.5
>Reporter: John Plevyak
>Assignee: John Plevyak
> Fix For: 3.1.4
>
> Attachments: TS-621_cluster_zero_size_objects.patch, 
> force_empty.diff, ts-621-jp-1.patch, ts-621-jp-2.patch, ts-621-jp-3.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-621) writing 0 bytes to the HTTP cache means only update the header... need a new API: update_header_only() to allow 0 byte files to be cached

2012-02-27 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217115#comment-13217115
 ] 

weijin commented on TS-621:
---

Base on John`s solution, I also write a patch to cache empty documents. It`s 
simple and have no need to modify the tunnel(http_sm). The patch have side 
effects in cluster mode (3), if set force_cache_emtpy_doc, It can write empty 
doc into another machine, but can not read it from that machine.

> writing 0 bytes to the HTTP cache means only update the header... need a new 
> API: update_header_only() to allow 0 byte files to be cached
> -
>
> Key: TS-621
> URL: https://issues.apache.org/jira/browse/TS-621
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Cache
>Affects Versions: 2.1.5
>Reporter: John Plevyak
>Assignee: John Plevyak
> Fix For: 3.1.4
>
> Attachments: TS-621_cluster_zero_size_objects.patch, 
> ts-621-jp-1.patch, ts-621-jp-2.patch, ts-621-jp-3.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1112) traffic_cop may crash at free()

2012-02-26 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217064#comment-13217064
 ] 

weijin commented on TS-1112:


I am not sure why this happen, and it did not arise since another problem 
solved. So maybe we can close it now, and reopen it if it arise again. 

> traffic_cop may crash at free()
> ---
>
> Key: TS-1112
> URL: https://issues.apache.org/jira/browse/TS-1112
> Project: Traffic Server
>  Issue Type: Bug
> Environment: v3.0.x
>Reporter: Zhao Yongming
>Assignee: weijin
> Fix For: 3.1.4
>
>
> traffic_cop may crash, at memory free, that will leave the manager & server 
> alone, or may die with the manager too, leave a system without service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1109) stack dump may crash too

2012-02-26 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217059#comment-13217059
 ] 

weijin commented on TS-1109:


Breakpoint 1, ink_stack_trace_get (stack=0x7fffb4d0, len=100, 
signalhandler_frame=2) at ink_stack_trace.cc:85
85  {
(gdb) n
87if ((btl = backtrace(stack, len)) > 0) {
(gdb) 
88  if (signalhandler_frame)
(gdb) 
89ink_restore_signal_handler_frame(stack, btl, signalhandler_frame 
+ 1);
(gdb) s
ink_restore_signal_handler_frame (stack=0x7fffb4d0, len=, signalhandler_frame=2) at ink_stack_trace.cc:63
63asm volatile ("mov %%rbp,%0":"=r" (fp));
(gdb) info reg rbp
rbp0x2  0x2
(gdb) n
67for (i = 0; i < signalhandler_frame; i++)
(gdb) 
68  fp = (void **) (*fp);
(gdb) 

Program received signal SIGSEGV, Segmentation fault.


> stack dump may crash too
> 
>
> Key: TS-1109
> URL: https://issues.apache.org/jira/browse/TS-1109
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 3.1.2
>Reporter: Zhao Yongming
>Assignee: weijin
>  Labels: crash
> Fix For: 3.1.4
>
>
> the codes doing stack dump may crash, in this case you will not able to get a 
> core file, that will hide most of the rare issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1109) stack dump may crash too

2012-02-26 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217056#comment-13217056
 ] 

weijin commented on TS-1109:


By disassembly ink_restore_signal_handler_frameļ¼Œ I found it was not the base 
frame address but the function args stored in %rbp in release mode. The 
embedded assembly codes of getting base frame address is dangerous and have 
poor portability, so I remove them from the sources.

> stack dump may crash too
> 
>
> Key: TS-1109
> URL: https://issues.apache.org/jira/browse/TS-1109
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 3.1.2
>Reporter: Zhao Yongming
>Assignee: weijin
>  Labels: crash
> Fix For: 3.1.4
>
>
> the codes doing stack dump may crash, in this case you will not able to get a 
> core file, that will hide most of the rare issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1032) Assertion when upstream connection is established (with event handled by thread A) and immediately disconnected (handled by thread B)

2011-12-10 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166876#comment-13166876
 ] 

weijin commented on TS-1032:


cool. 

> Assertion when upstream connection is established (with event handled by 
> thread A) and immediately disconnected (handled by thread B)
> -
>
> Key: TS-1032
> URL: https://issues.apache.org/jira/browse/TS-1032
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, HTTP
>Affects Versions: 3.1.1
> Environment: Linux 32bit CentOS 5.4. Pre-open source version of ATS.
>Reporter: Uri Shachar
>Assignee: Leif Hedstrom
> Fix For: 3.1.2
>
> Attachments: wait_patch.diff
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> This happened twice on a very old version of ATS (pre opensource code), but 
> it looks like it can happen in current ATS as well (it's a very rare race 
> condition, haven't been able to reproduce).
> Scenario:
>   1)  Client request arrives, handled by TS thread 1 and is reenabled 
> by a plugin (Inside a continuation called by ContSched)
>   2)  TS thread 2 starts to connect upstream
>   3)  A client disconnection event is placed in thread 1 queue.
>   4)  A successful connection event is placed in thread 2 queue.
>   5)  Thread 1 starts to handle pending events (setting cur_time to X)
>   6)  Thread 2 starts to handle pending events (setting cur_time to 
> Z=X+Y)
>   7)  Thread 2 handles the connection established event (setting 
> server_first_connect to Z)
>   8)  Thread 1 handles the client disconnection event - Getting a 
> negative wait and asserting...
> Sample stack trace:
> Program received signal SIGABRT, Aborted.
> [Switching to Thread 0xe3131b90 (LWP 14584)]
> 0xe410 in __kernel_vsyscall ()
> #0  0xe410 in __kernel_vsyscall ()
> #1  0x007e2df0 in raise () from /lib/libc.so.6
> #2  0x007e484e in abort () from /lib/libc.so.6
> #3  0x08427612 in ink_die_die_die (retval=1) at 
> /usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_error.cc:45
> #4  0x08427778 in ink_fatal_va (return_code=1, message_format=0xe312ee1f 
> "/tmp/ushachar-rpmbuild/BUILD/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:5572:
>  failed assert `wait >= 0`", ap=0xe312ee08 "\002") at 
> /usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_error.cc:100
> #5  0x084277d3 in ink_fatal (return_code=1, message_format=0xe312ee1f 
> "/tmp/ushachar-rpmbuild/BUILD/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:5572:
>  failed assert `wait >= 0`") at 
> /usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_error.cc:111
> #6  0x08424508 in _ink_assert (a=0x853db72 "wait >= 0", f=0x853ab3c 
> "/tmp/ushachar-rpmbuild/BUILD/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc", 
> l=5572) at /usr/src/debug/wts/proxy/ts/traffic/libwebsense++/ink_assert.cc:27
> #7  0x082f2505 in HttpSM::mark_server_down_on_client_abort (this=0xb622ece0) 
> at /usr/src/debug/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:5572
> #8  0x082f6080 in HttpSM::state_watch_for_client_abort (this=0xb622ece0, 
> event=3, data=0x7e0e2a88) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:1148
> #9  0x082fad0f in HttpSM::main_handler (this=0xb622ece0, event=3, 
> data=0x7e0e2a88) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/http2/HttpSM.cc:3213
> #10 0x0810a07b in Continuation::handleEvent (this=0xb622ece0, event=3, 
> data=0x7e0e2a88) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/Continuation.h:85
> #11 0x083ab348 in read_signal_and_update (event=3, vc=0x7e0e2a30) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:262
> #12 0x083ab3fe in read_signal_done (event=3, nh=0xa339b28, vc=0x7e0e2a30) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:300
> #13 0x083ab44f in read_signal_error (nh=0xa339b28, vc=0x7e0e2a30, lerrno=104) 
> at /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:324
> #14 0x083ae1c5 in read_from_net (nh=0xa339b28, vc=0x7e0e2a30, 
> thread=0xa32e490) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:783
> #15 0x083ae5a7 in UnixNetVConnection::net_read_io (this=0x7e0e2a30, 
> nh=0xa339b28, lthread=0xa32e490) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:1059
> #16 0x083adced in NetHandler::mainNetEvent (this=0xa339b28, event=5, 
> e=0xa1ab810) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/UnixNet.cc:1272
> #17 0x0810a07b in Continuation::handleEvent (this=0xa339b28, event=5, 
> data=0xa1ab810) at 
> /usr/src/debug/wts/proxy/ts/traffic/proxy/iocore/Continuation.h:85
> #18 0x083a19ac in EThread::process_event (this=0xa32e490, e=0xa1ab810, 
> calling_code=5) at 
> /usr/src/debu

[jira] [Commented] (TS-1030) hash collation in hdrtoken_hash

2011-11-23 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156480#comment-13156480
 ] 

weijin commented on TS-1030:


the function hdrtoken_tokenize should check the content of string equality 
rather than just check the hash and string length. Maybe someone can give us a 
perfect hash for it.
@Leif, @John

> hash collation in hdrtoken_hash
> ---
>
> Key: TS-1030
> URL: https://issues.apache.org/jira/browse/TS-1030
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Affects Versions: 3.1.2, 3.0.1
>Reporter: Zhao Yongming
>Priority: Critical
>
> we have find out a 3 characters collation:
> SPX == PUT
> that will crash TS, we need to take more care of those hash, or bad guys may 
> put some magic headers and crash all TS in your production, that is the most 
> powerful DOS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1028) Assert when enabling "shared origin connections" with a setting of "2"

2011-11-22 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155643#comment-13155643
 ] 

weijin commented on TS-1028:


weird, I remember I test it before, no deadlocks happen.  

> Assert when enabling "shared origin connections" with a setting of "2"
> --
>
> Key: TS-1028
> URL: https://issues.apache.org/jira/browse/TS-1028
> Project: Traffic Server
>  Issue Type: Bug
>Reporter: Leif Hedstrom
>Assignee: Leif Hedstrom
> Fix For: 3.1.2
>
>
> When you enable the "2" setting for sharing origin connections (which creates 
> a connection pool per net-thread), we trigger an assert in Debug builds. I 
> think it only triggers too with logging enabled, but not certain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1028) Assert when enabling "shared origin connections" with a setting of "2"

2011-11-22 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155631#comment-13155631
 ] 

weijin commented on TS-1028:


hmm, netvc::reenable function requires the Cont`mutex holded by current thread, 
but when eanble the "2" setting for sharing origin connections, no thread take 
the sessionbucket`mutex. So, change sessionbucket mutex as thead mutex in 
initialize_thread_for_http_sessions.

> Assert when enabling "shared origin connections" with a setting of "2"
> --
>
> Key: TS-1028
> URL: https://issues.apache.org/jira/browse/TS-1028
> Project: Traffic Server
>  Issue Type: Bug
>Reporter: Leif Hedstrom
>Assignee: Leif Hedstrom
> Fix For: 3.1.2
>
>
> When you enable the "2" setting for sharing origin connections (which creates 
> a connection pool per net-thread), we trigger an assert in Debug builds. I 
> think it only triggers too with logging enabled, but not certain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-857) Crash Report: HttpTunnel::chain_abort_all -> HttpServerSession::do_io_close -> UnixNetVConnection::do_io_close

2011-11-05 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144938#comment-13144938
 ] 

weijin commented on TS-857:
---

@Leif, @amc, what`s your advices. 

> Crash Report: HttpTunnel::chain_abort_all -> HttpServerSession::do_io_close 
> -> UnixNetVConnection::do_io_close
> --
>
> Key: TS-857
> URL: https://issues.apache.org/jira/browse/TS-857
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, Network
>Affects Versions: 3.1.0
> Environment: in my branch that is something same as 3.0.x
>Reporter: Zhao Yongming
>Assignee: weijin
> Fix For: 3.1.2
>
>
> here is the bt from the crash, some of the information is missing due to we 
> have not enable the --enable-debug configure options.
> {code}
> [New process 7532]
> #0  ink_stack_trace_get (stack=, len= out>, signalhandler_frame=)
> at ink_stack_trace.cc:68
> 68fp = (void **) (*fp);
> (gdb) bt
> #0  ink_stack_trace_get (stack=, len= out>, signalhandler_frame=)
> at ink_stack_trace.cc:68
> #1  0x2ba641dccef1 in ink_stack_trace_dump (sighandler_frame= optimized out>) at ink_stack_trace.cc:114
> #2  0x004df020 in signal_handler (sig=) at 
> signals.cc:225
> #3  
> #4  0x006a1ea9 in UnixNetVConnection::do_io_close (this=0x1cc9bd20, 
> alerrno=)
> at ../../iocore/eventsystem/I_Lock.h:297
> #5  0x0051f1d0 in HttpServerSession::do_io_close 
> (this=0x2aaab0042c80, alerrno=20600) at HttpServerSession.cc:127
> #6  0x0056d1e9 in HttpTunnel::chain_abort_all (this=0x2aabeeffdd70, 
> p=0x2aabeeffdf68) at HttpTunnel.cc:1300
> #7  0x005269ca in HttpSM::tunnel_handler_ua (this=0x2aabeeffc070, 
> event=104, c=0x2aabeeffdda8) at HttpSM.cc:2987
> #8  0x00571dfc in HttpTunnel::consumer_handler (this=0x2aabeeffdd70, 
> event=104, c=0x2aabeeffdda8) at HttpTunnel.cc:1232
> #9  0x00572032 in HttpTunnel::main_handler (this=0x2aabeeffdd70, 
> event=1088608784, data=)
> at HttpTunnel.cc:1456
> #10 0x006a6307 in write_to_net_io (nh=0x2b12d688, vc=0x1cc876e0, 
> thread=)
> at ../../iocore/eventsystem/I_Continuation.h:146
> #11 0x0069ce97 in NetHandler::mainNetEvent (this=0x2b12d688, 
> event=, e=0x171c1ed0) at UnixNet.cc:405
> #12 0x006cddaf in EThread::process_event (this=0x2b12c010, 
> e=0x171c1ed0, calling_code=5) at I_Continuation.h:146
> #13 0x006ce6bc in EThread::execute (this=0x2b12c010) at 
> UnixEThread.cc:262
> #14 0x006cd0ee in spawn_thread_internal (a=0x171b58f0) at Thread.cc:88
> #15 0x003c33c064a7 in start_thread () from /lib64/libpthread.so.0
> #16 0x003c330d3c2d in clone () from /lib64/libc.so.6
> (gdb) info f
> Stack level 0, frame at 0x40e2b790:
>  rip = 0x2ba641dccdf3 in ink_stack_trace_get(void**, int, int) 
> (ink_stack_trace.cc:68); saved rip 0x2ba641dccef1
>  called by frame at 0x40e2bbe0
>  source language c++.
>  Arglist at 0x40e2b770, args: stack=, len= optimized out>, signalhandler_frame=
>  Locals at 0x40e2b770, Previous frame's sp is 0x40e2b790
>  Saved registers:
>   rbx at 0x40e2b778, rbp at 0x40e2b780, rip at 0x40e2b788
> (gdb) 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-857) Crash Report: HttpTunnel::chain_abort_all -> HttpServerSession::do_io_close -> UnixNetVConnection::do_io_close

2011-11-05 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144936#comment-13144936
 ] 

weijin commented on TS-857:
---

httpsm can be callbacked between different threads, when terminating the sm, we 
call do_io_close to close the net_vc, but net_vc::do_io_close is not thread 
safe, Inactivlity_cop and netHandler can also close net_vc when net_vc::closed 
is set without locking the mutex of net_vc. I hope this can also explain TS-934.

I have read amc`s patch for TS-934 carefully recently, he contributed a lot in 
the problem solving.  I have two questions: 1) should we lock mutex of net_vc 
in Inactivlity_cop and netHandler 2) should one thread can close net_vc of a 
different thread. 

I tend to add some codes in net_vc::do_io_close and net_vc::mainEvent to make 
it thread safe:
UnixNetVConnection::do_io_close 
{
   if (thread != this_ethread()) {
  thread->schedule_imm(this, EVENT_VC_TRY_TO_CLOSE);
  return;
   }
   disable_read(this);
   disable_write(this);
   .
   close_UnixNetVConneciton(this, t);
}


UnixNetVConnection::mainEvent(int event, void *e)
{
   if (event == EVENT_VC_TRY_TO_CLOSE) {
 do_io_close();
 return EVENT_DONE;
   }
   
   // check the active and inactivity timeout 
   
}


> Crash Report: HttpTunnel::chain_abort_all -> HttpServerSession::do_io_close 
> -> UnixNetVConnection::do_io_close
> --
>
> Key: TS-857
> URL: https://issues.apache.org/jira/browse/TS-857
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, Network
>Affects Versions: 3.1.0
> Environment: in my branch that is something same as 3.0.x
>Reporter: Zhao Yongming
>Assignee: weijin
> Fix For: 3.1.2
>
>
> here is the bt from the crash, some of the information is missing due to we 
> have not enable the --enable-debug configure options.
> {code}
> [New process 7532]
> #0  ink_stack_trace_get (stack=, len= out>, signalhandler_frame=)
> at ink_stack_trace.cc:68
> 68fp = (void **) (*fp);
> (gdb) bt
> #0  ink_stack_trace_get (stack=, len= out>, signalhandler_frame=)
> at ink_stack_trace.cc:68
> #1  0x2ba641dccef1 in ink_stack_trace_dump (sighandler_frame= optimized out>) at ink_stack_trace.cc:114
> #2  0x004df020 in signal_handler (sig=) at 
> signals.cc:225
> #3  
> #4  0x006a1ea9 in UnixNetVConnection::do_io_close (this=0x1cc9bd20, 
> alerrno=)
> at ../../iocore/eventsystem/I_Lock.h:297
> #5  0x0051f1d0 in HttpServerSession::do_io_close 
> (this=0x2aaab0042c80, alerrno=20600) at HttpServerSession.cc:127
> #6  0x0056d1e9 in HttpTunnel::chain_abort_all (this=0x2aabeeffdd70, 
> p=0x2aabeeffdf68) at HttpTunnel.cc:1300
> #7  0x005269ca in HttpSM::tunnel_handler_ua (this=0x2aabeeffc070, 
> event=104, c=0x2aabeeffdda8) at HttpSM.cc:2987
> #8  0x00571dfc in HttpTunnel::consumer_handler (this=0x2aabeeffdd70, 
> event=104, c=0x2aabeeffdda8) at HttpTunnel.cc:1232
> #9  0x00572032 in HttpTunnel::main_handler (this=0x2aabeeffdd70, 
> event=1088608784, data=)
> at HttpTunnel.cc:1456
> #10 0x006a6307 in write_to_net_io (nh=0x2b12d688, vc=0x1cc876e0, 
> thread=)
> at ../../iocore/eventsystem/I_Continuation.h:146
> #11 0x0069ce97 in NetHandler::mainNetEvent (this=0x2b12d688, 
> event=, e=0x171c1ed0) at UnixNet.cc:405
> #12 0x006cddaf in EThread::process_event (this=0x2b12c010, 
> e=0x171c1ed0, calling_code=5) at I_Continuation.h:146
> #13 0x006ce6bc in EThread::execute (this=0x2b12c010) at 
> UnixEThread.cc:262
> #14 0x006cd0ee in spawn_thread_internal (a=0x171b58f0) at Thread.cc:88
> #15 0x003c33c064a7 in start_thread () from /lib64/libpthread.so.0
> #16 0x003c330d3c2d in clone () from /lib64/libc.so.6
> (gdb) info f
> Stack level 0, frame at 0x40e2b790:
>  rip = 0x2ba641dccdf3 in ink_stack_trace_get(void**, int, int) 
> (ink_stack_trace.cc:68); saved rip 0x2ba641dccef1
>  called by frame at 0x40e2bbe0
>  source language c++.
>  Arglist at 0x40e2b770, args: stack=, len= optimized out>, signalhandler_frame=
>  Locals at 0x40e2b770, Previous frame's sp is 0x40e2b790
>  Saved registers:
>   rbx at 0x40e2b778, rbp at 0x40e2b780, rip at 0x40e2b788
> (gdb) 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-980) change client_session schedule from global to thread local, and reduce the try_locks in UnixNetVConnection::reenable

2011-11-05 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144692#comment-13144692
 ] 

weijin commented on TS-980:
---

I also added some logics to reduce the possibility of acquiring the expire 
sessions.  

> change client_session schedule from global  to thread local, and reduce the 
> try_locks in UnixNetVConnection::reenable
> -
>
> Key: TS-980
> URL: https://issues.apache.org/jira/browse/TS-980
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Network, Performance
>Affects Versions: 3.1.0, 3.0.0
> Environment: all
>Reporter: weijin
>Assignee: weijin
> Fix For: 3.1.2
>
> Attachments: ts-980.diff
>
>
> I did some performance test on ats last days(disable cache, set share_server 
> session 2, pure proxy mode), I did see significant improvement on low load, 
> but it dropped rapidly when load is high. meanwhile, some stability problems 
> happened. Through gdb, I found the client_session`s mutex can be acquired by 
> two or more threads, I believe some schedules happened during the sm 
> life_time. May be we need do some work to find these eventProcessor.schedules 
> and change them to thread schedules.
> UnixVConnecton::reenable {
> if (nh->mutex->thread_holding == t) {
>   // put into ready_list
> } else {
>MUTEX_TRY_LOCK(lock, nh->mutex, t);
>if (!lock) {
>  // put into enable_list;
>} else {
>  // put into ready_list;
>}
> }
> remove UnixNetVConnection::reenable try_lock operations, 3 reasons
> 1. try_lock operation means obj allocation and deallocation operation. 
> frequently
> 2. try_lock hardly can lock the net-handler`s mutex.(net-handler is schedule 
> by as soon as possible)
> 3. try_lock should not acquire the net-handler`s mutex. That may lead more 
> net io latency if it is an epoll event need to be processed in other threads. 
> If it is not an epoll event(time event), I don`t think putting vc in 
> ready_list has any advantage than in enable_list.
> may be we can change reenale function like this:
> UnixVConnecton::reenable {
> if (nh->mutex->thread_holding == t) {
>   // put into ready_list;
> } else {
>   // put into enable_list;
> }
> my buddies, any advice?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-954) when use raw disks, some blocks is lost when caculate disk usable blocks

2011-10-18 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130319#comment-13130319
 ] 

weijin commented on TS-954:
---

yes. This bug is not import, maybe we`d better commit it when the next 
improvement in cache.

> when use raw disks, some blocks is lost when caculate disk usable blocks
> 
>
> Key: TS-954
> URL: https://issues.apache.org/jira/browse/TS-954
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache
>Affects Versions: 3.1.0
> Environment: all when use raw disks
>Reporter: weijin
>Assignee: John Plevyak
> Fix For: 3.1.1
>
> Attachments: calcu_blocks.dff
>
>
> when use raw disks, some blocks may be lost because the skip variable is in 
> bytes not in blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-980) change client_session schedule from global to thread local, and reduce the try_locks in UnixNetVConnection::reenable

2011-10-14 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128043#comment-13128043
 ] 

weijin commented on TS-980:
---

http_sm can be callback by different threads even use share sessions in thread 
mode. in most cases, the callbacks happends in one thread, there may come up an 
exception when http_sms(different threads) do the dns query in the same time. 
Through the sources, I found HostDBContinuation::remove_trigger_pending_dns can 
callsback http_sm which was created by other threads. This is an extreme 
condition and have little effect on performance. I also give a patch on this 
ticket because it takes me 3 days to figure out the reason. 

> change client_session schedule from global  to thread local, and reduce the 
> try_locks in UnixNetVConnection::reenable
> -
>
> Key: TS-980
> URL: https://issues.apache.org/jira/browse/TS-980
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Network, Performance
>Affects Versions: 3.1.0, 3.0.0
> Environment: all
>Reporter: weijin
>Assignee: weijin
> Fix For: 3.1.2
>
>
> I did some performance test on ats last days(disable cache, set share_server 
> session 2, pure proxy mode), I did see significant improvement on low load, 
> but it dropped rapidly when load is high. meanwhile, some stability problems 
> happened. Through gdb, I found the client_session`s mutex can be acquired by 
> two or more threads, I believe some schedules happened during the sm 
> life_time. May be we need do some work to find these eventProcessor.schedules 
> and change them to thread schedules.
> UnixVConnecton::reenable {
> if (nh->mutex->thread_holding == t) {
>   // put into ready_list
> } else {
>MUTEX_TRY_LOCK(lock, nh->mutex, t);
>if (!lock) {
>  // put into enable_list;
>} else {
>  // put into ready_list;
>}
> }
> remove UnixNetVConnection::reenable try_lock operations, 3 reasons
> 1. try_lock operation means obj allocation and deallocation operation. 
> frequently
> 2. try_lock hardly can lock the net-handler`s mutex.(net-handler is schedule 
> by as soon as possible)
> 3. try_lock should not acquire the net-handler`s mutex. That may lead more 
> net io latency if it is an epoll event need to be processed in other threads. 
> If it is not an epoll event(time event), I don`t think putting vc in 
> ready_list has any advantage than in enable_list.
> may be we can change reenale function like this:
> UnixVConnecton::reenable {
> if (nh->mutex->thread_holding == t) {
>   // put into ready_list;
> } else {
>   // put into enable_list;
> }
> my buddies, any advice?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira