[jira] [Created] (TS-3458) AIO caused system hang again
Zhaonanli created TS-3458: - Summary: AIO caused system hang again Key: TS-3458 URL: https://issues.apache.org/jira/browse/TS-3458 Project: Traffic Server Issue Type: Bug Components: Cache Reporter: Zhaonanli https://issues.apache.org/jira/browse/TS-2205 arise again. version 4.2.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3455) Marking the cache STALE in lookup-complete causes abort()
[ https://issues.apache.org/jira/browse/TS-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371119#comment-14371119 ] Luca Bruno commented on TS-3455: So I've tried the regex_revalidate plugin that does something similar, and I experience right the same abort() as with my plugin. Marking the cache STALE in lookup-complete causes abort() - Key: TS-3455 URL: https://issues.apache.org/jira/browse/TS-3455 Project: Traffic Server Issue Type: Bug Reporter: Luca Bruno Fix For: 6.0.0 I've written a simple test case plugin for demonstrating this problem, not sure if it's a problem on my side, but that would also mean that the regex invalidate plugin would also abort(). What the plugin does: in LOOKUP_COMPLETE, if the cache status is FRESH then set it to STALE. To reproduce: 1) Send a first cacheable request to ATS, which gets cached. 2) Request again the same url, the plugin triggers and set the cache to STALE. Then ATS does abort(). Plugin code: {noformat} #include ts/ts.h #include ts/remap.h #include ts/experimental.h #include stdlib.h #include stdio.h #include getopt.h #include string.h #include string #include iterator #include map const char PLUGIN_NAME[] = maybebug; static int Handler(TSCont cont, TSEvent event, void *edata); struct PluginState { PluginState() { cont = TSContCreate(Handler, NULL); TSContDataSet(cont, this); } ~PluginState() { TSContDestroy(cont); } TSCont cont; }; static int Handler(TSCont cont, TSEvent event, void* edata) { TSHttpTxn txn = (TSHttpTxn)edata; if (event == TS_EVENT_HTTP_CACHE_LOOKUP_COMPLETE) { int lookup_status; if (TS_SUCCESS == TSHttpTxnCacheLookupStatusGet(txn, lookup_status)) { TSDebug(PLUGIN_NAME, lookup complete: %d, lookup_status); if (lookup_status == TS_CACHE_LOOKUP_HIT_FRESH) { TSDebug(PLUGIN_NAME, set stale); TSHttpTxnCacheLookupStatusSet(txn, TS_CACHE_LOOKUP_HIT_STALE); } } } TSHttpTxnReenable(txn, TS_EVENT_HTTP_CONTINUE); return TS_EVENT_NONE; } void TSPluginInit (int argc, const char *argv[]) { TSPluginRegistrationInfo info; info.plugin_name = strdup(cappello); info.vendor_name = strdup(foo); info.support_email = strdup(f...@bar.com); if (TSPluginRegister(TS_SDK_VERSION_3_0 , info) != TS_SUCCESS) { TSError(Plugin registration failed); } PluginState* state = new PluginState(); TSHttpHookAdd(TS_HTTP_CACHE_LOOKUP_COMPLETE_HOOK, state-cont); } {noformat} Output: {noformat} [Mar 19 18:40:36.254] Server {0x7f6df0b4f740} DIAG: (maybebug) lookup complete: 0 [Mar 19 18:40:40.854] Server {0x7f6decfee700} DIAG: (maybebug) lookup complete: 2 [Mar 19 18:40:40.854] Server {0x7f6decfee700} DIAG: (maybebug) set stale FATAL: HttpTransact.cc:433: failed assert `s-pending_work == NULL` traffic_server - STACK TRACE: /usr/local/lib/libtsutil.so.5(ink_fatal+0xa3)[0x7f6df072186d] /usr/local/lib/libtsutil.so.5(_Z12ink_get_randv+0x0)[0x7f6df071f3a0] traffic_server[0x60d0aa] traffic_server(_ZN12HttpTransact22HandleCacheOpenReadHitEPNS_5StateE+0xf82)[0x619206] ... {noformat} What happens in gdb is that HandleCacheOpenReadHit is called twice in the same request. The first time s-pending_work is NULL, the second time it's not NULL. The patch below fixes the problem: {noformat} diff --git a/proxy/http/HttpTransact.cc b/proxy/http/HttpTransact.cc index 0078ef1..852f285 100644 --- a/proxy/http/HttpTransact.cc +++ b/proxy/http/HttpTransact.cc @@ -2641,11 +2641,6 @@ HttpTransact::HandleCacheOpenReadHit(State* s) //ink_release_assert(s-current.request_to == PARENT_PROXY || //s-http_config_param-no_dns_forward_to_parent != 0); -// Set ourselves up to handle pending revalidate issues -// after the PP DNS lookup -ink_assert(s-pending_work == NULL); -s-pending_work = issue_revalidate; - // We must be going a PARENT PROXY since so did // origin server DNS lookup right after state Start // @@ -2654,6 +2649,11 @@ HttpTransact::HandleCacheOpenReadHit(State* s) // missing ip but we won't take down the system // if (s-current.request_to == PARENT_PROXY) { + // Set ourselves up to handle pending revalidate issues + // after the PP DNS lookup + ink_assert(s-pending_work == NULL); + s-pending_work = issue_revalidate; + TRANSACT_RETURN(SM_ACTION_DNS_LOOKUP, PPDNSLookup); } else if (s-current.request_to == ORIGIN_SERVER) { TRANSACT_RETURN(SM_ACTION_DNS_LOOKUP, OSDNSLookup); {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leif Hedstrom resolved TS-3459. --- Resolution: Fixed Ok, the Geffon explained to me what this mean, so it's good. Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 5.3.0 This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3419) Run all code through clang-format tool
[ https://issues.apache.org/jira/browse/TS-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leif Hedstrom updated TS-3419: -- Fix Version/s: (was: 6.0.0) 5.3.0 Run all code through clang-format tool -- Key: TS-3419 URL: https://issues.apache.org/jira/browse/TS-3419 Project: Traffic Server Issue Type: Improvement Components: Core Reporter: Leif Hedstrom Assignee: Leif Hedstrom Fix For: 5.3.0 As discussed on mailing list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372442#comment-14372442 ] Sudheer Vinukonda commented on TS-3459: --- Sorry, please ignore the comment - I've misread the setting and assumed it is just blocking the sending of {{100 CONT}}. But, looks like this new setting explicitly forbids POSTs that expect {{100 CONT}} - so, I've no concerns. Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 5.3.0 This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-3364) Add command line config validation support to traffic_server
[ https://issues.apache.org/jira/browse/TS-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheer Vinukonda resolved TS-3364. --- Resolution: Fixed Add command line config validation support to traffic_server Key: TS-3364 URL: https://issues.apache.org/jira/browse/TS-3364 Project: Traffic Server Issue Type: Improvement Components: Configuration Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda Fix For: 5.3.0 Currently, traffic_server fails to initialize when it encounters fatal errors in loading the config files during start up. During dynamic reloading of config files (e.g. via traffic_line), traffic_server rejects new config and falls back to existing/old config (however, if there was a traffic_server crash/restart subsequently, that can again result into failing to initialize). This jira proposes to make the behavior of traffic_server when it encounters such fatal errors configurable via a new setting {{proxy.config.ignore_fatal_errors}} with the below options: {code} 0 : All errors are fatal, do not load/reload 1 : Ignore a bad config line, continue with the rest 2 : Ignore a bad config line, stop parsing the file further .. {code} Based on concerns expressed, it has been agreed to not change the traffic_server's behavior to loading with fatal config. Instead, this jira will be used to add a command line option to traffic_server to load and validate the config. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3364) Add command line config validation support to traffic_server
[ https://issues.apache.org/jira/browse/TS-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheer Vinukonda updated TS-3364: -- Fix Version/s: (was: 6.0.0) 5.3.0 Add command line config validation support to traffic_server Key: TS-3364 URL: https://issues.apache.org/jira/browse/TS-3364 Project: Traffic Server Issue Type: Improvement Components: Configuration Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda Fix For: 5.3.0 Currently, traffic_server fails to initialize when it encounters fatal errors in loading the config files during start up. During dynamic reloading of config files (e.g. via traffic_line), traffic_server rejects new config and falls back to existing/old config (however, if there was a traffic_server crash/restart subsequently, that can again result into failing to initialize). This jira proposes to make the behavior of traffic_server when it encounters such fatal errors configurable via a new setting {{proxy.config.ignore_fatal_errors}} with the below options: {code} 0 : All errors are fatal, do not load/reload 1 : Ignore a bad config line, continue with the rest 2 : Ignore a bad config line, stop parsing the file further .. {code} Based on concerns expressed, it has been agreed to not change the traffic_server's behavior to loading with fatal config. Instead, this jira will be used to add a command line option to traffic_server to load and validate the config. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-1775) Cleanup of ink_hrtime.{cc,h}
[ https://issues.apache.org/jira/browse/TS-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bwahn updated TS-1775: -- Attachment: TS-1775.patch It was attached to this for patch. Cleanup of ink_hrtime.{cc,h} Key: TS-1775 URL: https://issues.apache.org/jira/browse/TS-1775 Project: Traffic Server Issue Type: Improvement Reporter: Leif Hedstrom Labels: newbie Fix For: 6.0.0 Attachments: TS-1775.patch A few things comes to mind: 1) Why do we have a NEED_HRTIME define? It's always on as far as I can tell, and I can't imagine there's a reason to not have it on (it'd completely break like everything, in fact it would fail to compile since gethrtime() doesn't exist?). 2) We should eliminate the USE_TIME_STAMP_COUNTER_HRTIME define, and the code that implements our own TSC code. Modern Unix flavors implements this already in various way (e.g. glibc's gettimeofday() wrapper has a TSC user space implementation). 3) On FreeBSD, jpeach points out that CLOCK_REALTIME is probably not the optimal way to use clock_gettime(). He suggest using CLOCK_REALTIME_FAST or CLOCK_MONOTONIC_FAST which is similar to the optimizations done with TSC for gettimeofday() on linux. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372437#comment-14372437 ] Sudheer Vinukonda commented on TS-3459: --- [~briang] and [~zwoop] : I am not sure either to understand the need for two separate configs. Is this perhaps, the {{proxy.config.http.send_100_continue_response}} only controls the internal ATS generated {{100 CONT}} and not the {{100 CONT}} received from the origin? Even so, I am not sure that such a config makes sense - shouldn't we just follow the RFC that seems to say that a {{100 CONT}} from the origin should be fwded to the client or not based on the HTTP version and whether or not the {{Expect}} header was received. {code} Requirements for HTTP/1.1 proxies: - If a proxy receives a request that includes an Expect request- header field with the 100-continue expectation, and the proxy either knows that the next-hop server complies with HTTP/1.1 or higher, or does not know the HTTP version of the next-hop server, it MUST forward the request, including the Expect header field. - If the proxy knows that the version of the next-hop server is HTTP/1.0 or lower, it MUST NOT forward the request, and it MUST respond with a 417 (Expectation Failed) status. - Proxies SHOULD maintain a cache recording the HTTP version numbers received from recently-referenced next-hop servers. - A proxy MUST NOT forward a 100 (Continue) response if the request message was received from an HTTP/1.0 (or earlier) client and did not include an Expect request-header field with the 100-continue expectation. This requirement overrides the general rule for forwarding of 1xx responses (see section 10.1). {code} Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 5.3.0 This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3424) SSL error: SSL3_GET_RECORD:decryption failed or bad record mac
[ https://issues.apache.org/jira/browse/TS-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371355#comment-14371355 ] ASF subversion and git services commented on TS-3424: - Commit 7d2d30ba2c81f9da147b32bd845608430fe7ea0a in trafficserver's branch refs/heads/5.2.x from [~shinrich] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=7d2d30b ] TS-3424 SSL Failed: decryption failed or bad record mac. SSL error: SSL3_GET_RECORD:decryption failed or bad record mac -- Key: TS-3424 URL: https://issues.apache.org/jira/browse/TS-3424 Project: Traffic Server Issue Type: Bug Components: Core, SSL Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 5.2.1, 5.3.0 Attachments: ts-3424-2.diff, ts-3424-3.diff, ts-3424-for-52-2.diff, ts-3424-for-52-final.diff, ts-3424-for-52.diff, ts-3424.diff, undo-handshake-buffer-for-52.diff, undo-handshake-buffer.diff Starting with 5.2.x we're seeing SSL_ERROR_SSL type errors in {{ssl_read_from_net}}, when calling OpenSSL's {{ERR_error_string_n}} we see the error is {{1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-3456) SSL blind tunnel sometimes not created
[ https://issues.apache.org/jira/browse/TS-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371237#comment-14371237 ] Lev Stipakov edited comment on TS-3456 at 3/20/15 2:28 PM: --- Found out that {{HttpSM::setup_blind_tunnel}} is called N times less than {{SSLNextProtocolAccept:mainEvent}}, where N is exact amount of stalled connections. was (Author: lstipakov): Found out that {{SSLNextProtocolAccept:mainEvent}} is called N times less than {{HttpSM::setup_blind_tunnel}}, where N is exact amount of stalled connections. SSL blind tunnel sometimes not created --- Key: TS-3456 URL: https://issues.apache.org/jira/browse/TS-3456 Project: Traffic Server Issue Type: Bug Components: Plugins, SSL Reporter: Lev Stipakov Assignee: Susan Hinrichs Fix For: 6.0.0 Attachments: ts-tls.cc Hello, I made a simple plugin that sets up TS_SSL_SNI_HOOK and creates a blind tunnel from a separate thread. With low load everything works fine, but with moderate load (100 simultaneous users, each user sends 200 HTTPS requests) I see somewhat strange behavior. On a client side I use Tsung, which creates users and sends number of requests per user. For each user Tsung waits for a response before sending a new request, so if response never arrives, a particular user (and the whole test) stalls. So, with load mentioned above I see few 'stalled' connections on both client and proxy – netstat shows them as ”established”, ATS seems to have data structures for those (checked proxy.process.net.connections_currently_open value), but no traffic goes between proxy and client. Client side (.175): tcp 0 0 10.133.3.175:40737 10.133.3.250:443 ESTABLISHED 14332/beam.smp (more similar connections here) Proxy side (.250 is a server): tcp 0 0 10.133.3.250:443 10.133.3.175:40737 ESTABLISHED 28117/traffic_serve (more similar connections here) I checked traffic.out log and found out that ”SSLNextProtocolAccept:mainEvent” does not get called as many times as it should. This can probably be explained by the fact that client does not send requests for given user anymore if response to previous request hasn't been received. Which, in turn, may indicate that at some point tunnel has not been created. The interesting thing is that everything works fine if a tunnel is created directly from TS_SSL_SNI_HOOK but not from the separate thread. The plugin code is very simple – I set up TS_SSL_SNI_HOOK and start a thread with TSThreadCreate. When hook got called, I push TSVConn to a thread-safe queue. The thread wakes up when item has been pushed, calls TSVConnTunnel / TSVConnReenable for given vconn and then waits for the next item. I have attached the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3456) SSL blind tunnel sometimes not created
[ https://issues.apache.org/jira/browse/TS-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371237#comment-14371237 ] Lev Stipakov commented on TS-3456: -- Found out that {{SSLNextProtocolAccept:mainEvent}} is called N times less than {{HttpSM::setup_blind_tunnel}}, where N is exact amount of stalled connections. SSL blind tunnel sometimes not created --- Key: TS-3456 URL: https://issues.apache.org/jira/browse/TS-3456 Project: Traffic Server Issue Type: Bug Components: Plugins, SSL Reporter: Lev Stipakov Assignee: Susan Hinrichs Fix For: 6.0.0 Attachments: ts-tls.cc Hello, I made a simple plugin that sets up TS_SSL_SNI_HOOK and creates a blind tunnel from a separate thread. With low load everything works fine, but with moderate load (100 simultaneous users, each user sends 200 HTTPS requests) I see somewhat strange behavior. On a client side I use Tsung, which creates users and sends number of requests per user. For each user Tsung waits for a response before sending a new request, so if response never arrives, a particular user (and the whole test) stalls. So, with load mentioned above I see few 'stalled' connections on both client and proxy – netstat shows them as ”established”, ATS seems to have data structures for those (checked proxy.process.net.connections_currently_open value), but no traffic goes between proxy and client. Client side (.175): tcp 0 0 10.133.3.175:40737 10.133.3.250:443 ESTABLISHED 14332/beam.smp (more similar connections here) Proxy side (.250 is a server): tcp 0 0 10.133.3.250:443 10.133.3.175:40737 ESTABLISHED 28117/traffic_serve (more similar connections here) I checked traffic.out log and found out that ”SSLNextProtocolAccept:mainEvent” does not get called as many times as it should. This can probably be explained by the fact that client does not send requests for given user anymore if response to previous request hasn't been received. Which, in turn, may indicate that at some point tunnel has not been created. The interesting thing is that everything works fine if a tunnel is created directly from TS_SSL_SNI_HOOK but not from the separate thread. The plugin code is very simple – I set up TS_SSL_SNI_HOOK and start a thread with TSThreadCreate. When hook got called, I push TSVConn to a thread-safe queue. The thread wakes up when item has been pushed, calls TSVConnTunnel / TSVConnReenable for given vconn and then waits for the next item. I have attached the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3455) Marking the cache STALE in lookup-complete causes abort()
[ https://issues.apache.org/jira/browse/TS-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371223#comment-14371223 ] Luca Bruno commented on TS-3455: [~zwoop] the question is also why OpenReadHit is called twice in the same request. If the content is STALE, why should it open the cache content again? Marking the cache STALE in lookup-complete causes abort() - Key: TS-3455 URL: https://issues.apache.org/jira/browse/TS-3455 Project: Traffic Server Issue Type: Bug Reporter: Luca Bruno Fix For: 6.0.0 I've written a simple test case plugin for demonstrating this problem, not sure if it's a problem on my side, but that would also mean that the regex invalidate plugin would also abort(). What the plugin does: in LOOKUP_COMPLETE, if the cache status is FRESH then set it to STALE. To reproduce: 1) Send a first cacheable request to ATS, which gets cached. 2) Request again the same url, the plugin triggers and set the cache to STALE. Then ATS does abort(). Plugin code: {noformat} #include ts/ts.h #include ts/remap.h #include ts/experimental.h #include stdlib.h #include stdio.h #include getopt.h #include string.h #include string #include iterator #include map const char PLUGIN_NAME[] = maybebug; static int Handler(TSCont cont, TSEvent event, void *edata); struct PluginState { PluginState() { cont = TSContCreate(Handler, NULL); TSContDataSet(cont, this); } ~PluginState() { TSContDestroy(cont); } TSCont cont; }; static int Handler(TSCont cont, TSEvent event, void* edata) { TSHttpTxn txn = (TSHttpTxn)edata; if (event == TS_EVENT_HTTP_CACHE_LOOKUP_COMPLETE) { int lookup_status; if (TS_SUCCESS == TSHttpTxnCacheLookupStatusGet(txn, lookup_status)) { TSDebug(PLUGIN_NAME, lookup complete: %d, lookup_status); if (lookup_status == TS_CACHE_LOOKUP_HIT_FRESH) { TSDebug(PLUGIN_NAME, set stale); TSHttpTxnCacheLookupStatusSet(txn, TS_CACHE_LOOKUP_HIT_STALE); } } } TSHttpTxnReenable(txn, TS_EVENT_HTTP_CONTINUE); return TS_EVENT_NONE; } void TSPluginInit (int argc, const char *argv[]) { TSPluginRegistrationInfo info; info.plugin_name = strdup(cappello); info.vendor_name = strdup(foo); info.support_email = strdup(f...@bar.com); if (TSPluginRegister(TS_SDK_VERSION_3_0 , info) != TS_SUCCESS) { TSError(Plugin registration failed); } PluginState* state = new PluginState(); TSHttpHookAdd(TS_HTTP_CACHE_LOOKUP_COMPLETE_HOOK, state-cont); } {noformat} Output: {noformat} [Mar 19 18:40:36.254] Server {0x7f6df0b4f740} DIAG: (maybebug) lookup complete: 0 [Mar 19 18:40:40.854] Server {0x7f6decfee700} DIAG: (maybebug) lookup complete: 2 [Mar 19 18:40:40.854] Server {0x7f6decfee700} DIAG: (maybebug) set stale FATAL: HttpTransact.cc:433: failed assert `s-pending_work == NULL` traffic_server - STACK TRACE: /usr/local/lib/libtsutil.so.5(ink_fatal+0xa3)[0x7f6df072186d] /usr/local/lib/libtsutil.so.5(_Z12ink_get_randv+0x0)[0x7f6df071f3a0] traffic_server[0x60d0aa] traffic_server(_ZN12HttpTransact22HandleCacheOpenReadHitEPNS_5StateE+0xf82)[0x619206] ... {noformat} What happens in gdb is that HandleCacheOpenReadHit is called twice in the same request. The first time s-pending_work is NULL, the second time it's not NULL. The patch below fixes the problem: {noformat} diff --git a/proxy/http/HttpTransact.cc b/proxy/http/HttpTransact.cc index 0078ef1..852f285 100644 --- a/proxy/http/HttpTransact.cc +++ b/proxy/http/HttpTransact.cc @@ -2641,11 +2641,6 @@ HttpTransact::HandleCacheOpenReadHit(State* s) //ink_release_assert(s-current.request_to == PARENT_PROXY || //s-http_config_param-no_dns_forward_to_parent != 0); -// Set ourselves up to handle pending revalidate issues -// after the PP DNS lookup -ink_assert(s-pending_work == NULL); -s-pending_work = issue_revalidate; - // We must be going a PARENT PROXY since so did // origin server DNS lookup right after state Start // @@ -2654,6 +2649,11 @@ HttpTransact::HandleCacheOpenReadHit(State* s) // missing ip but we won't take down the system // if (s-current.request_to == PARENT_PROXY) { + // Set ourselves up to handle pending revalidate issues + // after the PP DNS lookup + ink_assert(s-pending_work == NULL); + s-pending_work = issue_revalidate; + TRANSACT_RETURN(SM_ACTION_DNS_LOOKUP, PPDNSLookup); } else if (s-current.request_to == ORIGIN_SERVER) { TRANSACT_RETURN(SM_ACTION_DNS_LOOKUP, OSDNSLookup); {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3455) Marking the cache STALE in lookup-complete causes abort()
[ https://issues.apache.org/jira/browse/TS-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371439#comment-14371439 ] Luca Bruno commented on TS-3455: ATS 4.2.3 also aborts in the same way with regex_revalidate. I have the plain default configuration of ATS, only changed the plugin.config and added simple remap rules. Marking the cache STALE in lookup-complete causes abort() - Key: TS-3455 URL: https://issues.apache.org/jira/browse/TS-3455 Project: Traffic Server Issue Type: Bug Reporter: Luca Bruno Fix For: 6.0.0 I've written a simple test case plugin for demonstrating this problem, not sure if it's a problem on my side, but that would also mean that the regex invalidate plugin would also abort(). What the plugin does: in LOOKUP_COMPLETE, if the cache status is FRESH then set it to STALE. To reproduce: 1) Send a first cacheable request to ATS, which gets cached. 2) Request again the same url, the plugin triggers and set the cache to STALE. Then ATS does abort(). Plugin code: {noformat} #include ts/ts.h #include ts/remap.h #include ts/experimental.h #include stdlib.h #include stdio.h #include getopt.h #include string.h #include string #include iterator #include map const char PLUGIN_NAME[] = maybebug; static int Handler(TSCont cont, TSEvent event, void *edata); struct PluginState { PluginState() { cont = TSContCreate(Handler, NULL); TSContDataSet(cont, this); } ~PluginState() { TSContDestroy(cont); } TSCont cont; }; static int Handler(TSCont cont, TSEvent event, void* edata) { TSHttpTxn txn = (TSHttpTxn)edata; if (event == TS_EVENT_HTTP_CACHE_LOOKUP_COMPLETE) { int lookup_status; if (TS_SUCCESS == TSHttpTxnCacheLookupStatusGet(txn, lookup_status)) { TSDebug(PLUGIN_NAME, lookup complete: %d, lookup_status); if (lookup_status == TS_CACHE_LOOKUP_HIT_FRESH) { TSDebug(PLUGIN_NAME, set stale); TSHttpTxnCacheLookupStatusSet(txn, TS_CACHE_LOOKUP_HIT_STALE); } } } TSHttpTxnReenable(txn, TS_EVENT_HTTP_CONTINUE); return TS_EVENT_NONE; } void TSPluginInit (int argc, const char *argv[]) { TSPluginRegistrationInfo info; info.plugin_name = strdup(cappello); info.vendor_name = strdup(foo); info.support_email = strdup(f...@bar.com); if (TSPluginRegister(TS_SDK_VERSION_3_0 , info) != TS_SUCCESS) { TSError(Plugin registration failed); } PluginState* state = new PluginState(); TSHttpHookAdd(TS_HTTP_CACHE_LOOKUP_COMPLETE_HOOK, state-cont); } {noformat} Output: {noformat} [Mar 19 18:40:36.254] Server {0x7f6df0b4f740} DIAG: (maybebug) lookup complete: 0 [Mar 19 18:40:40.854] Server {0x7f6decfee700} DIAG: (maybebug) lookup complete: 2 [Mar 19 18:40:40.854] Server {0x7f6decfee700} DIAG: (maybebug) set stale FATAL: HttpTransact.cc:433: failed assert `s-pending_work == NULL` traffic_server - STACK TRACE: /usr/local/lib/libtsutil.so.5(ink_fatal+0xa3)[0x7f6df072186d] /usr/local/lib/libtsutil.so.5(_Z12ink_get_randv+0x0)[0x7f6df071f3a0] traffic_server[0x60d0aa] traffic_server(_ZN12HttpTransact22HandleCacheOpenReadHitEPNS_5StateE+0xf82)[0x619206] ... {noformat} What happens in gdb is that HandleCacheOpenReadHit is called twice in the same request. The first time s-pending_work is NULL, the second time it's not NULL. The patch below fixes the problem: {noformat} diff --git a/proxy/http/HttpTransact.cc b/proxy/http/HttpTransact.cc index 0078ef1..852f285 100644 --- a/proxy/http/HttpTransact.cc +++ b/proxy/http/HttpTransact.cc @@ -2641,11 +2641,6 @@ HttpTransact::HandleCacheOpenReadHit(State* s) //ink_release_assert(s-current.request_to == PARENT_PROXY || //s-http_config_param-no_dns_forward_to_parent != 0); -// Set ourselves up to handle pending revalidate issues -// after the PP DNS lookup -ink_assert(s-pending_work == NULL); -s-pending_work = issue_revalidate; - // We must be going a PARENT PROXY since so did // origin server DNS lookup right after state Start // @@ -2654,6 +2649,11 @@ HttpTransact::HandleCacheOpenReadHit(State* s) // missing ip but we won't take down the system // if (s-current.request_to == PARENT_PROXY) { + // Set ourselves up to handle pending revalidate issues + // after the PP DNS lookup + ink_assert(s-pending_work == NULL); + s-pending_work = issue_revalidate; + TRANSACT_RETURN(SM_ACTION_DNS_LOOKUP, PPDNSLookup); } else if (s-current.request_to == ORIGIN_SERVER) { TRANSACT_RETURN(SM_ACTION_DNS_LOOKUP, OSDNSLookup); {noformat} -- This message was sent by Atlassian JIRA
[jira] [Updated] (TS-3460) Traffic Server doesn't count bytes transferred correctly on SSL Sessions
[ https://issues.apache.org/jira/browse/TS-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Gulati updated TS-3460: - Attachment: Archive.zip Has wireshark captures and stack traces Traffic Server doesn't count bytes transferred correctly on SSL Sessions Key: TS-3460 URL: https://issues.apache.org/jira/browse/TS-3460 Project: Traffic Server Issue Type: Bug Components: SSL Reporter: Kunal Gulati Attachments: Archive.zip Following API return incorrect values for SSL sessions (In case client hasn't yet send a TCP FIN or it is long running SSL session[For eg messenger] TSHttpTxnClientReqHdrBytesGet(txnp); TSHttpTxnClientReqBodyBytesGet(txnp); TSHttpTxnClientRespHdrBytesGet(txnp); TSHttpTxnClientRespBodyBytesGet(txnp); This issue is reproducible on ATS 4.2,4.3,5.2 and Tip of Tree (Not sure how to add stack trace and wireshark captures) Not working Breakpoint 1, debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 243 in_bytes = TSHttpTxnClientReqHdrBytesGet(txnp); (gdb) p in_bytes $8 = 7440486 (gdb) backtrace #0 debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 #1 0x72ed6831 in tcp_info_hook (contp=0x1117990, event=TS_EVENT_HTTP_TXN_CLOSE, edata=0x70bad9f0) at ./tcpinfo.cc:274 #2 0x004f3771 in INKContInternal::handle_event (this=0x1117990, event=60012, edata=0x70bad9f0) at InkAPI.cc:997 #3 0x004ea812 in Continuation::handleEvent (this=0x1117990, event=60012, data=0x70bad9f0) at ../iocore/eventsystem/I_Continuation.h:146 #4 0x004f4090 in APIHook::invoke (this=0x11189f0, event=60012, edata=0x70bad9f0) at InkAPI.cc:1216 #5 0x0056903e in HttpSM::state_api_callout (this=0x70bad9f0, event=0, data=0x0) at HttpSM.cc:1410 #6 0x0057559f in HttpSM::do_api_callout_internal (this=0x70bad9f0) at HttpSM.cc:4767 #7 0x00581ebe in HttpSM::do_api_callout (this=0x70bad9f0) at HttpSM.cc:497 #8 0x0057abfc in HttpSM::kill_this (this=0x70bad9f0) at HttpSM.cc:6443 #9 0x0056ccc4 in HttpSM::main_handler (this=0x70bad9f0, event=2301, data=0x70baf5f8) at HttpSM.cc:2545 #10 0x004ea812 in Continuation::handleEvent (this=0x70bad9f0, event=2301, data=0x70baf5f8) at ../iocore/eventsystem/I_Continuation.h:146 #11 0x005b814c in HttpTunnel::main_handler (this=0x70baf5f8, event=105, data=0x7fffec015ed0) at HttpTunnel.cc:1504 #12 0x004ea812 in Continuation::handleEvent (this=0x70baf5f8, event=105, data=0x7fffec015ed0) at ../iocore/eventsystem/I_Continuation.h:146 #13 0x006cd798 in read_signal_and_update (event=105, vc=0x7fffec015dc0) at UnixNetVConnection.cc:138 #14 0x006d0bd2 in UnixNetVConnection::mainEvent (this=0x7fffec015dc0, event=1, e=0x1097e20) at UnixNetVConnection.cc:1063 #15 0x004ea812 in Continuation::handleEvent (this=0x7fffec015dc0, event=1, data=0x1097e20) at ../iocore/eventsystem/I_Continuation.h:146 #16 0x006c7de9 in InactivityCop::check_inactivity (this=0x107ff10, event=2, e=0x1097e20) at UnixNet.cc:67 #17 0x004ea812 in Continuation::handleEvent (this=0x107ff10, event=2, data=0x1097e20) at ../iocore/eventsystem/I_Continuation.h:146 #18 0x006f01b3 in EThread::process_event (this=0x7524c010, e=0x1097e20, calling_code=2) at UnixEThread.cc:145 #19 0x006f058f in EThread::execute (this=0x7524c010) at UnixEThread.cc:224 #20 0x00512e16 in main (argv=0x7fffe718) at Main.cc:1659 (gdb) p in_bytes $9 = 7440486 (gdb) n 244 in_bytes += TSHttpTxnClientReqBodyBytesGet(txnp); (gdb) p in_bytes $10 = 435 (gdb) n 245 out_bytes = TSHttpTxnClientRespHdrBytesGet(txnp); (gdb) p in_bytes $11 = 435 (gdb) n 246 out_bytes += TSHttpTxnClientRespBodyBytesGet(txnp); (gdb) p out_bytes $12 = 100 (gdb) n 247 total_out_bytes += out_bytes; (gdb) p out_bytes $13 = 25122 (gdb) n 248 total_in_bytes += in_bytes; (gdb) p out_bytes $14 = 25122 Breakpoint 1, debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 243 in_bytes = TSHttpTxnClientReqHdrBytesGet(txnp); (gdb) backtrace #0 debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 #1 0x72ed6831 in tcp_info_hook (contp=0x1117990, event=TS_EVENT_HTTP_TXN_CLOSE, edata=0x70bad9f0) at ./tcpinfo.cc:274 #2 0x004f3771 in INKContInternal::handle_event (this=0x1117990, event=60012, edata=0x70bad9f0) at InkAPI.cc:997 #3 0x004ea812 in Continuation::handleEvent (this=0x1117990, event=60012, data=0x70bad9f0) at ../iocore/eventsystem/I_Continuation.h:146 #4 0x004f4090 in APIHook::invoke (this=0x11189f0,
[jira] [Created] (TS-3460) Traffic Server doesn't count bytes transferred correctly on SSL Sessions
Kunal Gulati created TS-3460: Summary: Traffic Server doesn't count bytes transferred correctly on SSL Sessions Key: TS-3460 URL: https://issues.apache.org/jira/browse/TS-3460 Project: Traffic Server Issue Type: Bug Components: SSL Reporter: Kunal Gulati Following API return incorrect values for SSL sessions (In case client hasn't yet send a TCP FIN or it is long running SSL session[For eg messenger] TSHttpTxnClientReqHdrBytesGet(txnp); TSHttpTxnClientReqBodyBytesGet(txnp); TSHttpTxnClientRespHdrBytesGet(txnp); TSHttpTxnClientRespBodyBytesGet(txnp); This issue is reproducible on ATS 4.2,4.3,5.2 and Tip of Tree (Not sure how to add stack trace and wireshark captures) Not working Breakpoint 1, debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 243 in_bytes = TSHttpTxnClientReqHdrBytesGet(txnp); (gdb) p in_bytes $8 = 7440486 (gdb) backtrace #0 debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 #1 0x72ed6831 in tcp_info_hook (contp=0x1117990, event=TS_EVENT_HTTP_TXN_CLOSE, edata=0x70bad9f0) at ./tcpinfo.cc:274 #2 0x004f3771 in INKContInternal::handle_event (this=0x1117990, event=60012, edata=0x70bad9f0) at InkAPI.cc:997 #3 0x004ea812 in Continuation::handleEvent (this=0x1117990, event=60012, data=0x70bad9f0) at ../iocore/eventsystem/I_Continuation.h:146 #4 0x004f4090 in APIHook::invoke (this=0x11189f0, event=60012, edata=0x70bad9f0) at InkAPI.cc:1216 #5 0x0056903e in HttpSM::state_api_callout (this=0x70bad9f0, event=0, data=0x0) at HttpSM.cc:1410 #6 0x0057559f in HttpSM::do_api_callout_internal (this=0x70bad9f0) at HttpSM.cc:4767 #7 0x00581ebe in HttpSM::do_api_callout (this=0x70bad9f0) at HttpSM.cc:497 #8 0x0057abfc in HttpSM::kill_this (this=0x70bad9f0) at HttpSM.cc:6443 #9 0x0056ccc4 in HttpSM::main_handler (this=0x70bad9f0, event=2301, data=0x70baf5f8) at HttpSM.cc:2545 #10 0x004ea812 in Continuation::handleEvent (this=0x70bad9f0, event=2301, data=0x70baf5f8) at ../iocore/eventsystem/I_Continuation.h:146 #11 0x005b814c in HttpTunnel::main_handler (this=0x70baf5f8, event=105, data=0x7fffec015ed0) at HttpTunnel.cc:1504 #12 0x004ea812 in Continuation::handleEvent (this=0x70baf5f8, event=105, data=0x7fffec015ed0) at ../iocore/eventsystem/I_Continuation.h:146 #13 0x006cd798 in read_signal_and_update (event=105, vc=0x7fffec015dc0) at UnixNetVConnection.cc:138 #14 0x006d0bd2 in UnixNetVConnection::mainEvent (this=0x7fffec015dc0, event=1, e=0x1097e20) at UnixNetVConnection.cc:1063 #15 0x004ea812 in Continuation::handleEvent (this=0x7fffec015dc0, event=1, data=0x1097e20) at ../iocore/eventsystem/I_Continuation.h:146 #16 0x006c7de9 in InactivityCop::check_inactivity (this=0x107ff10, event=2, e=0x1097e20) at UnixNet.cc:67 #17 0x004ea812 in Continuation::handleEvent (this=0x107ff10, event=2, data=0x1097e20) at ../iocore/eventsystem/I_Continuation.h:146 #18 0x006f01b3 in EThread::process_event (this=0x7524c010, e=0x1097e20, calling_code=2) at UnixEThread.cc:145 #19 0x006f058f in EThread::execute (this=0x7524c010) at UnixEThread.cc:224 #20 0x00512e16 in main (argv=0x7fffe718) at Main.cc:1659 (gdb) p in_bytes $9 = 7440486 (gdb) n 244 in_bytes += TSHttpTxnClientReqBodyBytesGet(txnp); (gdb) p in_bytes $10 = 435 (gdb) n 245 out_bytes = TSHttpTxnClientRespHdrBytesGet(txnp); (gdb) p in_bytes $11 = 435 (gdb) n 246 out_bytes += TSHttpTxnClientRespBodyBytesGet(txnp); (gdb) p out_bytes $12 = 100 (gdb) n 247 total_out_bytes += out_bytes; (gdb) p out_bytes $13 = 25122 (gdb) n 248 total_in_bytes += in_bytes; (gdb) p out_bytes $14 = 25122 Breakpoint 1, debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 243 in_bytes = TSHttpTxnClientReqHdrBytesGet(txnp); (gdb) backtrace #0 debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 #1 0x72ed6831 in tcp_info_hook (contp=0x1117990, event=TS_EVENT_HTTP_TXN_CLOSE, edata=0x70bad9f0) at ./tcpinfo.cc:274 #2 0x004f3771 in INKContInternal::handle_event (this=0x1117990, event=60012, edata=0x70bad9f0) at InkAPI.cc:997 #3 0x004ea812 in Continuation::handleEvent (this=0x1117990, event=60012, data=0x70bad9f0) at ../iocore/eventsystem/I_Continuation.h:146 #4 0x004f4090 in APIHook::invoke (this=0x11189f0, event=60012, edata=0x70bad9f0) at InkAPI.cc:1216 #5 0x0056903e in HttpSM::state_api_callout (this=0x70bad9f0, event=0, data=0x0) at HttpSM.cc:1410 #6 0x0057559f in HttpSM::do_api_callout_internal (this=0x70bad9f0) at HttpSM.cc:4767 #7 0x00581ebe in HttpSM::do_api_callout (this=0x70bad9f0) at HttpSM.cc:497 #8
[jira] [Created] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
Brian Geffon created TS-3459: Summary: Create a new config to disallow Post w/ Expect: 100-continue. Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Geffon reassigned TS-3459: Assignee: Brian Geffon Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Geffon updated TS-3459: - Fix Version/s: 5.3.0 Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 5.3.0 This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Geffon resolved TS-3459. -- Resolution: Fixed Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 5.3.0 This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-153) Dynamic keep-alive timeouts
[ https://issues.apache.org/jira/browse/TS-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Call updated TS-153: -- Fix Version/s: (was: 6.0.0) 5.3.0 Dynamic keep-alive timeouts - Key: TS-153 URL: https://issues.apache.org/jira/browse/TS-153 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Leif Hedstrom Assignee: Bryan Call Priority: Minor Labels: A Fix For: 5.3.0 Attachments: ts153.diff (This is from a Y! Bugzilla ticket 1821593, adding it here. . Originally posted by Leif Hedstrom on 2008-03-19): Currently you have to set static keep-alive idle timeouts in TS, e.g. CONFIG proxy.config.http.keep_alive_no_activity_timeout_in INT 8 CONFIG proxy.config.http.keep_alive_no_activity_timeout_out INT 30 even with epoll() in 1.17.x, this is difficult to configure, and put an appropriate timeout. The key here is that the settings above need to assure that you stay below the max configured number of connections, e.g.: CONFIG proxy.config.net.connections_throttle INT 75000 I'm suggesting that we add one (or two) new configuration options, and appropriate TS code support, to instead of specifying timeouts, we specify connection limits for idle KA connections. For example: CONFIG proxy.config.http.keep_alive_max_idle_connections_in INT 5 CONFIG proxy.config.http_keep_alive_max_idle_connections_out INT 5000 (one still has to be careful to leave head-room for active connections here, in the example above, 2 connections could be active, which is a lot of traffic). These would override the idle timeouts, so one could use the max_idle connections for incoming (client) connections, and the idle timeouts for outgoing (origin) connections for instance. The benefit here is that it makes configuration not only easier, but also a lot safer for many applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-153) Dynamic keep-alive timeouts
[ https://issues.apache.org/jira/browse/TS-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Call resolved TS-153. --- Resolution: Fixed Added LRU for Keep-Alive and the ability to limit the number of overall connections by only closing Keep-Alive connections. I will create another ticket to extend the functionality to limit the number of active connections. Dynamic keep-alive timeouts - Key: TS-153 URL: https://issues.apache.org/jira/browse/TS-153 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Leif Hedstrom Assignee: Bryan Call Priority: Minor Labels: A Fix For: 5.3.0 Attachments: ts153.diff (This is from a Y! Bugzilla ticket 1821593, adding it here. . Originally posted by Leif Hedstrom on 2008-03-19): Currently you have to set static keep-alive idle timeouts in TS, e.g. CONFIG proxy.config.http.keep_alive_no_activity_timeout_in INT 8 CONFIG proxy.config.http.keep_alive_no_activity_timeout_out INT 30 even with epoll() in 1.17.x, this is difficult to configure, and put an appropriate timeout. The key here is that the settings above need to assure that you stay below the max configured number of connections, e.g.: CONFIG proxy.config.net.connections_throttle INT 75000 I'm suggesting that we add one (or two) new configuration options, and appropriate TS code support, to instead of specifying timeouts, we specify connection limits for idle KA connections. For example: CONFIG proxy.config.http.keep_alive_max_idle_connections_in INT 5 CONFIG proxy.config.http_keep_alive_max_idle_connections_out INT 5000 (one still has to be careful to leave head-room for active connections here, in the example above, 2 connections could be active, which is a lot of traffic). These would override the idle timeouts, so one could use the max_idle connections for incoming (client) connections, and the idle timeouts for outgoing (origin) connections for instance. The benefit here is that it makes configuration not only easier, but also a lot safer for many applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (TS-2894) Spdy slow start..
[ https://issues.apache.org/jira/browse/TS-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber closed TS-2894. --- Resolution: Won't Fix Closing per [~bcall]'s comment. Spdy slow start.. - Key: TS-2894 URL: https://issues.apache.org/jira/browse/TS-2894 Project: Traffic Server Issue Type: Improvement Components: SPDY Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda Labels: yahoo Fix For: 5.3.0 Attachments: TS-2894.diff When production testing with spdy/5.0.0, we ran into an issue in some of our systems, where, the spdy hosts would flap constantly due to the flood of requests. We further noticed that, where the 4.0.x version or 5.0.0 w/ spdy turned off, would recover quickly following a restart, spdy enabled hosts would continue to receive flood of requests and continue to flap. During this time, traffic server is generally busy reading from the disk and can not handle too many requests, and is made miserable by spdy's support of multiple concurrent streams. To handle such a sudden flood of requests, I'm implementing a simple slow start mechanism with spdy. The idea is to increase the max_concurrent_streams_in gradually based on a configured timer, rather than use the configured value right away. The steps I chose to implement are 1, 25, 50, 75 and 100% of the configured max_concurrent_streams_in. Note that, currently, max_concurrent_streams_in only affects new spdy sessions. Existing sessions (if any) would continue to use their older values. Not too sure, if everyone would be interested in this..but, thought of still uploading my patch, incase, someone is interested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3455) Marking the cache STALE in lookup-complete causes abort()
[ https://issues.apache.org/jira/browse/TS-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371620#comment-14371620 ] Luca Bruno commented on TS-3455: Relevant debug output: hit the cache, call the plugin which sets STALE (see the message set stale), then OS dns lookup with next action HandleCacheOpenReadHit: {noformat} ... [Mar 20 17:23:45.420] Server {0x7fe58c8cf740} DEBUG: (http_match) [..._document_freshness] document is fresh; returning FRESHNESS_FRESH [Mar 20 17:23:45.420] Server {0x7fe58c8cf740} DEBUG: (http_seq) [HttpTransact::HandleCacheOpenReadHitFreshness] Fresh copy [Mar 20 17:23:45.420] Server {0x7fe58c8cf740} DEBUG: (http_seq) [HttpTransact::HandleCacheOpenReadHit] Authentication not needed [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http_trans) Next action SM_ACTION_API_CACHE_LOOKUP_COMPLETE; HttpTransact::HandleCacheOpenReadHit [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http) [1] State Transition: SM_ACTION_API_READ_CACHE_HDR - SM_ACTION_API_CACHE_LOOKUP_COMPLETE [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http) [1] calling plugin on hook TS_HTTP_CACHE_LOOKUP_COMPLETE_HOOK at hook 0x7fe580011240 [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DIAG: (maybebug) lookup complete: 2 [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DIAG: (maybebug) set stale [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http) [1] [HttpSM::state_api_callback, HTTP_API_CONTINUE] [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http) [1] [HttpSM::state_api_callout, HTTP_API_CONTINUE] [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http_seq) [HttpTransact::HandleCacheOpenReadHit] Authentication not needed [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- needs_auth = 0 [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- needs_revalidate= 1 [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- response_returnable = 1 [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- needs_cache_auth= 0 [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- send_revalidate= 1 [Mar 20 17:23:45.421] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- HIT-STALE [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_seq) [HttpTransact::HandleCacheOpenReadHit] Revalidate document with server [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) Next action SM_ACTION_DNS_LOOKUP; OSDNSLookup [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http) [1] State Transition: SM_ACTION_API_CACHE_LOOKUP_COMPLETE - SM_ACTION_DNS_LOOKUP [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_seq) [HttpSM::do_hostdb_lookup] Doing DNS Lookup [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) [ink_cluster_time] local: 1426868625, highest_delta: 0, cluster: 1426868625 [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) [HttpTransact::OSDNSLookup] This was attempt 1 [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_seq) [HttpTransact::OSDNSLookup] DNS Lookup successful [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) [OSDNSLookup] DNS lookup for O.S. successful IP: xx.xx.xx.xx [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) Next action SM_ACTION_API_OS_DNS; HandleCacheOpenReadHit [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http) [1] State Transition: SM_ACTION_DNS_LOOKUP - SM_ACTION_API_OS_DNS [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_seq) [HttpTransact::HandleCacheOpenReadHit] Authentication not needed [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- needs_auth = 0 [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- needs_revalidate= 1 [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- response_returnable = 1 [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- needs_cache_auth= 0 [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- send_revalidate= 1 [Mar 20 17:23:45.422] Server {0x7fe58c8cf740} DEBUG: (http_trans) CacheOpenRead --- HIT-STALE [Mar 20 17:23:45.423] Server {0x7fe58c8cf740} DEBUG: (http_seq) [HttpTransact::HandleCacheOpenReadHit] Revalidate document with server ... {noformat} Marking the cache STALE in lookup-complete causes abort() - Key: TS-3455 URL: https://issues.apache.org/jira/browse/TS-3455 Project: Traffic Server Issue Type: Bug Reporter: Luca Bruno Fix For: 6.0.0 I've written a simple test case plugin for demonstrating this problem, not
[jira] [Updated] (TS-3418) Second hash ring for consistently hashed parent selection
[ https://issues.apache.org/jira/browse/TS-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Justin Laue updated TS-3418: Description: It would be incredibly useful if we allowed for an (optional) second hash ring in the consistent hashing in parent selection. Imagine a setup where you have two set of parent proxies. A child would prefer to always use a parent n in ring A for a set of URLs, X. In the case of parent n not being available, instead of rehashing X to the surviving members of ring A, we could now hash the URLs to parent m in ring B. Upon failure there, we'd then go back and rehash on the primary ring again (A). This sounds complicated, but is simple in principle. Instead of immediately rehashing content upon a parent failure, we have a backup pool (potentially remote) of parents, that are likely to have the content. The idea is to minimize origin server traffic at all cost. was: It would be incredibly useful if we allowed for an (optional) second hash ring in the consistent hashing in parent selection. Imagine a setup where you have two set of parent proxies. A child would prefer to always use a parent n in ring A for a set of URLs, X. In the case of parent n not being available, instead of rehashing X to the surviving members of ring A, we could now hash the URLs to parent m in ring B. Upon failure there, we'd then go back and rehash on the primary ring again (A). This sounds complicated, but is simple in principle. Instead of immediately rehashing content upon a parent failure, we have a backup pool (potentially remote) of parents, that are likely to have the content. The idea is to minimize origin server traffic at all cost. Second hash ring for consistently hashed parent selection -- Key: TS-3418 URL: https://issues.apache.org/jira/browse/TS-3418 Project: Traffic Server Issue Type: New Feature Components: Parent Proxy Reporter: Leif Hedstrom Assignee: Phil Sorber Fix For: sometime It would be incredibly useful if we allowed for an (optional) second hash ring in the consistent hashing in parent selection. Imagine a setup where you have two set of parent proxies. A child would prefer to always use a parent n in ring A for a set of URLs, X. In the case of parent n not being available, instead of rehashing X to the surviving members of ring A, we could now hash the URLs to parent m in ring B. Upon failure there, we'd then go back and rehash on the primary ring again (A). This sounds complicated, but is simple in principle. Instead of immediately rehashing content upon a parent failure, we have a backup pool (potentially remote) of parents, that are likely to have the content. The idea is to minimize origin server traffic at all cost. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371601#comment-14371601 ] ASF subversion and git services commented on TS-3459: - Commit a30afc0c8b97d5427797f2a53a0b7e89f186f5f3 in trafficserver's branch refs/heads/master from [~briang] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=a30afc0 ] TS-3459: Create a new config to disallow Post w/ Expect: 100-continue Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3364) Add command line config validation support to traffic_server
[ https://issues.apache.org/jira/browse/TS-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371597#comment-14371597 ] Bryan Call commented on TS-3364: [~sudheerv] Looks like everything has been committed, can this be marked as resolved? Add command line config validation support to traffic_server Key: TS-3364 URL: https://issues.apache.org/jira/browse/TS-3364 Project: Traffic Server Issue Type: Improvement Components: Configuration Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda Fix For: 6.0.0 Currently, traffic_server fails to initialize when it encounters fatal errors in loading the config files during start up. During dynamic reloading of config files (e.g. via traffic_line), traffic_server rejects new config and falls back to existing/old config (however, if there was a traffic_server crash/restart subsequently, that can again result into failing to initialize). This jira proposes to make the behavior of traffic_server when it encounters such fatal errors configurable via a new setting {{proxy.config.ignore_fatal_errors}} with the below options: {code} 0 : All errors are fatal, do not load/reload 1 : Ignore a bad config line, continue with the rest 2 : Ignore a bad config line, stop parsing the file further .. {code} Based on concerns expressed, it has been agreed to not change the traffic_server's behavior to loading with fatal config. Instead, this jira will be used to add a command line option to traffic_server to load and validate the config. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371602#comment-14371602 ] ASF subversion and git services commented on TS-3459: - Commit bf207f3a361d69c9cf1c0f60aec377ee8ebe195d in trafficserver's branch refs/heads/master from [~briang] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=bf207f3 ] TS-3459: Create a new config to disallow Post w/ Expect: 100-continue: UPDATE DOCS Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371603#comment-14371603 ] ASF subversion and git services commented on TS-3459: - Commit 95cd99da5d161fc2419584a0e40329f48e55e732 in trafficserver's branch refs/heads/master from [~briang] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=95cd99d ] TS-3459: Create a new config to disallow Post w/ Expect: 100-continue: UPDATE CHANGES Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2894) Spdy slow start..
[ https://issues.apache.org/jira/browse/TS-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371551#comment-14371551 ] Bryan Call commented on TS-2894: [~psudaemon] I don't think this feature is needed upstream. If a need is warranted we can apply the patch later. I would close it won't fix. Spdy slow start.. - Key: TS-2894 URL: https://issues.apache.org/jira/browse/TS-2894 Project: Traffic Server Issue Type: Improvement Components: SPDY Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda Labels: yahoo Fix For: 5.3.0 Attachments: TS-2894.diff When production testing with spdy/5.0.0, we ran into an issue in some of our systems, where, the spdy hosts would flap constantly due to the flood of requests. We further noticed that, where the 4.0.x version or 5.0.0 w/ spdy turned off, would recover quickly following a restart, spdy enabled hosts would continue to receive flood of requests and continue to flap. During this time, traffic server is generally busy reading from the disk and can not handle too many requests, and is made miserable by spdy's support of multiple concurrent streams. To handle such a sudden flood of requests, I'm implementing a simple slow start mechanism with spdy. The idea is to increase the max_concurrent_streams_in gradually based on a configured timer, rather than use the configured value right away. The steps I chose to implement are 1, 25, 50, 75 and 100% of the configured max_concurrent_streams_in. Note that, currently, max_concurrent_streams_in only affects new spdy sessions. Existing sessions (if any) would continue to use their older values. Not too sure, if everyone would be interested in this..but, thought of still uploading my patch, incase, someone is interested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2644) TOS (DSCP)
[ https://issues.apache.org/jira/browse/TS-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371629#comment-14371629 ] Phil Sorber commented on TS-2644: - I've got no response on this. I think this is a duplicate feature. I am going to close as won't fix. If this is not the case, please feel free to re-open. TOS (DSCP) --- Key: TS-2644 URL: https://issues.apache.org/jira/browse/TS-2644 Project: Traffic Server Issue Type: New Feature Components: Cache, Network Reporter: Faysal Banna Assignee: Phil Sorber Labels: review Fix For: 5.3.0 Attachments: domain_tos.cc Hi Guys I wonder if it would be possible to have a plugin that we can assign TOS/DSCP bits to the objects that are being a cache HIT or maybe object type of video/audio. such a plugin would give us better performance and control on how to distribute the output of the cache towards clients. example : suppose i set traffic to clients each of different bandwidth. on a router on a link somewhere on some roof top building i can say this client can get miss object traffic of 512Kbit/s and 1Mbit/s of Hits from the cache. this way if this client is getting a cached object he would get it in 1Mbit/s while his non cached requests would be of 512Kbit/s hope whoever does this patch plugin takes into consideration the mime type or url of the object being retrieved maybe i want to set audio/video being cached or not to have 768Kbit/s while windows updates and android/iphone apps should take no more than 512kbit/s bear in mind that this has nothing to do with Origin servers throttling feature request. this is just client side feature set. much regards Faysal Banna -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3460) Traffic Server doesn't count bytes transferred correctly on SSL Sessions
[ https://issues.apache.org/jira/browse/TS-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372065#comment-14372065 ] Kunal Gulati commented on TS-3460: -- I have setup ATS as a forward proxy Traffic Server doesn't count bytes transferred correctly on SSL Sessions Key: TS-3460 URL: https://issues.apache.org/jira/browse/TS-3460 Project: Traffic Server Issue Type: Bug Components: SSL Reporter: Kunal Gulati Attachments: Archive.zip Following API return incorrect values for SSL sessions (In case client hasn't yet send a TCP FIN or it is long running SSL session[For eg messenger] TSHttpTxnClientReqHdrBytesGet(txnp); TSHttpTxnClientReqBodyBytesGet(txnp); TSHttpTxnClientRespHdrBytesGet(txnp); TSHttpTxnClientRespBodyBytesGet(txnp); This issue is reproducible on ATS 4.2,4.3,5.2 and Tip of Tree (Not sure how to add stack trace and wireshark captures) Not working Breakpoint 1, debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 243 in_bytes = TSHttpTxnClientReqHdrBytesGet(txnp); (gdb) p in_bytes $8 = 7440486 (gdb) backtrace #0 debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 #1 0x72ed6831 in tcp_info_hook (contp=0x1117990, event=TS_EVENT_HTTP_TXN_CLOSE, edata=0x70bad9f0) at ./tcpinfo.cc:274 #2 0x004f3771 in INKContInternal::handle_event (this=0x1117990, event=60012, edata=0x70bad9f0) at InkAPI.cc:997 #3 0x004ea812 in Continuation::handleEvent (this=0x1117990, event=60012, data=0x70bad9f0) at ../iocore/eventsystem/I_Continuation.h:146 #4 0x004f4090 in APIHook::invoke (this=0x11189f0, event=60012, edata=0x70bad9f0) at InkAPI.cc:1216 #5 0x0056903e in HttpSM::state_api_callout (this=0x70bad9f0, event=0, data=0x0) at HttpSM.cc:1410 #6 0x0057559f in HttpSM::do_api_callout_internal (this=0x70bad9f0) at HttpSM.cc:4767 #7 0x00581ebe in HttpSM::do_api_callout (this=0x70bad9f0) at HttpSM.cc:497 #8 0x0057abfc in HttpSM::kill_this (this=0x70bad9f0) at HttpSM.cc:6443 #9 0x0056ccc4 in HttpSM::main_handler (this=0x70bad9f0, event=2301, data=0x70baf5f8) at HttpSM.cc:2545 #10 0x004ea812 in Continuation::handleEvent (this=0x70bad9f0, event=2301, data=0x70baf5f8) at ../iocore/eventsystem/I_Continuation.h:146 #11 0x005b814c in HttpTunnel::main_handler (this=0x70baf5f8, event=105, data=0x7fffec015ed0) at HttpTunnel.cc:1504 #12 0x004ea812 in Continuation::handleEvent (this=0x70baf5f8, event=105, data=0x7fffec015ed0) at ../iocore/eventsystem/I_Continuation.h:146 #13 0x006cd798 in read_signal_and_update (event=105, vc=0x7fffec015dc0) at UnixNetVConnection.cc:138 #14 0x006d0bd2 in UnixNetVConnection::mainEvent (this=0x7fffec015dc0, event=1, e=0x1097e20) at UnixNetVConnection.cc:1063 #15 0x004ea812 in Continuation::handleEvent (this=0x7fffec015dc0, event=1, data=0x1097e20) at ../iocore/eventsystem/I_Continuation.h:146 #16 0x006c7de9 in InactivityCop::check_inactivity (this=0x107ff10, event=2, e=0x1097e20) at UnixNet.cc:67 #17 0x004ea812 in Continuation::handleEvent (this=0x107ff10, event=2, data=0x1097e20) at ../iocore/eventsystem/I_Continuation.h:146 #18 0x006f01b3 in EThread::process_event (this=0x7524c010, e=0x1097e20, calling_code=2) at UnixEThread.cc:145 #19 0x006f058f in EThread::execute (this=0x7524c010) at UnixEThread.cc:224 #20 0x00512e16 in main (argv=0x7fffe718) at Main.cc:1659 (gdb) p in_bytes $9 = 7440486 (gdb) n 244 in_bytes += TSHttpTxnClientReqBodyBytesGet(txnp); (gdb) p in_bytes $10 = 435 (gdb) n 245 out_bytes = TSHttpTxnClientRespHdrBytesGet(txnp); (gdb) p in_bytes $11 = 435 (gdb) n 246 out_bytes += TSHttpTxnClientRespBodyBytesGet(txnp); (gdb) p out_bytes $12 = 100 (gdb) n 247 total_out_bytes += out_bytes; (gdb) p out_bytes $13 = 25122 (gdb) n 248 total_in_bytes += in_bytes; (gdb) p out_bytes $14 = 25122 Breakpoint 1, debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 243 in_bytes = TSHttpTxnClientReqHdrBytesGet(txnp); (gdb) backtrace #0 debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 #1 0x72ed6831 in tcp_info_hook (contp=0x1117990, event=TS_EVENT_HTTP_TXN_CLOSE, edata=0x70bad9f0) at ./tcpinfo.cc:274 #2 0x004f3771 in INKContInternal::handle_event (this=0x1117990, event=60012, edata=0x70bad9f0) at InkAPI.cc:997 #3 0x004ea812 in Continuation::handleEvent (this=0x1117990, event=60012, data=0x70bad9f0) at ../iocore/eventsystem/I_Continuation.h:146 #4 0x004f4090 in APIHook::invoke
[jira] [Updated] (TS-2237) URL encoding wrong in squid.blog
[ https://issues.apache.org/jira/browse/TS-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-2237: Fix Version/s: (was: 5.3.0) 6.0.0 URL encoding wrong in squid.blog Key: TS-2237 URL: https://issues.apache.org/jira/browse/TS-2237 Project: Traffic Server Issue Type: Bug Components: Logging Reporter: David Carlin Assignee: Alan M. Carroll Priority: Minor Labels: yahoo Fix For: 6.0.0 Attachments: TS-2237.diff I was replaying URLs captured from squid.blog and I noticed I was getting 404's for some of them when squid.blog showed a 200 for that request. Turns out there is an issue with URL encoding. For example: Requesting file 'duck%20sports%20authority.gif' via curl will put this in the logs: duck%2520sports%2520authority.gif The % from %20 (space) in the request is being converted to %25 resulting in %2520 I tested both the %cquc and %cquuc log fields - same thing happens. I tested on ATS 3.2.0 and 3.3.5 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2134) SRV lookup does not handle failover correctly
[ https://issues.apache.org/jira/browse/TS-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372004#comment-14372004 ] Phil Sorber commented on TS-2134: - Moving out to 6.0.0. SRV lookup does not handle failover correctly - Key: TS-2134 URL: https://issues.apache.org/jira/browse/TS-2134 Project: Traffic Server Issue Type: Bug Components: DNS, HTTP Reporter: Thach Tran Assignee: Alan M. Carroll Labels: review Fix For: 6.0.0 Attachments: ats.log, ts2134.patch I'm seeing an issue with SRV lookup in ATS in which the proxy doesn't fail over to alternative origins once the first choice is marked as down. To reproduce this, I'm running dnsmasq as a local resolver to serve up the test SRV records. My configuration is as follows. h4. records.config CONFIG proxy.config.dns.nameservers STRING 127.0.0.1 CONFIG proxy.config.dns.resolv_conf STRING NULL CONFIG proxy.config.srv_enabled INT 1 h4. remap.config regex_remap http://.*:8080/ https://noexample.com/ h4. dnsmasq.conf (srv records config) srv-host=_http._tcp.noexample.com,abc.com,443,0,100 srv-host=_http._tcp.noexample.com,google.com,443,1,100 The intention is since the srv lookup for _http._tcp.noexample.com returns abc.com:443 and google.com:443 with abc.com:443 being the one with higher priority, the proxy should try that first and once the connection to abc.com:443 is marked as down (up to 6 retries by default), google.com:443 should be tried next and the connection should succeed then. However, testing with the following curl command multiple times still gives back 502. $ curl -v http://localhost:8080/ Debug log seems to suggest it always attempts abc.com:443. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2237) URL encoding wrong in squid.blog
[ https://issues.apache.org/jira/browse/TS-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372006#comment-14372006 ] Phil Sorber commented on TS-2237: - Moving out to 6.0.0. URL encoding wrong in squid.blog Key: TS-2237 URL: https://issues.apache.org/jira/browse/TS-2237 Project: Traffic Server Issue Type: Bug Components: Logging Reporter: David Carlin Assignee: Alan M. Carroll Priority: Minor Labels: yahoo Fix For: 6.0.0 Attachments: TS-2237.diff I was replaying URLs captured from squid.blog and I noticed I was getting 404's for some of them when squid.blog showed a 200 for that request. Turns out there is an issue with URL encoding. For example: Requesting file 'duck%20sports%20authority.gif' via curl will put this in the logs: duck%2520sports%2520authority.gif The % from %20 (space) in the request is being converted to %25 resulting in %2520 I tested both the %cquc and %cquuc log fields - same thing happens. I tested on ATS 3.2.0 and 3.3.5 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3104) traffic_cop can't restart traffic_manager properly
[ https://issues.apache.org/jira/browse/TS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372009#comment-14372009 ] Phil Sorber commented on TS-3104: - Moving out to 6.0.0. traffic_cop can't restart traffic_manager properly -- Key: TS-3104 URL: https://issues.apache.org/jira/browse/TS-3104 Project: Traffic Server Issue Type: Bug Components: Cop Reporter: Victor Assignee: James Peach Fix For: 6.0.0 Attachments: ts-0022-fix-lockfile-killgroup.patch, ts-0023-cop-reinit-mgr-api-on-failure.patch In some cases traffic_cop can't restart traffic_manager properly. We met these issues at Ashmanov and partners (http://en.ashmanov.com/). There are two places in code which in my opinion need corrections: 1) The logic which decides whether to kill process or group. 2) The main traffic_cop loop: it doesn't reinitialize manager API in case of failure and this fact leads to constant attempts to connect to manager using socket id == -1. I have prepared patches for both issues. Please kindly take a look at them and let me know your thoughts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-2411) TS Http byte get functions does not return the true number, for server response body byte get
[ https://issues.apache.org/jira/browse/TS-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-2411: Fix Version/s: (was: 5.3.0) 6.0.0 TS Http byte get functions does not return the true number, for server response body byte get - Key: TS-2411 URL: https://issues.apache.org/jira/browse/TS-2411 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: Roee Gil Assignee: Alan M. Carroll Labels: review Fix For: 6.0.0 Attachments: TS-2411.diff When using the example of null-transform, adding TS_EVENT_HTTP_TXN_CLOSE to hooks, and counting byte number, I get: // server - proxy TSHttpTxnServerRespHdrBytesGet(txnDB); TSHttpTxnServerRespBodyBytesGet(txnDB); // proxy - client TSHttpTxnClientRespHdrBytesGet(txnDB); TSHttpTxnClientRespBodyBytesGet(txnDB); 1. server side response body = 0 2. client side response body = (payload size) when inspecting this issue, it seems that VConnection is downloading the content but, this does not count in server response byte get -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3252) Don't chunk response body if transform_response_cl is valid
[ https://issues.apache.org/jira/browse/TS-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3252: Fix Version/s: (was: 5.3.0) 6.0.0 Don't chunk response body if transform_response_cl is valid --- Key: TS-3252 URL: https://issues.apache.org/jira/browse/TS-3252 Project: Traffic Server Issue Type: Improvement Components: Core, HTTP Reporter: portl4t Assignee: Leif Hedstrom Labels: Review Fix For: 6.0.0 Attachments: 0001-TS-3252-Don-t-chunk-response-body-if-transform_respo.patch The way as I see, the client will get chunked response from ATS if the origin server issues a chunked response to ATS. I am wondering whether this can be changed if there is a transfrom plugin exists and the transform can insure transform_response_cl is valid. This can be done by TSVConnWrite(...) with a valid nbytes(not INT64_MAX) in transform handler. Here is an example, I want to response abcdefg in transform handler, no matter what is received from upstream, and I can write code like this in plugin: {code} static int transform_handler(...) { ... output.buffer = TSIOBufferCreate(); output.reader = TSIOBufferReaderAlloc(output.buffer); output.vio = TSVConnWrite(output_conn, contp, output.reader, sizeof(abcdefg)-1); TSIOBufferWrite(output.buffer, abcdefg, sizeof(abcdefg)-1); TSVIOReenable(output.vio); ... } {code} However, the response body to the client will be chunked if ATS got chunked response from origin server. Maybe we can change this by refining the function HttpTransact::handle_response_keep_alive_headers(...) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3104) traffic_cop can't restart traffic_manager properly
[ https://issues.apache.org/jira/browse/TS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3104: Fix Version/s: (was: 5.3.0) 6.0.0 traffic_cop can't restart traffic_manager properly -- Key: TS-3104 URL: https://issues.apache.org/jira/browse/TS-3104 Project: Traffic Server Issue Type: Bug Components: Cop Reporter: Victor Assignee: James Peach Fix For: 6.0.0 Attachments: ts-0022-fix-lockfile-killgroup.patch, ts-0023-cop-reinit-mgr-api-on-failure.patch In some cases traffic_cop can't restart traffic_manager properly. We met these issues at Ashmanov and partners (http://en.ashmanov.com/). There are two places in code which in my opinion need corrections: 1) The logic which decides whether to kill process or group. 2) The main traffic_cop loop: it doesn't reinitialize manager API in case of failure and this fact leads to constant attempts to connect to manager using socket id == -1. I have prepared patches for both issues. Please kindly take a look at them and let me know your thoughts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2411) TS Http byte get functions does not return the true number, for server response body byte get
[ https://issues.apache.org/jira/browse/TS-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372013#comment-14372013 ] Phil Sorber commented on TS-2411: - Moving out to 6.0.0. TS Http byte get functions does not return the true number, for server response body byte get - Key: TS-2411 URL: https://issues.apache.org/jira/browse/TS-2411 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: Roee Gil Assignee: Alan M. Carroll Labels: review Fix For: 6.0.0 Attachments: TS-2411.diff When using the example of null-transform, adding TS_EVENT_HTTP_TXN_CLOSE to hooks, and counting byte number, I get: // server - proxy TSHttpTxnServerRespHdrBytesGet(txnDB); TSHttpTxnServerRespBodyBytesGet(txnDB); // proxy - client TSHttpTxnClientRespHdrBytesGet(txnDB); TSHttpTxnClientRespBodyBytesGet(txnDB); 1. server side response body = 0 2. client side response body = (payload size) when inspecting this issue, it seems that VConnection is downloading the content but, this does not count in server response byte get -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3448) Add an internal Mod to ControlMatcher (a boolean value)
[ https://issues.apache.org/jira/browse/TS-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372019#comment-14372019 ] Phil Sorber commented on TS-3448: - Moving out to 6.0.0. Add an internal Mod to ControlMatcher (a boolean value) - Key: TS-3448 URL: https://issues.apache.org/jira/browse/TS-3448 Project: Traffic Server Issue Type: New Feature Components: Configuration, Core Reporter: Leif Hedstrom Assignee: Leif Hedstrom Fix For: 6.0.0 This allows, as an example, exclusion of parent.config for requests that are internal. Or, different cache.config rules for internal requests. Example usage could be {code} dest_domain=. parent=proxy1.example.com:8080; proxy2.example.com:8080 internal=false {code} This would allow this rule to only trigger if the request is not an internal (plugin) request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2907) Unix Sockets
[ https://issues.apache.org/jira/browse/TS-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372021#comment-14372021 ] Phil Sorber commented on TS-2907: - Moving out to 6.0.0. Unix Sockets Key: TS-2907 URL: https://issues.apache.org/jira/browse/TS-2907 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Luca Rea Assignee: Brian Geffon Labels: review Fix For: 6.0.0 Attachments: TS-2907.diff, unixsocket-backpost.diff Feature request for support listeners and parents on unix sockets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3448) Add an internal Mod to ControlMatcher (a boolean value)
[ https://issues.apache.org/jira/browse/TS-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3448: Fix Version/s: (was: 5.3.0) 6.0.0 Add an internal Mod to ControlMatcher (a boolean value) - Key: TS-3448 URL: https://issues.apache.org/jira/browse/TS-3448 Project: Traffic Server Issue Type: New Feature Components: Configuration, Core Reporter: Leif Hedstrom Assignee: Leif Hedstrom Fix For: 6.0.0 This allows, as an example, exclusion of parent.config for requests that are internal. Or, different cache.config rules for internal requests. Example usage could be {code} dest_domain=. parent=proxy1.example.com:8080; proxy2.example.com:8080 internal=false {code} This would allow this rule to only trigger if the request is not an internal (plugin) request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-2848) ATS crash in HttpSM::release_server_session
[ https://issues.apache.org/jira/browse/TS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-2848: Fix Version/s: (was: 5.3.0) 6.0.0 ATS crash in HttpSM::release_server_session --- Key: TS-2848 URL: https://issues.apache.org/jira/browse/TS-2848 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: Feifei Cai Assignee: Alan M. Carroll Labels: crash, review, yahoo Fix For: 6.0.0 Attachments: TS-2848.diff We deploy ATS on production hosts, and noticed crashes with the following stack trace. This happens not very frequently, about 1 week or even longer. It crashes repeatedly in the last 2 months, however, the root cause is not found and we can not reproduce the crash as wish, only wait for it happens. {noformat} NOTE: Traffic Server received Sig 11: Segmentation fault /home/y/bin/traffic_server - STACK TRACE: /lib64/libpthread.so.0(+0x321e60f500)[0x2b69adf8f500] /home/y/bin/traffic_server(_ZN6HttpSM22release_server_sessionEb+0x35)[0x529eb5] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x2db)[0x5362bb] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] /home/y/bin/traffic_server(_ZN6HttpSM16do_hostdb_lookupEv+0x282)[0x51e422] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0xbad)[0x536b8d] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x52ff8e] /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x50bef2] /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x5f0a93] /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x65934f] /home/y/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationP7INK_MD5P7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePci+0x383)[0x656373] /home/y/bin/traffic_server(_ZN14CacheProcessor9open_readEP12ContinuationP3URLbP7HTTPHdrP21CacheLookupHttpConfigl13CacheFragType+0xad)[0x633a6d] /home/y/bin/traffic_server(_ZN11HttpCacheSM9open_readEP3URLP7HTTPHdrP21CacheLookupHttpConfigl+0x94)[0x50b944] /home/y/bin/traffic_server(_ZN6HttpSM24do_cache_lookup_and_readEv+0xf3)[0x51d893] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x722)[0x536702] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x49d)[0x53546d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x85d)[0x53683d] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] /home/y/libexec64/trafficserver/header_rewrite.so(+0x1288d)[0x2b69c36d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] /home/y/bin/traffic_server(_ZN6HttpSM33state_read_server_response_headerEiPv+0x398)[0x530828] /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] /home/y/bin/traffic_server[0x68606b] /home/y/bin/traffic_server[0x688a14] /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x681582] /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x6a89bf] /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4a3)[0x6a93a3] /home/y/bin/traffic_server[0x6a785a] /lib64/libpthread.so.0(+0x321e607851)[0x2b69adf87851]
[jira] [Commented] (TS-2848) ATS crash in HttpSM::release_server_session
[ https://issues.apache.org/jira/browse/TS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372017#comment-14372017 ] Phil Sorber commented on TS-2848: - Moving out to 6.0.0. ATS crash in HttpSM::release_server_session --- Key: TS-2848 URL: https://issues.apache.org/jira/browse/TS-2848 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: Feifei Cai Assignee: Alan M. Carroll Labels: crash, review, yahoo Fix For: 6.0.0 Attachments: TS-2848.diff We deploy ATS on production hosts, and noticed crashes with the following stack trace. This happens not very frequently, about 1 week or even longer. It crashes repeatedly in the last 2 months, however, the root cause is not found and we can not reproduce the crash as wish, only wait for it happens. {noformat} NOTE: Traffic Server received Sig 11: Segmentation fault /home/y/bin/traffic_server - STACK TRACE: /lib64/libpthread.so.0(+0x321e60f500)[0x2b69adf8f500] /home/y/bin/traffic_server(_ZN6HttpSM22release_server_sessionEb+0x35)[0x529eb5] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x2db)[0x5362bb] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] /home/y/bin/traffic_server(_ZN6HttpSM16do_hostdb_lookupEv+0x282)[0x51e422] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0xbad)[0x536b8d] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x52ff8e] /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x50bef2] /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x5f0a93] /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x65934f] /home/y/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationP7INK_MD5P7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePci+0x383)[0x656373] /home/y/bin/traffic_server(_ZN14CacheProcessor9open_readEP12ContinuationP3URLbP7HTTPHdrP21CacheLookupHttpConfigl13CacheFragType+0xad)[0x633a6d] /home/y/bin/traffic_server(_ZN11HttpCacheSM9open_readEP3URLP7HTTPHdrP21CacheLookupHttpConfigl+0x94)[0x50b944] /home/y/bin/traffic_server(_ZN6HttpSM24do_cache_lookup_and_readEv+0xf3)[0x51d893] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x722)[0x536702] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x49d)[0x53546d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x85d)[0x53683d] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] /home/y/libexec64/trafficserver/header_rewrite.so(+0x1288d)[0x2b69c36d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] /home/y/bin/traffic_server(_ZN6HttpSM33state_read_server_response_headerEiPv+0x398)[0x530828] /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] /home/y/bin/traffic_server[0x68606b] /home/y/bin/traffic_server[0x688a14] /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x681582] /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x6a89bf] /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4a3)[0x6a93a3] /home/y/bin/traffic_server[0x6a785a] /lib64/libpthread.so.0(+0x321e607851)[0x2b69adf87851]
[jira] [Commented] (TS-1334) congestion control - observed issues
[ https://issues.apache.org/jira/browse/TS-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372018#comment-14372018 ] Phil Sorber commented on TS-1334: - Moving out to 6.0.0. congestion control - observed issues Key: TS-1334 URL: https://issues.apache.org/jira/browse/TS-1334 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.2 Reporter: Aidan McGurn Assignee: Alan M. Carroll Labels: review Fix For: 6.0.0 Attachments: TS-1334.diff Hi, I have investigated the use of using ATS congestion control but I had some observations. i can split out if these are bugs which need separate attention. (queries are with ATS v3.0.2 as test code, assuming not much changed here for v3.2) • Is it feasible for a new Congestion hook to be added to the architecture at some point i.e. for these events: CONGESTION_EVENT_CONGESTED_ON_F CONGESTION_EVENT_CONGESTED_ON_M It would be desirable to send a hook event upwards to inform any plugins of a congested site. • How is the congestion cache managed in that I don’t see it deleting entries – In CongestionDB.cc/function remove_congested_entry - I set breakpoints here, I congest, then I uncongest but I never see this function called. Therefore does the cache grow and grow with old entries? The reason for checking this is I would also need to inform plugin land when a site becomes UNCONGESTED but I don’t even see a httpSM event for this. (this is the biggest issue with CC for me) • Traffic_line –q //doesn’t appear to work? i.e. no congested stats returned there is a Jira open for along time on this without further response: https://issues.apache.org/jira/browse/TS-1221 • Some other lesser important observations like parameters: live_os_conn_retries live_os_conn_timeout dead_os_conn_timeout dead_os_conn_retries appear to have no effect whatsoever but not as important as previous points. . doesn't look like status response code can be customised Maybe this is not supported much as an ATS feature? Any pointers on any of these appreciated even to let me know if the observations are correct and won't fixed in coming releases… Thanks, /aidan -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-2907) Unix Sockets
[ https://issues.apache.org/jira/browse/TS-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-2907: Fix Version/s: (was: 5.3.0) 6.0.0 Unix Sockets Key: TS-2907 URL: https://issues.apache.org/jira/browse/TS-2907 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Luca Rea Assignee: Brian Geffon Labels: review Fix For: 6.0.0 Attachments: TS-2907.diff, unixsocket-backpost.diff Feature request for support listeners and parents on unix sockets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3384) Add stats for OCSP Stapling errors
[ https://issues.apache.org/jira/browse/TS-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3384: Fix Version/s: (was: 5.3.0) 6.0.0 Add stats for OCSP Stapling errors -- Key: TS-3384 URL: https://issues.apache.org/jira/browse/TS-3384 Project: Traffic Server Issue Type: Improvement Components: SSL Reporter: Feifei Cai Assignee: Bryan Call Labels: review Fix For: 6.0.0 Attachments: TS-3384.diff # Add stats for bad OCSP response status: revoked or unknown. {noformat} $ traffic_line -m proxy.process.ssl.ssl_ocsp proxy.process.ssl.ssl_ocsp_revoked_cert_stat 0 proxy.process.ssl.ssl_ocsp_unknown_cert_stat 0 {noformat} {noformat} OCSP_resp_find_status(bs, cinf-cid, status, reason, rev, thisupd, nextupd); switch (status) { case V_OCSP_CERTSTATUS_GOOD: break; case V_OCSP_CERTSTATUS_REVOKED: SSL_INCREMENT_DYN_STAT(ssl_ocsp_revoked_cert_stat); break; case V_OCSP_CERTSTATUS_UNKNOWN: SSL_INCREMENT_DYN_STAT(ssl_ocsp_unknown_cert_stat); break; default: break; } {noformat} # change debug tag in OCSP Stapling to ssl_ocsp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-1334) congestion control - observed issues
[ https://issues.apache.org/jira/browse/TS-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-1334: Fix Version/s: (was: 5.3.0) 6.0.0 congestion control - observed issues Key: TS-1334 URL: https://issues.apache.org/jira/browse/TS-1334 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.2 Reporter: Aidan McGurn Assignee: Alan M. Carroll Labels: review Fix For: 6.0.0 Attachments: TS-1334.diff Hi, I have investigated the use of using ATS congestion control but I had some observations. i can split out if these are bugs which need separate attention. (queries are with ATS v3.0.2 as test code, assuming not much changed here for v3.2) • Is it feasible for a new Congestion hook to be added to the architecture at some point i.e. for these events: CONGESTION_EVENT_CONGESTED_ON_F CONGESTION_EVENT_CONGESTED_ON_M It would be desirable to send a hook event upwards to inform any plugins of a congested site. • How is the congestion cache managed in that I don’t see it deleting entries – In CongestionDB.cc/function remove_congested_entry - I set breakpoints here, I congest, then I uncongest but I never see this function called. Therefore does the cache grow and grow with old entries? The reason for checking this is I would also need to inform plugin land when a site becomes UNCONGESTED but I don’t even see a httpSM event for this. (this is the biggest issue with CC for me) • Traffic_line –q //doesn’t appear to work? i.e. no congested stats returned there is a Jira open for along time on this without further response: https://issues.apache.org/jira/browse/TS-1221 • Some other lesser important observations like parameters: live_os_conn_retries live_os_conn_timeout dead_os_conn_timeout dead_os_conn_retries appear to have no effect whatsoever but not as important as previous points. . doesn't look like status response code can be customised Maybe this is not supported much as an ATS feature? Any pointers on any of these appreciated even to let me know if the observations are correct and won't fixed in coming releases… Thanks, /aidan -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3384) Add stats for OCSP Stapling errors
[ https://issues.apache.org/jira/browse/TS-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372015#comment-14372015 ] Phil Sorber commented on TS-3384: - Moving out to 6.0.0. Add stats for OCSP Stapling errors -- Key: TS-3384 URL: https://issues.apache.org/jira/browse/TS-3384 Project: Traffic Server Issue Type: Improvement Components: SSL Reporter: Feifei Cai Assignee: Bryan Call Labels: review Fix For: 6.0.0 Attachments: TS-3384.diff # Add stats for bad OCSP response status: revoked or unknown. {noformat} $ traffic_line -m proxy.process.ssl.ssl_ocsp proxy.process.ssl.ssl_ocsp_revoked_cert_stat 0 proxy.process.ssl.ssl_ocsp_unknown_cert_stat 0 {noformat} {noformat} OCSP_resp_find_status(bs, cinf-cid, status, reason, rev, thisupd, nextupd); switch (status) { case V_OCSP_CERTSTATUS_GOOD: break; case V_OCSP_CERTSTATUS_REVOKED: SSL_INCREMENT_DYN_STAT(ssl_ocsp_revoked_cert_stat); break; case V_OCSP_CERTSTATUS_UNKNOWN: SSL_INCREMENT_DYN_STAT(ssl_ocsp_unknown_cert_stat); break; default: break; } {noformat} # change debug tag in OCSP Stapling to ssl_ocsp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-2134) SRV lookup does not handle failover correctly
[ https://issues.apache.org/jira/browse/TS-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-2134: Fix Version/s: (was: 5.3.0) 6.0.0 SRV lookup does not handle failover correctly - Key: TS-2134 URL: https://issues.apache.org/jira/browse/TS-2134 Project: Traffic Server Issue Type: Bug Components: DNS, HTTP Reporter: Thach Tran Assignee: Alan M. Carroll Labels: review Fix For: 6.0.0 Attachments: ats.log, ts2134.patch I'm seeing an issue with SRV lookup in ATS in which the proxy doesn't fail over to alternative origins once the first choice is marked as down. To reproduce this, I'm running dnsmasq as a local resolver to serve up the test SRV records. My configuration is as follows. h4. records.config CONFIG proxy.config.dns.nameservers STRING 127.0.0.1 CONFIG proxy.config.dns.resolv_conf STRING NULL CONFIG proxy.config.srv_enabled INT 1 h4. remap.config regex_remap http://.*:8080/ https://noexample.com/ h4. dnsmasq.conf (srv records config) srv-host=_http._tcp.noexample.com,abc.com,443,0,100 srv-host=_http._tcp.noexample.com,google.com,443,1,100 The intention is since the srv lookup for _http._tcp.noexample.com returns abc.com:443 and google.com:443 with abc.com:443 being the one with higher priority, the proxy should try that first and once the connection to abc.com:443 is marked as down (up to 6 retries by default), google.com:443 should be tried next and the connection should succeed then. However, testing with the following curl command multiple times still gives back 502. $ curl -v http://localhost:8080/ Debug log seems to suggest it always attempts abc.com:443. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3456) SSL blind tunnel sometimes not created
[ https://issues.apache.org/jira/browse/TS-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372190#comment-14372190 ] Susan Hinrichs commented on TS-3456: Got the Tsung environment running. I think I'm seeing the error now. In the tsung.log do you get counts for error_unknown? Thought I had a solution. It seems like the SSLNetVC version of reenable ought to reset the reenable flag as well like the UnixNetVConnection version does. However, once I tidied up my solution the error_unknown's returned. I must head out until later this weekend. Hopefully, we can get you a solution to try then. Alan also mentioned that another person had to patch the core to get their SSL plugins working. He's reaching out to that person for details. SSL blind tunnel sometimes not created --- Key: TS-3456 URL: https://issues.apache.org/jira/browse/TS-3456 Project: Traffic Server Issue Type: Bug Components: Plugins, SSL Reporter: Lev Stipakov Assignee: Susan Hinrichs Fix For: 6.0.0 Attachments: ts-tls.cc Hello, I made a simple plugin that sets up TS_SSL_SNI_HOOK and creates a blind tunnel from a separate thread. With low load everything works fine, but with moderate load (100 simultaneous users, each user sends 200 HTTPS requests) I see somewhat strange behavior. On a client side I use Tsung, which creates users and sends number of requests per user. For each user Tsung waits for a response before sending a new request, so if response never arrives, a particular user (and the whole test) stalls. So, with load mentioned above I see few 'stalled' connections on both client and proxy – netstat shows them as ”established”, ATS seems to have data structures for those (checked proxy.process.net.connections_currently_open value), but no traffic goes between proxy and client. Client side (.175): tcp 0 0 10.133.3.175:40737 10.133.3.250:443 ESTABLISHED 14332/beam.smp (more similar connections here) Proxy side (.250 is a server): tcp 0 0 10.133.3.250:443 10.133.3.175:40737 ESTABLISHED 28117/traffic_serve (more similar connections here) I checked traffic.out log and found out that ”SSLNextProtocolAccept:mainEvent” does not get called as many times as it should. This can probably be explained by the fact that client does not send requests for given user anymore if response to previous request hasn't been received. Which, in turn, may indicate that at some point tunnel has not been created. The interesting thing is that everything works fine if a tunnel is created directly from TS_SSL_SNI_HOOK but not from the separate thread. The plugin code is very simple – I set up TS_SSL_SNI_HOOK and start a thread with TSThreadCreate. When hook got called, I push TSVConn to a thread-safe queue. The thread wakes up when item has been pushed, calls TSVConnTunnel / TSVConnReenable for given vconn and then waits for the next item. I have attached the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3456) SSL blind tunnel sometimes not created
[ https://issues.apache.org/jira/browse/TS-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Susan Hinrichs updated TS-3456: --- Attachment: ts-3456.diff Attaching the diff of what I was working with reverted to its less tidy form. Still not working through. SSL blind tunnel sometimes not created --- Key: TS-3456 URL: https://issues.apache.org/jira/browse/TS-3456 Project: Traffic Server Issue Type: Bug Components: Plugins, SSL Reporter: Lev Stipakov Assignee: Susan Hinrichs Fix For: 6.0.0 Attachments: ts-3456.diff, ts-tls.cc Hello, I made a simple plugin that sets up TS_SSL_SNI_HOOK and creates a blind tunnel from a separate thread. With low load everything works fine, but with moderate load (100 simultaneous users, each user sends 200 HTTPS requests) I see somewhat strange behavior. On a client side I use Tsung, which creates users and sends number of requests per user. For each user Tsung waits for a response before sending a new request, so if response never arrives, a particular user (and the whole test) stalls. So, with load mentioned above I see few 'stalled' connections on both client and proxy – netstat shows them as ”established”, ATS seems to have data structures for those (checked proxy.process.net.connections_currently_open value), but no traffic goes between proxy and client. Client side (.175): tcp 0 0 10.133.3.175:40737 10.133.3.250:443 ESTABLISHED 14332/beam.smp (more similar connections here) Proxy side (.250 is a server): tcp 0 0 10.133.3.250:443 10.133.3.175:40737 ESTABLISHED 28117/traffic_serve (more similar connections here) I checked traffic.out log and found out that ”SSLNextProtocolAccept:mainEvent” does not get called as many times as it should. This can probably be explained by the fact that client does not send requests for given user anymore if response to previous request hasn't been received. Which, in turn, may indicate that at some point tunnel has not been created. The interesting thing is that everything works fine if a tunnel is created directly from TS_SSL_SNI_HOOK but not from the separate thread. The plugin code is very simple – I set up TS_SSL_SNI_HOOK and start a thread with TSThreadCreate. When hook got called, I push TSVConn to a thread-safe queue. The thread wakes up when item has been pushed, calls TSVConnTunnel / TSVConnReenable for given vconn and then waits for the next item. I have attached the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3460) Traffic Server doesn't count bytes transferred correctly on SSL Sessions
[ https://issues.apache.org/jira/browse/TS-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leif Hedstrom updated TS-3460: -- Fix Version/s: sometime Traffic Server doesn't count bytes transferred correctly on SSL Sessions Key: TS-3460 URL: https://issues.apache.org/jira/browse/TS-3460 Project: Traffic Server Issue Type: Bug Components: SSL Reporter: Kunal Gulati Fix For: sometime Attachments: Archive.zip Following API return incorrect values for SSL sessions (In case client hasn't yet send a TCP FIN or it is long running SSL session[For eg messenger] TSHttpTxnClientReqHdrBytesGet(txnp); TSHttpTxnClientReqBodyBytesGet(txnp); TSHttpTxnClientRespHdrBytesGet(txnp); TSHttpTxnClientRespBodyBytesGet(txnp); This issue is reproducible on ATS 4.2,4.3,5.2 and Tip of Tree (Not sure how to add stack trace and wireshark captures) Not working Breakpoint 1, debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 243 in_bytes = TSHttpTxnClientReqHdrBytesGet(txnp); (gdb) p in_bytes $8 = 7440486 (gdb) backtrace #0 debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 #1 0x72ed6831 in tcp_info_hook (contp=0x1117990, event=TS_EVENT_HTTP_TXN_CLOSE, edata=0x70bad9f0) at ./tcpinfo.cc:274 #2 0x004f3771 in INKContInternal::handle_event (this=0x1117990, event=60012, edata=0x70bad9f0) at InkAPI.cc:997 #3 0x004ea812 in Continuation::handleEvent (this=0x1117990, event=60012, data=0x70bad9f0) at ../iocore/eventsystem/I_Continuation.h:146 #4 0x004f4090 in APIHook::invoke (this=0x11189f0, event=60012, edata=0x70bad9f0) at InkAPI.cc:1216 #5 0x0056903e in HttpSM::state_api_callout (this=0x70bad9f0, event=0, data=0x0) at HttpSM.cc:1410 #6 0x0057559f in HttpSM::do_api_callout_internal (this=0x70bad9f0) at HttpSM.cc:4767 #7 0x00581ebe in HttpSM::do_api_callout (this=0x70bad9f0) at HttpSM.cc:497 #8 0x0057abfc in HttpSM::kill_this (this=0x70bad9f0) at HttpSM.cc:6443 #9 0x0056ccc4 in HttpSM::main_handler (this=0x70bad9f0, event=2301, data=0x70baf5f8) at HttpSM.cc:2545 #10 0x004ea812 in Continuation::handleEvent (this=0x70bad9f0, event=2301, data=0x70baf5f8) at ../iocore/eventsystem/I_Continuation.h:146 #11 0x005b814c in HttpTunnel::main_handler (this=0x70baf5f8, event=105, data=0x7fffec015ed0) at HttpTunnel.cc:1504 #12 0x004ea812 in Continuation::handleEvent (this=0x70baf5f8, event=105, data=0x7fffec015ed0) at ../iocore/eventsystem/I_Continuation.h:146 #13 0x006cd798 in read_signal_and_update (event=105, vc=0x7fffec015dc0) at UnixNetVConnection.cc:138 #14 0x006d0bd2 in UnixNetVConnection::mainEvent (this=0x7fffec015dc0, event=1, e=0x1097e20) at UnixNetVConnection.cc:1063 #15 0x004ea812 in Continuation::handleEvent (this=0x7fffec015dc0, event=1, data=0x1097e20) at ../iocore/eventsystem/I_Continuation.h:146 #16 0x006c7de9 in InactivityCop::check_inactivity (this=0x107ff10, event=2, e=0x1097e20) at UnixNet.cc:67 #17 0x004ea812 in Continuation::handleEvent (this=0x107ff10, event=2, data=0x1097e20) at ../iocore/eventsystem/I_Continuation.h:146 #18 0x006f01b3 in EThread::process_event (this=0x7524c010, e=0x1097e20, calling_code=2) at UnixEThread.cc:145 #19 0x006f058f in EThread::execute (this=0x7524c010) at UnixEThread.cc:224 #20 0x00512e16 in main (argv=0x7fffe718) at Main.cc:1659 (gdb) p in_bytes $9 = 7440486 (gdb) n 244 in_bytes += TSHttpTxnClientReqBodyBytesGet(txnp); (gdb) p in_bytes $10 = 435 (gdb) n 245 out_bytes = TSHttpTxnClientRespHdrBytesGet(txnp); (gdb) p in_bytes $11 = 435 (gdb) n 246 out_bytes += TSHttpTxnClientRespBodyBytesGet(txnp); (gdb) p out_bytes $12 = 100 (gdb) n 247 total_out_bytes += out_bytes; (gdb) p out_bytes $13 = 25122 (gdb) n 248 total_in_bytes += in_bytes; (gdb) p out_bytes $14 = 25122 Breakpoint 1, debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 243 in_bytes = TSHttpTxnClientReqHdrBytesGet(txnp); (gdb) backtrace #0 debug_print_session (txnp=0x70bad9f0) at ./tcpinfo.cc:243 #1 0x72ed6831 in tcp_info_hook (contp=0x1117990, event=TS_EVENT_HTTP_TXN_CLOSE, edata=0x70bad9f0) at ./tcpinfo.cc:274 #2 0x004f3771 in INKContInternal::handle_event (this=0x1117990, event=60012, edata=0x70bad9f0) at InkAPI.cc:997 #3 0x004ea812 in Continuation::handleEvent (this=0x1117990, event=60012, data=0x70bad9f0) at ../iocore/eventsystem/I_Continuation.h:146 #4 0x004f4090 in APIHook::invoke (this=0x11189f0, event=60012,
[jira] [Updated] (TS-2894) Spdy slow start..
[ https://issues.apache.org/jira/browse/TS-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leif Hedstrom updated TS-2894: -- Fix Version/s: (was: 5.3.0) Spdy slow start.. - Key: TS-2894 URL: https://issues.apache.org/jira/browse/TS-2894 Project: Traffic Server Issue Type: Improvement Components: SPDY Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda Labels: yahoo Attachments: TS-2894.diff When production testing with spdy/5.0.0, we ran into an issue in some of our systems, where, the spdy hosts would flap constantly due to the flood of requests. We further noticed that, where the 4.0.x version or 5.0.0 w/ spdy turned off, would recover quickly following a restart, spdy enabled hosts would continue to receive flood of requests and continue to flap. During this time, traffic server is generally busy reading from the disk and can not handle too many requests, and is made miserable by spdy's support of multiple concurrent streams. To handle such a sudden flood of requests, I'm implementing a simple slow start mechanism with spdy. The idea is to increase the max_concurrent_streams_in gradually based on a configured timer, rather than use the configured value right away. The steps I chose to implement are 1, 25, 50, 75 and 100% of the configured max_concurrent_streams_in. Note that, currently, max_concurrent_streams_in only affects new spdy sessions. Existing sessions (if any) would continue to use their older values. Not too sure, if everyone would be interested in this..but, thought of still uploading my patch, incase, someone is interested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3364) Add command line config validation support to traffic_server
[ https://issues.apache.org/jira/browse/TS-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372248#comment-14372248 ] Leif Hedstrom commented on TS-3364: --- If so, please also change the fix version. Add command line config validation support to traffic_server Key: TS-3364 URL: https://issues.apache.org/jira/browse/TS-3364 Project: Traffic Server Issue Type: Improvement Components: Configuration Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda Fix For: 6.0.0 Currently, traffic_server fails to initialize when it encounters fatal errors in loading the config files during start up. During dynamic reloading of config files (e.g. via traffic_line), traffic_server rejects new config and falls back to existing/old config (however, if there was a traffic_server crash/restart subsequently, that can again result into failing to initialize). This jira proposes to make the behavior of traffic_server when it encounters such fatal errors configurable via a new setting {{proxy.config.ignore_fatal_errors}} with the below options: {code} 0 : All errors are fatal, do not load/reload 1 : Ignore a bad config line, continue with the rest 2 : Ignore a bad config line, stop parsing the file further .. {code} Based on concerns expressed, it has been agreed to not change the traffic_server's behavior to loading with fatal config. Instead, this jira will be used to add a command line option to traffic_server to load and validate the config. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3455) Marking the cache STALE in lookup-complete causes abort()
[ https://issues.apache.org/jira/browse/TS-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372253#comment-14372253 ] Leif Hedstrom commented on TS-3455: --- I tested this on a fresh install, with minimal changes and the plugin above. I could not get it to assert(). I see this in the diags: {code} root@loki 67/1 # ./bin/traffic_server -T maybebug traffic_server: using root directory '/opt/ats' [Mar 20 10:05:09.360] Server {0x7f6d14d4c880} DIAG: (maybebug) lookup complete: 0 [Mar 20 10:05:12.115] Server {0x7f6d13272700} DIAG: (maybebug) lookup complete: 2 [Mar 20 10:05:12.115] Server {0x7f6d13272700} DIAG: (maybebug) set stale [Mar 20 10:05:14.309] Server {0x7f6d13171700} DIAG: (maybebug) lookup complete: 2 [Mar 20 10:05:14.309] Server {0x7f6d13171700} DIAG: (maybebug) set stale {code} Marking the cache STALE in lookup-complete causes abort() - Key: TS-3455 URL: https://issues.apache.org/jira/browse/TS-3455 Project: Traffic Server Issue Type: Bug Reporter: Luca Bruno Fix For: 6.0.0 I've written a simple test case plugin for demonstrating this problem, not sure if it's a problem on my side, but that would also mean that the regex invalidate plugin would also abort(). What the plugin does: in LOOKUP_COMPLETE, if the cache status is FRESH then set it to STALE. To reproduce: 1) Send a first cacheable request to ATS, which gets cached. 2) Request again the same url, the plugin triggers and set the cache to STALE. Then ATS does abort(). Plugin code: {noformat} #include ts/ts.h #include ts/remap.h #include ts/experimental.h #include stdlib.h #include stdio.h #include getopt.h #include string.h #include string #include iterator #include map const char PLUGIN_NAME[] = maybebug; static int Handler(TSCont cont, TSEvent event, void *edata); struct PluginState { PluginState() { cont = TSContCreate(Handler, NULL); TSContDataSet(cont, this); } ~PluginState() { TSContDestroy(cont); } TSCont cont; }; static int Handler(TSCont cont, TSEvent event, void* edata) { TSHttpTxn txn = (TSHttpTxn)edata; if (event == TS_EVENT_HTTP_CACHE_LOOKUP_COMPLETE) { int lookup_status; if (TS_SUCCESS == TSHttpTxnCacheLookupStatusGet(txn, lookup_status)) { TSDebug(PLUGIN_NAME, lookup complete: %d, lookup_status); if (lookup_status == TS_CACHE_LOOKUP_HIT_FRESH) { TSDebug(PLUGIN_NAME, set stale); TSHttpTxnCacheLookupStatusSet(txn, TS_CACHE_LOOKUP_HIT_STALE); } } } TSHttpTxnReenable(txn, TS_EVENT_HTTP_CONTINUE); return TS_EVENT_NONE; } void TSPluginInit (int argc, const char *argv[]) { TSPluginRegistrationInfo info; info.plugin_name = strdup(cappello); info.vendor_name = strdup(foo); info.support_email = strdup(f...@bar.com); if (TSPluginRegister(TS_SDK_VERSION_3_0 , info) != TS_SUCCESS) { TSError(Plugin registration failed); } PluginState* state = new PluginState(); TSHttpHookAdd(TS_HTTP_CACHE_LOOKUP_COMPLETE_HOOK, state-cont); } {noformat} Output: {noformat} [Mar 19 18:40:36.254] Server {0x7f6df0b4f740} DIAG: (maybebug) lookup complete: 0 [Mar 19 18:40:40.854] Server {0x7f6decfee700} DIAG: (maybebug) lookup complete: 2 [Mar 19 18:40:40.854] Server {0x7f6decfee700} DIAG: (maybebug) set stale FATAL: HttpTransact.cc:433: failed assert `s-pending_work == NULL` traffic_server - STACK TRACE: /usr/local/lib/libtsutil.so.5(ink_fatal+0xa3)[0x7f6df072186d] /usr/local/lib/libtsutil.so.5(_Z12ink_get_randv+0x0)[0x7f6df071f3a0] traffic_server[0x60d0aa] traffic_server(_ZN12HttpTransact22HandleCacheOpenReadHitEPNS_5StateE+0xf82)[0x619206] ... {noformat} What happens in gdb is that HandleCacheOpenReadHit is called twice in the same request. The first time s-pending_work is NULL, the second time it's not NULL. The patch below fixes the problem: {noformat} diff --git a/proxy/http/HttpTransact.cc b/proxy/http/HttpTransact.cc index 0078ef1..852f285 100644 --- a/proxy/http/HttpTransact.cc +++ b/proxy/http/HttpTransact.cc @@ -2641,11 +2641,6 @@ HttpTransact::HandleCacheOpenReadHit(State* s) //ink_release_assert(s-current.request_to == PARENT_PROXY || //s-http_config_param-no_dns_forward_to_parent != 0); -// Set ourselves up to handle pending revalidate issues -// after the PP DNS lookup -ink_assert(s-pending_work == NULL); -s-pending_work = issue_revalidate; - // We must be going a PARENT PROXY since so did // origin server DNS lookup right after state Start // @@ -2654,6 +2649,11 @@ HttpTransact::HandleCacheOpenReadHit(State* s) // missing ip but we won't take down the system // if (s-current.request_to ==
[jira] [Commented] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372264#comment-14372264 ] Leif Hedstrom commented on TS-3459: --- Do we really need a new config for this? Why not overlap it on the existing one? Now we have {code} {RECT_CONFIG, proxy.config.http.send_100_continue_response, RECD_INT, 0, RECU_DYNAMIC, RR_NULL, RECC_NULL, NULL, RECA_NULL} {RECT_CONFIG, proxy.config.http.disallow_post_100_continue, RECD_INT, 0, RECU_DYNAMIC, RR_NULL, RECC_NULL, NULL, RECA_NULL} {code} But why not have the first take multiple values? I don't feel strongly about this, but it's a pretty common pattern to have configurations be levels, and not booleans. So, we could have {code} 0 - Don't send 100 Cont 1 - Send 100 Cont 2 - Send 100 Cont as long as method is not POST {code} Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 5.3.0 This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (TS-3459) Create a new config to disallow Post w/ Expect: 100-continue.
[ https://issues.apache.org/jira/browse/TS-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leif Hedstrom reopened TS-3459: --- Create a new config to disallow Post w/ Expect: 100-continue. - Key: TS-3459 URL: https://issues.apache.org/jira/browse/TS-3459 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 5.3.0 This is something that's been bothering us for a while, we want a way to explicitly disallow Posts w/ Expect: 100-continue. I'm going to add a small block of code (configurable of course) that will allow you to return a 405 Method Not Allowed if enabled. This config will default to OFF to maintain backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)