[jira] [Commented] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518906#comment-14518906 ] Feifei Cai commented on TS-3487: In my understanding of the current inactivity mechanism implementation, *transaction_no_activity_timeout_in* only specifies the connections to client, it does not relate to connections to origin server. That is to say, it only count for timeout when a transaction stalls on client side connections. When we disable I/O with client, the timer is set to 0, as follows: {noformat} read_disable/write_disable(NetHandler *nh, UnixNetVConnection *vc) { … vc->next_inactivity_timeout_at = 0; … } {noformat} Indeed, the inactivity mechanism is designed to ensure connections/resources would not be held for too long and to prevent possible leaks. But we need *transaction_no_activity_timeout_in* and *transaction_no_activity_timeout_out* both, to control on the 2 directions. In your example - redirection scenarios, I think it's just okay for client to be oblivious of 3xx follows that ATS performs with origin server, because *transaction_no_activity_timeout_out* timer would be responsible for the 3xx processes. If a transaction stalls on *ATS <-> origin server* side, we should not abort connection on *client - ATS* side (directly), it should trigger *ATS <-> origin server* side timeout first, then abort this side connection, and at last, maybe abort the other side connection according to some other logic. We also use *accept_no_activity_timeout* to limit accept time, and *connect_attempts_timeout* to limit connect time. These timeout is used earlier than *transaction_no_activity_timeout_in/out*. When connection established, *and*, start to read/write data on a connection, we update timer with *transaction_no_activity_timeout_in/out* configurations. When stop to read/write data on a connection, we clear the timer. BTW, things changed slightly when we add *default_inactivity_timeout* setting. As [~bcall] explained in his GET request example, "The inactivity timeout is cleared because it is a GET and then reset to the default because there is none set by inactivity cop." So, it should be possible for using (and relying solely on) the *default_inactivity_timeout* timer independently to control two side connections. However, I think this is a long time insurance, but not a short time control. e.g. the default settings are as follows: {noformat} proxy.config.http.transaction_no_activity_timeout_in 30 proxy.config.http.transaction_no_activity_timeout_out 30 proxy.config.net.default_inactivity_timeout 86400 {noformat} Ideally, with *transaction_no_activity_timeout_in* and *transaction_no_activity_timeout_out* together, we should be able to ensure connections time on both side and prevent leaks. I think we add a long time *default_inactivity_timeout* setting because we have some unknown potential bugs. With this setting, we can kill the connections which are idle too long. > cannot override proxy.config.http.transaction_no_activity_timeout_in per > remap rule for POST methold > > > Key: TS-3487 > URL: https://issues.apache.org/jira/browse/TS-3487 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.2.1 >Reporter: Feifei Cai >Assignee: Bryan Call > Labels: review > Fix For: 6.0.0 > > Attachments: TS-3487.diff > > > The configuration and test are as follows: > remap.config: > {noformat} > map /test1 http://httpbin.org > map /test2 http://httpbin.org @plugin=conf_remap.so > @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 > {noformat} > records.config: > {noformat} > CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 > CONFIG proxy.config.diags.debug.enabled INT 1 > CONFIG proxy.config.diags.debug.tags STRING > http_cs|http_ss|inactivity.*|socket > {noformat} > {code:title=test.py} > import time > import logging > import socket > log = logging.getLogger(__name__) > logging.basicConfig(level=logging.INFO) > import SocketServer > url1 = 'POST /test1/post HTTP/1.1\r\n' > url2 = 'POST /test2/post HTTP/1.1\r\n' > header1 = 'Host: 127.0.0.1\r\n' > # last header need additional '\r\n' > header2 = 'Content-Length: 10\r\n\r\n' > body1 = '12345' > body2 = '67890' > def get_socket(): > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > s.connect(('127.0.0.1', 8080)) > return s > def test_global_config(): > s = get_socket() > log.info('start test global config...') > try: > # before remap > s.send(url1) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2
[jira] [Comment Edited] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518916#comment-14518916 ] Feifei Cai edited comment on TS-3487 at 4/29/15 7:57 AM: - In my understanding of the current inactivity mechanism implementation, *transaction_no_activity_timeout_in* only specifies the connections to client, it does not relate to connections to origin server. That is to say, it only count for timeout when a transaction stalls on client side connections. When we disable I/O with client, the timer is set to 0, as follows: {noformat} read_disable/write_disable(NetHandler *nh, UnixNetVConnection *vc) { … vc->next_inactivity_timeout_at = 0; … } {noformat} Indeed, the inactivity mechanism is designed to ensure connections/resources would not be held for too long and to prevent possible leaks. But we need *transaction_no_activity_timeout_in* and *transaction_no_activity_timeout_out* both, to control on the 2 directions. In your example - redirection scenarios, I think it's just okay for client to be oblivious of 3xx follows that ATS performs with origin server, because *transaction_no_activity_timeout_out* timer would be responsible for the 3xx processes. If a transaction stalls on *ATS <=> origin server* side, we should not abort connection on *client <=> ATS* side (directly), it should trigger *ATS <=> origin server* side timeout first, then abort this side connection, and at last, maybe abort the other side connection according to some other logic. We also use *accept_no_activity_timeout* to limit accept time, and *connect_attempts_timeout* to limit connect time. These timeout is used earlier than *transaction_no_activity_timeout_in/out*. When connection established, *and*, start to read/write data on a connection, we update timer with *transaction_no_activity_timeout_in/out* configurations. When stop to read/write data on a connection, we clear the timer. BTW, things changed slightly when we add *default_inactivity_timeout* setting. As [~bcall] explained in his GET request example, "The inactivity timeout is cleared because it is a GET and then reset to the default because there is none set by inactivity cop." So, it should be possible for using (and relying solely on) the *default_inactivity_timeout* timer independently to control two side connections. However, I think this is a long time insurance, but not a short time control. e.g. the default settings are as follows: {noformat} proxy.config.http.transaction_no_activity_timeout_in 30 proxy.config.http.transaction_no_activity_timeout_out 30 proxy.config.net.default_inactivity_timeout 86400 {noformat} Ideally, with *transaction_no_activity_timeout_in* and *transaction_no_activity_timeout_out* together, we should be able to ensure connections time on both side and prevent leaks. I think we add a long time *default_inactivity_timeout* setting because we have some unknown potential bugs. With this setting, we can kill the connections which are idle too long. was (Author: ffcai): In my understanding of the current inactivity mechanism implementation, *transaction_no_activity_timeout_in* only specifies the connections to client, it does not relate to connections to origin server. That is to say, it only count for timeout when a transaction stalls on client side connections. When we disable I/O with client, the timer is set to 0, as follows: {noformat} read_disable/write_disable(NetHandler *nh, UnixNetVConnection *vc) { … vc->next_inactivity_timeout_at = 0; … } {noformat} Indeed, the inactivity mechanism is designed to ensure connections/resources would not be held for too long and to prevent possible leaks. But we need *transaction_no_activity_timeout_in* and *transaction_no_activity_timeout_out* both, to control on the 2 directions. In your example - redirection scenarios, I think it's just okay for client to be oblivious of 3xx follows that ATS performs with origin server, because *transaction_no_activity_timeout_out* timer would be responsible for the 3xx processes. If a transaction stalls on *ATS <=> origin server* side, we should not abort connection on *client <=> ATS* side (directly), it should trigger *ATS <=> origin server* side timeout first, then abort this side connection, and at last, maybe abort the other side connection according to some other logic. We also use *accept_no_activity_timeout* to limit accept time, and *connect_attempts_timeout* to limit connect time. These timeout is used earlier than *transaction_no_activity_timeout_in/out*. When connection established, *and*, start to read/write data on a connection, we update timer with *transaction_no_activity_timeout_in/out* configurations. When stop to read/write data on a connection, we clear the timer. BTW, things changed slightly when we add *default_inactivity_timeout* setting. As [~bcall] ex
[jira] [Issue Comment Deleted] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3487: --- Comment: was deleted (was: In my understanding of the current inactivity mechanism implementation, *transaction_no_activity_timeout_in* only specifies the connections to client, it does not relate to connections to origin server. That is to say, it only count for timeout when a transaction stalls on client side connections. When we disable I/O with client, the timer is set to 0, as follows: {noformat} read_disable/write_disable(NetHandler *nh, UnixNetVConnection *vc) { … vc->next_inactivity_timeout_at = 0; … } {noformat} Indeed, the inactivity mechanism is designed to ensure connections/resources would not be held for too long and to prevent possible leaks. But we need *transaction_no_activity_timeout_in* and *transaction_no_activity_timeout_out* both, to control on the 2 directions. In your example - redirection scenarios, I think it's just okay for client to be oblivious of 3xx follows that ATS performs with origin server, because *transaction_no_activity_timeout_out* timer would be responsible for the 3xx processes. If a transaction stalls on *ATS <-> origin server* side, we should not abort connection on *client - ATS* side (directly), it should trigger *ATS <-> origin server* side timeout first, then abort this side connection, and at last, maybe abort the other side connection according to some other logic. We also use *accept_no_activity_timeout* to limit accept time, and *connect_attempts_timeout* to limit connect time. These timeout is used earlier than *transaction_no_activity_timeout_in/out*. When connection established, *and*, start to read/write data on a connection, we update timer with *transaction_no_activity_timeout_in/out* configurations. When stop to read/write data on a connection, we clear the timer. BTW, things changed slightly when we add *default_inactivity_timeout* setting. As [~bcall] explained in his GET request example, "The inactivity timeout is cleared because it is a GET and then reset to the default because there is none set by inactivity cop." So, it should be possible for using (and relying solely on) the *default_inactivity_timeout* timer independently to control two side connections. However, I think this is a long time insurance, but not a short time control. e.g. the default settings are as follows: {noformat} proxy.config.http.transaction_no_activity_timeout_in 30 proxy.config.http.transaction_no_activity_timeout_out 30 proxy.config.net.default_inactivity_timeout 86400 {noformat} Ideally, with *transaction_no_activity_timeout_in* and *transaction_no_activity_timeout_out* together, we should be able to ensure connections time on both side and prevent leaks. I think we add a long time *default_inactivity_timeout* setting because we have some unknown potential bugs. With this setting, we can kill the connections which are idle too long. ) > cannot override proxy.config.http.transaction_no_activity_timeout_in per > remap rule for POST methold > > > Key: TS-3487 > URL: https://issues.apache.org/jira/browse/TS-3487 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.2.1 >Reporter: Feifei Cai >Assignee: Bryan Call > Labels: review > Fix For: 6.0.0 > > Attachments: TS-3487.diff > > > The configuration and test are as follows: > remap.config: > {noformat} > map /test1 http://httpbin.org > map /test2 http://httpbin.org @plugin=conf_remap.so > @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 > {noformat} > records.config: > {noformat} > CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 > CONFIG proxy.config.diags.debug.enabled INT 1 > CONFIG proxy.config.diags.debug.tags STRING > http_cs|http_ss|inactivity.*|socket > {noformat} > {code:title=test.py} > import time > import logging > import socket > log = logging.getLogger(__name__) > logging.basicConfig(level=logging.INFO) > import SocketServer > url1 = 'POST /test1/post HTTP/1.1\r\n' > url2 = 'POST /test2/post HTTP/1.1\r\n' > header1 = 'Host: 127.0.0.1\r\n' > # last header need additional '\r\n' > header2 = 'Content-Length: 10\r\n\r\n' > body1 = '12345' > body2 = '67890' > def get_socket(): > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > s.connect(('127.0.0.1', 8080)) > return s > def test_global_config(): > s = get_socket() > log.info('start test global config...') > try: > # before remap > s.send(url1) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap
[jira] [Commented] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518916#comment-14518916 ] Feifei Cai commented on TS-3487: In my understanding of the current inactivity mechanism implementation, *transaction_no_activity_timeout_in* only specifies the connections to client, it does not relate to connections to origin server. That is to say, it only count for timeout when a transaction stalls on client side connections. When we disable I/O with client, the timer is set to 0, as follows: {noformat} read_disable/write_disable(NetHandler *nh, UnixNetVConnection *vc) { … vc->next_inactivity_timeout_at = 0; … } {noformat} Indeed, the inactivity mechanism is designed to ensure connections/resources would not be held for too long and to prevent possible leaks. But we need *transaction_no_activity_timeout_in* and *transaction_no_activity_timeout_out* both, to control on the 2 directions. In your example - redirection scenarios, I think it's just okay for client to be oblivious of 3xx follows that ATS performs with origin server, because *transaction_no_activity_timeout_out* timer would be responsible for the 3xx processes. If a transaction stalls on *ATS <=> origin server* side, we should not abort connection on *client <=> ATS* side (directly), it should trigger *ATS <=> origin server* side timeout first, then abort this side connection, and at last, maybe abort the other side connection according to some other logic. We also use *accept_no_activity_timeout* to limit accept time, and *connect_attempts_timeout* to limit connect time. These timeout is used earlier than *transaction_no_activity_timeout_in/out*. When connection established, *and*, start to read/write data on a connection, we update timer with *transaction_no_activity_timeout_in/out* configurations. When stop to read/write data on a connection, we clear the timer. BTW, things changed slightly when we add *default_inactivity_timeout* setting. As [~bcall] explained in his GET request example, "The inactivity timeout is cleared because it is a GET and then reset to the default because there is none set by inactivity cop." So, it should be possible for using (and relying solely on) the *default_inactivity_timeout* timer independently to control two side connections. However, I think this is a long time insurance, but not a short time control. e.g. the default settings are as follows: {noformat} proxy.config.http.transaction_no_activity_timeout_in 30 proxy.config.http.transaction_no_activity_timeout_out 30 proxy.config.net.default_inactivity_timeout 86400 {noformat} Ideally, with *transaction_no_activity_timeout_in* and *transaction_no_activity_timeout_out* together, we should be able to ensure connections time on both side and prevent leaks. I think we add a long time *default_inactivity_timeout* setting because we have some unknown potential bugs. With this setting, we can kill the connections which are idle too long. > cannot override proxy.config.http.transaction_no_activity_timeout_in per > remap rule for POST methold > > > Key: TS-3487 > URL: https://issues.apache.org/jira/browse/TS-3487 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.2.1 >Reporter: Feifei Cai >Assignee: Bryan Call > Labels: review > Fix For: 6.0.0 > > Attachments: TS-3487.diff > > > The configuration and test are as follows: > remap.config: > {noformat} > map /test1 http://httpbin.org > map /test2 http://httpbin.org @plugin=conf_remap.so > @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 > {noformat} > records.config: > {noformat} > CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 > CONFIG proxy.config.diags.debug.enabled INT 1 > CONFIG proxy.config.diags.debug.tags STRING > http_cs|http_ss|inactivity.*|socket > {noformat} > {code:title=test.py} > import time > import logging > import socket > log = logging.getLogger(__name__) > logging.basicConfig(level=logging.INFO) > import SocketServer > url1 = 'POST /test1/post HTTP/1.1\r\n' > url2 = 'POST /test2/post HTTP/1.1\r\n' > header1 = 'Host: 127.0.0.1\r\n' > # last header need additional '\r\n' > header2 = 'Content-Length: 10\r\n\r\n' > body1 = '12345' > body2 = '67890' > def get_socket(): > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > s.connect(('127.0.0.1', 8080)) > return s > def test_global_config(): > s = get_socket() > log.info('start test global config...') > try: > # before remap > s.send(url1) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header
[jira] [Commented] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496051#comment-14496051 ] Feifei Cai commented on TS-3487: Well, let’s go through some code of inactivity timeout mechanism. Assuming ATS is compiled in default way - using InactivityCop with a timeout timestamp, as [~bcall] explained. https://github.com/apache/trafficserver/blob/master/iocore/net/UnixNetVConnection.cc {code} UnixNetVConnection::reenable/reenable_re(VIO *vio) { … UnixNetVConnection::set_enabled(VIO *vio) { … if (!next_inactivity_timeout_at && inactivity_timeout_in) { next_inactivity_timeout_at = ink_get_hrtime() + inactivity_timeout_in; } … } … } {code} https://github.com/apache/trafficserver/blob/master/iocore/net/P_UnixNet.h {code} read_disable/write_disable(NetHandler *nh, UnixNetVConnection *vc) { … vc->next_inactivity_timeout_at = 0; … } {code} As it shows, the timer's start operation is binding to UnixNetVConnection::set_enabled in UnixNetVConnection::reenable/reenable_re, the stop operation is binding to read_disable/write_disable, and the reset operation is binding to net_activity in read_from_net/write_to_net. So, we do not need to care about the implementation details in net IO. What we need do is to call set_inactivity_timeout at appropriate point and cancel_inactivity_timeout when necessary. For GET requests, the inactive timer on the client vc is cancelled right before the origin connection is opened: {code} HttpSM::set_next_state() { ... case HttpTransact::SM_ACTION_ORIGIN_SERVER_OPEN: ua_session->get_netvc()->cancel_inactivity_timeout(); ... } {code} which is inside call_transact_and_set_next_state(): {code} HttpSM::state_read_client_request_header () { ... case PARSE_DONE: ua_entry->read_vio->nbytes = ua_entry->read_vio->ndone; call_transact_and_set_next_state(HttpTransact::ModifyRequest); return 0; } {code} Be aware that before entering call_transact_and_set_next_state(HttpTransact::ModifyRequest), ATS is before TS_HTTP_READ_REQUEST_HDR_HOOK hook point; after exit call_transact_and_set_next_state(HttpTransact::ModifyRequest), ATS is before TS_HTTP_SEND_REQUEST_HDR_HOOK hook point. In this time window, ATS is in HttpSM functions, and would not switch to IO read/write on client vc. It’s running inside read_signal_and_update, which is in NetHandler::mainNetEvent, and would not switch to InactivityCop::check_inactivity. So, the inactivity timeout event would not happen in this time window. There are several hook points which we can override the configurations, however, check_inactivity does not take effect here, even ATS is blocked in it (which should not happen in normal case, here ATS is not doing I/O with client), so we do not need to set_inactivity_timeout with new value . What’s more, I think we can cancel_inactivity_timeout some earlier, just after finish reading and parsing client request headers and disable further I/O on client vc. Once we stop client I/O, we do not need inactivity timeout control on client vc anymore. For GET requests, we only need to set it again when sending back response to client. The current implementation is just okay. For POST requests, I agree we should set_inactivity_timeout some earlier. How about let’s go through GET requests first, then I update the patch for POST requests, because in my understanding, we do not need to change anything for GET requests logic. > cannot override proxy.config.http.transaction_no_activity_timeout_in per > remap rule for POST methold > > > Key: TS-3487 > URL: https://issues.apache.org/jira/browse/TS-3487 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.2.1 >Reporter: Feifei Cai >Assignee: Bryan Call > Labels: review > Fix For: 6.0.0 > > Attachments: TS-3487.diff > > > The configuration and test are as follows: > remap.config: > {noformat} > map /test1 http://httpbin.org > map /test2 http://httpbin.org @plugin=conf_remap.so > @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 > {noformat} > records.config: > {noformat} > CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 > CONFIG proxy.config.diags.debug.enabled INT 1 > CONFIG proxy.config.diags.debug.tags STRING > http_cs|http_ss|inactivity.*|socket > {noformat} > {code:title=test.py} > import time > import logging > import socket > log = logging.getLogger(__name__) > logging.basicConfig(level=logging.INFO) > import SocketServer > url1 = 'POST /test1/post HTTP/1.1\r\n' > url2 = 'POST /test2/post HTTP/1.1\r\n' > header1 = 'Host: 127.0.0.1\r\n' > # last header need additional '\r\n' > header2 = 'Co
[jira] [Comment Edited] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14485383#comment-14485383 ] Feifei Cai edited comment on TS-3487 at 4/8/15 3:39 PM: Well, I think it makes sense that ATS starts a timer inside {{HttpSM::state_read_client_request_header}}, because the transaction may stall here. In my test case, I make the client send url and headers with sleep calls, this would stall {{HttpSM::state_read_client_request_header}}. {noformat} # before remap s.send(url1) time.sleep(2) # < global config s.send(header1) time.sleep(3) # < global config s.send(header2) {noformat} If we do not start a timer here (before remap), there would be no timeout control when ATS stalls on "reading client request headers". My patch adds a timer setup in POST start point, because the configuration may be overridden, we have to reset it with newest configuration. As you figured out, remap is just one case, it may be overridden at any hook point. As to GET method, there's no client read/write during "read client request headers" and "send back response to client", so we do not need to change anything. Even the configuration may be overridden at some hook point, we do not need to reset the timer, since there's no timer until ATS start to "send back response to client". was (Author: ffcai): Well, I think it makes sense that ATS starts a timer inside {{HttpSM::state_read_client_request_header}}, because the transaction may stall here. In my test case, I make the client send url and headers with sleep calls, this would stall {{HttpSM::state_read_client_request_header}}. {noformat} # before remap s.send(url1) time.sleep(2) # < global config s.send(header1) time.sleep(3) # < global config s.send(header2) {noformat} If we do not start here (before remap), there would be no timeout when ATS stalls on reading client request headers. > cannot override proxy.config.http.transaction_no_activity_timeout_in per > remap rule for POST methold > > > Key: TS-3487 > URL: https://issues.apache.org/jira/browse/TS-3487 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.2.1 >Reporter: Feifei Cai >Assignee: Bryan Call > Labels: review > Fix For: 6.0.0 > > Attachments: TS-3487.diff > > > The configuration and test are as follows: > remap.config: > {noformat} > map /test1 http://httpbin.org > map /test2 http://httpbin.org @plugin=conf_remap.so > @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 > {noformat} > records.config: > {noformat} > CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 > CONFIG proxy.config.diags.debug.enabled INT 1 > CONFIG proxy.config.diags.debug.tags STRING > http_cs|http_ss|inactivity.*|socket > {noformat} > {code:title=test.py} > import time > import logging > import socket > log = logging.getLogger(__name__) > logging.basicConfig(level=logging.INFO) > import SocketServer > url1 = 'POST /test1/post HTTP/1.1\r\n' > url2 = 'POST /test2/post HTTP/1.1\r\n' > header1 = 'Host: 127.0.0.1\r\n' > # last header need additional '\r\n' > header2 = 'Content-Length: 10\r\n\r\n' > body1 = '12345' > body2 = '67890' > def get_socket(): > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > s.connect(('127.0.0.1', 8080)) > return s > def test_global_config(): > s = get_socket() > log.info('start test global config...') > try: > # before remap > s.send(url1) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(2) # < global config > s.send(body1) > time.sleep(4) # < global config > s.send(body2) > log.info('test global config: pass!') > except IOError: > log.info('test global config: fail!') > response = s.recv(4096) > print response > def test_per_remap_config(): > s = get_socket() > log.info('start test per remap config...') > try: > # before remap > s.send(url2) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(11) # < per remap config > s.send(body1) > time.sleep(13) # < per remap config > s.send(body2) > log.info('test per remap config: pass!') > except IOError: > log.info('test per remap config: fail!') > response = s.recv(4096) > print response > if __name__ == '__mai
[jira] [Commented] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14485383#comment-14485383 ] Feifei Cai commented on TS-3487: Well, I think it makes sense that ATS starts a timer inside {{HttpSM::state_read_client_request_header}}, because the transaction may stall here. In my test case, I make the client send url and headers with sleep calls, this would stall {{HttpSM::state_read_client_request_header}}. {noformat} # before remap s.send(url1) time.sleep(2) # < global config s.send(header1) time.sleep(3) # < global config s.send(header2) {noformat} If we do not start here (before remap), there would be no timeout when ATS stalls on reading client request headers. > cannot override proxy.config.http.transaction_no_activity_timeout_in per > remap rule for POST methold > > > Key: TS-3487 > URL: https://issues.apache.org/jira/browse/TS-3487 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.2.1 >Reporter: Feifei Cai >Assignee: Bryan Call > Labels: review > Fix For: 6.0.0 > > Attachments: TS-3487.diff > > > The configuration and test are as follows: > remap.config: > {noformat} > map /test1 http://httpbin.org > map /test2 http://httpbin.org @plugin=conf_remap.so > @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 > {noformat} > records.config: > {noformat} > CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 > CONFIG proxy.config.diags.debug.enabled INT 1 > CONFIG proxy.config.diags.debug.tags STRING > http_cs|http_ss|inactivity.*|socket > {noformat} > {code:title=test.py} > import time > import logging > import socket > log = logging.getLogger(__name__) > logging.basicConfig(level=logging.INFO) > import SocketServer > url1 = 'POST /test1/post HTTP/1.1\r\n' > url2 = 'POST /test2/post HTTP/1.1\r\n' > header1 = 'Host: 127.0.0.1\r\n' > # last header need additional '\r\n' > header2 = 'Content-Length: 10\r\n\r\n' > body1 = '12345' > body2 = '67890' > def get_socket(): > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > s.connect(('127.0.0.1', 8080)) > return s > def test_global_config(): > s = get_socket() > log.info('start test global config...') > try: > # before remap > s.send(url1) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(2) # < global config > s.send(body1) > time.sleep(4) # < global config > s.send(body2) > log.info('test global config: pass!') > except IOError: > log.info('test global config: fail!') > response = s.recv(4096) > print response > def test_per_remap_config(): > s = get_socket() > log.info('start test per remap config...') > try: > # before remap > s.send(url2) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(11) # < per remap config > s.send(body1) > time.sleep(13) # < per remap config > s.send(body2) > log.info('test per remap config: pass!') > except IOError: > log.info('test per remap config: fail!') > response = s.recv(4096) > print response > if __name__ == '__main__': > test_global_config() > test_per_remap_config() > {code} > {{test_global_config()}} would pass, but {{test_per_remap_config()}} fails. > {{proxy.config.http.transaction_no_activity_timeout_in}} in per remap rule > does not works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14485371#comment-14485371 ] Feifei Cai commented on TS-3487: {{transaction_no_activity_timeout_in}} specifies how long ATS keeps connections to clients open if a transaction stalls. In my understanding, each time before ATS starts to read from or write to client (e.g. read client request headers, read client request body, write response to client...), we setup a {{transaction_no_activity_timeout_in}} timer, assuming that the transaction is on no-activity status. Once ATS read data from or write data to client, the transaction switches to active status, we reset the {{transaction_no_activity_timeout_in}} timer. The timer only runs between last read/write and next read/write. Every read/write will reset the timer, since a read/write means that the transaction switches status from no-activity status to active status. In the current implementation, we set up {{transaction_no_activity_timeout_in}} timer at 2 points: # {{HttpSM::state_read_client_request_header}} # {{HttpTransact::SM_ACTION_API_SEND_RESPONSE_HDR}} This works fine for GET method, because ATS read from or write to client only at these 2 points. However, for POST method, the point {{HttpSM::do_setup_post_tunnel}} is missed. We need to set up the timer with the newest configuration, because {{transaction_no_activity_timeout_in}} may have been overridden in some hook points between {{HttpSM::state_read_client_request_header}} and {{HttpSM::do_setup_post_tunnel}}. Please take a look at the debug output as follows, it's the test result of my local python test: {noformat} $ traffic_server -T '(http_cs|http_ss|inactivity|socket)' traffic_server: using root directory '/tmp/tsqa/base_envs/tmp9hwEUd' [Apr 8 14:57:01.119] Server {0x7fa8e276c800} NOTE: Traffic Server is running unprivileged, not switching to user 'nobody' [Apr 8 14:57:01.128] Server {0x7fa8e276c800} DEBUG: (inactivity_cop) default inactivity timeout is set to: 86400 [Apr 8 14:57:04.683] Server {0x7fa8e276c800} DEBUG: (socket) net_activity updating inactivity 0, NetVC=0x7fa8d8016600 [Apr 8 14:57:04.683] Server {0x7fa8e276c800} DEBUG: (http_cs) [0] session born, netvc 0x7fa8d8016600 [Apr 8 14:57:04.683] Server {0x7fa8e276c800} DEBUG: (http_cs) [0] Starting transaction 1 using sm [0] [Apr 8 14:57:04.683] Server {0x7fa8e276c800} DEBUG: (socket) UnixNetVConnection::remove_from_keep_alive_lru NetVC=0x7fa8d8016600 [Apr 8 14:57:04.683] Server {0x7fa8e276c800} DEBUG: (socket) Set inactive timeout=1200, for NetVC=0x7fa8d8016600 [Apr 8 14:57:04.683] Server {0x7fa8e276c800} DEBUG: (socket) Set active timeout=9000, NetVC=0x7fa8d8016600 [Apr 8 14:57:04.683] Server {0x7fa8e276c800} DEBUG: (socket) Set inactive timeout=50, for NetVC=0x7fa8d8016600 [Apr 8 14:57:05.131] Server {0x7fa8e276c800} DEBUG: (inactivity_cop_verbose) vc: 0x7fa8d8016600 now: 1428505025131303369 timeout at: 1428505029 timeout in: 5 [Apr 8 14:57:06.141] Server {0x7fa8e276c800} DEBUG: (inactivity_cop_verbose) vc: 0x7fa8d8016600 now: 1428505026141133710 timeout at: 1428505029 timeout in: 5 [Apr 8 14:57:06.702] Server {0x7fa8e276c800} DEBUG: (socket) net_activity updating inactivity 50, NetVC=0x7fa8d8016600 [Apr 8 14:57:07.138] Server {0x7fa8e276c800} DEBUG: (inactivity_cop_verbose) vc: 0x7fa8d8016600 now: 1428505027138484319 timeout at: 1428505031 timeout in: 5 [Apr 8 14:57:08.137] Server {0x7fa8e276c800} DEBUG: (inactivity_cop_verbose) vc: 0x7fa8d8016600 now: 1428505028137647712 timeout at: 1428505031 timeout in: 5 [Apr 8 14:57:09.140] Server {0x7fa8e276c800} DEBUG: (inactivity_cop_verbose) vc: 0x7fa8d8016600 now: 1428505029140409681 timeout at: 1428505031 timeout in: 5 [Apr 8 14:57:09.687] Server {0x7fa8e276c800} DEBUG: (socket) net_activity updating inactivity 50, NetVC=0x7fa8d8016600 [Apr 8 14:57:09.687] Server {0x7fa8e276c800} DEBUG: (socket) ::open: setsockopt() TCP_NODELAY on socket [Apr 8 14:57:09.687] Server {0x7fa8e276c800} DEBUG: (http_ss) [0] session born, netvc 0x7fa8d8016340 [Apr 8 14:57:09.687] Server {0x7fa8e276c800} DEBUG: (socket) Set inactive timeout=18000, for NetVC=0x7fa8d8016340 [Apr 8 14:57:09.687] Server {0x7fa8e276c800} DEBUG: (socket) Set active timeout=0, NetVC=0x7fa8d8016340 [Apr 8 14:57:09.687] Server {0x7fa8e276c800} DEBUG: (socket) read_disable updating inactivity_at 0, NetVC=0x7fa8d8016600 [Apr 8 14:57:09.917] Server {0x7fa8e276c800} DEBUG: (socket) net_activity updating inactivity 18000, NetVC=0x7fa8d8016340 [Apr 8 14:57:10.139] Server {0x7fa8e276c800} DEBUG: (inactivity_cop_verbose) vc: 0x7fa8d8016340 now: 1428505030139478064 timeout at: 1428506829 timeout in: 1800 [Apr 8 14:57:10.139] Server {0x7fa8e276c800} DEBUG: (inactivity_cop_verbose) vc: 0x7fa8d8016600 now: 1428505030139478064 timeout at: 1428505034 ti
[jira] [Commented] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394264#comment-14394264 ] Feifei Cai commented on TS-3487: For the per remap rule use case, transaction_no_activity_timeout_in does not works due to: {{set_inactivity_timeout(transaction_no_activity_timeout_in)}} is called in {{HttpSM::state_read_client_request_header}}, which is before remap process. But we override it after remap rule is applied. So we need to reset it with new value after remap. What's more, per remap is only one of the use cases. Considering that an overridable setting is allowed to be overridden per transaction in any hook point, we need to {{set_inactivity_timeout(transaction_no_activity_timeout_in)}} each time before starting read/write with client, and {{set_inactivity_timeout(transaction_no_activity_timeout_out)}} with origin server. We have done this for most points in [HTTP Transaction State Diagram|https://docs.trafficserver.apache.org/en/latest/sdk/http-hooks-and-transactions.en.html#http-transaction-state-diagram], however, we seems to miss the point {{HttpSM::do_setup_post_tunnel}}. > cannot override proxy.config.http.transaction_no_activity_timeout_in per > remap rule for POST methold > > > Key: TS-3487 > URL: https://issues.apache.org/jira/browse/TS-3487 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.2.1 >Reporter: Feifei Cai > Attachments: TS-3487.diff > > > The configuration and test are as follows: > remap.config: > {noformat} > map /test1 http://httpbin.org > map /test2 http://httpbin.org @plugin=conf_remap.so > @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 > {noformat} > records.config: > {noformat} > CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 > CONFIG proxy.config.diags.debug.enabled INT 1 > CONFIG proxy.config.diags.debug.tags STRING > http_cs|http_ss|inactivity.*|socket > {noformat} > {code:title=test.py} > import time > import logging > import socket > log = logging.getLogger(__name__) > logging.basicConfig(level=logging.INFO) > import SocketServer > url1 = 'POST /test1/post HTTP/1.1\r\n' > url2 = 'POST /test2/post HTTP/1.1\r\n' > header1 = 'Host: 127.0.0.1\r\n' > # last header need additional '\r\n' > header2 = 'Content-Length: 10\r\n\r\n' > body1 = '12345' > body2 = '67890' > def get_socket(): > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > s.connect(('127.0.0.1', 8080)) > return s > def test_global_config(): > s = get_socket() > log.info('start test global config...') > try: > # before remap > s.send(url1) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(2) # < global config > s.send(body1) > time.sleep(4) # < global config > s.send(body2) > log.info('test global config: pass!') > except IOError: > log.info('test global config: fail!') > response = s.recv(4096) > print response > def test_per_remap_config(): > s = get_socket() > log.info('start test per remap config...') > try: > # before remap > s.send(url2) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(11) # < per remap config > s.send(body1) > time.sleep(13) # < per remap config > s.send(body2) > log.info('test per remap config: pass!') > except IOError: > log.info('test per remap config: fail!') > response = s.recv(4096) > print response > if __name__ == '__main__': > test_global_config() > test_per_remap_config() > {code} > {{test_global_config()}} would pass, but {{test_per_remap_config()}} fails. > {{proxy.config.http.transaction_no_activity_timeout_in}} in per remap rule > does not works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3487: --- Attachment: TS-3487.diff > cannot override proxy.config.http.transaction_no_activity_timeout_in per > remap rule for POST methold > > > Key: TS-3487 > URL: https://issues.apache.org/jira/browse/TS-3487 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.2.1 >Reporter: Feifei Cai > Attachments: TS-3487.diff > > > The configuration and test are as follows: > remap.config: > {noformat} > map /test1 http://httpbin.org > map /test2 http://httpbin.org @plugin=conf_remap.so > @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 > {noformat} > records.config: > {noformat} > CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 > CONFIG proxy.config.diags.debug.enabled INT 1 > CONFIG proxy.config.diags.debug.tags STRING > http_cs|http_ss|inactivity.*|socket > {noformat} > {code:title=test.py} > import time > import logging > import socket > log = logging.getLogger(__name__) > logging.basicConfig(level=logging.INFO) > import SocketServer > url1 = 'POST /test1/post HTTP/1.1\r\n' > url2 = 'POST /test2/post HTTP/1.1\r\n' > header1 = 'Host: 127.0.0.1\r\n' > # last header need additional '\r\n' > header2 = 'Content-Length: 10\r\n\r\n' > body1 = '12345' > body2 = '67890' > def get_socket(): > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > s.connect(('127.0.0.1', 8080)) > return s > def test_global_config(): > s = get_socket() > log.info('start test global config...') > try: > # before remap > s.send(url1) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(2) # < global config > s.send(body1) > time.sleep(4) # < global config > s.send(body2) > log.info('test global config: pass!') > except IOError: > log.info('test global config: fail!') > response = s.recv(4096) > print response > def test_per_remap_config(): > s = get_socket() > log.info('start test per remap config...') > try: > # before remap > s.send(url2) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(11) # < per remap config > s.send(body1) > time.sleep(13) # < per remap config > s.send(body2) > log.info('test per remap config: pass!') > except IOError: > log.info('test per remap config: fail!') > response = s.recv(4096) > print response > if __name__ == '__main__': > test_global_config() > test_per_remap_config() > {code} > {{test_global_config()}} would pass, but {{test_per_remap_config()}} fails. > {{proxy.config.http.transaction_no_activity_timeout_in}} in per remap rule > does not works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
[ https://issues.apache.org/jira/browse/TS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3487: --- Affects Version/s: 5.2.1 > cannot override proxy.config.http.transaction_no_activity_timeout_in per > remap rule for POST methold > > > Key: TS-3487 > URL: https://issues.apache.org/jira/browse/TS-3487 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.2.1 >Reporter: Feifei Cai > > The configuration and test are as follows: > remap.config: > {noformat} > map /test1 http://httpbin.org > map /test2 http://httpbin.org @plugin=conf_remap.so > @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 > {noformat} > records.config: > {noformat} > CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 > CONFIG proxy.config.diags.debug.enabled INT 1 > CONFIG proxy.config.diags.debug.tags STRING > http_cs|http_ss|inactivity.*|socket > {noformat} > {code:title=test.py} > import time > import logging > import socket > log = logging.getLogger(__name__) > logging.basicConfig(level=logging.INFO) > import SocketServer > url1 = 'POST /test1/post HTTP/1.1\r\n' > url2 = 'POST /test2/post HTTP/1.1\r\n' > header1 = 'Host: 127.0.0.1\r\n' > # last header need additional '\r\n' > header2 = 'Content-Length: 10\r\n\r\n' > body1 = '12345' > body2 = '67890' > def get_socket(): > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > s.connect(('127.0.0.1', 8080)) > return s > def test_global_config(): > s = get_socket() > log.info('start test global config...') > try: > # before remap > s.send(url1) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(2) # < global config > s.send(body1) > time.sleep(4) # < global config > s.send(body2) > log.info('test global config: pass!') > except IOError: > log.info('test global config: fail!') > response = s.recv(4096) > print response > def test_per_remap_config(): > s = get_socket() > log.info('start test per remap config...') > try: > # before remap > s.send(url2) > time.sleep(2) # < global config > s.send(header1) > time.sleep(3) # < global config > s.send(header2) > # after remap > time.sleep(11) # < per remap config > s.send(body1) > time.sleep(13) # < per remap config > s.send(body2) > log.info('test per remap config: pass!') > except IOError: > log.info('test per remap config: fail!') > response = s.recv(4096) > print response > if __name__ == '__main__': > test_global_config() > test_per_remap_config() > {code} > {{test_global_config()}} would pass, but {{test_per_remap_config()}} fails. > {{proxy.config.http.transaction_no_activity_timeout_in}} in per remap rule > does not works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-3487) cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold
Feifei Cai created TS-3487: -- Summary: cannot override proxy.config.http.transaction_no_activity_timeout_in per remap rule for POST methold Key: TS-3487 URL: https://issues.apache.org/jira/browse/TS-3487 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: Feifei Cai The configuration and test are as follows: remap.config: {noformat} map /test1 http://httpbin.org map /test2 http://httpbin.org @plugin=conf_remap.so @pparam=proxy.config.http.transaction_no_activity_timeout_in=15 {noformat} records.config: {noformat} CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 5 CONFIG proxy.config.diags.debug.enabled INT 1 CONFIG proxy.config.diags.debug.tags STRING http_cs|http_ss|inactivity.*|socket {noformat} {code:title=test.py} import time import logging import socket log = logging.getLogger(__name__) logging.basicConfig(level=logging.INFO) import SocketServer url1 = 'POST /test1/post HTTP/1.1\r\n' url2 = 'POST /test2/post HTTP/1.1\r\n' header1 = 'Host: 127.0.0.1\r\n' # last header need additional '\r\n' header2 = 'Content-Length: 10\r\n\r\n' body1 = '12345' body2 = '67890' def get_socket(): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('127.0.0.1', 8080)) return s def test_global_config(): s = get_socket() log.info('start test global config...') try: # before remap s.send(url1) time.sleep(2) # < global config s.send(header1) time.sleep(3) # < global config s.send(header2) # after remap time.sleep(2) # < global config s.send(body1) time.sleep(4) # < global config s.send(body2) log.info('test global config: pass!') except IOError: log.info('test global config: fail!') response = s.recv(4096) print response def test_per_remap_config(): s = get_socket() log.info('start test per remap config...') try: # before remap s.send(url2) time.sleep(2) # < global config s.send(header1) time.sleep(3) # < global config s.send(header2) # after remap time.sleep(11) # < per remap config s.send(body1) time.sleep(13) # < per remap config s.send(body2) log.info('test per remap config: pass!') except IOError: log.info('test per remap config: fail!') response = s.recv(4096) print response if __name__ == '__main__': test_global_config() test_per_remap_config() {code} {{test_global_config()}} would pass, but {{test_per_remap_config()}} fails. {{proxy.config.http.transaction_no_activity_timeout_in}} in per remap rule does not works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (TS-3362) Do not staple negative OCSP response
[ https://issues.apache.org/jira/browse/TS-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai closed TS-3362. -- Resolution: Won't Fix Fix Version/s: (was: sometime) > Do not staple negative OCSP response > > > Key: TS-3362 > URL: https://issues.apache.org/jira/browse/TS-3362 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Feifei Cai > Labels: review > Attachments: TS-3362.diff > > > When get OCSP response, we check it before cache/staple it. If it's negative, > I think we'd better discard it instead of sending back to user agent. This > would not increase security risk: User agent would query CA for OCSP response > if ATS does not staple it with certificate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3362) Do not staple negative OCSP response
[ https://issues.apache.org/jira/browse/TS-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352446#comment-14352446 ] Feifei Cai commented on TS-3362: Thanks [~zwoop]. I'll close this ticket. > Do not staple negative OCSP response > > > Key: TS-3362 > URL: https://issues.apache.org/jira/browse/TS-3362 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Feifei Cai > Labels: review > Attachments: TS-3362.diff > > > When get OCSP response, we check it before cache/staple it. If it's negative, > I think we'd better discard it instead of sending back to user agent. This > would not increase security risk: User agent would query CA for OCSP response > if ATS does not staple it with certificate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344988#comment-14344988 ] Feifei Cai commented on TS-2497: For *redirect GET* use case, the test is as follows: {noformat} $ curl http://httpbin.org/redirect-to?url=http://httpbin.org/get -L -v {noformat} However, the *redirect POST* request is not support: {noformat} $ curl http://httpbin.org/redirect-to?url=http://httpbin.org/post -d xxx -L -v {noformat} References: # http://httpbin.org/ # https://github.com/Runscope/httpbin/issues/156 > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344875#comment-14344875 ] Feifei Cai commented on TS-2497: Not quite clear about the POST/PUT redirection use case. For GET methods, I have simple test case; however, I have not set up a test case for POST redirection yet. BTW, I tried reverting the patch , i.e. deallocate memory in both no origin server response and receive origin server response cases, it runs stable for ~3 days in our production server. > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3384) Add stats for OCSP Stapling errors
[ https://issues.apache.org/jira/browse/TS-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3384: --- Attachment: TS-3384.diff > Add stats for OCSP Stapling errors > -- > > Key: TS-3384 > URL: https://issues.apache.org/jira/browse/TS-3384 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Feifei Cai > Attachments: TS-3384.diff > > > # Add stats for bad OCSP response status: revoked or unknown. > {noformat} > $ traffic_line -m proxy.process.ssl.ssl_ocsp > proxy.process.ssl.ssl_ocsp_revoked_cert_stat 0 > proxy.process.ssl.ssl_ocsp_unknown_cert_stat 0 > {noformat} > {noformat} > OCSP_resp_find_status(bs, cinf->cid, &status, &reason, &rev, &thisupd, > &nextupd); > switch (status) { > case V_OCSP_CERTSTATUS_GOOD: > break; > case V_OCSP_CERTSTATUS_REVOKED: > SSL_INCREMENT_DYN_STAT(ssl_ocsp_revoked_cert_stat); > break; > case V_OCSP_CERTSTATUS_UNKNOWN: > SSL_INCREMENT_DYN_STAT(ssl_ocsp_unknown_cert_stat); > break; > default: > break; > } > {noformat} > # change debug tag in OCSP Stapling to ssl_ocsp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-3384) Add stats for OCSP Stapling errors
Feifei Cai created TS-3384: -- Summary: Add stats for OCSP Stapling errors Key: TS-3384 URL: https://issues.apache.org/jira/browse/TS-3384 Project: Traffic Server Issue Type: Improvement Components: SSL Reporter: Feifei Cai # Add stats for bad OCSP response status: revoked or unknown. {noformat} $ traffic_line -m proxy.process.ssl.ssl_ocsp proxy.process.ssl.ssl_ocsp_revoked_cert_stat 0 proxy.process.ssl.ssl_ocsp_unknown_cert_stat 0 {noformat} {noformat} OCSP_resp_find_status(bs, cinf->cid, &status, &reason, &rev, &thisupd, &nextupd); switch (status) { case V_OCSP_CERTSTATUS_GOOD: break; case V_OCSP_CERTSTATUS_REVOKED: SSL_INCREMENT_DYN_STAT(ssl_ocsp_revoked_cert_stat); break; case V_OCSP_CERTSTATUS_UNKNOWN: SSL_INCREMENT_DYN_STAT(ssl_ocsp_unknown_cert_stat); break; default: break; } {noformat} # change debug tag in OCSP Stapling to ssl_ocsp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306497#comment-14306497 ] Feifei Cai edited comment on TS-2497 at 2/5/15 2:13 AM: I have not test redirects yet. Will try it. Hi [~briang], do you remember the crash case? Is it happened on redirects or not? was (Author: ffcai): I have not test redirects yet. Will try it. Hi [~briang]], do you remember the crash case? Is it happened on redirects or not? > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306497#comment-14306497 ] Feifei Cai commented on TS-2497: I have not test redirects yet. Will try it. Hi [~briang]], do you remember the crash case? Is it happened on redirects or not? > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-3362) Do not staple negative OCSP response
[ https://issues.apache.org/jira/browse/TS-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302982#comment-14302982 ] Feifei Cai edited comment on TS-3362 at 2/3/15 9:13 AM: Oh, yes, you're right. The fetch and check of OCSP response is an independent thread, not in ssl handshake. I should report it in some new metrics, e.g. {{proxy.process.ssl.ocsp_revoked_certstatus}}, {{proxy.process.ssl.ocsp_unknown_certstatus}}... And, I'll extend {{ssl}} debug tag to {{ssl_ocsp}}. Will attach a new patch as soon. was (Author: ffcai): Oh, yes, you're right. The fetch and check of OCSP response is an independent thread, not in ssl handshake. I should report it in some new metrics, e.g. proxy.process.ssl.ocsp_revoked_certstatus, proxy.process.ssl.ocsp_unknown_certstatus... And, I'll extend ssl debug tag to ssl_ocsp. Will attach a new patch as soon. > Do not staple negative OCSP response > > > Key: TS-3362 > URL: https://issues.apache.org/jira/browse/TS-3362 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Feifei Cai > Attachments: TS-3362.diff > > > When get OCSP response, we check it before cache/staple it. If it's negative, > I think we'd better discard it instead of sending back to user agent. This > would not increase security risk: User agent would query CA for OCSP response > if ATS does not staple it with certificate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3362) Do not staple negative OCSP response
[ https://issues.apache.org/jira/browse/TS-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302982#comment-14302982 ] Feifei Cai commented on TS-3362: Oh, yes, you're right. The fetch and check of OCSP response is an independent thread, not in ssl handshake. I should report it in some new metrics, e.g. proxy.process.ssl.ocsp_revoked_certstatus, proxy.process.ssl.ocsp_unknown_certstatus... And, I'll extend ssl debug tag to ssl_ocsp. Will attach a new patch as soon. > Do not staple negative OCSP response > > > Key: TS-3362 > URL: https://issues.apache.org/jira/browse/TS-3362 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Feifei Cai > Attachments: TS-3362.diff > > > When get OCSP response, we check it before cache/staple it. If it's negative, > I think we'd better discard it instead of sending back to user agent. This > would not increase security risk: User agent would query CA for OCSP response > if ATS does not staple it with certificate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3362) Do not staple negative OCSP response
[ https://issues.apache.org/jira/browse/TS-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3362: --- Attachment: TS-3362.diff > Do not staple negative OCSP response > > > Key: TS-3362 > URL: https://issues.apache.org/jira/browse/TS-3362 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Feifei Cai > Attachments: TS-3362.diff > > > When get OCSP response, we check it before cache/staple it. If it's negative, > I think we'd better discard it instead of sending back to user agent. This > would not increase security risk: User agent would query CA for OCSP response > if ATS does not staple it with certificate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-3362) Do not staple negative OCSP response
Feifei Cai created TS-3362: -- Summary: Do not staple negative OCSP response Key: TS-3362 URL: https://issues.apache.org/jira/browse/TS-3362 Project: Traffic Server Issue Type: Improvement Components: SSL Reporter: Feifei Cai When get OCSP response, we check it before cache/staple it. If it's negative, I think we'd better discard it instead of sending back to user agent. This would not increase security risk: User agent would query CA for OCSP response if ATS does not staple it with certificate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296866#comment-14296866 ] Feifei Cai commented on TS-2497: Hi [~shinrich], the read on server_session would be scheduled on transferring server response to client, and it would use new MIOBuffer instead of post_buffer, so it would not cause issue here. I think TS-1425 may have fixed the issue here, or not doing enough yet. > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2497: --- Comment: was deleted (was: Yes, now I have the same question: TS-1425's patch disables read from client side - cancel read_from_net() call using MIOBuffer *post_buffer. This should have fixed the premature free issue. Why we still try to delay free time to prevent conflict with client side read?) > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2497: --- Comment: was deleted (was: Yes, now I have the same question: TS-1425's patch disables read from client side - cancel read_from_net() call using MIOBuffer *post_buffer. This should have fixed the premature free issue. Why we still try to delay free time to prevent conflict with client side read?) > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296848#comment-14296848 ] Feifei Cai commented on TS-2497: Yes, now I have the same question: TS-1425's patch disables read from client side - cancel read_from_net() call using MIOBuffer *post_buffer. This should have fixed the premature free issue. Why we still try to delay free time to prevent conflict with client side read? > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296852#comment-14296852 ] Feifei Cai commented on TS-2497: Yes, now I have the same question: TS-1425's patch disables read from client side - cancel read_from_net() call using MIOBuffer *post_buffer. This should have fixed the premature free issue. Why we still try to delay free time to prevent conflict with client side read? > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296851#comment-14296851 ] Feifei Cai commented on TS-2497: Yes, now I have the same question: TS-1425's patch disables read from client side - cancel read_from_net() call using MIOBuffer *post_buffer. This should have fixed the premature free issue. Why we still try to delay free time to prevent conflict with client side read? > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289206#comment-14289206 ] Feifei Cai edited comment on TS-2497 at 1/23/15 12:54 PM: -- Memory leak is noticed in our production hosts. It should be related to handling 5xx response from origin sever. The dump info is as follows, it's from 1 host with ~70% POST requests. I enabled memory dump {{proxy.config.dump_mem_info_frequency}} and track {{proxy.config.res_track_memory}}. *traffic.out:* {noformat} allocated |in-use | type size | free list name |||-- 0 | 0 |2097152 | memory/ioBufAllocator[14] 0 | 0 |1048576 | memory/ioBufAllocator[13] 0 | 0 | 524288 | memory/ioBufAllocator[12] 0 | 0 | 262144 | memory/ioBufAllocator[11] 0 | 0 | 131072 | memory/ioBufAllocator[10] 0 | 0 | 65536 | memory/ioBufAllocator[9] 1266679808 | 1262354432 | 32768 | memory/ioBufAllocator[8] 600309760 | 599703552 | 16384 | memory/ioBufAllocator[7] 395051008 | 391086080 | 8192 | memory/ioBufAllocator[6] 229113856 | 224432128 | 4096 | memory/ioBufAllocator[5] 342622208 | 342503424 | 2048 | memory/ioBufAllocator[4] 245104640 | 245042176 | 1024 | memory/ioBufAllocator[3] 2228224 |2176512 |512 | memory/ioBufAllocator[2] 622592 | 607232 |256 | memory/ioBufAllocator[1] 2375680 |2370176 |128 | memory/ioBufAllocator[0] Location | Size In-use ---+ memory/IOBuffer/ProtocolProbeSessionAccept.cc:39 | 66768896 memory/IOBuffer/HttpClientSession.cc:230 |0 memory/IOBuffer/HttpSM.cc:3314 |0 memory/IOBuffer/HttpSM.cc:5349 | 3003506816 memory/IOBuffer/HttpSM.cc:5668 |0 memory/IOBuffer/HttpSM.cc:5874 |0 memory/IOBuffer/HttpSM.cc:5976 |0 memory/IOBuffer/HttpSM.cc:6267 |0 memory/IOBuffer/HttpServerSession.cc:87 |0 memory/IOBuffer/HttpTunnel.cc:95 |0 memory/IOBuffer/HttpTunnel.cc:100 |0 TOTAL | 3070275712 {noformat} I take a refer to [~shaunmcginnity]'s node.js with some changes, and reproduce the memory leak in my local environment. # [origin-server.js|https://issues.apache.org/jira/secure/attachment/12694145/origin-server.js] This origin server responses a 503 when receives more than one single byte, so the post would not complete at most cases. I change [~shaunmcginnity]'s code, make origin server responses to ats, which would make ats hits another code path. # [client.js|https://issues.apache.org/jira/secure/attachment/12694146/client.js] We create a new client per second, and each client try to post 32K bytes data. # ats *remap.config*: remap all to local port 5000 {quote}map / http://127.0.0.1:5000{quote} *records.config*: listen on 80 {quote}CONFIG proxy.config.http.server_ports STRING 80{quote} Then we can get dump info as follows, and in-use number of MIOBuffer with index=8 (size=32K) would increase 1 per second. {noformat} allocated |in-use | type size | free list name |||-- 1048576 | 32768 | 32768 | memory/ioBufAllocator[8] {noformat} We can also try change the "Content-Length" in client.js to a smaller size, and MIOBuffer with the corresponding index(0-7) would also increase. I add this simple patch to prevent the memory leak in the case above, just like last commit, and it's verified in 1 test host. free.diff {code} diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc index 932ef97..123b97a 100644 --- a/proxy/http/HttpSM.cc +++ b/proxy/http/HttpSM.cc @@ -5074,6 +5074,7 @@ HttpSM::handle_post_failure() t_state.current.server->keep_alive = HTTP_NO_KEEPALIVE; if (server_buffer_reader->read_avail() > 0) { +tunnel.deallocate_buffers(); tunnel.reset(); // There's data from the server so try to read the header setup_server_read_response
[jira] [Comment Edited] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289206#comment-14289206 ] Feifei Cai edited comment on TS-2497 at 1/23/15 12:53 PM: -- Memory leak is noticed in our production hosts. It should be related to handling 5xx response from origin sever. The dump info is as follows, it's from 1 host with ~70% POST requests. I enabled memory dump {{proxy.config.dump_mem_info_frequency}} and track {{proxy.config.res_track_memory}}. *traffic.out:* {noformat} allocated |in-use | type size | free list name |||-- 0 | 0 |2097152 | memory/ioBufAllocator[14] 0 | 0 |1048576 | memory/ioBufAllocator[13] 0 | 0 | 524288 | memory/ioBufAllocator[12] 0 | 0 | 262144 | memory/ioBufAllocator[11] 0 | 0 | 131072 | memory/ioBufAllocator[10] 0 | 0 | 65536 | memory/ioBufAllocator[9] 1266679808 | 1262354432 | 32768 | memory/ioBufAllocator[8] 600309760 | 599703552 | 16384 | memory/ioBufAllocator[7] 395051008 | 391086080 | 8192 | memory/ioBufAllocator[6] 229113856 | 224432128 | 4096 | memory/ioBufAllocator[5] 342622208 | 342503424 | 2048 | memory/ioBufAllocator[4] 245104640 | 245042176 | 1024 | memory/ioBufAllocator[3] 2228224 |2176512 |512 | memory/ioBufAllocator[2] 622592 | 607232 |256 | memory/ioBufAllocator[1] 2375680 |2370176 |128 | memory/ioBufAllocator[0] Location | Size In-use ---+ memory/IOBuffer/ProtocolProbeSessionAccept.cc:39 | 66768896 memory/IOBuffer/HttpClientSession.cc:230 |0 memory/IOBuffer/HttpSM.cc:3314 |0 memory/IOBuffer/HttpSM.cc:5349 | 3003506816 memory/IOBuffer/HttpSM.cc:5668 |0 memory/IOBuffer/HttpSM.cc:5874 |0 memory/IOBuffer/HttpSM.cc:5976 |0 memory/IOBuffer/HttpSM.cc:6267 |0 memory/IOBuffer/HttpServerSession.cc:87 |0 memory/IOBuffer/HttpTunnel.cc:95 |0 memory/IOBuffer/HttpTunnel.cc:100 |0 TOTAL | 3070275712 {noformat} I take a refer to [~shaunmcginnity]'s node.js with some changes, and reproduce the memory leak in my local environment. # [origin-server.js|https://issues.apache.org/jira/secure/attachment/12694146/client.js] This origin server responses a 503 when receives more than one single byte, so the post would not complete at most cases. I change [~shaunmcginnity]'s code, make origin server responses to ats, which would make ats hits another code path. # [client.js|https://issues.apache.org/jira/secure/attachment/12694145/origin-server.js] We create a new client per second, and each client try to post 32K bytes data. # ats *remap.config*: remap all to local port 5000 {quote}map / http://127.0.0.1:5000{quote} *records.config*: listen on 80 {quote}CONFIG proxy.config.http.server_ports STRING 80{quote} Then we can get dump info as follows, and in-use number of MIOBuffer with index=8 (size=32K) would increase 1 per second. {noformat} allocated |in-use | type size | free list name |||-- 1048576 | 32768 | 32768 | memory/ioBufAllocator[8] {noformat} We can also try change the "Content-Length" in client.js to a smaller size, and MIOBuffer with the corresponding index(0-7) would also increase. I add this simple patch to prevent the memory leak in the case above, just like last commit, and it's verified in 1 test host. free.diff {code} diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc index 932ef97..123b97a 100644 --- a/proxy/http/HttpSM.cc +++ b/proxy/http/HttpSM.cc @@ -5074,6 +5074,7 @@ HttpSM::handle_post_failure() t_state.current.server->keep_alive = HTTP_NO_KEEPALIVE; if (server_buffer_reader->read_avail() > 0) { +tunnel.deallocate_buffers(); tunnel.reset(); // There's data from the server so try to read the header setup_server_read_response
[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289206#comment-14289206 ] Feifei Cai commented on TS-2497: Memory leak is noticed in our production hosts. It should be related to handling 5xx response from origin sever. The dump info is as follows, it's from 1 host with ~70% POST requests. I enabled memory dump {{proxy.config.dump_mem_info_frequency}} and track {{proxy.config.res_track_memory}}. *traffic.out:* {noformat} allocated |in-use | type size | free list name |||-- 0 | 0 |2097152 | memory/ioBufAllocator[14] 0 | 0 |1048576 | memory/ioBufAllocator[13] 0 | 0 | 524288 | memory/ioBufAllocator[12] 0 | 0 | 262144 | memory/ioBufAllocator[11] 0 | 0 | 131072 | memory/ioBufAllocator[10] 0 | 0 | 65536 | memory/ioBufAllocator[9] 1266679808 | 1262354432 | 32768 | memory/ioBufAllocator[8] 600309760 | 599703552 | 16384 | memory/ioBufAllocator[7] 395051008 | 391086080 | 8192 | memory/ioBufAllocator[6] 229113856 | 224432128 | 4096 | memory/ioBufAllocator[5] 342622208 | 342503424 | 2048 | memory/ioBufAllocator[4] 245104640 | 245042176 | 1024 | memory/ioBufAllocator[3] 2228224 |2176512 |512 | memory/ioBufAllocator[2] 622592 | 607232 |256 | memory/ioBufAllocator[1] 2375680 |2370176 |128 | memory/ioBufAllocator[0] Location | Size In-use ---+ memory/IOBuffer/ProtocolProbeSessionAccept.cc:39 | 66768896 memory/IOBuffer/HttpClientSession.cc:230 |0 memory/IOBuffer/HttpSM.cc:3314 |0 memory/IOBuffer/HttpSM.cc:5349 | 3003506816 memory/IOBuffer/HttpSM.cc:5668 |0 memory/IOBuffer/HttpSM.cc:5874 |0 memory/IOBuffer/HttpSM.cc:5976 |0 memory/IOBuffer/HttpSM.cc:6267 |0 memory/IOBuffer/HttpServerSession.cc:87 |0 memory/IOBuffer/HttpTunnel.cc:95 |0 memory/IOBuffer/HttpTunnel.cc:100 |0 TOTAL | 3070275712 {noformat} I take a refer to [~shaunmcginnity]'s node.js with some changes, and reproduce the memory leak in my local environment. # origin-server.js This origin server responses a 503 when receives more than one single byte, so the post would not complete at most cases. I change [~shaunmcginnity]'s code, make origin server responses to ats, which would make ats hits another code path. # client.js We create a new client per second, and each client try to post 32K bytes data. # ATS *remap.config*: remap all to local port 5000 {quote}map / http://127.0.0.1:5000{quote} *records.config*: listen on 80 {quote}CONFIG proxy.config.http.server_ports STRING 80{quote} Then we can get dump info as follows, and in-use number of MIOBuffer with index=8 (size=32K) would increase 1 per second. {noformat} allocated |in-use | type size | free list name |||-- 1048576 | 32768 | 32768 | memory/ioBufAllocator[8] {noformat} We can also try change the "Content-Length" in client.js to a smaller size, and MIOBuffer with the corresponding index(0-7) would also increase. I add this simple patch to prevent the memory leak in the case above, just like last commit, and it's verified in 1 test host. free.diff {code} diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc index 932ef97..123b97a 100644 --- a/proxy/http/HttpSM.cc +++ b/proxy/http/HttpSM.cc @@ -5074,6 +5074,7 @@ HttpSM::handle_post_failure() t_state.current.server->keep_alive = HTTP_NO_KEEPALIVE; if (server_buffer_reader->read_avail() > 0) { +tunnel.deallocate_buffers(); tunnel.reset(); // There's data from the server so try to read the header setup_server_read_response_header(); {code} *traffic.out* {noformat} allocated |in-use | type size | free list name |||
[jira] [Updated] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely
[ https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2497: --- Attachment: client.js origin-server.js > Failed post results in tunnel buffers being returned to freelist prematurely > > > Key: TS-2497 > URL: https://issues.apache.org/jira/browse/TS-2497 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Brian Geffon >Assignee: Brian Geffon > Fix For: 4.2.0 > > Attachments: TS-2497.patch, client.js, origin-server.js, repro.js > > > When a post fails to an origin server either the server died or the server > returned a response without reading all of the post data, in either case, TS > will destroy buffers too early. This normally does not result in a crash > because the MIOBuffers are returned to the freelist and only with sufficient > load will the race happen causing a crash. Additionally, even if a crash > doesn't happen you might have data corruption across post requests from the > buffers being used after being returned to the freelist. > Thanks to Thomas Jackson for help reproducing and resolving this bug. > An example stack trace, while we've seen other crashes in write_avail too. > #0 0x004eff14 in IOBufferBlock::read_avail (this=0x0) at > ../iocore/eventsystem/I_IOBuffer.h:362 > #1 0x0050d151 in MIOBuffer::append_block_internal > (this=0x2aab38001130, b=0x2aab0c037200) at > ../iocore/eventsystem/P_IOBuffer.h:946 > #2 0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, > asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986 > #3 0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:994 > #4 0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1002 > #5 0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at > ../iocore/eventsystem/P_IOBuffer.h:1048 > #6 0x006c18f3 in read_from_net (nh=0x2aaafca0d208, > vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234 > #7 0x006c37bf in UnixNetVConnection::net_read_io > (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at > UnixNetVConnection.cc:816 > #8 0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, > event=5, e=0x271d8e0) at UnixNet.cc:380 > #9 0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, > event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146 > #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, > e=0x271d8e0, calling_code=5) at UnixEThread.cc:142 > #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at > UnixEThread.cc:264 > #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88 > #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0 > #14 0x0033728e68ed in clone () from /lib64/libc.so.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-3161) traffic_top shows some values incorrectly
[ https://issues.apache.org/jira/browse/TS-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189897#comment-14189897 ] Feifei Cai edited comment on TS-3161 at 10/30/14 10:27 AM: --- traffic_top-2: After the fix, it shows the correct values. I've tried a private fix on my local environment, will send pull request some later. was (Author: ffcai): traffic_top-2: After the fix, it shows the correct values. > traffic_top shows some values incorrectly > - > > Key: TS-3161 > URL: https://issues.apache.org/jira/browse/TS-3161 > Project: Traffic Server > Issue Type: Bug > Components: Tools >Reporter: Feifei Cai > Attachments: traffic_top-1.png, traffic_top-2.png > > > The average time values in "Cache Information" do not show correctly, such as > "Fresh (ms)", "Reval (ms)"... > We use TSRecordGetInt to get all statistics. Some statistics are float value, > we need to use TSRecordGetFloat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3161) traffic_top shows some values incorrectly
[ https://issues.apache.org/jira/browse/TS-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3161: --- Attachment: traffic_top-2.png > traffic_top shows some values incorrectly > - > > Key: TS-3161 > URL: https://issues.apache.org/jira/browse/TS-3161 > Project: Traffic Server > Issue Type: Bug > Components: Tools >Reporter: Feifei Cai > Attachments: traffic_top-1.png, traffic_top-2.png > > > The average time values in "Cache Information" do not show correctly, such as > "Fresh (ms)", "Reval (ms)"... > We use TSRecordGetInt to get all statistics. Some statistics are float value, > we need to use TSRecordGetFloat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3161) traffic_top shows some values incorrectly
[ https://issues.apache.org/jira/browse/TS-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3161: --- Attachment: (was: traffic_top-2.png) > traffic_top shows some values incorrectly > - > > Key: TS-3161 > URL: https://issues.apache.org/jira/browse/TS-3161 > Project: Traffic Server > Issue Type: Bug > Components: Tools >Reporter: Feifei Cai > Attachments: traffic_top-1.png > > > The average time values in "Cache Information" do not show correctly, such as > "Fresh (ms)", "Reval (ms)"... > We use TSRecordGetInt to get all statistics. Some statistics are float value, > we need to use TSRecordGetFloat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3161) traffic_top shows some values incorrectly
[ https://issues.apache.org/jira/browse/TS-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3161: --- Attachment: traffic_top-2.png traffic_top-2: After the fix, it shows the correct values. > traffic_top shows some values incorrectly > - > > Key: TS-3161 > URL: https://issues.apache.org/jira/browse/TS-3161 > Project: Traffic Server > Issue Type: Bug > Components: Tools >Reporter: Feifei Cai > Attachments: traffic_top-1.png, traffic_top-2.png > > > The average time values in "Cache Information" do not show correctly, such as > "Fresh (ms)", "Reval (ms)"... > We use TSRecordGetInt to get all statistics. Some statistics are float value, > we need to use TSRecordGetFloat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-3161) traffic_top shows some values incorrectly
[ https://issues.apache.org/jira/browse/TS-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189888#comment-14189888 ] Feifei Cai edited comment on TS-3161 at 10/30/14 10:09 AM: --- traffic_top-1: The average time of "Fresh, Reval, Cold, Change, Not, No (ms)" do not show correctly. was (Author: ffcai): The average time of "Fresh, Reval, Cold, Change, Not, No (ms)" do not show correctly. > traffic_top shows some values incorrectly > - > > Key: TS-3161 > URL: https://issues.apache.org/jira/browse/TS-3161 > Project: Traffic Server > Issue Type: Bug > Components: Tools >Reporter: Feifei Cai > Attachments: traffic_top-1.png > > > The average time values in "Cache Information" do not show correctly, such as > "Fresh (ms)", "Reval (ms)"... > We use TSRecordGetInt to get all statistics. Some statistics are float value, > we need to use TSRecordGetFloat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3161) traffic_top shows some values incorrectly
[ https://issues.apache.org/jira/browse/TS-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3161: --- Attachment: (was: traffic_top-1.png) > traffic_top shows some values incorrectly > - > > Key: TS-3161 > URL: https://issues.apache.org/jira/browse/TS-3161 > Project: Traffic Server > Issue Type: Bug > Components: Tools >Reporter: Feifei Cai > Attachments: traffic_top-1.png > > > The average time values in "Cache Information" do not show correctly, such as > "Fresh (ms)", "Reval (ms)"... > We use TSRecordGetInt to get all statistics. Some statistics are float value, > we need to use TSRecordGetFloat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3161) traffic_top shows some values incorrectly
[ https://issues.apache.org/jira/browse/TS-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3161: --- Attachment: traffic_top-1.png > traffic_top shows some values incorrectly > - > > Key: TS-3161 > URL: https://issues.apache.org/jira/browse/TS-3161 > Project: Traffic Server > Issue Type: Bug > Components: Tools >Reporter: Feifei Cai > Attachments: traffic_top-1.png, traffic_top-1.png > > > The average time values in "Cache Information" do not show correctly, such as > "Fresh (ms)", "Reval (ms)"... > We use TSRecordGetInt to get all statistics. Some statistics are float value, > we need to use TSRecordGetFloat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3161) traffic_top shows some values incorrectly
[ https://issues.apache.org/jira/browse/TS-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3161: --- Attachment: traffic_top-1.png The average time of "Fresh, Reval, Cold, Change, Not, No (ms)" do not show correctly. > traffic_top shows some values incorrectly > - > > Key: TS-3161 > URL: https://issues.apache.org/jira/browse/TS-3161 > Project: Traffic Server > Issue Type: Bug > Components: Tools >Reporter: Feifei Cai > Attachments: traffic_top-1.png > > > The average time values in "Cache Information" do not show correctly, such as > "Fresh (ms)", "Reval (ms)"... > We use TSRecordGetInt to get all statistics. Some statistics are float value, > we need to use TSRecordGetFloat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-3161) traffic_top shows some values incorrectly
Feifei Cai created TS-3161: -- Summary: traffic_top shows some values incorrectly Key: TS-3161 URL: https://issues.apache.org/jira/browse/TS-3161 Project: Traffic Server Issue Type: Bug Components: Tools Reporter: Feifei Cai The average time values in "Cache Information" do not show correctly, such as "Fresh (ms)", "Reval (ms)"... We use TSRecordGetInt to get all statistics. Some statistics are float value, we need to use TSRecordGetFloat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3120) Overlapping rank in config.remap when using .include directives
[ https://issues.apache.org/jira/browse/TS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3120: --- Attachment: TS-3120_fix_regex.diff > Overlapping rank in config.remap when using .include directives > --- > > Key: TS-3120 > URL: https://issues.apache.org/jira/browse/TS-3120 > Project: Traffic Server > Issue Type: Bug > Components: Configuration, Core >Affects Versions: 5.0.0 >Reporter: Feifei Cai >Assignee: James Peach > Fix For: 5.2.0 > > Attachments: TS-3120.diff, TS-3120_fix_regex.diff > > > As described in > [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] > {quote} > Once these rules are executed we pick the lowest line number as the match > (which replicates first-match-wins). > {quote} > In fact, we use line number for rank for now. > ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) > However, when we use .include directives to include some other remap config > files, there will be overlapping and conflict with the line numbers in each > other file. > *Examples* > remap.config > {noformat} > .include remap1.config > .include remap2.config > {noformat} > remap1.config > {noformat} > map /foo/ https://www.yahoo.com > {noformat} > remap2.config > {noformat} > map /foo/bar1 https://www.yahoo.com > map /foo/bar2 https://www.yahoo.com > {noformat} > *Test Results* > {noformat} > $ curl "`hostname`/foo/" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar1" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar2" -I | grep HTTP > HTTP/1.1 404 Not Found > {noformat} > When parsing remap1.config, first entry in remap1.config is inserted with > rank 0, second with rank 1. Then parsing remap2.config, the single entry is > inserted with rank 0 again. So the entry in remap2.config is overlapped with > first entry in remap1.config and takes precedence with second entry. This > would confuse customers. > I'm considering using a global line number to count each entry, and follow > the "first-match-wins" rules. I test with my local environment and it works. > However, we are using the line number for debugging > output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), > so the current count variant _cln_ is still needed. This would mess up the > code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3120) Overlapping rank in config.remap when using .include directives
[ https://issues.apache.org/jira/browse/TS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3120: --- Attachment: (was: TS-3120_fix_regex.diff) > Overlapping rank in config.remap when using .include directives > --- > > Key: TS-3120 > URL: https://issues.apache.org/jira/browse/TS-3120 > Project: Traffic Server > Issue Type: Bug > Components: Configuration, Core >Affects Versions: 5.0.0 >Reporter: Feifei Cai >Assignee: James Peach > Fix For: 5.2.0 > > Attachments: TS-3120.diff > > > As described in > [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] > {quote} > Once these rules are executed we pick the lowest line number as the match > (which replicates first-match-wins). > {quote} > In fact, we use line number for rank for now. > ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) > However, when we use .include directives to include some other remap config > files, there will be overlapping and conflict with the line numbers in each > other file. > *Examples* > remap.config > {noformat} > .include remap1.config > .include remap2.config > {noformat} > remap1.config > {noformat} > map /foo/ https://www.yahoo.com > {noformat} > remap2.config > {noformat} > map /foo/bar1 https://www.yahoo.com > map /foo/bar2 https://www.yahoo.com > {noformat} > *Test Results* > {noformat} > $ curl "`hostname`/foo/" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar1" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar2" -I | grep HTTP > HTTP/1.1 404 Not Found > {noformat} > When parsing remap1.config, first entry in remap1.config is inserted with > rank 0, second with rank 1. Then parsing remap2.config, the single entry is > inserted with rank 0 again. So the entry in remap2.config is overlapped with > first entry in remap1.config and takes precedence with second entry. This > would confuse customers. > I'm considering using a global line number to count each entry, and follow > the "first-match-wins" rules. I test with my local environment and it works. > However, we are using the line number for debugging > output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), > so the current count variant _cln_ is still needed. This would mess up the > code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3120) Overlapping rank in config.remap when using .include directives
[ https://issues.apache.org/jira/browse/TS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3120: --- Attachment: TS-3120_fix_regex.diff We should align the rank with count for both simple mappings and regex mappings. > Overlapping rank in config.remap when using .include directives > --- > > Key: TS-3120 > URL: https://issues.apache.org/jira/browse/TS-3120 > Project: Traffic Server > Issue Type: Bug > Components: Configuration, Core >Affects Versions: 5.0.0 >Reporter: Feifei Cai >Assignee: James Peach > Fix For: 5.2.0 > > Attachments: TS-3120.diff, TS-3120_fix_regex.diff > > > As described in > [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] > {quote} > Once these rules are executed we pick the lowest line number as the match > (which replicates first-match-wins). > {quote} > In fact, we use line number for rank for now. > ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) > However, when we use .include directives to include some other remap config > files, there will be overlapping and conflict with the line numbers in each > other file. > *Examples* > remap.config > {noformat} > .include remap1.config > .include remap2.config > {noformat} > remap1.config > {noformat} > map /foo/ https://www.yahoo.com > {noformat} > remap2.config > {noformat} > map /foo/bar1 https://www.yahoo.com > map /foo/bar2 https://www.yahoo.com > {noformat} > *Test Results* > {noformat} > $ curl "`hostname`/foo/" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar1" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar2" -I | grep HTTP > HTTP/1.1 404 Not Found > {noformat} > When parsing remap1.config, first entry in remap1.config is inserted with > rank 0, second with rank 1. Then parsing remap2.config, the single entry is > inserted with rank 0 again. So the entry in remap2.config is overlapped with > first entry in remap1.config and takes precedence with second entry. This > would confuse customers. > I'm considering using a global line number to count each entry, and follow > the "first-match-wins" rules. I test with my local environment and it works. > However, we are using the line number for debugging > output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), > so the current count variant _cln_ is still needed. This would mess up the > code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (TS-3120) Overlapping rank in config.remap when using .include directives
[ https://issues.apache.org/jira/browse/TS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai reopened TS-3120: Backport to Version: 5.2.0 The last patch did not count the rank of regex mappings. > Overlapping rank in config.remap when using .include directives > --- > > Key: TS-3120 > URL: https://issues.apache.org/jira/browse/TS-3120 > Project: Traffic Server > Issue Type: Bug > Components: Configuration, Core >Affects Versions: 5.0.0 >Reporter: Feifei Cai >Assignee: James Peach > Fix For: 5.2.0 > > Attachments: TS-3120.diff > > > As described in > [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] > {quote} > Once these rules are executed we pick the lowest line number as the match > (which replicates first-match-wins). > {quote} > In fact, we use line number for rank for now. > ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) > However, when we use .include directives to include some other remap config > files, there will be overlapping and conflict with the line numbers in each > other file. > *Examples* > remap.config > {noformat} > .include remap1.config > .include remap2.config > {noformat} > remap1.config > {noformat} > map /foo/ https://www.yahoo.com > {noformat} > remap2.config > {noformat} > map /foo/bar1 https://www.yahoo.com > map /foo/bar2 https://www.yahoo.com > {noformat} > *Test Results* > {noformat} > $ curl "`hostname`/foo/" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar1" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar2" -I | grep HTTP > HTTP/1.1 404 Not Found > {noformat} > When parsing remap1.config, first entry in remap1.config is inserted with > rank 0, second with rank 1. Then parsing remap2.config, the single entry is > inserted with rank 0 again. So the entry in remap2.config is overlapped with > first entry in remap1.config and takes precedence with second entry. This > would confuse customers. > I'm considering using a global line number to count each entry, and follow > the "first-match-wins" rules. I test with my local environment and it works. > However, we are using the line number for debugging > output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), > so the current count variant _cln_ is still needed. This would mess up the > code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-3120) Overlapping rank in config.remap when using .include directives
[ https://issues.apache.org/jira/browse/TS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164974#comment-14164974 ] Feifei Cai edited comment on TS-3120 at 10/10/14 2:07 AM: -- I use {{count}} in {{UrlRewrite::_addToStore}} for rank. It is used to count the number of each type of map rules(map, reverse_map, map_with_referer...). When we insert a new {{url_mapping}}, it is stored in a separate {{MappingsStore}}. So there would be no conflict with each type's count. was (Author: ffcai): I use count in UrlRewrite::_addToStore for rank. It is used to count the number of each type of map rules(map, reverse_map, map_with_referer...). When we insert a new url_mapping, it is stored in a separate MappingsStore. So there would be no conflict with each type's count. > Overlapping rank in config.remap when using .include directives > --- > > Key: TS-3120 > URL: https://issues.apache.org/jira/browse/TS-3120 > Project: Traffic Server > Issue Type: Bug > Components: Configuration, Core >Affects Versions: 5.0.0 >Reporter: Feifei Cai >Assignee: James Peach > Fix For: 5.2.0 > > Attachments: TS-3120.diff > > > As described in > [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] > {quote} > Once these rules are executed we pick the lowest line number as the match > (which replicates first-match-wins). > {quote} > In fact, we use line number for rank for now. > ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) > However, when we use .include directives to include some other remap config > files, there will be overlapping and conflict with the line numbers in each > other file. > *Examples* > remap.config > {noformat} > .include remap1.config > .include remap2.config > {noformat} > remap1.config > {noformat} > map /foo/ https://www.yahoo.com > {noformat} > remap2.config > {noformat} > map /foo/bar1 https://www.yahoo.com > map /foo/bar2 https://www.yahoo.com > {noformat} > *Test Results* > {noformat} > $ curl "`hostname`/foo/" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar1" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar2" -I | grep HTTP > HTTP/1.1 404 Not Found > {noformat} > When parsing remap1.config, first entry in remap1.config is inserted with > rank 0, second with rank 1. Then parsing remap2.config, the single entry is > inserted with rank 0 again. So the entry in remap2.config is overlapped with > first entry in remap1.config and takes precedence with second entry. This > would confuse customers. > I'm considering using a global line number to count each entry, and follow > the "first-match-wins" rules. I test with my local environment and it works. > However, we are using the line number for debugging > output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), > so the current count variant _cln_ is still needed. This would mess up the > code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3120) Overlapping rank in config.remap when using .include directives
[ https://issues.apache.org/jira/browse/TS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164974#comment-14164974 ] Feifei Cai commented on TS-3120: I use count in UrlRewrite::_addToStore for rank. It is used to count the number of each type of map rules(map, reverse_map, map_with_referer...). When we insert a new url_mapping, it is stored in a separate MappingsStore. So there would be no conflict with each type's count. > Overlapping rank in config.remap when using .include directives > --- > > Key: TS-3120 > URL: https://issues.apache.org/jira/browse/TS-3120 > Project: Traffic Server > Issue Type: Bug > Components: Configuration, Core >Affects Versions: 5.0.0 >Reporter: Feifei Cai > Fix For: 5.2.0 > > Attachments: TS-3120.diff > > > As described in > [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] > {quote} > Once these rules are executed we pick the lowest line number as the match > (which replicates first-match-wins). > {quote} > In fact, we use line number for rank for now. > ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) > However, when we use .include directives to include some other remap config > files, there will be overlapping and conflict with the line numbers in each > other file. > *Examples* > remap.config > {noformat} > .include remap1.config > .include remap2.config > {noformat} > remap1.config > {noformat} > map /foo/ https://www.yahoo.com > {noformat} > remap2.config > {noformat} > map /foo/bar1 https://www.yahoo.com > map /foo/bar2 https://www.yahoo.com > {noformat} > *Test Results* > {noformat} > $ curl "`hostname`/foo/" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar1" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar2" -I | grep HTTP > HTTP/1.1 404 Not Found > {noformat} > When parsing remap1.config, first entry in remap1.config is inserted with > rank 0, second with rank 1. Then parsing remap2.config, the single entry is > inserted with rank 0 again. So the entry in remap2.config is overlapped with > first entry in remap1.config and takes precedence with second entry. This > would confuse customers. > I'm considering using a global line number to count each entry, and follow > the "first-match-wins" rules. I test with my local environment and it works. > However, we are using the line number for debugging > output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), > so the current count variant _cln_ is still needed. This would mess up the > code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3120) Overlapping rank in config.remap when using .include directives
[ https://issues.apache.org/jira/browse/TS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3120: --- Attachment: TS-3120.diff > Overlapping rank in config.remap when using .include directives > --- > > Key: TS-3120 > URL: https://issues.apache.org/jira/browse/TS-3120 > Project: Traffic Server > Issue Type: Bug > Components: Configuration, Core >Affects Versions: 5.0.0 >Reporter: Feifei Cai > Fix For: 5.2.0 > > Attachments: TS-3120.diff > > > As described in > [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] > {quote} > Once these rules are executed we pick the lowest line number as the match > (which replicates first-match-wins). > {quote} > In fact, we use line number for rank for now. > ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) > However, when we use .include directives to include some other remap config > files, there will be overlapping and conflict with the line numbers in each > other file. > *Examples* > remap.config > {noformat} > .include remap1.config > .include remap2.config > {noformat} > remap1.config > {noformat} > map /foo/ https://www.yahoo.com > {noformat} > remap2.config > {noformat} > map /foo/bar1 https://www.yahoo.com > map /foo/bar2 https://www.yahoo.com > {noformat} > *Test Results* > {noformat} > $ curl "`hostname`/foo/" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar1" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar2" -I | grep HTTP > HTTP/1.1 404 Not Found > {noformat} > When parsing remap1.config, first entry in remap1.config is inserted with > rank 0, second with rank 1. Then parsing remap2.config, the single entry is > inserted with rank 0 again. So the entry in remap2.config is overlapped with > first entry in remap1.config and takes precedence with second entry. This > would confuse customers. > I'm considering using a global line number to count each entry, and follow > the "first-match-wins" rules. I test with my local environment and it works. > However, we are using the line number for debugging > output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), > so the current count variant _cln_ is still needed. This would mess up the > code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3120) Overlapping rank in config.remap when using .include directives
[ https://issues.apache.org/jira/browse/TS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3120: --- Affects Version/s: (was: 5.0.1) 5.0.0 > Overlapping rank in config.remap when using .include directives > --- > > Key: TS-3120 > URL: https://issues.apache.org/jira/browse/TS-3120 > Project: Traffic Server > Issue Type: Bug > Components: Configuration, Core >Affects Versions: 5.0.0 >Reporter: Feifei Cai > > As described in > [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] > {quote} > Once these rules are executed we pick the lowest line number as the match > (which replicates first-match-wins). > {quote} > In fact, we use line number for rank for now. > ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) > However, when we use .include directives to include some other remap config > files, there will be overlapping and conflict with the line numbers in each > other file. > *Examples* > remap.config > {noformat} > .include remap1.config > .include remap2.config > {noformat} > remap1.config > {noformat} > map /foo/ https://www.yahoo.com > {noformat} > remap2.config > {noformat} > map /foo/bar1 https://www.yahoo.com > map /foo/bar2 https://www.yahoo.com > {noformat} > *Test Results* > {noformat} > $ curl "`hostname`/foo/" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar1" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar2" -I | grep HTTP > HTTP/1.1 404 Not Found > {noformat} > When parsing remap1.config, first entry in remap1.config is inserted with > rank 0, second with rank 1. Then parsing remap2.config, the single entry is > inserted with rank 0 again. So the entry in remap2.config is overlapped with > first entry in remap1.config and takes precedence with second entry. This > would confuse customers. > I'm considering using a global line number to count each entry, and follow > the "first-match-wins" rules. I test with my local environment and it works. > However, we are using the line number for debugging > output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), > so the current count variant _cln_ is still needed. This would mess up the > code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3120) Overlapping rank in config.remap when using .include directives
[ https://issues.apache.org/jira/browse/TS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-3120: --- Affects Version/s: 5.0.1 > Overlapping rank in config.remap when using .include directives > --- > > Key: TS-3120 > URL: https://issues.apache.org/jira/browse/TS-3120 > Project: Traffic Server > Issue Type: Bug > Components: Configuration, Core >Affects Versions: 5.0.1 >Reporter: Feifei Cai > > As described in > [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] > {quote} > Once these rules are executed we pick the lowest line number as the match > (which replicates first-match-wins). > {quote} > In fact, we use line number for rank for now. > ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) > However, when we use .include directives to include some other remap config > files, there will be overlapping and conflict with the line numbers in each > other file. > *Examples* > remap.config > {noformat} > .include remap1.config > .include remap2.config > {noformat} > remap1.config > {noformat} > map /foo/ https://www.yahoo.com > {noformat} > remap2.config > {noformat} > map /foo/bar1 https://www.yahoo.com > map /foo/bar2 https://www.yahoo.com > {noformat} > *Test Results* > {noformat} > $ curl "`hostname`/foo/" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar1" -I | grep HTTP > HTTP/1.1 200 OK > $ curl "`hostname`/foo/bar2" -I | grep HTTP > HTTP/1.1 404 Not Found > {noformat} > When parsing remap1.config, first entry in remap1.config is inserted with > rank 0, second with rank 1. Then parsing remap2.config, the single entry is > inserted with rank 0 again. So the entry in remap2.config is overlapped with > first entry in remap1.config and takes precedence with second entry. This > would confuse customers. > I'm considering using a global line number to count each entry, and follow > the "first-match-wins" rules. I test with my local environment and it works. > However, we are using the line number for debugging > output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), > so the current count variant _cln_ is still needed. This would mess up the > code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-3120) Overlapping rank in config.remap when using .include directives
Feifei Cai created TS-3120: -- Summary: Overlapping rank in config.remap when using .include directives Key: TS-3120 URL: https://issues.apache.org/jira/browse/TS-3120 Project: Traffic Server Issue Type: Bug Components: Configuration, Core Reporter: Feifei Cai As described in [docs|https://docs.trafficserver.apache.org/en/latest/reference/configuration/remap.config.en.html#precedence] {quote} Once these rules are executed we pick the lowest line number as the match (which replicates first-match-wins). {quote} In fact, we use line number for rank for now. ([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L987]) However, when we use .include directives to include some other remap config files, there will be overlapping and conflict with the line numbers in each other file. *Examples* remap.config {noformat} .include remap1.config .include remap2.config {noformat} remap1.config {noformat} map /foo/ https://www.yahoo.com {noformat} remap2.config {noformat} map /foo/bar1 https://www.yahoo.com map /foo/bar2 https://www.yahoo.com {noformat} *Test Results* {noformat} $ curl "`hostname`/foo/" -I | grep HTTP HTTP/1.1 200 OK $ curl "`hostname`/foo/bar1" -I | grep HTTP HTTP/1.1 200 OK $ curl "`hostname`/foo/bar2" -I | grep HTTP HTTP/1.1 404 Not Found {noformat} When parsing remap1.config, first entry in remap1.config is inserted with rank 0, second with rank 1. Then parsing remap2.config, the single entry is inserted with rank 0 again. So the entry in remap2.config is overlapped with first entry in remap1.config and takes precedence with second entry. This would confuse customers. I'm considering using a global line number to count each entry, and follow the "first-match-wins" rules. I test with my local environment and it works. However, we are using the line number for debugging output([Github|https://github.com/apache/trafficserver/blob/master/proxy/http/remap/RemapConfig.cc#L938]), so the current count variant _cln_ is still needed. This would mess up the code. Appreciate for any suggestions to this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2902) Allow POST requests without a Content-Length header
[ https://issues.apache.org/jira/browse/TS-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100379#comment-14100379 ] Feifei Cai commented on TS-2902: Sorry for the late reply. I'm a little confused with the user agent's logic: it is a POST request, but not sending body data. It is defined in RFC 7231 as follows: [http://tools.ietf.org/html/rfc7231#section-4.3.3] {quote} The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. {quote} It does not require a message body explicitly. I also see some discussion here: [http://w3-org.9356.n7.nabble.com/POST-with-empty-body-td103965.html] However, when user agent sends a POST request without message body, the original server should be able to handle this special case. If the original server behind ATS supports this use case, I agree that we should consider allowing the request to pass; but if not, then ATS would send back 400/411 response from original server after relay the request to it. How about test the POST request against original server first? We need to ensure both ATS and original server accept the request. > Allow POST requests without a Content-Length header > --- > > Key: TS-2902 > URL: https://issues.apache.org/jira/browse/TS-2902 > Project: Traffic Server > Issue Type: Improvement >Reporter: Masakazu Kitajo >Assignee: Bryan Call > Labels: review > Fix For: 5.1.0 > > Attachments: make_it_configuarable.patch > > > I get "*400* Content Length Required" when user agents send a POST request > that doesn't contain any body data without a Content-Length header. > (The header is omitted because the length is zero, I think) > According to RFC2730 Section 3.3.2, presence of Content-Length is not MUST. > http://tools.ietf.org/html/rfc7230#section-3.3.2 > {quote} > A user agent SHOULD send a Content-Length in a request message when >no Transfer-Encoding is sent and the request method defines a meaning >for an enclosed payload body. > {quote} > Also according to section 3.3.3, a server are allowed to reject similar > request with 411 Length Required, but not *400*. > http://tools.ietf.org/html/rfc7230#section-3.3.2 > {quote} > A server MAY reject a request that contains a message body but not a >Content-Length by responding with 411 (Length Required). > {quote} > Traffic Server should accept the requests, no body data without > Content-Length header, or reject it with *411*. I think the former one is > better for interoperability. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TS-2902) Allow POST requests without a Content-Length header
[ https://issues.apache.org/jira/browse/TS-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098383#comment-14098383 ] Feifei Cai commented on TS-2902: If server only handles HTTP/1.0 requests, the user agent MUST send a valid "Content-Length" header, as specified in RFC7230; if server handles HTTP/1.1 (or later) requests, the "Content-Length" header is no longer a MUST, because HTTP/1.1 introduced "Transfer-Encoding". For HTTP/1.1 requests, if both these two headers are not included, server also should response 400/411. [http://tools.ietf.org/html/rfc7230#section-3.3] {quote} The presence of a message body in a request is signaled by a Content-Length or Transfer-Encoding header field. {quote} We can add a configuration option for this change, and disable it by default, not breaking other use cases. However, I hope to be conservative and keep the handle logic align with RFC. Maybe I still have some misunderstanding with RFC7230, so I wish we can discuss this more clear before we start to patch it to ATS. > Allow POST requests without a Content-Length header > --- > > Key: TS-2902 > URL: https://issues.apache.org/jira/browse/TS-2902 > Project: Traffic Server > Issue Type: Improvement >Reporter: Masakazu Kitajo >Assignee: Bryan Call > Labels: review > Fix For: 5.1.0 > > Attachments: make_it_configuarable.patch > > > I get "*400* Content Length Required" when user agents send a POST request > that doesn't contain any body data without a Content-Length header. > (The header is omitted because the length is zero, I think) > According to RFC2730 Section 3.3.2, presence of Content-Length is not MUST. > http://tools.ietf.org/html/rfc7230#section-3.3.2 > {quote} > A user agent SHOULD send a Content-Length in a request message when >no Transfer-Encoding is sent and the request method defines a meaning >for an enclosed payload body. > {quote} > Also according to section 3.3.3, a server are allowed to reject similar > request with 411 Length Required, but not *400*. > http://tools.ietf.org/html/rfc7230#section-3.3.2 > {quote} > A server MAY reject a request that contains a message body but not a >Content-Length by responding with 411 (Length Required). > {quote} > Traffic Server should accept the requests, no body data without > Content-Length header, or reject it with *411*. I think the former one is > better for interoperability. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TS-2902) Allow POST requests without a Content-Length header
[ https://issues.apache.org/jira/browse/TS-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098281#comment-14098281 ] Feifei Cai commented on TS-2902: My understanding is: User agents SHOULD send Content-Length header when not send Transfer-Encoding, even for empty body data, a "Content-Length: 0" is needed. If not send the header, server sends back a 400 response, it MAY send a 411 as an option, to indicate more explicit information. Anyway, server would reject user agents' request which is without Content-Length and Transfer-Encoding. > Allow POST requests without a Content-Length header > --- > > Key: TS-2902 > URL: https://issues.apache.org/jira/browse/TS-2902 > Project: Traffic Server > Issue Type: Improvement >Reporter: Masakazu Kitajo >Assignee: Bryan Call > Labels: review > Fix For: 5.1.0 > > Attachments: make_it_configuarable.patch > > > I get "*400* Content Length Required" when user agents send a POST request > that doesn't contain any body data without a Content-Length header. > (The header is omitted because the length is zero, I think) > According to RFC2730 Section 3.3.2, presence of Content-Length is not MUST. > http://tools.ietf.org/html/rfc7230#section-3.3.2 > {quote} > A user agent SHOULD send a Content-Length in a request message when >no Transfer-Encoding is sent and the request method defines a meaning >for an enclosed payload body. > {quote} > Also according to section 3.3.3, a server are allowed to reject similar > request with 411 Length Required, but not *400*. > http://tools.ietf.org/html/rfc7230#section-3.3.2 > {quote} > A server MAY reject a request that contains a message body but not a >Content-Length by responding with 411 (Length Required). > {quote} > Traffic Server should accept the requests, no body data without > Content-Length header, or reject it with *411*. I think the former one is > better for interoperability. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TS-2902) Allow POST requests without a Content-Length header
[ https://issues.apache.org/jira/browse/TS-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098110#comment-14098110 ] Feifei Cai commented on TS-2902: Hi Masakazu, Strictly speaking, "Content-Length" is a SHOULD but not a MUST. However, I think we'd better follow the SHOULD. [http://tools.ietf.org/html/rfc7230#section-3.3.2] {quote} A user agent SHOULD send a Content-Length in a request message when no Transfer-Encoding is sent and the request method defines a meaning for an enclosed payload body. For example, a Content-Length header field is normally sent in a POST request even when the value is 0 (indicating an empty payload body). A user agent SHOULD NOT send a Content-Length header field when the request message does not contain a payload body and the method semantics do not anticipate such a body. {quote} As to the response, I think that we can make it configurable, since it is MAY in RFC 7230. So, how about keep requiring "Content-Length" and add a configurable option to response status code (400/411)? The patch makes requiring "Content-Length" configurable and change response status code from 400 to 411, I'm afraid this is somehow aggressive and may breaks some other's use cases. > Allow POST requests without a Content-Length header > --- > > Key: TS-2902 > URL: https://issues.apache.org/jira/browse/TS-2902 > Project: Traffic Server > Issue Type: Improvement >Reporter: Masakazu Kitajo >Assignee: Bryan Call > Labels: review > Fix For: 5.1.0 > > Attachments: make_it_configuarable.patch > > > I get "*400* Content Length Required" when user agents send a POST request > that doesn't contain any body data without a Content-Length header. > (The header is omitted because the length is zero, I think) > According to RFC2730 Section 3.3.2, presence of Content-Length is not MUST. > http://tools.ietf.org/html/rfc7230#section-3.3.2 > {quote} > A user agent SHOULD send a Content-Length in a request message when >no Transfer-Encoding is sent and the request method defines a meaning >for an enclosed payload body. > {quote} > Also according to section 3.3.3, a server are allowed to reject similar > request with 411 Length Required, but not *400*. > http://tools.ietf.org/html/rfc7230#section-3.3.2 > {quote} > A server MAY reject a request that contains a message body but not a >Content-Length by responding with 411 (Length Required). > {quote} > Traffic Server should accept the requests, no body data without > Content-Length header, or reject it with *411*. I think the former one is > better for interoperability. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TS-3001) GlobalSign responds 403 when OCSP request posted without Host header
[ https://issues.apache.org/jira/browse/TS-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093814#comment-14093814 ] Feifei Cai commented on TS-3001: I have tested this patch, it works fine in my local environment. Thanks to [~rokubo]! Hi [~bcall], would you please help commit this patch? > GlobalSign responds 403 when OCSP request posted without Host header > > > Key: TS-3001 > URL: https://issues.apache.org/jira/browse/TS-3001 > Project: Traffic Server > Issue Type: Bug > Components: SSL >Reporter: Ryo Okubo >Assignee: Bryan Call > Fix For: 5.1.0 > > Attachments: ocsp.diff > > > OCSP request for GlobalSign requires Host header but OCSP_sendreq_new() > doesn't attach it. so OCSP request failed with 403 status code. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-2367) Add OCSP (Online Certificate Status Protocol) Stapling Support
[ https://issues.apache.org/jira/browse/TS-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2367: --- Attachment: (was: TS-2367.diff) > Add OCSP (Online Certificate Status Protocol) Stapling Support > --- > > Key: TS-2367 > URL: https://issues.apache.org/jira/browse/TS-2367 > Project: Traffic Server > Issue Type: New Feature > Components: HTTP, SSL >Reporter: Bryan Call >Assignee: Bryan Call > Labels: review > Fix For: 5.1.0 > > Attachments: TS-2367.diff, TS-2367.diff > > > RFC: > http://tools.ietf.org/html/rfc6066 > Overview: > https://wiki.mozilla.org/Security/Server_Side_TLS#OCSP_Stapling > http://en.wikipedia.org/wiki/OCSP_stapling > There is support for this added into openssl 0.9.8g. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-2367) Add OCSP (Online Certificate Status Protocol) Stapling Support
[ https://issues.apache.org/jira/browse/TS-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2367: --- Attachment: TS-2367.diff > Add OCSP (Online Certificate Status Protocol) Stapling Support > --- > > Key: TS-2367 > URL: https://issues.apache.org/jira/browse/TS-2367 > Project: Traffic Server > Issue Type: New Feature > Components: HTTP, SSL >Reporter: Bryan Call >Assignee: Bryan Call > Labels: review > Fix For: 5.1.0 > > Attachments: TS-2367.diff, TS-2367.diff > > > RFC: > http://tools.ietf.org/html/rfc6066 > Overview: > https://wiki.mozilla.org/Security/Server_Side_TLS#OCSP_Stapling > http://en.wikipedia.org/wiki/OCSP_stapling > There is support for this added into openssl 0.9.8g. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TS-2367) Add OCSP (Online Certificate Status Protocol) Stapling Support
[ https://issues.apache.org/jira/browse/TS-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14079268#comment-14079268 ] Feifei Cai commented on TS-2367: Thanks to [~jpe...@apache.org] and [~bcall]! I made some updates as following: # Use schedule_every() to check/update response. # Use spawn_event_threads() to spawn a thread for OCSP thread. # All functions that return 1 or 0 should be declared bool: Done. # d2i_OCSP_RESPONSE is a heavy conversion, so I do copy first, then release the lock to cinf as soon. I removed stapling_get_cached_response in callback function, and leave the conversion/check to OCSP update thread. # Change configuration options with stapling in the name to ocsp: Done. # I change query_responder() to use openssl's API, OCSP_sendreq_nbio, to implement the unblocking query with timeout option. > Add OCSP (Online Certificate Status Protocol) Stapling Support > --- > > Key: TS-2367 > URL: https://issues.apache.org/jira/browse/TS-2367 > Project: Traffic Server > Issue Type: New Feature > Components: HTTP, SSL >Reporter: Bryan Call >Assignee: Bryan Call > Labels: review > Fix For: 5.1.0 > > Attachments: TS-2367.diff, TS-2367.diff > > > RFC: > http://tools.ietf.org/html/rfc6066 > Overview: > https://wiki.mozilla.org/Security/Server_Side_TLS#OCSP_Stapling > http://en.wikipedia.org/wiki/OCSP_stapling > There is support for this added into openssl 0.9.8g. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TS-2915) SEGV occurs when POST request was posted without Expect: 100-continue header
[ https://issues.apache.org/jira/browse/TS-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061858#comment-14061858 ] Feifei Cai commented on TS-2915: I have reproduced this issue and verified the patch. Many thanks to [~rokubo]! Hi [~bcall], could you please help commit this patch? Thanks! > SEGV occurs when POST request was posted without Expect: 100-continue header > > > Key: TS-2915 > URL: https://issues.apache.org/jira/browse/TS-2915 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 5.0.0 >Reporter: Ryo Okubo > Fix For: 5.1.0 > > Attachments: bugfix.diff > > > The patch merged on TS-1125 has a bug that occurs Segmentation fault. > It's reproduced when send_100_continue_response option was enabled and POST > request was posted without Expect: 100-continue header. > Please check follow links. > https://github.com/apache/trafficserver/blob/master/proxy/http/HttpSM.cc#L1898 > https://github.com/apache/trafficserver/blob/master/proxy/http/HttpSM.cc#L3290 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TS-2848) ATS crash in HttpSM::release_server_session
[ https://issues.apache.org/jira/browse/TS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061580#comment-14061580 ] Feifei Cai commented on TS-2848: This crash may be happening pretty frequently in some of yahoo hosts. We apply this patch and it works fine. The root cause is not found yet, we still need to figure out when t_state.current.server would be set to NULL and whether there's some other thing to be handled in this condition. > ATS crash in HttpSM::release_server_session > --- > > Key: TS-2848 > URL: https://issues.apache.org/jira/browse/TS-2848 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Reporter: Feifei Cai > Labels: crash, yahoo > Fix For: sometime > > Attachments: TS-2848.diff > > > We deploy ATS on production hosts, and noticed crashes with the following > stack trace. This happens not very frequently, about 1 week or even longer. > It crashes repeatedly in the last 2 months, however, the root cause is not > found and we can not reproduce the crash as wish, only wait for it happens. > {noformat} > NOTE: Traffic Server received Sig 11: Segmentation fault > /home/y/bin/traffic_server - STACK TRACE: > /lib64/libpthread.so.0(+0x321e60f500)[0x2b69adf8f500] > /home/y/bin/traffic_server(_ZN6HttpSM22release_server_sessionEb+0x35)[0x529eb5] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x2db)[0x5362bb] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM16do_hostdb_lookupEv+0x282)[0x51e422] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0xbad)[0x536b8d] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x52ff8e] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] > /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x50bef2] > /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x5f0a93] > /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x65934f] > /home/y/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationP7INK_MD5P7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePci+0x383)[0x656373] > /home/y/bin/traffic_server(_ZN14CacheProcessor9open_readEP12ContinuationP3URLbP7HTTPHdrP21CacheLookupHttpConfigl13CacheFragType+0xad)[0x633a6d] > /home/y/bin/traffic_server(_ZN11HttpCacheSM9open_readEP3URLP7HTTPHdrP21CacheLookupHttpConfigl+0x94)[0x50b944] > /home/y/bin/traffic_server(_ZN6HttpSM24do_cache_lookup_and_readEv+0xf3)[0x51d893] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x722)[0x536702] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x49d)[0x53546d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x85d)[0x53683d] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_rewrite.so(+0x1288d)[0x2b69c36d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM33state_read_server_response_headerEiPv+0x398)[0x530828] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] > /home/y/bin/traffic_server[0x68606b] > /home/y/bin/traffic_server[0x688a14] > /home/y/bin/traffic_server(_ZN10NetHandle
[jira] [Updated] (TS-2848) ATS crash in HttpSM::release_server_session
[ https://issues.apache.org/jira/browse/TS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2848: --- Attachment: TS-2848.diff add NULL check for t_state.current.server > ATS crash in HttpSM::release_server_session > --- > > Key: TS-2848 > URL: https://issues.apache.org/jira/browse/TS-2848 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Reporter: Feifei Cai > Labels: crash, yahoo > Fix For: sometime > > Attachments: TS-2848.diff > > > We deploy ATS on production hosts, and noticed crashes with the following > stack trace. This happens not very frequently, about 1 week or even longer. > It crashes repeatedly in the last 2 months, however, the root cause is not > found and we can not reproduce the crash as wish, only wait for it happens. > {noformat} > NOTE: Traffic Server received Sig 11: Segmentation fault > /home/y/bin/traffic_server - STACK TRACE: > /lib64/libpthread.so.0(+0x321e60f500)[0x2b69adf8f500] > /home/y/bin/traffic_server(_ZN6HttpSM22release_server_sessionEb+0x35)[0x529eb5] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x2db)[0x5362bb] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM16do_hostdb_lookupEv+0x282)[0x51e422] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0xbad)[0x536b8d] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x52ff8e] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] > /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x50bef2] > /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x5f0a93] > /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x65934f] > /home/y/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationP7INK_MD5P7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePci+0x383)[0x656373] > /home/y/bin/traffic_server(_ZN14CacheProcessor9open_readEP12ContinuationP3URLbP7HTTPHdrP21CacheLookupHttpConfigl13CacheFragType+0xad)[0x633a6d] > /home/y/bin/traffic_server(_ZN11HttpCacheSM9open_readEP3URLP7HTTPHdrP21CacheLookupHttpConfigl+0x94)[0x50b944] > /home/y/bin/traffic_server(_ZN6HttpSM24do_cache_lookup_and_readEv+0xf3)[0x51d893] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x722)[0x536702] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x49d)[0x53546d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x85d)[0x53683d] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_rewrite.so(+0x1288d)[0x2b69c36d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM33state_read_server_response_headerEiPv+0x398)[0x530828] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] > /home/y/bin/traffic_server[0x68606b] > /home/y/bin/traffic_server[0x688a14] > /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x681582] > /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x6a89bf] > /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4a3)[0x6a93a3] > /home/y/bin/traffic_server[0x6a785a] > /lib64/libpthread.so.0(+0x321e607851)[0x2b69a
[jira] [Updated] (TS-2916) combo_handler does not set the response headers properly
[ https://issues.apache.org/jira/browse/TS-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2916: --- Attachment: TS-2916.diff > combo_handler does not set the response headers properly > > > Key: TS-2916 > URL: https://issues.apache.org/jira/browse/TS-2916 > Project: Traffic Server > Issue Type: Bug > Components: Plugins >Reporter: Feifei Cai > Labels: yahoo > Attachments: TS-2916.diff > > > # "Cache-Control: max-age=xxx" > combo_handler plugin should parse each url's max-age value in "Cache-Control" > header, and use the minimal value in the response header. The hard-code "10 > years" max-age prevents cache from refresh, even though we have parsed the > value in "Expires" headers. > See > [rfc2616-sec14.9.3|http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3]: > "If a response includes both an Expires header and a max-age directive, the > max-age directive overrides the Expires header, even if the Expires header is > more restrictive." > # Duplicated headers > We add support for whitelist of headers in a recent > [commit|https://github.com/apache/trafficserver/commit/f61b1b416f4bb99854c6b6c77b12f742b5af9ca8] > When we have added headers specified in whitelist, we need to check for > duplicated headers in the following response write actions. > # Multiple values > Some headers has multiple values, e.g. "Cache-Control: max-age=3600, public". > It has 2 values: "max-age=3600" and "public". > We need to parse all the values for each header specified in whitelist. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-2916) combo_handler does not set the response headers properly
[ https://issues.apache.org/jira/browse/TS-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2916: --- Labels: yahoo (was: ) > combo_handler does not set the response headers properly > > > Key: TS-2916 > URL: https://issues.apache.org/jira/browse/TS-2916 > Project: Traffic Server > Issue Type: Bug > Components: Plugins >Reporter: Feifei Cai > Labels: yahoo > > # "Cache-Control: max-age=xxx" > combo_handler plugin should parse each url's max-age value in "Cache-Control" > header, and use the minimal value in the response header. The hard-code "10 > years" max-age prevents cache from refresh, even though we have parsed the > value in "Expires" headers. > See > [rfc2616-sec14.9.3|http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3]: > "If a response includes both an Expires header and a max-age directive, the > max-age directive overrides the Expires header, even if the Expires header is > more restrictive." > # Duplicated headers > We add support for whitelist of headers in a recent > [commit|https://github.com/apache/trafficserver/commit/f61b1b416f4bb99854c6b6c77b12f742b5af9ca8] > When we have added headers specified in whitelist, we need to check for > duplicated headers in the following response write actions. > # Multiple values > Some headers has multiple values, e.g. "Cache-Control: max-age=3600, public". > It has 2 values: "max-age=3600" and "public". > We need to parse all the values for each header specified in whitelist. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (TS-2916) combo_handler does not set the response headers properly
Feifei Cai created TS-2916: -- Summary: combo_handler does not set the response headers properly Key: TS-2916 URL: https://issues.apache.org/jira/browse/TS-2916 Project: Traffic Server Issue Type: Bug Components: Plugins Reporter: Feifei Cai # "Cache-Control: max-age=xxx" combo_handler plugin should parse each url's max-age value in "Cache-Control" header, and use the minimal value in the response header. The hard-code "10 years" max-age prevents cache from refresh, even though we have parsed the value in "Expires" headers. See [rfc2616-sec14.9.3|http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3]: "If a response includes both an Expires header and a max-age directive, the max-age directive overrides the Expires header, even if the Expires header is more restrictive." # Duplicated headers We add support for whitelist of headers in a recent [commit|https://github.com/apache/trafficserver/commit/f61b1b416f4bb99854c6b6c77b12f742b5af9ca8] When we have added headers specified in whitelist, we need to check for duplicated headers in the following response write actions. # Multiple values Some headers has multiple values, e.g. "Cache-Control: max-age=3600, public". It has 2 values: "max-age=3600" and "public". We need to parse all the values for each header specified in whitelist. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-2848) ATS crash in HttpSM::release_server_session
[ https://issues.apache.org/jira/browse/TS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2848: --- Labels: yahoo (was: ) > ATS crash in HttpSM::release_server_session > --- > > Key: TS-2848 > URL: https://issues.apache.org/jira/browse/TS-2848 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Reporter: Feifei Cai > Labels: yahoo > > We deploy ATS on production hosts, and noticed crashes with the following > stack trace. This happens not very frequently, about 1 week or even longer. > It crashes repeatedly in the last 2 months, however, the root cause is not > found and we can not reproduce the crash as wish, only wait for it happens. > {noformat} > NOTE: Traffic Server received Sig 11: Segmentation fault > /home/y/bin/traffic_server - STACK TRACE: > /lib64/libpthread.so.0(+0x321e60f500)[0x2b69adf8f500] > /home/y/bin/traffic_server(_ZN6HttpSM22release_server_sessionEb+0x35)[0x529eb5] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x2db)[0x5362bb] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM16do_hostdb_lookupEv+0x282)[0x51e422] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0xbad)[0x536b8d] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x52ff8e] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] > /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x50bef2] > /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x5f0a93] > /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x65934f] > /home/y/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationP7INK_MD5P7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePci+0x383)[0x656373] > /home/y/bin/traffic_server(_ZN14CacheProcessor9open_readEP12ContinuationP3URLbP7HTTPHdrP21CacheLookupHttpConfigl13CacheFragType+0xad)[0x633a6d] > /home/y/bin/traffic_server(_ZN11HttpCacheSM9open_readEP3URLP7HTTPHdrP21CacheLookupHttpConfigl+0x94)[0x50b944] > /home/y/bin/traffic_server(_ZN6HttpSM24do_cache_lookup_and_readEv+0xf3)[0x51d893] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x722)[0x536702] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x49d)[0x53546d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x85d)[0x53683d] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_rewrite.so(+0x1288d)[0x2b69c36d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM33state_read_server_response_headerEiPv+0x398)[0x530828] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] > /home/y/bin/traffic_server[0x68606b] > /home/y/bin/traffic_server[0x688a14] > /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x681582] > /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x6a89bf] > /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4a3)[0x6a93a3] > /home/y/bin/traffic_server[0x6a785a] > /lib64/libpthread.so.0(+0x321e607851)[0x2b69adf87851] > /lib64/libc.so.6(clone+0x6d)[0x321e2e890d] > {noformat} > gdb back trace: > {noformat} > (gdb) bt > #0 0x000
[jira] [Created] (TS-2848) ATS crash in HttpSM::release_server_session
Feifei Cai created TS-2848: -- Summary: ATS crash in HttpSM::release_server_session Key: TS-2848 URL: https://issues.apache.org/jira/browse/TS-2848 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: Feifei Cai We deploy ATS on production hosts, and noticed crashes with the following stack trace. This happens not very frequently, about 1 week or even longer. It crashes repeatedly in the last 2 months, however, the root cause is not found and we can not reproduce the crash as wish, only wait for it happens. {noformat} NOTE: Traffic Server received Sig 11: Segmentation fault /home/y/bin/traffic_server - STACK TRACE: /lib64/libpthread.so.0(+0x321e60f500)[0x2b69adf8f500] /home/y/bin/traffic_server(_ZN6HttpSM22release_server_sessionEb+0x35)[0x529eb5] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x2db)[0x5362bb] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] /home/y/bin/traffic_server(_ZN6HttpSM16do_hostdb_lookupEv+0x282)[0x51e422] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0xbad)[0x536b8d] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x52ff8e] /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x50bef2] /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x5f0a93] /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x65934f] /home/y/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationP7INK_MD5P7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePci+0x383)[0x656373] /home/y/bin/traffic_server(_ZN14CacheProcessor9open_readEP12ContinuationP3URLbP7HTTPHdrP21CacheLookupHttpConfigl13CacheFragType+0xad)[0x633a6d] /home/y/bin/traffic_server(_ZN11HttpCacheSM9open_readEP3URLP7HTTPHdrP21CacheLookupHttpConfigl+0x94)[0x50b944] /home/y/bin/traffic_server(_ZN6HttpSM24do_cache_lookup_and_readEv+0xf3)[0x51d893] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x722)[0x536702] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x49d)[0x53546d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x85d)[0x53683d] /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] /home/y/libexec64/trafficserver/header_rewrite.so(+0x1288d)[0x2b69c36d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] /home/y/bin/traffic_server(_ZN6HttpSM33state_read_server_response_headerEiPv+0x398)[0x530828] /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] /home/y/bin/traffic_server[0x68606b] /home/y/bin/traffic_server[0x688a14] /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x681582] /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x6a89bf] /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4a3)[0x6a93a3] /home/y/bin/traffic_server[0x6a785a] /lib64/libpthread.so.0(+0x321e607851)[0x2b69adf87851] /lib64/libc.so.6(clone+0x6d)[0x321e2e890d] {noformat} gdb back trace: {noformat} (gdb) bt #0 0x00529eb5 in HttpSM::release_server_session (this=0x2b12bc107bd0, serve_from_cache=true) at HttpSM.cc:4892 #1 0x005362bb in HttpSM::set_next_state (this=0x2b12bc107bd0) at HttpSM.cc:7010 #2 0x0053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at HttpSM.cc:1557 #3 0x0052dbd0 in HttpSM::state_api_callo
[jira] [Comment Edited] (TS-2367) Add OCSP (Online Certificate Status Protocol) Stapling Support
[ https://issues.apache.org/jira/browse/TS-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14008742#comment-14008742 ] Feifei Cai edited comment on TS-2367 at 5/27/14 1:59 AM: - Hi [~jamespeach], thanks for review! I made the following changes according to your comments: 1. separate the ocsp stapling's code from iocore/net/SSLUtils.cc, into iocore/net/P_OCSPStapling.h and iocore/net/OCSPStapling.cc 2. move struct certinfo into OCSPStapling.cc, since it's not needed in header file 3. add the new settings' description in mgmt/RecordsConfig.cc HAVE_OPENSSL_OCSP_STAPLING is defined when SSL_CTX_set_tlsext_status_cb is defined in openssl library, in case an old version openssl is used. MAX_STAPLING_DER is 10K, so each certificate will use ~10K size to store ocsp's info. I think it's not too large for common use cases (in my test case, it's about 2K). Do we need to take large amount of certificates into consideration? was (Author: ffcai): Hi [~jamespeach], thanks for review! I made the following changes according to your comments: 1. separate the ocsp stapling's code from iocore/net/SSLUtils.cc, into iocore/net/P_OCSPStapling.h and iocore/net/OCSPStapling.cc 2. move struct certinfo into OCSPStapling.cc, since it's not needed in header file 3. add the new settings' description in mgmt/RecordsConfig.cc HAVE_OPENSSL_OCSP_STAPLING is defined when SSL_CTX_set_tlsext_status_cb is defined in openssl library, in case an old version openssl is used. MAX_STAPLING_DER is 10K, so each certificate will use ~10K size to store ocsp's info. I think it's not too large for common use cases (in my test case, it's about 2K). Do we need to take large amount of certificates into account? > Add OCSP (Online Certificate Status Protocol) Stapling Support > --- > > Key: TS-2367 > URL: https://issues.apache.org/jira/browse/TS-2367 > Project: Traffic Server > Issue Type: New Feature > Components: HTTP, SSL >Reporter: Bryan Call >Assignee: Bryan Call > Labels: review > Fix For: 5.1.0 > > Attachments: TS-2367.diff, TS-2367.diff > > > RFC: > http://tools.ietf.org/html/rfc6066 > Overview: > https://wiki.mozilla.org/Security/Server_Side_TLS#OCSP_Stapling > http://en.wikipedia.org/wiki/OCSP_stapling > There is support for this added into openssl 0.9.8g. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-2367) Add OCSP (Online Certificate Status Protocol) Stapling Support
[ https://issues.apache.org/jira/browse/TS-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2367: --- Attachment: TS-2367.diff Hi [~jamespeach], thanks for review! I made the following changes according to your comments: 1. separate the ocsp stapling's code from iocore/net/SSLUtils.cc, into iocore/net/P_OCSPStapling.h and iocore/net/OCSPStapling.cc 2. move struct certinfo into OCSPStapling.cc, since it's not needed in header file 3. add the new settings' description in mgmt/RecordsConfig.cc HAVE_OPENSSL_OCSP_STAPLING is defined when SSL_CTX_set_tlsext_status_cb is defined in openssl library, in case an old version openssl is used. MAX_STAPLING_DER is 10K, so each certificate will use ~10K size to store ocsp's info. I think it's not too large for common use cases (in my test case, it's about 2K). Do we need to take large amount of certificates into account? > Add OCSP (Online Certificate Status Protocol) Stapling Support > --- > > Key: TS-2367 > URL: https://issues.apache.org/jira/browse/TS-2367 > Project: Traffic Server > Issue Type: New Feature > Components: HTTP, SSL >Reporter: Bryan Call >Assignee: Bryan Call > Labels: review > Fix For: 5.1.0 > > Attachments: TS-2367.diff, TS-2367.diff > > > RFC: > http://tools.ietf.org/html/rfc6066 > Overview: > https://wiki.mozilla.org/Security/Server_Side_TLS#OCSP_Stapling > http://en.wikipedia.org/wiki/OCSP_stapling > There is support for this added into openssl 0.9.8g. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-1125) POST's with Expect: 100-continue are slowed by delayed 100 response.
[ https://issues.apache.org/jira/browse/TS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-1125: --- Attachment: TS-1125.diff Some changes of the new patch: 1. Change the configuration option for 100 continue response, not using overridable configuration. 2. Use normal VIO APIs, instead of the initial fix of "raw" write. > POST's with Expect: 100-continue are slowed by delayed 100 response. > > > Key: TS-1125 > URL: https://issues.apache.org/jira/browse/TS-1125 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 3.0.2 > Environment: TS 3.0.2 going to Apache 2.2 web server >Reporter: William Bardwell >Assignee: Bryan Call >Priority: Minor > Labels: yahoo > Fix For: 5.0.0 > > Attachments: TS-1125.diff, TS-1125.diff, ts1125.diff, ts1125.diff, > ts1125.diff > > > Sending a post like: > POST / HTTP/1.1 > Host: www.example.com > Content-Length: 10 > Expect: 100-continue > directly to the web server immediately sends back: > HTTP/1.1 100 Continue > And then when the post data is sent, a status 200 response comes back. > But when going through ATS the "HTTP/1.1 100 Continue" is not sent > immediately, and instead is sent after the POST data has been received. This > is legal, but it makes clients that are hoping for a 100 continue to wait a > little while hoping to get that, ATS should forward that response through > immediately. > Note: I see curl using "Expect: 100-continue" with > 1024 bytes of post data, > but web searching indicates that some Microsoft products also use it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TS-2782) ats crash in master in HdrHeap::inherit_string_heaps
[ https://issues.apache.org/jira/browse/TS-2782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993422#comment-13993422 ] Feifei Cai commented on TS-2782: Hi [~briang], Thank you for your commit, and it works! I build a package using the latest master, and test upgrading from some earlier version (before your first TS-2766's commit) to this new package, and there's no crash issues now. > ats crash in master in HdrHeap::inherit_string_heaps > > > Key: TS-2782 > URL: https://issues.apache.org/jira/browse/TS-2782 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Sudheer Vinukonda > Labels: spdy, yahoo > > When testing master on production hosts, noticed the below crash occuring > repeatedly every time ats version is changed. This crash stops happening > after clearing the cache. This needs further investigation, but, I remember a > discussion between briang and zwoop about duplicate string fields in HdrHeap. > Not sure if this core is related to that. Would appreciate if briang or zwoop > can comment. Thank you. > {code} > [example_prep.sh] Checking/Moving old cores... > [TrafficServer] using root directory '/home/y' > NOTE: Traffic Server received Sig 11: Segmentation fault > /home/y/bin/traffic_server - STACK TRACE: > /lib64/libpthread.so.0(+0x30d3c0f500)[0x2aae27e2d500] > /home/y/bin/traffic_server(_ZN7HdrHeap20inherit_string_heapsEPKS_+0x271)[0x61caa1] > /home/y/bin/traffic_server(_Z14http_hdr_cloneP11HTTPHdrImplP7HdrHeapS2_+0x93)[0x619f83] > /home/y/bin/traffic_server(_ZN19HttpTransactHeaders18copy_header_fieldsEP7HTTPHdrS1_bl+0x1be)[0x5c201e] > /home/y/bin/traffic_server(_ZN12HttpTransact14build_responseEPNS_5StateEP7HTTPHdrS3_11HTTPVersion10HTTPStatusPKc+0x3ed)[0x5a287d] > /home/y/bin/traffic_server(_ZN12HttpTransact25build_response_from_cacheEPNS_5StateE15HTTPWarningCode+0x354)[0x5b67f4] > /home/y/bin/traffic_server(_ZN12HttpTransact22HandleCacheOpenReadHitEPNS_5StateE+0x448)[0x5b84c8] > /home/y/bin/traffic_server(_ZN6HttpSM32call_transact_and_set_next_stateEPFvPN12HttpTransact5StateEE+0x66)[0x573816] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x2d2)[0x58ba72] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x583070] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1ea)[0x58d06a] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x2d2)[0x58ba72] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x583070] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1ea)[0x58d06a] > /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x58533e] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x587de8] > /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x566e12] > /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x653453] > /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x6be9af] > /home/y/bin/traffic_server(_ZN7CacheVC14handleReadDoneEiP5Event+0x1ed)[0x69d40d] > /home/y/bin/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x35)[0x6539c5] > /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x714aef] > /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x61b)[0x71561b] > /home/y/bin/traffic_server[0x713e9a] > /lib64/libpthread.so.0(+0x30d3c07851)[0x2aae27e25851] > /lib64/libc.so.6(clone+0x6d)[0x30d38e890d] > {code} > gdb output below: > {code} > (gdb) bt > #0 ink_atomic_increment (this=0x2afb60113010, > inherit_from=0x2afaa824e688) at ../../lib/ts/ink_atomic.h:162 > #1 refcount_inc (this=0x2afb60113010, inherit_from=0x2afaa824e688) at > ../../lib/ts/Ptr.h:279 > #2 operator= (this=0x2afb60113010, inherit_from=0x2afaa824e688) at > ../../lib/ts/Ptr.h:408 > #3 attach_str_heap (this=0x2afb60113010, inherit_from=0x2afaa824e688) at > HdrHeap.cc:1000 > #4 HdrHeap::inherit_string_heaps (this=0x2afb60113010, > inherit_from=0x2afaa824e688) at HdrHeap.cc:1081 > #5 0x00619f83 in http_hdr_clone (s_hh=0x2afaa824e710, > s_heap=0x2afaa824e688, d_heap=0x2afb60113010) at HTTP.cc:375 > #6 0x005c201e in copy (src_hdr=0x2afaa824e0b8, > new_hdr=0x2afac4058c50, retain_proxy_auth_hdrs=false, date=0) at > ../../proxy/hdrs/HTTP.h:867 > #7 HttpTransactHeaders::copy_header_fields (src_hdr=0x2afaa824e0b8, > new_hdr=0x2afac4058c50, retain_proxy_auth_hdrs=false, date=0) at > HttpTransactHeaders.cc:201 > #8 0x005a287d in HttpTransact::build_response (s=0x2afac4058570, > base_response=0x2afaa824e0b8, outgoing_response=0x2afac4058c50, > outgoing_version=, > status_code=HTTP_STATUS_NONE, reason_phrase=0x7323ac "None") at > HttpTransact.cc:7926 > #9 0x005b67f4 in HttpTransact::build_response_from_cache > (s=0x2afac4058570, warning_
[jira] [Commented] (TS-2782) ats crash in master in HdrHeap::inherit_string_heaps
[ https://issues.apache.org/jira/browse/TS-2782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992603#comment-13992603 ] Feifei Cai commented on TS-2782: Hi [~sudheerv], I think it's caused by [TS-2766|https://issues.apache.org/jira/browse/TS-2766]. The [commit|https://github.com/apache/trafficserver/commit/1630af7f76643c86751398838fbdf8a1363ce292] in [TS-2766|https://issues.apache.org/jira/browse/TS-2766] changes/corrects HdrHeap::coalesce_str_heaps(), which is called by HdrHeap::inherit_string_heaps(). So HdrHeap after this commit is not compliant with the previous versions. I tested some versions after this commit, and there's no such kind of crash issues during upgrades and downgrades. > ats crash in master in HdrHeap::inherit_string_heaps > > > Key: TS-2782 > URL: https://issues.apache.org/jira/browse/TS-2782 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Sudheer Vinukonda > Labels: spdy, yahoo > > When testing master on production hosts, noticed the below crash occuring > repeatedly every time ats version is changed. This crash stops happening > after clearing the cache. This needs further investigation, but, I remember a > discussion between briang and zwoop about duplicate string fields in HdrHeap. > Not sure if this core is related to that. Would appreciate if briang or zwoop > can comment. Thank you. > {code} > [example_prep.sh] Checking/Moving old cores... > [TrafficServer] using root directory '/home/y' > NOTE: Traffic Server received Sig 11: Segmentation fault > /home/y/bin/traffic_server - STACK TRACE: > /lib64/libpthread.so.0(+0x30d3c0f500)[0x2aae27e2d500] > /home/y/bin/traffic_server(_ZN7HdrHeap20inherit_string_heapsEPKS_+0x271)[0x61caa1] > /home/y/bin/traffic_server(_Z14http_hdr_cloneP11HTTPHdrImplP7HdrHeapS2_+0x93)[0x619f83] > /home/y/bin/traffic_server(_ZN19HttpTransactHeaders18copy_header_fieldsEP7HTTPHdrS1_bl+0x1be)[0x5c201e] > /home/y/bin/traffic_server(_ZN12HttpTransact14build_responseEPNS_5StateEP7HTTPHdrS3_11HTTPVersion10HTTPStatusPKc+0x3ed)[0x5a287d] > /home/y/bin/traffic_server(_ZN12HttpTransact25build_response_from_cacheEPNS_5StateE15HTTPWarningCode+0x354)[0x5b67f4] > /home/y/bin/traffic_server(_ZN12HttpTransact22HandleCacheOpenReadHitEPNS_5StateE+0x448)[0x5b84c8] > /home/y/bin/traffic_server(_ZN6HttpSM32call_transact_and_set_next_stateEPFvPN12HttpTransact5StateEE+0x66)[0x573816] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x2d2)[0x58ba72] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x583070] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1ea)[0x58d06a] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x2d2)[0x58ba72] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x583070] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1ea)[0x58d06a] > /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x58533e] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x587de8] > /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x566e12] > /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x653453] > /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x6be9af] > /home/y/bin/traffic_server(_ZN7CacheVC14handleReadDoneEiP5Event+0x1ed)[0x69d40d] > /home/y/bin/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x35)[0x6539c5] > /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x714aef] > /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x61b)[0x71561b] > /home/y/bin/traffic_server[0x713e9a] > /lib64/libpthread.so.0(+0x30d3c07851)[0x2aae27e25851] > /lib64/libc.so.6(clone+0x6d)[0x30d38e890d] > {code} > gdb output below: > {code} > (gdb) bt > #0 ink_atomic_increment (this=0x2afb60113010, > inherit_from=0x2afaa824e688) at ../../lib/ts/ink_atomic.h:162 > #1 refcount_inc (this=0x2afb60113010, inherit_from=0x2afaa824e688) at > ../../lib/ts/Ptr.h:279 > #2 operator= (this=0x2afb60113010, inherit_from=0x2afaa824e688) at > ../../lib/ts/Ptr.h:408 > #3 attach_str_heap (this=0x2afb60113010, inherit_from=0x2afaa824e688) at > HdrHeap.cc:1000 > #4 HdrHeap::inherit_string_heaps (this=0x2afb60113010, > inherit_from=0x2afaa824e688) at HdrHeap.cc:1081 > #5 0x00619f83 in http_hdr_clone (s_hh=0x2afaa824e710, > s_heap=0x2afaa824e688, d_heap=0x2afb60113010) at HTTP.cc:375 > #6 0x005c201e in copy (src_hdr=0x2afaa824e0b8, > new_hdr=0x2afac4058c50, retain_proxy_auth_hdrs=false, date=0) at > ../../proxy/hdrs/HTTP.h:867 > #7 HttpTransactHeaders::copy_header_fields (src_hdr=0x2afaa824e0b8, > new_hdr=0x2afac4058c50, retain_proxy_auth_hdrs=false, date=0) at > HttpTransactHeaders.cc:201 > #8 0x005a287d in HttpTransact::b
[jira] [Updated] (TS-2367) Add OCSP (Online Certificate Status Protocol) Stapling Support
[ https://issues.apache.org/jira/browse/TS-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2367: --- Attachment: (was: TS-2367.diff) > Add OCSP (Online Certificate Status Protocol) Stapling Support > --- > > Key: TS-2367 > URL: https://issues.apache.org/jira/browse/TS-2367 > Project: Traffic Server > Issue Type: New Feature > Components: HTTP, SSL >Reporter: Bryan Call >Assignee: Bryan Call > Fix For: 5.0.0 > > Attachments: TS-2367.diff > > > RFC: > http://tools.ietf.org/html/rfc6066 > Overview: > https://wiki.mozilla.org/Security/Server_Side_TLS#OCSP_Stapling > http://en.wikipedia.org/wiki/OCSP_stapling > There is support for this added into openssl 0.9.8g. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-2367) Add OCSP (Online Certificate Status Protocol) Stapling Support
[ https://issues.apache.org/jira/browse/TS-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2367: --- Attachment: TS-2367.diff proxy.config.ssl.stapling.enabled: Enable stapling of OCSP responses. Disabled by default. proxy.config.ssl.stapling.cache_timeout: Number of seconds before an OCSP response expires in the stapling cache. 3600s (1 hour) by default. proxy.config.ssl.stapling.request_timeout: Timeout for queries to OCSP responders. 10s by default. proxy.config.ssl.stapling.update_period: Update period for stapling caches. 60s (1 min) by default. When OCSP Stapling is enabled, ATS spawns a new thread to send OCSP request and get OCSP response. The response will be cached for 3600s(1 hour) in server. For details, the request has a timeout when try to connect to CA's OCSP responder, and it's 10s by default. ATS keeps checking for cached response with an update period. In the SSL module, ATS does not send OCSP request in openssl's callback functions. It just try to get response from stapling caches. In this way, the connection won't hang ATS event system. > Add OCSP (Online Certificate Status Protocol) Stapling Support > --- > > Key: TS-2367 > URL: https://issues.apache.org/jira/browse/TS-2367 > Project: Traffic Server > Issue Type: New Feature > Components: HTTP, SSL >Reporter: Bryan Call >Assignee: Bryan Call > Fix For: 5.0.0 > > Attachments: TS-2367.diff, TS-2367.diff > > > RFC: > http://tools.ietf.org/html/rfc6066 > Overview: > https://wiki.mozilla.org/Security/Server_Side_TLS#OCSP_Stapling > http://en.wikipedia.org/wiki/OCSP_stapling > There is support for this added into openssl 0.9.8g. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-2367) Add OCSP (Online Certificate Status Protocol) Stapling Support
[ https://issues.apache.org/jira/browse/TS-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2367: --- Attachment: TS-2367.diff > Add OCSP (Online Certificate Status Protocol) Stapling Support > --- > > Key: TS-2367 > URL: https://issues.apache.org/jira/browse/TS-2367 > Project: Traffic Server > Issue Type: New Feature > Components: HTTP, SSL >Reporter: Bryan Call >Assignee: Bryan Call > Fix For: 5.0.0 > > Attachments: TS-2367.diff > > > RFC: > http://tools.ietf.org/html/rfc6066 > Overview: > https://wiki.mozilla.org/Security/Server_Side_TLS#OCSP_Stapling > http://en.wikipedia.org/wiki/OCSP_stapling > There is support for this added into openssl 0.9.8g. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TS-1125) POST's with Expect: 100-continue are slowed by delayed 100 response.
[ https://issues.apache.org/jira/browse/TS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976960#comment-13976960 ] Feifei Cai commented on TS-1125: Hi [~zwoop], Thanks for your comments. 1. Overridable configuration After further check and referring to [docs|https://trafficserver.readthedocs.org/en/latest/reference/api/TSHttpOverridableConfig.en.html#description], I found overridable configuration is not needed. I'll move it out of overridable struct. Thanks for your advice. 2. Append new WKS to end of the list. Thanks to [~bcall]'s help. The latest patch has done this. 3. Use VIO APIs. The patch with "raw" write has been verified in yahoo's servers. I'll try with normal VIO APIs, and update a new patch after it's verified. > POST's with Expect: 100-continue are slowed by delayed 100 response. > > > Key: TS-1125 > URL: https://issues.apache.org/jira/browse/TS-1125 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 3.0.2 > Environment: TS 3.0.2 going to Apache 2.2 web server >Reporter: William Bardwell >Assignee: Bryan Call >Priority: Minor > Labels: yahoo > Fix For: 5.0.0 > > Attachments: TS-1125.diff, ts1125.diff, ts1125.diff, ts1125.diff > > > Sending a post like: > POST / HTTP/1.1 > Host: www.example.com > Content-Length: 10 > Expect: 100-continue > directly to the web server immediately sends back: > HTTP/1.1 100 Continue > And then when the post data is sent, a status 200 response comes back. > But when going through ATS the "HTTP/1.1 100 Continue" is not sent > immediately, and instead is sent after the POST data has been received. This > is legal, but it makes clients that are hoping for a 100 continue to wait a > little while hoping to get that, ATS should forward that response through > immediately. > Note: I see curl using "Expect: 100-continue" with > 1024 bytes of post data, > but web searching indicates that some Microsoft products also use it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-1125) POST's with Expect: 100-continue are slowed by delayed 100 response.
[ https://issues.apache.org/jira/browse/TS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-1125: --- Attachment: ts1125.diff Correct the little mistake of comma. > POST's with Expect: 100-continue are slowed by delayed 100 response. > > > Key: TS-1125 > URL: https://issues.apache.org/jira/browse/TS-1125 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 3.0.2 > Environment: TS 3.0.2 going to Apache 2.2 web server >Reporter: William Bardwell >Assignee: Bryan Call >Priority: Minor > Labels: yahoo > Fix For: 5.0.0 > > Attachments: TS-1125.diff, ts1125.diff, ts1125.diff, ts1125.diff > > > Sending a post like: > POST / HTTP/1.1 > Host: www.example.com > Content-Length: 10 > Expect: 100-continue > directly to the web server immediately sends back: > HTTP/1.1 100 Continue > And then when the post data is sent, a status 200 response comes back. > But when going through ATS the "HTTP/1.1 100 Continue" is not sent > immediately, and instead is sent after the POST data has been received. This > is legal, but it makes clients that are hoping for a 100 continue to wait a > little while hoping to get that, ATS should forward that response through > immediately. > Note: I see curl using "Expect: 100-continue" with > 1024 bytes of post data, > but web searching indicates that some Microsoft products also use it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-1125) POST's with Expect: 100-continue are slowed by delayed 100 response.
[ https://issues.apache.org/jira/browse/TS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-1125: --- Attachment: TS-1125.diff Force ATS to return a “100 Continue” response to client as soon as it receives the "Expect: 100-continue" request from client, and remove this header, not relay to original server. This is a quick fix. It does not strictly follows RFC-2616's requirements. > POST's with Expect: 100-continue are slowed by delayed 100 response. > > > Key: TS-1125 > URL: https://issues.apache.org/jira/browse/TS-1125 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 3.0.2 > Environment: TS 3.0.2 going to Apache 2.2 web server >Reporter: William Bardwell >Priority: Minor > Fix For: 5.2.0 > > Attachments: TS-1125.diff > > > Sending a post like: > POST / HTTP/1.1 > Host: www.example.com > Content-Length: 10 > Expect: 100-continue > directly to the web server immediately sends back: > HTTP/1.1 100 Continue > And then when the post data is sent, a status 200 response comes back. > But when going through ATS the "HTTP/1.1 100 Continue" is not sent > immediately, and instead is sent after the POST data has been received. This > is legal, but it makes clients that are hoping for a 100 continue to wait a > little while hoping to get that, ATS should forward that response through > immediately. > Note: I see curl using "Expect: 100-continue" with > 1024 bytes of post data, > but web searching indicates that some Microsoft products also use it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TS-2031) Two SSL certs with overlapping CNs stomps over each other without warnings
[ https://issues.apache.org/jira/browse/TS-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2031: --- Attachment: (was: TS-2031.diff) > Two SSL certs with overlapping CNs stomps over each other without warnings > -- > > Key: TS-2031 > URL: https://issues.apache.org/jira/browse/TS-2031 > Project: Traffic Server > Issue Type: Bug > Components: SSL >Reporter: Leif Hedstrom >Assignee: James Peach >Priority: Minor > Fix For: 5.1.0 > > Attachments: TS-2031.diff > > > If you have two certs that has the same CNs, the last one wins in the SNI > negotiation. This even takes precedence over "assigned" IPs (SNI trumps IP). > We should at least warn on this. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (TS-2031) Two SSL certs with overlapping CNs stomps over each other without warnings
[ https://issues.apache.org/jira/browse/TS-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2031: --- Attachment: TS-2031.diff > Two SSL certs with overlapping CNs stomps over each other without warnings > -- > > Key: TS-2031 > URL: https://issues.apache.org/jira/browse/TS-2031 > Project: Traffic Server > Issue Type: Bug > Components: SSL >Reporter: Leif Hedstrom >Assignee: James Peach >Priority: Minor > Fix For: 5.1.0 > > Attachments: TS-2031.diff > > > If you have two certs that has the same CNs, the last one wins in the SNI > negotiation. This even takes precedence over "assigned" IPs (SNI trumps IP). > We should at least warn on this. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (TS-2031) Two SSL certs with overlapping CNs stomps over each other without warnings
[ https://issues.apache.org/jira/browse/TS-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883683#comment-13883683 ] Feifei Cai commented on TS-2031: Hi James, xptr is used to control memory, allocate and free memory by ourself. The release() just set the pointer to NULL, actually it prevent the pointer from free. I think there will be no double-free issue. https://github.com/apache/trafficserver/blob/master/lib/ts/ink_resource.h#L110 However, I reconsidered it and found that it's no need to use xptr, because I did not do any memory allocation for found_entry. So I change found_entry's type back to SSLEntry *, just like here: https://github.com/apache/trafficserver/blob/master/iocore/net/SSLCertLookup.cc#L269 I update the attached patch, please help have a review. Thank you! > Two SSL certs with overlapping CNs stomps over each other without warnings > -- > > Key: TS-2031 > URL: https://issues.apache.org/jira/browse/TS-2031 > Project: Traffic Server > Issue Type: Bug > Components: SSL >Reporter: Leif Hedstrom >Assignee: James Peach >Priority: Minor > Fix For: 5.1.0 > > Attachments: TS-2031.diff > > > If you have two certs that has the same CNs, the last one wins in the SNI > negotiation. This even takes precedence over "assigned" IPs (SNI trumps IP). > We should at least warn on this. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (TS-2031) Two SSL certs with overlapping CNs stomps over each other without warnings
[ https://issues.apache.org/jira/browse/TS-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Cai updated TS-2031: --- Attachment: TS-2031.diff > Two SSL certs with overlapping CNs stomps over each other without warnings > -- > > Key: TS-2031 > URL: https://issues.apache.org/jira/browse/TS-2031 > Project: Traffic Server > Issue Type: Bug > Components: SSL >Reporter: Leif Hedstrom >Priority: Minor > Fix For: 5.1.0 > > Attachments: TS-2031.diff > > > If you have two certs that has the same CNs, the last one wins in the SNI > negotiation. This even takes precedence over "assigned" IPs (SNI trumps IP). > We should at least warn on this. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (TS-2031) Two SSL certs with overlapping CNs stomps over each other without warnings
[ https://issues.apache.org/jira/browse/TS-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878448#comment-13878448 ] Feifei Cai commented on TS-2031: Hi Leif, I attached a small patch for this issue. iocore/net/SSLCertLookup.cc I check the insert result or lookup it in advance. If the same hostname is found, I add a warning in the log. iocore/net/SSLUtils.cc Update a comment. We actually have deal with wildcards now. Although this patch can warn on same hostname now, I have a question about the logic in SSLCertLookup.cc : When we store subject CN and subjectAltNames DSN, we use 2 data structure: 1. For hostnames without wildcard (e.g. www.example.com), we insert in hash table. https://github.com/apache/trafficserver/blob/master/lib/ts/ink_hash_table.cc#L312 2. For hostnames contains wildcard (e.g. *.example.com), we insert in trie. https://github.com/apache/trafficserver/blob/master/lib/ts/Trie.h#L144 When we insert a hostname for multiple times, the insert logic is different. In hash table, it will replace any previous binding values for the key; but in trie, it can not insert duplicated value and will print Debug message into log file and return false. I think we should keep a uniform logic here, and, I prefer the logic of hash table. Because of these 2 different logic, my patch gives 2 different warnings, which may confuse our customers. If we keep a uniform logic, then we can give uniform warnings. What's more, we can move warnings outside SSLCertLookup.cc, into SSLUtils.cc, then we can print warnings with certfile names instead of SSL_CTX. Then customers can know which certfile wins and which certfile is overwritten. https://github.com/apache/trafficserver/blob/master/iocore/net/SSLUtils.cc#L617 We also should keep in mind that most certs include CN in subjectAltName DNSs, e.g. CN: *.example.com subjectAltName: DNS*.example.com DNSwww.example.com So multiple insertion almost always occurs when parse certfiles with subjectAltNames extensions, we should not warn on this condition. Would you please help confirm the insert logic? If you agree to keep a uniform logic, which do you prefer? I'll propose another patch according to your advice. If not, please review the attached patch. Thank you! > Two SSL certs with overlapping CNs stomps over each other without warnings > -- > > Key: TS-2031 > URL: https://issues.apache.org/jira/browse/TS-2031 > Project: Traffic Server > Issue Type: Bug > Components: SSL >Reporter: Leif Hedstrom >Priority: Minor > Fix For: 5.1.0 > > Attachments: TS-2031.diff > > > If you have two certs that has the same CNs, the last one wins in the SNI > negotiation. This even takes precedence over "assigned" IPs (SNI trumps IP). > We should at least warn on this. -- This message was sent by Atlassian JIRA (v6.1.5#6160)