[ https://issues.apache.org/jira/browse/TS-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dimitry Andric updated TS-2616: ------------------------------- Description: I encountered an origin server which sometimes sends multiple copies of the same header. Obviously, this origin server is buggy, but I cannot touch it, or modify it in any way. An example of such a request: {noformat} CLIENT> GET /example/request?token=12345678 HTTP/1.1 CLIENT> Host: 192.168.42.42:8080 CLIENT> Accept-Encoding: deflate, gzip CLIENT> User-Agent: Mozilla/5.0 (Mac OS X; en-us) AppleWebKit/535.7.0 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/535.7.0 CLIENT> Accept: application/json CLIENT> Referer: http://example.com/ CLIENT> Client-ip: 127.0.0.1 CLIENT> X-Forwarded-For: 127.0.0.1 CLIENT> Via: http/1.1 trafficserver.localdomain[FF000000000000000000000000000011] (ApacheTrafficServer/4.0.2) CLIENT> SERVER> HTTP/1.1 200 OK SERVER> Access-Control-Allow-Origin: * SERVER> Access-Control-Allow-Headers : Content-Type SERVER> Access-Control-Allow-Methods : GET, PUT, POST, DELETE SERVER> Server: Apache-Coyote/1.1 SERVER> Transfer-Encoding: chunked SERVER> Content-Encoding: gzip SERVER> Vary: Accept-Encoding SERVER> Date: Tue, 04 Feb 2014 13:55:56 GMT SERVER> Transfer-Encoding: chunked SERVER> SERVER> 14 SERVER> .................... SERVER> 0 {noformat} When this request is initially passed through by TrafficServer, the client sees at least one "Transfer-Encoding: chunked" header, and assumes a chunked transfer. Everything works fine. After TrafficServer has cached the request, subsequent requests from the client will receive a non-chunked transfer. However, while TrafficServer has coalesced multiple headers into one, it removes only *one* instance of the "chunked" value: {noformat} CLIENT> GET http://192.168.42.42:8080/example/request?token=12345678 HTTP/1.1 CLIENT> Host: 192.168.42.42 CLIENT> Accept-Encoding: deflate, gzip CLIENT> Proxy-Connection: Keep-Alive CLIENT> User-Agent: Mozilla/5.0 (Mac OS X; en-us) AppleWebKit/535.7.0 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/535.7.0 CLIENT> Accept: application/json CLIENT> Referer: http://example.com/ CLIENT> SERVER> HTTP/1.1 200 OK SERVER> Access-Control-Allow-Origin: * SERVER> Access-Control-Allow-Headers : Content-Type SERVER> Access-Control-Allow-Methods : GET, PUT, POST, DELETE SERVER> Server: ATS/4.0.2 SERVER> Content-Encoding: gzip SERVER> Vary: Accept-Encoding SERVER> Date: Tue, 04 Feb 2014 13:57:36 GMT SERVER> Transfer-Encoding: chunked SERVER> Age: 1546 SERVER> Content-Length: 20 SERVER> Proxy-Connection: keep-alive SERVER> SERVER> .................... {noformat} Now the client sees both a "Content-Length: 20" and "Transfer-Encoding: chunked" header, which should not happen. When e.g. curl is used as a client, it will assume the transfer is chunked, and it attempts to parse the initial bytes as a hexadecimal size line. Of course this fails, since there is no real chunked transfer, and curl will abort the transfer with an error, after reading approximately 256 kiB. The problem is due to a piece of code in HttpTransact::initialize_state_variables_from_response(), which scans the Transfer-Encoding: headers, coalesces them, and removes any "chunked" value from them. However, it does not handle multiple "chunked" values, as explicitly noticed in a comment on line 5795: {noformat} 5720 void 5721 HttpTransact::initialize_state_variables_from_response(State* s, HTTPHdr* incoming_response) 5722 { ... 5785 if (incoming_response->presence(MIME_PRESENCE_TRANSFER_ENCODING)) { 5786 MIMEField *field = incoming_response->field_find(MIME_FIELD_TRANSFER_ENCODING, MIME_LEN_TRANSFER_ENCODING); ... 5792 while (enc_value) { 5793 const char *wks_value = hdrtoken_string_to_wks(enc_value, enc_val_len); 5794 5795 // FIX ME: What is chunked appears more than once? Old 5796 // code didn't deal with this so I don't either 5797 if (wks_value == HTTP_VALUE_CHUNKED) { ... 5813 // Loop over the all the values in existing Trans-enc header and 5814 // copy the ones that aren't our chunked value to a new field 5815 while (new_enc_val) { 5816 if (new_enc_val != enc_value) { 5817 if (new_enc_field) { 5818 new_enc_field->value_append(incoming_response->m_heap, incoming_response->m_mime, new_enc_val, new_enc_len, true); 5819 } else { 5820 new_enc_field = incoming_response->field_create(); 5821 incoming_response->field_value_set(new_enc_field, new_enc_val, new_enc_len); 5822 } 5823 } 5824 5825 new_enc_val = new_enc_iter.get_next(&new_enc_len); 5826 } {noformat} The comparison on line 5816 compares the char pointers new_enc_val and enc_value, to pluck out any non-"chunked" values, while it should really compare the *contents* of the strings. Although it is probably better to just re-use hdrtoken_string_to_wks(), and compare the result with the special HTTP_VALUE_CHUNKED value, e.g.: {noformat} 5813 // Loop over the all the values in existing Trans-enc header and 5814 // copy the ones that aren't our chunked value to a new field 5815 while (new_enc_val) { 5816 const char *new_wks_val = hdrtoken_string_to_wks(new_enc_val, new_enc_len); 5817 if (new_wks_val != HTTP_VALUE_CHUNKED) { 5818 if (new_enc_field) { 5819 new_enc_field->value_append(incoming_response->m_heap, incoming_response->m_mime, new_enc_val, new_enc_len, true); 5820 } else { 5821 new_enc_field = incoming_response->field_create(); 5822 incoming_response->field_value_set(new_enc_field, new_enc_val, new_enc_len); 5823 } 5824 } 5825 5826 new_enc_val = new_enc_iter.get_next(&new_enc_len); 5827 } {noformat} After locally modifying my copy of TrafficServer 4.0.2 in the above fashion, I have successfully used it to sanitize the multiple Transfer-Encoding: headers from the origin server. Please review the patch I will attach to this issue, which modifies proxy/http/HttpTransact.cc as shown above. was: I encountered an origin server which sometimes sends multiple copies of the same header. Obviously, this origin server is buggy, but I cannot touch it, or modify it in any way. An example of such a request: {noformat} CLIENT> GET /example/request?token=12345678 HTTP/1.1 CLIENT> Host: 192.168.42.42:8080 CLIENT> Accept-Encoding: deflate, gzip CLIENT> User-Agent: Mozilla/5.0 (Mac OS X; en-us) AppleWebKit/535.7.0 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/535.7.0 CLIENT> Accept: application/json CLIENT> Referer: http://example.com/ CLIENT> Client-ip: 127.0.0.1 CLIENT> X-Forwarded-For: 127.0.0.1 CLIENT> Via: http/1.1 trafficserver.localdomain[FF000000000000000000000000000011] (ApacheTrafficServer/4.0.2) CLIENT> SERVER> HTTP/1.1 200 OK SERVER> Access-Control-Allow-Origin: * SERVER> Access-Control-Allow-Headers : Content-Type SERVER> Access-Control-Allow-Methods : GET, PUT, POST, DELETE SERVER> Server: Apache-Coyote/1.1 SERVER> Transfer-Encoding: chunked SERVER> Content-Encoding: gzip SERVER> Vary: Accept-Encoding SERVER> Date: Tue, 04 Feb 2014 13:55:56 GMT SERVER> Transfer-Encoding: chunked SERVER> SERVER> 14 SERVER> .................... SERVER> 0 {noformat} When this request is initially passed through by TrafficServer, the client sees at least one "Transfer-Encoding: chunked" header, and assumes a chunked transfer. Everything works fine. After TrafficServer has cached the request, subsequent requests from the client will receive a non-chunked transfer. However, while TrafficServer has coalesced multiple headers into one, it removes only *one* instance of the "chunked" value: {noformat} CLIENT> GET http://192.168.42.42:8080/example/request?token=12345678 HTTP/1.1 CLIENT> Host: 192.168.42.42 CLIENT> Accept-Encoding: deflate, gzip CLIENT> Proxy-Connection: Keep-Alive CLIENT> User-Agent: User-Agent: Mozilla/5.0 (Mac OS X; en-us) AppleWebKit/535.7.0 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/535.7.0 CLIENT> Accept: application/json CLIENT> Referer: http://example.com/ CLIENT> SERVER> HTTP/1.1 200 OK SERVER> Access-Control-Allow-Origin: * SERVER> Access-Control-Allow-Headers : Content-Type SERVER> Access-Control-Allow-Methods : GET, PUT, POST, DELETE SERVER> Server: ATS/4.0.2 SERVER> Content-Encoding: gzip SERVER> Vary: Accept-Encoding SERVER> Date: Tue, 04 Feb 2014 13:57:36 GMT SERVER> Transfer-Encoding: chunked SERVER> Age: 1546 SERVER> Content-Length: 20 SERVER> Proxy-Connection: keep-alive SERVER> SERVER> .................... {noformat} Now the client sees both a "Content-Length: 20" and "Transfer-Encoding: chunked" header, which should not happen. When e.g. curl is used as a client, it will assume the transfer is chunked, and it attempts to parse the initial bytes as a hexadecimal size line. Of course this fails, since there is no real chunked transfer, and curl will abort the transfer with an error, after reading approximately 256 kiB. The problem is due to a piece of code in HttpTransact::initialize_state_variables_from_response(), which scans the Transfer-Encoding: headers, coalesces them, and removes any "chunked" value from them. However, it does not handle multiple "chunked" values, as explicitly noticed in a comment on line 5795: {noformat} 5720 void 5721 HttpTransact::initialize_state_variables_from_response(State* s, HTTPHdr* incoming_response) 5722 { ... 5785 if (incoming_response->presence(MIME_PRESENCE_TRANSFER_ENCODING)) { 5786 MIMEField *field = incoming_response->field_find(MIME_FIELD_TRANSFER_ENCODING, MIME_LEN_TRANSFER_ENCODING); ... 5792 while (enc_value) { 5793 const char *wks_value = hdrtoken_string_to_wks(enc_value, enc_val_len); 5794 5795 // FIX ME: What is chunked appears more than once? Old 5796 // code didn't deal with this so I don't either 5797 if (wks_value == HTTP_VALUE_CHUNKED) { ... 5813 // Loop over the all the values in existing Trans-enc header and 5814 // copy the ones that aren't our chunked value to a new field 5815 while (new_enc_val) { 5816 if (new_enc_val != enc_value) { 5817 if (new_enc_field) { 5818 new_enc_field->value_append(incoming_response->m_heap, incoming_response->m_mime, new_enc_val, new_enc_len, true); 5819 } else { 5820 new_enc_field = incoming_response->field_create(); 5821 incoming_response->field_value_set(new_enc_field, new_enc_val, new_enc_len); 5822 } 5823 } 5824 5825 new_enc_val = new_enc_iter.get_next(&new_enc_len); 5826 } {noformat} The comparison on line 5816 compares the char pointers new_enc_val and enc_value, to pluck out any non-"chunked" values, while it should really compare the *contents* of the strings. Although it is probably better to just re-use hdrtoken_string_to_wks(), and compare the result with the special HTTP_VALUE_CHUNKED value, e.g.: {noformat} 5813 // Loop over the all the values in existing Trans-enc header and 5814 // copy the ones that aren't our chunked value to a new field 5815 while (new_enc_val) { 5816 const char *new_wks_val = hdrtoken_string_to_wks(new_enc_val, new_enc_len); 5817 if (new_wks_val != HTTP_VALUE_CHUNKED) { 5818 if (new_enc_field) { 5819 new_enc_field->value_append(incoming_response->m_heap, incoming_response->m_mime, new_enc_val, new_enc_len, true); 5820 } else { 5821 new_enc_field = incoming_response->field_create(); 5822 incoming_response->field_value_set(new_enc_field, new_enc_val, new_enc_len); 5823 } 5824 } 5825 5826 new_enc_val = new_enc_iter.get_next(&new_enc_len); 5827 } {noformat} After locally modifying my copy of TrafficServer 4.0.2 in the above fashion, I have successfully used it to sanitize the multiple Transfer-Encoding: headers from the origin server. Please review the patch I will attach to this issue, which modifies proxy/http/HttpTransact.cc as shown above. > Multiple Transfer-Encoding: chunked headers are not sanitized > ------------------------------------------------------------- > > Key: TS-2616 > URL: https://issues.apache.org/jira/browse/TS-2616 > Project: Traffic Server > Issue Type: Bug > Components: HTTP > Reporter: Dimitry Andric > Attachments: fix-ts-2616-1.diff > > > I encountered an origin server which sometimes sends multiple copies of the > same header. Obviously, this origin server is buggy, but I cannot touch it, > or modify it in any way. An example of such a request: > {noformat} > CLIENT> GET /example/request?token=12345678 HTTP/1.1 > CLIENT> Host: 192.168.42.42:8080 > CLIENT> Accept-Encoding: deflate, gzip > CLIENT> User-Agent: Mozilla/5.0 (Mac OS X; en-us) AppleWebKit/535.7.0 (KHTML, > like Gecko) Version/4.0.4 Mobile/7B334b Safari/535.7.0 > CLIENT> Accept: application/json > CLIENT> Referer: http://example.com/ > CLIENT> Client-ip: 127.0.0.1 > CLIENT> X-Forwarded-For: 127.0.0.1 > CLIENT> Via: http/1.1 > trafficserver.localdomain[FF000000000000000000000000000011] > (ApacheTrafficServer/4.0.2) > CLIENT> > SERVER> HTTP/1.1 200 OK > SERVER> Access-Control-Allow-Origin: * > SERVER> Access-Control-Allow-Headers : Content-Type > SERVER> Access-Control-Allow-Methods : GET, PUT, POST, DELETE > SERVER> Server: Apache-Coyote/1.1 > SERVER> Transfer-Encoding: chunked > SERVER> Content-Encoding: gzip > SERVER> Vary: Accept-Encoding > SERVER> Date: Tue, 04 Feb 2014 13:55:56 GMT > SERVER> Transfer-Encoding: chunked > SERVER> > SERVER> 14 > SERVER> .................... > SERVER> 0 > {noformat} > When this request is initially passed through by TrafficServer, the client > sees at least one "Transfer-Encoding: chunked" header, and assumes a chunked > transfer. Everything works fine. > After TrafficServer has cached the request, subsequent requests from the > client will receive a non-chunked transfer. However, while TrafficServer has > coalesced multiple headers into one, it removes only *one* instance of the > "chunked" value: > {noformat} > CLIENT> GET http://192.168.42.42:8080/example/request?token=12345678 HTTP/1.1 > CLIENT> Host: 192.168.42.42 > CLIENT> Accept-Encoding: deflate, gzip > CLIENT> Proxy-Connection: Keep-Alive > CLIENT> User-Agent: Mozilla/5.0 (Mac OS X; en-us) AppleWebKit/535.7.0 (KHTML, > like Gecko) Version/4.0.4 Mobile/7B334b Safari/535.7.0 > CLIENT> Accept: application/json > CLIENT> Referer: http://example.com/ > CLIENT> > SERVER> HTTP/1.1 200 OK > SERVER> Access-Control-Allow-Origin: * > SERVER> Access-Control-Allow-Headers : Content-Type > SERVER> Access-Control-Allow-Methods : GET, PUT, POST, DELETE > SERVER> Server: ATS/4.0.2 > SERVER> Content-Encoding: gzip > SERVER> Vary: Accept-Encoding > SERVER> Date: Tue, 04 Feb 2014 13:57:36 GMT > SERVER> Transfer-Encoding: chunked > SERVER> Age: 1546 > SERVER> Content-Length: 20 > SERVER> Proxy-Connection: keep-alive > SERVER> > SERVER> .................... > {noformat} > Now the client sees both a "Content-Length: 20" and "Transfer-Encoding: > chunked" header, which should not happen. When e.g. curl is used as a > client, it will assume the transfer is chunked, and it attempts to parse the > initial bytes as a hexadecimal size line. Of course this fails, since there > is no real chunked transfer, and curl will abort the transfer with an error, > after reading approximately 256 kiB. > The problem is due to a piece of code in > HttpTransact::initialize_state_variables_from_response(), which scans the > Transfer-Encoding: headers, coalesces them, and removes any "chunked" value > from them. However, it does not handle multiple "chunked" values, as > explicitly noticed in a comment on line 5795: > {noformat} > 5720 void > 5721 HttpTransact::initialize_state_variables_from_response(State* s, > HTTPHdr* incoming_response) > 5722 { > ... > 5785 if (incoming_response->presence(MIME_PRESENCE_TRANSFER_ENCODING)) { > 5786 MIMEField *field = > incoming_response->field_find(MIME_FIELD_TRANSFER_ENCODING, > MIME_LEN_TRANSFER_ENCODING); > ... > 5792 while (enc_value) { > 5793 const char *wks_value = hdrtoken_string_to_wks(enc_value, > enc_val_len); > 5794 > 5795 // FIX ME: What is chunked appears more than once? Old > 5796 // code didn't deal with this so I don't either > 5797 if (wks_value == HTTP_VALUE_CHUNKED) { > ... > 5813 // Loop over the all the values in existing Trans-enc header > and > 5814 // copy the ones that aren't our chunked value to a new > field > 5815 while (new_enc_val) { > 5816 if (new_enc_val != enc_value) { > 5817 if (new_enc_field) { > 5818 new_enc_field->value_append(incoming_response->m_heap, > incoming_response->m_mime, new_enc_val, new_enc_len, true); > 5819 } else { > 5820 new_enc_field = incoming_response->field_create(); > 5821 incoming_response->field_value_set(new_enc_field, > new_enc_val, new_enc_len); > 5822 } > 5823 } > 5824 > 5825 new_enc_val = new_enc_iter.get_next(&new_enc_len); > 5826 } > {noformat} > The comparison on line 5816 compares the char pointers new_enc_val and > enc_value, to pluck out any non-"chunked" values, while it should really > compare the *contents* of the strings. Although it is probably better to > just re-use hdrtoken_string_to_wks(), and compare the result with the special > HTTP_VALUE_CHUNKED value, e.g.: > {noformat} > 5813 // Loop over the all the values in existing Trans-enc header > and > 5814 // copy the ones that aren't our chunked value to a new > field > 5815 while (new_enc_val) { > 5816 const char *new_wks_val = > hdrtoken_string_to_wks(new_enc_val, new_enc_len); > 5817 if (new_wks_val != HTTP_VALUE_CHUNKED) { > 5818 if (new_enc_field) { > 5819 new_enc_field->value_append(incoming_response->m_heap, > incoming_response->m_mime, new_enc_val, new_enc_len, true); > 5820 } else { > 5821 new_enc_field = incoming_response->field_create(); > 5822 incoming_response->field_value_set(new_enc_field, > new_enc_val, new_enc_len); > 5823 } > 5824 } > 5825 > 5826 new_enc_val = new_enc_iter.get_next(&new_enc_len); > 5827 } > {noformat} > After locally modifying my copy of TrafficServer 4.0.2 in the above fashion, > I have successfully used it to sanitize the multiple Transfer-Encoding: > headers from the origin server. > Please review the patch I will attach to this issue, which modifies > proxy/http/HttpTransact.cc as shown above. -- This message was sent by Atlassian JIRA (v6.2#6252)