On 16/03/2016 5:52 a.m., Alex Rousskov wrote: > On 03/15/2016 09:36 AM, Amos Jeffries wrote: >> Squid already contains AnyP::PROTO_UNKNOWN support for unknown protocols >> but currently does not preserve the actual string value received for them. >> >> This adds a textual representation ('image') to the UriScheme object to >> fill that gap and ensure that all URL representatinos (ie cache keys, >> logs and outgoing messages) are generated with the scheme string as it >> was received. > > Preserving textual representation is the right thing to do, but the > implementation itself needs a lot more work IMO. > > >> /// convert the URL scheme to that given >> void setScheme(const AnyP::ProtocolType &p) {scheme_=p; touch();} >> + void setSchemeImage(const char *str) {scheme_.setImage(str); touch();} > > > The parameter type should be changed instead. Yes, that will require > more changes, but those changes are essential to properly support > foreign protocol names. Without those changes, you are creating an API > bug in addition to adding a useful feature. > > We (mostly you) have already done similar work for foreign HTTP request > methods. This is similar -- UriScheme should be used nearly everywhere > (instead of ProtocolType that does not carry enough information). It > could be argued that the UriScheme class itself should be renamed to > something like Protocol, but I do not want to argue about that. >
There are two distinct layers involved. I'm sure you know this, but I'll reiterate for the sake of clarity. - ProtocolType is the transfer or transport protocol type. - UriScheme is the URL/URI/URN scheme label. They have been conflated in the past and are still in the difficult process of de-tangling. The URI scheme implies a protocol, but is not one in and of itself. For example HTTP ProtocolType can be used to transfer ftp:// scheme URLs, etc. Untangling them is happening, but not part of the scope of this patch except to the point that the URL image() is being made to no longer generate directly from the ProtocolType value in most cases. > >> + /// Sets the string representation of this scheme. >> + /// Only needed if the scheme type is PROTO_UNKNOWN. >> + void setImage(const char *str) {assert(theScheme_ == >> AnyP::PROTO_UNKNOWN); schemeImage_ = str;} > > This method should be a constructor. Done. Note that this has side effects on HttpRequest construction and requires move semantics to be added to UriScheme. Not a problem exactly, but scope creep. > >> char const * >> AnyP::UriScheme::c_str() const >> { > > This method should be replaced with AnyP::UriScheme::image() returning SBuf. > Done. > >> + if (!schemeImage_.isEmpty()) >> + return schemeImage_.c_str(); >> + > > Why are we not lowering the case of a foreign protocol image if we are > lowering the case of standard protocol images below? Either both should > be lowered or there should be a source code comment explaining the > discrepancy. Transfer Protocol type and Scheme labels are both case-sensitive (and differently cased). So when using an externally supplied scheme label we must take what was given. But, since our registered ProtocolType and its ProtocolType_str is actually the message transfer protocol label (upper case) we need to lower the case of ProtocolType_str when using it as implicit URI scheme for one of the registered/known scheme names. > > If both should be lowered, the lowering should be done in the > constructor to avoid lowering the same string multiple times. > Trunk is what lowers the string multiple times (on each display of the URL, including debugs). These patches make that happen only once and stores the lowered result in the scheme image_ to re-use it for any followup calls after the initial down-case. Initial patch did it on first display. After the ctor changes it now happens one on construction regardless of whether display ever happens. > >> if (theScheme_ > AnyP::PROTO_NONE && theScheme_ < AnyP::PROTO_MAX) { >> - const char *in = AnyP::ProtocolType_str[theScheme_]; >> - for (; p < (BUFSIZ-1) && in[p] != '\0'; ++p) >> - out[p] = xtolower(in[p]); >> + schemeImage_ = AnyP::ProtocolType_str[theScheme_]; >> + schemeImage_.toLower(); >> } > > This slow code may not be needed at all if the rest of the changes are > implemented. If this code stays, please add an "XXX: Slow." or a similar > comment after toLower(). There are many ways to fix/optimize this, but > the best way probably depends on the rest of the changes so I am not > going to suggest any specific TODO here. AFAICT the best optimization we can do is just take the image string from the urlParse logic. Which is what this patch is doing now after the ctor changes. The slow logic path is now only done for URI which are generated internally by Squid from just a ProtocolType enum. > >> @@ -157,6 +158,9 @@ >> if (strncasecmp(b, "whois", len) == 0) >> return AnyP::PROTO_WHOIS; >> >> + if (len > 0) >> + return AnyP::PROTO_UNKNOWN; >> + >> return AnyP::PROTO_NONE; >> } > > This function should return a UriScheme object instead. Ideally, using > created-once constants for common protocols (so that we do not have to > deal with to-lower-case conversion every time a scheme image is needed). This function is used to determine what outgoing ProtocolType module should be used when forwarding the message upstream. In an ideal world this and the parse itself would be a methods of the URL object. But that is a much bigger change and I intend to do it in a later patch, not this one. For now returning PROTO_UNKNOWN instead of PROTO_NONE fixes the input validation that follow this function in the parsing sequence. By simply returning PROTO_UNKNOWN here on this len>0 condition we do trigger the external code path which can copy the scheme name from the parse and avoid all that to-lower business. > >> @@ -27,7 +28,7 @@ >> ~UriScheme() {} > ... >> + /// the string representation to use for theScheme_ >> + mutable SBuf schemeImage_; > > Please do not name Foo members FooX. One Foo is enough! > Done. > >> + /// the string representation to use for theScheme_ > > s/to use for theScheme_// > or > s/to use for theScheme_/of the scheme// > Done. > If you can make such a guarantee after all other changes, please add > something like "; always in lower case". Can't. It is what it is - case sensitive. see above. > > >> + mutable SBuf schemeImage_; > > Mutable may not be needed after all other changes. > Yes. Gone. > > It is difficult for me to review 4-line no-function-names diff, so > forgive me if I missed any context-dependent caveats. > Sorry. This one has 20 lines of context. Can you remind me what the .bazaar.conf setting was to make this the default? I lost the old bazaar.conf with a HDD earlier this year. Amos
=== modified file 'src/HttpRequest.cc' --- src/HttpRequest.cc 2016-03-25 20:11:29 +0000 +++ src/HttpRequest.cc 2016-04-15 18:04:33 +0000 @@ -26,60 +26,60 @@ #include "log/Config.h" #include "MemBuf.h" #include "sbuf/StringConvert.h" #include "SquidConfig.h" #include "Store.h" #include "URL.h" #if USE_AUTH #include "auth/UserRequest.h" #endif #if ICAP_CLIENT #include "adaptation/icap/icap_log.h" #endif HttpRequest::HttpRequest() : HttpMsg(hoRequest) { init(); } -HttpRequest::HttpRequest(const HttpRequestMethod& aMethod, AnyP::ProtocolType aProtocol, const char *aUrlpath) : +HttpRequest::HttpRequest(const HttpRequestMethod& aMethod, AnyP::ProtocolType aProtocol, const char *aSchemeImg, const char *aUrlpath) : HttpMsg(hoRequest) { static unsigned int id = 1; debugs(93,7, HERE << "constructed, this=" << this << " id=" << ++id); init(); - initHTTP(aMethod, aProtocol, aUrlpath); + initHTTP(aMethod, aProtocol, aSchemeImg, aUrlpath); } HttpRequest::~HttpRequest() { clean(); debugs(93,7, HERE << "destructed, this=" << this); } void -HttpRequest::initHTTP(const HttpRequestMethod& aMethod, AnyP::ProtocolType aProtocol, const char *aUrlpath) +HttpRequest::initHTTP(const HttpRequestMethod& aMethod, AnyP::ProtocolType aProtocol, const char *aSchemeImg, const char *aUrlpath) { method = aMethod; - url.setScheme(aProtocol); + url.setScheme(aProtocol, aSchemeImg); url.path(aUrlpath); } void HttpRequest::init() { method = Http::METHOD_NONE; url.clear(); #if USE_AUTH auth_user_request = NULL; #endif memset(&flags, '\0', sizeof(flags)); range = NULL; ims = -1; imslen = 0; lastmod = -1; client_addr.setEmpty(); my_addr.setEmpty(); body_pipe = NULL; // hier @@ -162,45 +162,41 @@ void HttpRequest::reset() { clean(); init(); } HttpRequest * HttpRequest::clone() const { HttpRequest *copy = new HttpRequest(); copy->method = method; // TODO: move common cloning clone to Msg::copyTo() or copy ctor copy->header.append(&header); copy->hdrCacheInit(); copy->hdr_sz = hdr_sz; copy->http_ver = http_ver; copy->pstate = pstate; // TODO: should we assert a specific state here? copy->body_pipe = body_pipe; - copy->url.setScheme(url.getScheme()); - copy->url.userInfo(url.userInfo()); - copy->url.host(url.host()); - copy->url.port(url.port()); - copy->url.path(url.path()); + copy->url = url; // range handled in hdrCacheInit() copy->ims = ims; copy->imslen = imslen; copy->hier = hier; // Is it safe to copy? Should we? copy->errType = errType; // XXX: what to do with copy->peer_login? copy->lastmod = lastmod; copy->etag = etag; copy->vary_headers = vary_headers; // XXX: what to do with copy->peer_domain? copy->tag = tag; copy->extacl_log = extacl_log; copy->extacl_message = extacl_message; const bool inheritWorked = copy->inheritProperties(this); === modified file 'src/HttpRequest.h' --- src/HttpRequest.h 2016-03-25 20:11:29 +0000 +++ src/HttpRequest.h 2016-04-15 18:04:34 +0000 @@ -31,45 +31,45 @@ #if USE_SQUID_EUI #include "eui/Eui48.h" #include "eui/Eui64.h" #endif class ConnStateData; /* Http Request */ void httpRequestPack(void *obj, Packable *p); class HttpHdrRange; class HttpRequest: public HttpMsg { MEMPROXY_CLASS(HttpRequest); public: typedef RefCount<HttpRequest> Pointer; HttpRequest(); - HttpRequest(const HttpRequestMethod& aMethod, AnyP::ProtocolType aProtocol, const char *aUrlpath); + HttpRequest(const HttpRequestMethod& aMethod, AnyP::ProtocolType aProtocol, const char *schemeImage, const char *aUrlpath); ~HttpRequest(); virtual void reset(); - void initHTTP(const HttpRequestMethod& aMethod, AnyP::ProtocolType aProtocol, const char *aUrlpath); + void initHTTP(const HttpRequestMethod& aMethod, AnyP::ProtocolType aProtocol, const char *schemeImage, const char *aUrlpath); virtual HttpRequest *clone() const; /// Whether response to this request is potentially cachable /// \retval false Not cacheable. /// \retval true Possibly cacheable. Response factors will determine. bool maybeCacheable(); bool conditional() const; ///< has at least one recognized If-* header /// whether the client is likely to be able to handle a 1xx reply bool canHandle1xx() const; #if USE_ADAPTATION /// Returns possibly nil history, creating it if adapt. logging is enabled Adaptation::History::Pointer adaptLogHistory() const; /// Returns possibly nil history, creating it if requested Adaptation::History::Pointer adaptHistory(bool createIfNone = false) const; /// Makes their history ours, throwing on conflicts void adaptHistoryImport(const HttpRequest &them); === modified file 'src/PeerPoolMgr.cc' --- src/PeerPoolMgr.cc 2016-02-02 15:39:23 +0000 +++ src/PeerPoolMgr.cc 2016-06-15 11:32:59 +0000 @@ -45,41 +45,41 @@ request(), opener(), securer(), closer(), addrUsed(0) { } PeerPoolMgr::~PeerPoolMgr() { cbdataReferenceDone(peer); } void PeerPoolMgr::start() { AsyncJob::start(); // ErrorState, getOutgoingAddress(), and other APIs may require a request. // We fake one. TODO: Optionally send this request to peers? - request = new HttpRequest(Http::METHOD_OPTIONS, AnyP::PROTO_HTTP, "*"); + request = new HttpRequest(Http::METHOD_OPTIONS, AnyP::PROTO_HTTP, "http", "*"); request->url.host(peer->host); checkpoint("peer initialized"); } void PeerPoolMgr::swanSong() { AsyncJob::swanSong(); } bool PeerPoolMgr::validPeer() const { return peer && cbdataReferenceValid(peer) && peer->standby.pool; } bool PeerPoolMgr::doneAll() const { === modified file 'src/URL.h' --- src/URL.h 2016-02-23 08:51:22 +0000 +++ src/URL.h 2016-06-15 11:36:47 +0000 @@ -11,55 +11,69 @@ #include "anyp/UriScheme.h" #include "ip/Address.h" #include "rfc2181.h" #include "sbuf/SBuf.h" #include <iosfwd> /** * The URL class represents a Uniform Resource Location * * Governed by RFC 3986 */ class URL { MEMPROXY_CLASS(URL); public: URL() : hostIsNumeric_(false), port_(0) {*host_=0;} URL(AnyP::UriScheme const &aScheme); + URL(const URL &other) { + this->operator =(other); + } + URL &operator =(const URL &o) { + scheme_ = o.scheme_; + userInfo_ = o.userInfo_; + memcpy(host_, o.host_, sizeof(host_)); + hostIsNumeric_ = o.hostIsNumeric_; + hostAddr_ = o.hostAddr_; + port_ = o.port_; + path_ = o.path_; + touch(); + return *this; + } void clear() { scheme_=AnyP::PROTO_NONE; hostIsNumeric_ = false; *host_ = 0; hostAddr_.setEmpty(); port_ = 0; touch(); } void touch(); ///< clear the cached URI display forms AnyP::UriScheme const & getScheme() const {return scheme_;} /// convert the URL scheme to that given - void setScheme(const AnyP::ProtocolType &p) {scheme_=p; touch();} + void setScheme(const AnyP::ProtocolType &p, const char *str) {scheme_ = AnyP::UriScheme(p, str); touch();} void userInfo(const SBuf &s) {userInfo_=s; touch();} const SBuf &userInfo() const {return userInfo_;} void host(const char *src); const char *host(void) const {return host_;} int hostIsNumeric(void) const {return hostIsNumeric_;} Ip::Address const & hostIP(void) const {return hostAddr_;} void port(unsigned short p) {port_=p; touch();} unsigned short port() const {return port_;} void path(const char *p) {path_=p; touch();} void path(const SBuf &p) {path_=p; touch();} const SBuf &path() const; /// the static '/' default URL-path static const SBuf &SlashPath(); /// the static '*' pseudo-URL @@ -114,43 +128,51 @@ // XXX: uses char[] instead of SBUf to reduce performance regressions // from c_str() since most code using this is not yet using SBuf char host_[SQUIDHOSTNAMELEN]; ///< string representation of the URI authority name or IP bool hostIsNumeric_; ///< whether the authority 'host' is a raw-IP Ip::Address hostAddr_; ///< binary representation of the URI authority if it is a raw-IP unsigned short port_; ///< URL port // XXX: for now includes query-string. SBuf path_; ///< URL path segment // pre-assembled URL forms mutable SBuf authorityHttp_; ///< RFC 7230 section 5.3.3 authority, maybe without default-port mutable SBuf authorityWithPort_; ///< RFC 7230 section 5.3.3 authority with explicit port mutable SBuf absolute_; ///< RFC 7230 section 5.3.2 absolute-URI }; inline std::ostream & operator <<(std::ostream &os, const URL &url) { - if (const char *sc = url.getScheme().c_str()) - os << sc << ":"; - os << "//" << url.authority() << url.path(); + // none means explicit empty string for scheme. + if (url.getScheme() != AnyP::PROTO_NONE) + os << url.getScheme().image(); + os << ":"; + + // no authority section on URN + if (url.getScheme() != AnyP::PROTO_URN) + os << "//" << url.authority(); + + // path is what it is - including absent + os << url.path(); return os; } class HttpRequest; class HttpRequestMethod; void urlInitialize(void); HttpRequest *urlParse(const HttpRequestMethod&, char *, HttpRequest *request = NULL); char *urlCanonicalClean(const HttpRequest *); const char *urlCanonicalFakeHttps(const HttpRequest * request); bool urlIsRelative(const char *); char *urlMakeAbsolute(const HttpRequest *, const char *); char *urlRInternal(const char *host, unsigned short port, const char *dir, const char *name); char *urlInternal(const char *dir, const char *name); /** * matchDomainName() compares a hostname (usually extracted from traffic) * with a domainname (usually from an ACL) according to the following rules: * * HOST | DOMAIN | MATCH? === modified file 'src/anyp/UriScheme.cc' --- src/anyp/UriScheme.cc 2016-01-01 00:12:18 +0000 +++ src/anyp/UriScheme.cc 2016-04-15 18:09:43 +0000 @@ -1,49 +1,51 @@ /* * Copyright (C) 1996-2016 The Squid Software Foundation and contributors * * Squid software is distributed under GPLv2+ license and includes * contributions from numerous individuals and organizations. * Please see the COPYING and CONTRIBUTORS files for details. */ /* DEBUG: section 23 URL Scheme parsing */ #include "squid.h" #include "anyp/UriScheme.h" -char const * -AnyP::UriScheme::c_str() const +AnyP::UriScheme::UriScheme(AnyP::ProtocolType const aScheme, const char *img) : + theScheme_(aScheme) { - if (theScheme_ == AnyP::PROTO_UNKNOWN) - return "(unknown)"; - - static char out[BUFSIZ]; - int p = 0; - - if (theScheme_ > AnyP::PROTO_NONE && theScheme_ < AnyP::PROTO_MAX) { - const char *in = AnyP::ProtocolType_str[theScheme_]; - for (; p < (BUFSIZ-1) && in[p] != '\0'; ++p) - out[p] = xtolower(in[p]); + if (img) + // image could be provided explicitly (case-sensitive) + image_ = img; + + else if (theScheme_ == AnyP::PROTO_UNKNOWN) + // image could be actually unknown and not provided + image_ = "(unknown)"; + + else if (theScheme_ > AnyP::PROTO_NONE && theScheme_ < AnyP::PROTO_MAX) { + // image could be implied by a registered transfer protocol + // which use upper-case labels, so down-case for scheme image + image_ = AnyP::ProtocolType_str[theScheme_]; + image_.toLower(); } - out[p] = '\0'; - return out; + // else, image is an empty string ("://example.com/") } unsigned short AnyP::UriScheme::defaultPort() const { switch (theScheme_) { case AnyP::PROTO_HTTP: return 80; case AnyP::PROTO_HTTPS: return 443; case AnyP::PROTO_FTP: return 21; case AnyP::PROTO_COAP: case AnyP::PROTO_COAPS: // coaps:// default is TBA as of draft-ietf-core-coap-08. // Assuming IANA policy of allocating same port for base and TLS protocol versions will occur. === modified file 'src/anyp/UriScheme.h' --- src/anyp/UriScheme.h 2016-01-01 00:12:18 +0000 +++ src/anyp/UriScheme.h 2016-06-15 11:06:36 +0000 @@ -1,59 +1,68 @@ /* * Copyright (C) 1996-2016 The Squid Software Foundation and contributors * * Squid software is distributed under GPLv2+ license and includes * contributions from numerous individuals and organizations. * Please see the COPYING and CONTRIBUTORS files for details. */ #ifndef SQUID_ANYP_URISCHEME_H #define SQUID_ANYP_URISCHEME_H #include "anyp/ProtocolType.h" +#include "sbuf/SBuf.h" #include <iosfwd> namespace AnyP { /** This class represents a URI Scheme such as http:// https://, wais://, urn: etc. * It does not represent the PROTOCOL that such schemes refer to. */ class UriScheme { public: UriScheme() : theScheme_(AnyP::PROTO_NONE) {} - UriScheme(AnyP::ProtocolType const aScheme) : theScheme_(aScheme) {} + UriScheme(AnyP::ProtocolType const aScheme, const char *img = nullptr); + UriScheme(const AnyP::UriScheme &o) : theScheme_(o.theScheme_), image_(o.image_) {} + UriScheme(AnyP::UriScheme &&) = default; ~UriScheme() {} - operator AnyP::ProtocolType() const { return theScheme_; } + AnyP::UriScheme& operator=(const AnyP::UriScheme &o) { + theScheme_ = o.theScheme_; + image_ = o.image_; + return *this; + } + AnyP::UriScheme& operator=(AnyP::UriScheme &&) = default; + operator AnyP::ProtocolType() const { return theScheme_; } + // XXX: does not account for comparison of unknown schemes (by image) bool operator != (AnyP::ProtocolType const & aProtocol) const { return theScheme_ != aProtocol; } /** Get a char string representation of the scheme. - * Does not include the ':' or '://" terminators. - * - * An upper bound length of BUFSIZ bytes converted. Remainder will be truncated. - * The result of this call will remain usable only until any subsequest call - * and must be copied if persistence is needed. + * Does not include the ':' or "://" terminators. */ - char const *c_str() const; + SBuf image() const {return image_;} unsigned short defaultPort() const; private: /// This is a typecode pointer into the enum/registry of protocols handled. AnyP::ProtocolType theScheme_; + + /// the string representation + SBuf image_; }; } // namespace AnyP inline std::ostream & operator << (std::ostream &os, AnyP::UriScheme const &scheme) { - os << scheme.c_str(); + os << scheme.image(); return os; } #endif /* SQUID_ANYP_URISCHEME_H */ === modified file 'src/cache_cf.cc' --- src/cache_cf.cc 2016-04-03 23:41:58 +0000 +++ src/cache_cf.cc 2016-04-15 18:12:34 +0000 @@ -3329,41 +3329,41 @@ check_null_IpAddress_list(const Ip::Address_list * s) { return NULL == s; } #endif /* CURRENTLY_UNUSED */ #endif /* USE_WCCPv2 */ static void parsePortSpecification(const AnyP::PortCfgPointer &s, char *token) { char *host = NULL; unsigned short port = 0; char *t = NULL; char *junk = NULL; s->disable_pmtu_discovery = DISABLE_PMTU_OFF; s->name = xstrdup(token); s->connection_auth_disabled = false; - const char *portType = AnyP::UriScheme(s->transport.protocol).c_str(); + const SBuf &portType = AnyP::UriScheme(s->transport.protocol).image(); if (*token == '[') { /* [ipv6]:port */ host = token + 1; t = strchr(host, ']'); if (!t) { debugs(3, DBG_CRITICAL, "FATAL: " << portType << "_port: missing ']' on IPv6 address: " << token); self_destruct(); } *t = '\0'; ++t; if (*t != ':') { debugs(3, DBG_CRITICAL, "FATAL: " << portType << "_port: missing Port in: " << token); self_destruct(); } if (!Ip::EnableIpv6) { debugs(3, DBG_CRITICAL, "FATAL: " << portType << "_port: IPv6 is not available."); self_destruct(); } port = xatos(t + 1); @@ -3705,41 +3705,41 @@ debugs(3,DBG_CRITICAL, "FATAL: https_port: require-proxy-header option is not supported on HTTPS ports."); self_destruct(); } } else if (protoName.cmp("FTP") == 0) { /* ftp_port does not support ssl-bump */ if (s->flags.tunnelSslBumping) { debugs(3, DBG_CRITICAL, "FATAL: ssl-bump is not supported for ftp_port."); self_destruct(); } if (s->flags.proxySurrogate) { // Passive FTP data channel does not work without deep protocol inspection in the frontend. debugs(3,DBG_CRITICAL, "FATAL: require-proxy-header option is not supported on ftp_port."); self_destruct(); } } if (Ip::EnableIpv6&IPV6_SPECIAL_SPLITSTACK && s->s.isAnyAddr()) { // clone the port options from *s to *(s->next) s->next = s->clone(); s->next->s.setIPv4(); - debugs(3, 3, AnyP::UriScheme(s->transport.protocol).c_str() << "_port: clone wildcard address for split-stack: " << s->s << " and " << s->next->s); + debugs(3, 3, AnyP::UriScheme(s->transport.protocol).image() << "_port: clone wildcard address for split-stack: " << s->s << " and " << s->next->s); } while (*head != NULL) head = &((*head)->next); *head = s; } static void dump_generic_port(StoreEntry * e, const char *n, const AnyP::PortCfgPointer &s) { char buf[MAX_IPSTRLEN]; storeAppendPrintf(e, "%s %s", n, s->s.toUrl(buf,MAX_IPSTRLEN)); // MODES and specific sub-options. if (s->flags.natIntercept) storeAppendPrintf(e, " intercept"); @@ -3749,41 +3749,41 @@ else if (s->flags.proxySurrogate) storeAppendPrintf(e, " require-proxy-header"); else if (s->flags.accelSurrogate) { storeAppendPrintf(e, " accel"); if (s->vhost) storeAppendPrintf(e, " vhost"); if (s->vport < 0) storeAppendPrintf(e, " vport"); else if (s->vport > 0) storeAppendPrintf(e, " vport=%d", s->vport); if (s->defaultsite) storeAppendPrintf(e, " defaultsite=%s", s->defaultsite); // TODO: compare against prefix of 'n' instead of assuming http_port if (s->transport.protocol != AnyP::PROTO_HTTP) - storeAppendPrintf(e, " protocol=%s", AnyP::UriScheme(s->transport.protocol).c_str()); + storeAppendPrintf(e, " protocol=%s", AnyP::ProtocolType_str[s->transport.protocol]); if (s->allow_direct) storeAppendPrintf(e, " allow-direct"); if (s->ignore_cc) storeAppendPrintf(e, " ignore-cc"); } // Generic independent options if (s->name) storeAppendPrintf(e, " name=%s", s->name); #if USE_HTTP_VIOLATIONS if (!s->flags.accelSurrogate && s->ignore_cc) storeAppendPrintf(e, " ignore-cc"); #endif if (s->connection_auth_disabled) === modified file 'src/carp.cc' --- src/carp.cc 2016-01-01 00:12:18 +0000 +++ src/carp.cc 2016-04-15 18:12:37 +0000 @@ -150,41 +150,41 @@ CachePeer *tp; unsigned int user_hash = 0; unsigned int combined_hash; double score; double high_score = 0; if (n_carp_peers == 0) return NULL; /* calculate hash key */ debugs(39, 2, "carpSelectParent: Calculating hash for " << request->effectiveRequestUri()); /* select CachePeer */ for (k = 0; k < n_carp_peers; ++k) { SBuf key; tp = carp_peers[k]; if (tp->options.carp_key.set) { // this code follows URI syntax pattern. // corner cases should use the full effective request URI if (tp->options.carp_key.scheme) { - key.append(request->url.getScheme().c_str()); + key.append(request->url.getScheme().image()); if (key.length()) //if the scheme is not empty key.append("://"); } if (tp->options.carp_key.host) { key.append(request->url.host()); } if (tp->options.carp_key.port) { key.appendf(":%u", request->url.port()); } if (tp->options.carp_key.path) { // XXX: fix when path and query are separate key.append(request->url.path().substr(0,request->url.path().find('?'))); // 0..N } if (tp->options.carp_key.params) { // XXX: fix when path and query are separate SBuf::size_type pos; if ((pos=request->url.path().find('?')) != SBuf::npos) key.append(request->url.path().substr(pos)); // N..npos } } === modified file 'src/client_side.cc' --- src/client_side.cc 2016-05-20 13:20:27 +0000 +++ src/client_side.cc 2016-06-15 11:32:08 +0000 @@ -1202,92 +1202,96 @@ const bool switchedToHttps = conn->switchedToHttps(); const bool tryHostHeader = vhost || switchedToHttps; char *host = NULL; if (tryHostHeader && (host = hp->getHeaderField("Host"))) { debugs(33, 5, "ACCEL VHOST REWRITE: vhost=" << host << " + vport=" << vport); char thost[256]; if (vport > 0) { thost[0] = '\0'; char *t = NULL; if (host[strlen(host)] != ']' && (t = strrchr(host,':')) != NULL) { strncpy(thost, host, (t-host)); snprintf(thost+(t-host), sizeof(thost)-(t-host), ":%d", vport); host = thost; } else if (!t) { snprintf(thost, sizeof(thost), "%s:%d",host, vport); host = thost; } } // else nothing to alter port-wise. const int url_sz = hp->requestUri().length() + 32 + Config.appendDomainLen + strlen(host); http->uri = (char *)xcalloc(url_sz, 1); - snprintf(http->uri, url_sz, "%s://%s" SQUIDSBUFPH, AnyP::UriScheme(conn->transferProtocol.protocol).c_str(), host, SQUIDSBUFPRINT(url)); + const SBuf &scheme = AnyP::UriScheme(conn->transferProtocol.protocol).image(); + snprintf(http->uri, url_sz, SQUIDSBUFPH "://%s" SQUIDSBUFPH, SQUIDSBUFPRINT(scheme), host, SQUIDSBUFPRINT(url)); debugs(33, 5, "ACCEL VHOST REWRITE: " << http->uri); } else if (conn->port->defaultsite /* && !vhost */) { debugs(33, 5, "ACCEL DEFAULTSITE REWRITE: defaultsite=" << conn->port->defaultsite << " + vport=" << vport); const int url_sz = hp->requestUri().length() + 32 + Config.appendDomainLen + strlen(conn->port->defaultsite); http->uri = (char *)xcalloc(url_sz, 1); char vportStr[32]; vportStr[0] = '\0'; if (vport > 0) { snprintf(vportStr, sizeof(vportStr),":%d",vport); } - snprintf(http->uri, url_sz, "%s://%s%s" SQUIDSBUFPH, - AnyP::UriScheme(conn->transferProtocol.protocol).c_str(), conn->port->defaultsite, vportStr, SQUIDSBUFPRINT(url)); + const SBuf &scheme = AnyP::UriScheme(conn->transferProtocol.protocol).image(); + snprintf(http->uri, url_sz, SQUIDSBUFPH "://%s%s" SQUIDSBUFPH, + SQUIDSBUFPRINT(scheme), conn->port->defaultsite, vportStr, SQUIDSBUFPRINT(url)); debugs(33, 5, "ACCEL DEFAULTSITE REWRITE: " << http->uri); } else if (vport > 0 /* && (!vhost || no Host:) */) { debugs(33, 5, "ACCEL VPORT REWRITE: *_port IP + vport=" << vport); /* Put the local socket IP address as the hostname, with whatever vport we found */ const int url_sz = hp->requestUri().length() + 32 + Config.appendDomainLen; http->uri = (char *)xcalloc(url_sz, 1); http->getConn()->clientConnection->local.toHostStr(ipbuf,MAX_IPSTRLEN); - snprintf(http->uri, url_sz, "%s://%s:%d" SQUIDSBUFPH, - AnyP::UriScheme(conn->transferProtocol.protocol).c_str(), - ipbuf, vport, SQUIDSBUFPRINT(url)); + const SBuf &scheme = AnyP::UriScheme(conn->transferProtocol.protocol).image(); + snprintf(http->uri, url_sz, SQUIDSBUFPH "://%s:%d" SQUIDSBUFPH, + SQUIDSBUFPRINT(scheme), ipbuf, vport, SQUIDSBUFPRINT(url)); debugs(33, 5, "ACCEL VPORT REWRITE: " << http->uri); } } static void prepareTransparentURL(ConnStateData * conn, ClientHttpRequest *http, const Http1::RequestParserPointer &hp) { // TODO Must() on URI !empty when the parser supports throw. For now avoid assert(). if (!hp->requestUri().isEmpty() && hp->requestUri()[0] != '/') return; /* already in good shape */ /* BUG: Squid cannot deal with '*' URLs (RFC2616 5.1.2) */ if (const char *host = hp->getHeaderField("Host")) { const int url_sz = hp->requestUri().length() + 32 + Config.appendDomainLen + strlen(host); http->uri = (char *)xcalloc(url_sz, 1); - snprintf(http->uri, url_sz, "%s://%s" SQUIDSBUFPH, - AnyP::UriScheme(conn->transferProtocol.protocol).c_str(), host, SQUIDSBUFPRINT(hp->requestUri())); + const SBuf &scheme = AnyP::UriScheme(conn->transferProtocol.protocol).image(); + snprintf(http->uri, url_sz, SQUIDSBUFPH "://%s" SQUIDSBUFPH, + SQUIDSBUFPRINT(scheme), host, SQUIDSBUFPRINT(hp->requestUri())); debugs(33, 5, "TRANSPARENT HOST REWRITE: " << http->uri); } else { /* Put the local socket IP address as the hostname. */ const int url_sz = hp->requestUri().length() + 32 + Config.appendDomainLen; http->uri = (char *)xcalloc(url_sz, 1); static char ipbuf[MAX_IPSTRLEN]; http->getConn()->clientConnection->local.toHostStr(ipbuf,MAX_IPSTRLEN); - snprintf(http->uri, url_sz, "%s://%s:%d" SQUIDSBUFPH, - AnyP::UriScheme(http->getConn()->transferProtocol.protocol).c_str(), + const SBuf &scheme = AnyP::UriScheme(http->getConn()->transferProtocol.protocol).image(); + snprintf(http->uri, url_sz, SQUIDSBUFPH "://%s:%d" SQUIDSBUFPH, + SQUIDSBUFPRINT(scheme), ipbuf, http->getConn()->clientConnection->local.port(), SQUIDSBUFPRINT(hp->requestUri())); debugs(33, 5, "TRANSPARENT REWRITE: " << http->uri); } } /** Parse an HTTP request * * \note Sets result->flags.parsed_ok to 0 if failed to parse the request, * to 1 if the request was correctly parsed. * \param[in] csd a ConnStateData. The caller must make sure it is not null * \param[in] hp an Http1::RequestParser * \param[out] mehtod_p will be set as a side-effect of the parsing. * Pointed-to value will be set to Http::METHOD_NONE in case of * parsing failure * \param[out] http_ver will be set as a side-effect of the parsing * \return NULL on incomplete requests, * a Http::Stream on success or failure. */ Http::Stream * parseHttpRequest(ConnStateData *csd, const Http1::RequestParserPointer &hp) @@ -1660,41 +1664,41 @@ request->flags.intercepted = ((http->clientConnection->flags & COMM_INTERCEPTION) != 0); request->flags.interceptTproxy = ((http->clientConnection->flags & COMM_TRANSPARENT) != 0 ) ; static const bool proxyProtocolPort = (conn->port != NULL) ? conn->port->flags.proxySurrogate : false; if (request->flags.interceptTproxy && !proxyProtocolPort) { if (Config.accessList.spoof_client_ip) { ACLFilledChecklist *checklist = clientAclChecklistCreate(Config.accessList.spoof_client_ip, http); request->flags.spoofClientIp = (checklist->fastCheck() == ACCESS_ALLOWED); delete checklist; } else request->flags.spoofClientIp = true; } else request->flags.spoofClientIp = false; } if (internalCheck(request->url.path())) { if (internalHostnameIs(request->url.host()) && request->url.port() == getMyPort()) { debugs(33, 2, "internal URL found: " << request->url.getScheme() << "://" << request->url.authority(true)); http->flags.internal = true; } else if (Config.onoff.global_internal_static && internalStaticCheck(request->url.path())) { debugs(33, 2, "internal URL found: " << request->url.getScheme() << "://" << request->url.authority(true) << " (global_internal_static on)"); - request->url.setScheme(AnyP::PROTO_HTTP); + request->url.setScheme(AnyP::PROTO_HTTP, "http"); request->url.host(internalHostname()); request->url.port(getMyPort()); http->flags.internal = true; } else debugs(33, 2, "internal URL found: " << request->url.getScheme() << "://" << request->url.authority(true) << " (not this proxy)"); } request->flags.internal = http->flags.internal; setLogUri (http, urlCanonicalClean(request.getRaw())); request->client_addr = conn->clientConnection->remote; // XXX: remove reuest->client_addr member. #if FOLLOW_X_FORWARDED_FOR // indirect client gets stored here because it is an HTTP header result (from X-Forwarded-For:) // not a details about teh TCP connection itself request->indirect_client_addr = conn->clientConnection->remote; #endif /* FOLLOW_X_FORWARDED_FOR */ request->my_addr = conn->clientConnection->local; request->myportname = conn->port->name; if (!isFtp) { // XXX: for non-HTTP messages instantiate a different HttpMsg child type @@ -3431,41 +3435,41 @@ } return true; } /// find any unused HttpSockets[] slot and store fd there or return false static bool AddOpenedHttpSocket(const Comm::ConnectionPointer &conn) { bool found = false; for (int i = 0; i < NHttpSockets && !found; ++i) { if ((found = HttpSockets[i] < 0)) HttpSockets[i] = conn->fd; } return found; } static void clientHttpConnectionsOpen(void) { for (AnyP::PortCfgPointer s = HttpPortList; s != NULL; s = s->next) { - const char *scheme = AnyP::UriScheme(s->transport.protocol).c_str(); + const SBuf &scheme = AnyP::UriScheme(s->transport.protocol).image(); if (MAXTCPLISTENPORTS == NHttpSockets) { debugs(1, DBG_IMPORTANT, "WARNING: You have too many '" << scheme << "_port' lines."); debugs(1, DBG_IMPORTANT, " The limit is " << MAXTCPLISTENPORTS << " HTTP ports."); continue; } #if USE_OPENSSL if (s->flags.tunnelSslBumping) { if (!Config.accessList.ssl_bump) { debugs(33, DBG_IMPORTANT, "WARNING: No ssl_bump configured. Disabling ssl-bump on " << scheme << "_port " << s->s); s->flags.tunnelSslBumping = false; } if (!s->secure.staticContext && !s->generateHostCertificates) { debugs(1, DBG_IMPORTANT, "Will not bump SSL at " << scheme << "_port " << s->s << " due to TLS initialization failure."); s->flags.tunnelSslBumping = false; if (s->transport.protocol == AnyP::PROTO_HTTP) s->secure.encryptTransport = false; } if (s->flags.tunnelSslBumping) { === modified file 'src/client_side_reply.cc' --- src/client_side_reply.cc 2016-06-02 09:49:19 +0000 +++ src/client_side_reply.cc 2016-06-15 11:28:54 +0000 @@ -2210,41 +2210,41 @@ } } holdingBuffer = result; processReplyAccess(); return; } /* Using this breaks the client layering just a little! */ void clientReplyContext::createStoreEntry(const HttpRequestMethod& m, RequestFlags reqFlags) { assert(http != NULL); /* * For erroneous requests, we might not have a h->request, * so make a fake one. */ if (http->request == NULL) { - http->request = new HttpRequest(m, AnyP::PROTO_NONE, null_string); + http->request = new HttpRequest(m, AnyP::PROTO_NONE, "http", null_string); HTTPMSGLOCK(http->request); } StoreEntry *e = storeCreateEntry(storeId(), http->log_uri, reqFlags, m); // Make entry collapsable ASAP, to increase collapsing chances for others, // TODO: every must-revalidate and similar request MUST reach the origin, // but do we have to prohibit others from collapsing on that request? if (Config.onoff.collapsed_forwarding && reqFlags.cachable && !reqFlags.needValidation && (m == Http::METHOD_GET || m == Http::METHOD_HEAD)) { // make the entry available for future requests now Store::Root().allowCollapsing(e, reqFlags, m); } sc = storeClientListAdd(e, this); #if USE_DELAY_POOLS sc->setDelayId(DelayId::DelayClient(http)); #endif === modified file 'src/errorpage.cc' --- src/errorpage.cc 2016-04-17 11:49:54 +0000 +++ src/errorpage.cc 2016-04-19 11:04:53 +0000 @@ -929,41 +929,42 @@ case 'O': if (!building_deny_info_url) do_quote = 0; case 'o': p = request ? request->extacl_message.termedBuf() : external_acl_message; if (!p && !building_deny_info_url) p = "[not available]"; break; case 'p': if (request) { mb.appendf("%u", request->url.port()); } else if (!building_deny_info_url) { p = "[unknown port]"; } break; case 'P': if (request) { - p = request->url.getScheme().c_str(); + const SBuf &m = request->url.getScheme().image(); + mb.append(m.rawContent(), m.length()); } else if (!building_deny_info_url) { p = "[unknown protocol]"; } break; case 'R': if (building_deny_info_url) { if (request != NULL) { SBuf tmp = request->url.path(); p = tmp.c_str(); no_urlescape = 1; } else p = "[no request]"; break; } if (request != NULL) { mb.appendf(SQUIDSBUFPH " " SQUIDSBUFPH " %s/%d.%d\n", SQUIDSBUFPRINT(request->method.image()), SQUIDSBUFPRINT(request->url.path()), AnyP::ProtocolType_str[request->http_ver.protocol], === modified file 'src/format/Format.cc' --- src/format/Format.cc 2016-03-25 13:03:30 +0000 +++ src/format/Format.cc 2016-04-15 18:14:34 +0000 @@ -979,41 +979,43 @@ if (al->request) { const SBuf &s = al->request->method.image(); sb.append(s.rawContent(), s.length()); out = sb.termedBuf(); quote = 1; } break; case LFT_CLIENT_REQ_URI: // original client URI if (al->request) { const SBuf &s = al->request->effectiveRequestUri(); sb.append(s.rawContent(), s.length()); out = sb.termedBuf(); quote = 1; } break; case LFT_CLIENT_REQ_URLSCHEME: if (al->request) { - out = al->request->url.getScheme().c_str(); + const SBuf s(al->request->url.getScheme().image()); + sb.append(s.rawContent(), s.length()); + out = sb.termedBuf(); quote = 1; } break; case LFT_CLIENT_REQ_URLDOMAIN: if (al->request) { out = al->request->url.host(); quote = 1; } break; case LFT_CLIENT_REQ_URLPORT: if (al->request) { outint = al->request->url.port(); doint = 1; } break; case LFT_REQUEST_URLPATH_OLD_31: case LFT_CLIENT_REQ_URLPATH: @@ -1058,41 +1060,43 @@ if (al->adapted_request) { const SBuf &s = al->adapted_request->method.image(); sb.append(s.rawContent(), s.length()); out = sb.termedBuf(); quote = 1; } break; case LFT_SERVER_REQ_URI: // adapted request URI sent to server/peer if (al->adapted_request) { const SBuf &s = al->adapted_request->effectiveRequestUri(); sb.append(s.rawContent(), s.length()); out = sb.termedBuf(); quote = 1; } break; case LFT_SERVER_REQ_URLSCHEME: if (al->adapted_request) { - out = al->adapted_request->url.getScheme().c_str(); + const SBuf s(al->adapted_request->url.getScheme().image()); + sb.append(s.rawContent(), s.length()); + out = sb.termedBuf(); quote = 1; } break; case LFT_SERVER_REQ_URLDOMAIN: if (al->adapted_request) { out = al->adapted_request->url.host(); quote = 1; } break; case LFT_SERVER_REQ_URLPORT: if (al->adapted_request) { outint = al->adapted_request->url.port(); doint = 1; } break; case LFT_SERVER_REQ_URLPATH: if (al->adapted_request) { === modified file 'src/tests/stub_HttpRequest.cc' --- src/tests/stub_HttpRequest.cc 2016-03-17 03:28:14 +0000 +++ src/tests/stub_HttpRequest.cc 2016-04-15 18:22:14 +0000 @@ -1,42 +1,42 @@ /* * Copyright (C) 1996-2016 The Squid Software Foundation and contributors * * Squid software is distributed under GPLv2+ license and includes * contributions from numerous individuals and organizations. * Please see the COPYING and CONTRIBUTORS files for details. */ #include "squid.h" #include "AccessLogEntry.h" #include "HttpRequest.h" #define STUB_API "HttpRequest.cc" #include "tests/STUB.h" // void httpRequestPack(void *obj, Packable *p); HttpRequest::HttpRequest() : HttpMsg(hoRequest) {STUB} -HttpRequest::HttpRequest(const HttpRequestMethod &, AnyP::ProtocolType, const char *) : HttpMsg(hoRequest) {STUB} +HttpRequest::HttpRequest(const HttpRequestMethod &, AnyP::ProtocolType, const char *, const char *) : HttpMsg(hoRequest) {STUB} HttpRequest::~HttpRequest() STUB void HttpRequest::reset() STUB -void HttpRequest::initHTTP(const HttpRequestMethod &, AnyP::ProtocolType, const char *) STUB +void HttpRequest::initHTTP(const HttpRequestMethod &, AnyP::ProtocolType, const char *, const char *) STUB HttpRequest * HttpRequest::clone() const STUB_RETVAL(NULL) bool HttpRequest::maybeCacheable() STUB_RETVAL(false) bool HttpRequest::conditional() const STUB_RETVAL(false) bool HttpRequest::canHandle1xx() const STUB_RETVAL(false) #if USE_ADAPTATION Adaptation::History::Pointer HttpRequest::adaptLogHistory() const STUB_RETVAL(Adaptation::History::Pointer()) Adaptation::History::Pointer HttpRequest::adaptHistory(bool) const STUB_RETVAL(Adaptation::History::Pointer()) void HttpRequest::adaptHistoryImport(const HttpRequest &) STUB #endif #if ICAP_CLIENT Adaptation::Icap::History::Pointer HttpRequest::icapHistory() const STUB_RETVAL(Adaptation::Icap::History::Pointer()) #endif void HttpRequest::recordLookup(const Dns::LookupDetails &) STUB void HttpRequest::detailError(err_type, int) STUB void HttpRequest::clearError() STUB void HttpRequest::clean() STUB void HttpRequest::init() STUB static const SBuf nilSBuf; const SBuf &HttpRequest::effectiveRequestUri() const STUB_RETVAL(nilSBuf) bool HttpRequest::multipartRangeRequest() const STUB_RETVAL(false) === modified file 'src/tests/testUriScheme.cc' --- src/tests/testUriScheme.cc 2016-01-01 00:12:18 +0000 +++ src/tests/testUriScheme.cc 2016-04-15 18:24:08 +0000 @@ -95,43 +95,43 @@ /* * we should be able to construct a AnyP::UriScheme from the old 'protocol_t' enum. */ void testUriScheme::testConstructprotocol_t() { AnyP::UriScheme lhs_none(AnyP::PROTO_NONE), rhs_none(AnyP::PROTO_NONE); CPPUNIT_ASSERT_EQUAL(lhs_none, rhs_none); AnyP::UriScheme lhs_cacheobj(AnyP::PROTO_CACHE_OBJECT), rhs_cacheobj(AnyP::PROTO_CACHE_OBJECT); CPPUNIT_ASSERT_EQUAL(lhs_cacheobj, rhs_cacheobj); CPPUNIT_ASSERT(lhs_none != rhs_cacheobj); } /* * we should be able to get a char const * version of the method. */ void testUriScheme::testC_str() { - String lhs("wais"); + SBuf lhs("wais"); AnyP::UriScheme wais(AnyP::PROTO_WAIS); - String rhs(wais.c_str()); + SBuf rhs(wais.image()); CPPUNIT_ASSERT_EQUAL(lhs, rhs); } /* * a AnyP::UriScheme replaces protocol_t, so we should be able to test for equality on * either the left or right hand side seamlessly. */ void testUriScheme::testEqualprotocol_t() { CPPUNIT_ASSERT(AnyP::UriScheme() == AnyP::PROTO_NONE); CPPUNIT_ASSERT(not (AnyP::UriScheme(AnyP::PROTO_WAIS) == AnyP::PROTO_HTTP)); CPPUNIT_ASSERT(AnyP::PROTO_HTTP == AnyP::UriScheme(AnyP::PROTO_HTTP)); CPPUNIT_ASSERT(not (AnyP::PROTO_CACHE_OBJECT == AnyP::UriScheme(AnyP::PROTO_HTTP))); } /* * a AnyP::UriScheme should testable for inequality with a protocol_t. */ void === modified file 'src/url.cc' --- src/url.cc 2016-05-02 15:18:33 +0000 +++ src/url.cc 2016-06-15 11:31:37 +0000 @@ -1,40 +1,41 @@ /* * Copyright (C) 1996-2016 The Squid Software Foundation and contributors * * Squid software is distributed under GPLv2+ license and includes * contributions from numerous individuals and organizations. * Please see the COPYING and CONTRIBUTORS files for details. */ /* DEBUG: section 23 URL Parsing */ #include "squid.h" #include "globals.h" #include "HttpRequest.h" #include "rfc1738.h" #include "SquidConfig.h" #include "SquidString.h" #include "URL.h" static HttpRequest *urlParseFinish(const HttpRequestMethod& method, const AnyP::ProtocolType protocol, + const char *const protoStr, const char *const urlpath, const char *const host, const SBuf &login, const int port, HttpRequest *request); static HttpRequest *urnParse(const HttpRequestMethod& method, char *urn, HttpRequest *request); static const char valid_hostname_chars_u[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789-._" "[:]" ; static const char valid_hostname_chars[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789-." "[:]" ; const SBuf & @@ -140,40 +141,43 @@ return AnyP::PROTO_COAP; if (strncasecmp(b, "coaps", len) == 0) return AnyP::PROTO_COAPS; if (strncasecmp(b, "gopher", len) == 0) return AnyP::PROTO_GOPHER; if (strncasecmp(b, "wais", len) == 0) return AnyP::PROTO_WAIS; if (strncasecmp(b, "cache_object", len) == 0) return AnyP::PROTO_CACHE_OBJECT; if (strncasecmp(b, "urn", len) == 0) return AnyP::PROTO_URN; if (strncasecmp(b, "whois", len) == 0) return AnyP::PROTO_WHOIS; + if (len > 0) + return AnyP::PROTO_UNKNOWN; + return AnyP::PROTO_NONE; } /* * Parse a URI/URL. * * If the 'request' arg is non-NULL, put parsed values there instead * of allocating a new HttpRequest. * * This abuses HttpRequest as a way of representing the parsed url * and its components. * method is used to switch parsers and to init the HttpRequest. * If method is Http::METHOD_CONNECT, then rather than a URL a hostname:port is * looked for. * The url is non const so that if its too long we can NULL-terminate it in place. */ /* * This routine parses a URL. Its assumed that the URL is complete - * ie, the end of the string is the end of the URL. Don't pass a partial @@ -197,42 +201,42 @@ const char *src; char *dst; proto[0] = host[0] = urlpath[0] = login[0] = '\0'; if ((l = strlen(url)) + Config.appendDomainLen > (MAX_URL - 1)) { /* terminate so it doesn't overflow other buffers */ *(url + (MAX_URL >> 1)) = '\0'; debugs(23, DBG_IMPORTANT, "urlParse: URL too large (" << l << " bytes)"); return NULL; } if (method == Http::METHOD_CONNECT) { port = CONNECT_PORT; if (sscanf(url, "[%[^]]]:%d", host, &port) < 1) if (sscanf(url, "%[^:]:%d", host, &port) < 1) return NULL; } else if ((method == Http::METHOD_OPTIONS || method == Http::METHOD_TRACE) && URL::Asterisk().cmp(url) == 0) { protocol = AnyP::PROTO_HTTP; - port = AnyP::UriScheme(protocol).defaultPort(); - return urlParseFinish(method, protocol, url, host, SBuf(), port, request); + port = 80; // or the slow way ... AnyP::UriScheme(protocol,"http").defaultPort(); + return urlParseFinish(method, protocol, "http", url, host, SBuf(), port, request); } else if (!strncmp(url, "urn:", 4)) { return urnParse(method, url, request); } else { /* Parse the URL: */ src = url; i = 0; /* Find first : - everything before is protocol */ for (i = 0, dst = proto; i < l && *src != ':'; ++i, ++src, ++dst) { *dst = *src; } if (i >= l) return NULL; *dst = '\0'; /* Then its :// */ if ((i+3) > l || *src != ':' || *(src + 1) != '/' || *(src + 2) != '/') return NULL; i += 3; src += 3; @@ -405,115 +409,117 @@ break; case URI_WHITESPACE_CHOP: *(urlpath + strcspn(urlpath, w_space)) = '\0'; break; case URI_WHITESPACE_STRIP: default: t = q = urlpath; while (*t) { if (!xisspace(*t)) { *q = *t; ++q; } ++t; } *q = '\0'; } } - return urlParseFinish(method, protocol, urlpath, host, SBuf(login), port, request); + return urlParseFinish(method, protocol, proto, urlpath, host, SBuf(login), port, request); } /** * Update request with parsed URI data. If the request arg is * non-NULL, put parsed values there instead of allocating a new * HttpRequest. */ static HttpRequest * urlParseFinish(const HttpRequestMethod& method, const AnyP::ProtocolType protocol, + const char *const protoStr, // for unknown protocols const char *const urlpath, const char *const host, const SBuf &login, const int port, HttpRequest *request) { if (NULL == request) - request = new HttpRequest(method, protocol, urlpath); + request = new HttpRequest(method, protocol, protoStr, urlpath); else { - request->initHTTP(method, protocol, urlpath); + request->initHTTP(method, protocol, protoStr, urlpath); } request->url.host(host); request->url.userInfo(login); request->url.port(port); return request; } static HttpRequest * urnParse(const HttpRequestMethod& method, char *urn, HttpRequest *request) { debugs(50, 5, "urnParse: " << urn); if (request) { - request->initHTTP(method, AnyP::PROTO_URN, urn + 4); + request->initHTTP(method, AnyP::PROTO_URN, "urn", urn + 4); return request; } - return new HttpRequest(method, AnyP::PROTO_URN, urn + 4); + return new HttpRequest(method, AnyP::PROTO_URN, "urn", urn + 4); } void URL::touch() { absolute_.clear(); authorityHttp_.clear(); authorityWithPort_.clear(); } SBuf & URL::authority(bool requirePort) const { if (authorityHttp_.isEmpty()) { // both formats contain Host/IP authorityWithPort_.append(host()); authorityHttp_ = authorityWithPort_; // authorityForm_ only has :port if it is non-default authorityWithPort_.appendf(":%u",port()); if (port() != getScheme().defaultPort()) authorityHttp_ = authorityWithPort_; } return requirePort ? authorityWithPort_ : authorityHttp_; } SBuf & URL::absolute() const { if (absolute_.isEmpty()) { // TODO: most URL will be much shorter, avoid allocating this much absolute_.reserveCapacity(MAX_URL); - absolute_.appendf("%s:", getScheme().c_str()); + absolute_.append(getScheme().image()); + absolute_.append(":",1); if (getScheme() != AnyP::PROTO_URN) { absolute_.append("//", 2); const bool omitUserInfo = getScheme() == AnyP::PROTO_HTTP || getScheme() != AnyP::PROTO_HTTPS || userInfo().isEmpty(); if (!omitUserInfo) { absolute_.append(userInfo()); absolute_.append("@", 1); } absolute_.append(authority()); } absolute_.append(path()); } return absolute_; } /** \todo AYJ: Performance: This is an *almost* duplicate of HttpRequest::effectiveRequestUri(). But elides the query-string. * After copying it on in the first place! Would be less code to merge the two with a flag parameter. * and never copy the query-string part in the first place @@ -603,42 +609,43 @@ */ char * urlMakeAbsolute(const HttpRequest * req, const char *relUrl) { if (req->method.id() == Http::METHOD_CONNECT) { return (NULL); } char *urlbuf = (char *)xmalloc(MAX_URL * sizeof(char)); if (req->url.getScheme() == AnyP::PROTO_URN) { // XXX: this is what the original code did, but it seems to break the // intended behaviour of this function. It returns the stored URN path, // not converting the given one into a URN... snprintf(urlbuf, MAX_URL, SQUIDSBUFPH, SQUIDSBUFPRINT(req->url.absolute())); return (urlbuf); } SBuf authorityForm = req->url.authority(); // host[:port] - size_t urllen = snprintf(urlbuf, MAX_URL, "%s://" SQUIDSBUFPH "%s" SQUIDSBUFPH, - req->url.getScheme().c_str(), + const SBuf &scheme = req->url.getScheme().image(); + size_t urllen = snprintf(urlbuf, MAX_URL, SQUIDSBUFPH "://" SQUIDSBUFPH "%s" SQUIDSBUFPH, + SQUIDSBUFPRINT(scheme), SQUIDSBUFPRINT(req->url.userInfo()), !req->url.userInfo().isEmpty() ? "@" : "", SQUIDSBUFPRINT(authorityForm)); // if the first char is '/' assume its a relative path // XXX: this breaks on scheme-relative URLs, // but we should not see those outside ESI, and rarely there. // XXX: also breaks on any URL containing a '/' in the query-string portion if (relUrl[0] == '/') { xstrncpy(&urlbuf[urllen], relUrl, MAX_URL - urllen - 1); } else { SBuf path = req->url.path(); SBuf::size_type lastSlashPos = path.rfind('/'); if (lastSlashPos == SBuf::npos) { // replace the whole path with the given bit(s) urlbuf[urllen] = '/'; ++urllen; xstrncpy(&urlbuf[urllen], relUrl, MAX_URL - urllen - 1); } else {
_______________________________________________ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev