Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
RTT wrote: > Why this fix has not been added? Thanks, I just checked in your change untested, rev. #866 Available via SVN now or included in the next nightly snapshot ZIP. http://wiki.overbyte.be/wiki/index.php/ICS_Download -- Arno Garrels -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
RTT wrote: > Why this fix has not been added? AFAIR Francois wasn't convinced, if there are no side effects and Francois doesn't refuse it explicitly I'll add it. -- Arno Garrels > The first one is not really important, even if completes the logic > already taken by the ParseURL for URLs missing the protocol , but the > second change, used by the THttpCli when parsing redirection URLs, is > IMO mandatory, because URLs beginning with '//' are valid relative > URLs. > >> On 27-10-2011 18:33, Arno Garrels wrote: >>> I'd be happy to see more contributions from the ICS users in >>> general. >> >> 1st change >> >> In unit OverbyteIcsUrl, procedure ParseURL(..., replace >> >> if (url[1] = '/') then begin >> { Relative path without protocol specified } >> proto := 'http'; >> p := 1; >> if (Length(url) > 1) and (url[2] <> '/') then begin >> { Relative path } >> Path := Copy(url, 1, Length(url)); >> Exit; >> end; >> end >> >> by >> >> if (url[1] = '/') then begin >> { Relative path without protocol specified } >> proto := 'http'; >> if (Length(url) > 1) then begin >> if (url[2] <> '/') then begin >> { Relative path } >> Path := Copy(url, 1, Length(url)); >> Exit; >> end else >> p:=2; >> end >> else begin >> Path := '/'; >> Exit; >> end; >> end >> >> 2nd Change >> >> In unit OverbyteIcsHttpProt, procedure THttpCli.GetHeaderLineNext, >> replace >> >> if Field = 'location' then begin { Change the URL ! } >> >> by >> >> if Field = 'location' then begin { Change the URL ! } >> if Copy(Data,1,2)='//' then >> Data:=FProtocol+':'+Data; >> -- >> To unsubscribe or change your settings for TWSocket mailing list >> please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket >> Visit our website at http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
Why this fix has not been added? The first one is not really important, even if completes the logic already taken by the ParseURL for URLs missing the protocol , but the second change, used by the THttpCli when parsing redirection URLs, is IMO mandatory, because URLs beginning with '//' are valid relative URLs. On 27-10-2011 18:33, Arno Garrels wrote: I'd be happy to see more contributions from the ICS users in general. 1st change In unit OverbyteIcsUrl, procedure ParseURL(..., replace if (url[1] = '/') then begin { Relative path without protocol specified } proto := 'http'; p := 1; if (Length(url) > 1) and (url[2] <> '/') then begin { Relative path } Path := Copy(url, 1, Length(url)); Exit; end; end by if (url[1] = '/') then begin { Relative path without protocol specified } proto := 'http'; if (Length(url) > 1) then begin if (url[2] <> '/') then begin { Relative path } Path := Copy(url, 1, Length(url)); Exit; end else p:=2; end else begin Path := '/'; Exit; end; end 2nd Change In unit OverbyteIcsHttpProt, procedure THttpCli.GetHeaderLineNext, replace if Field = 'location' then begin { Change the URL ! } by if Field = 'location' then begin { Change the URL ! } if Copy(Data,1,2)='//' then Data:=FProtocol+':'+Data; -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
RFC2616 references RFC2396 as a source of information about URI in the description of a general syntax (Chapter 3.2.1). Then the next chapter (3.2.2) specifies the http URL which /must/ start with the "http:" scheme. The RFC2396 also don't say noting about a default protocol for such situations. It's a relative URI, so it's responsibility of the application to accept it or not. In my opinion, a HTTP client should not handle URL not starting with the HTTP scheme altough a browser may do it because a browser support many schemes, and probably default to the http: scheme if none is given (check that with a network sniffer). An HTTP client is a client for HTTP, so I don't see why can't assume the default protocol as HTTP. But yes, probably better if application set it up with a default protocol. In the other hand, the OverbyteIcsUrl ParseURL procedure already assume HTTP for similar situations, and this may be wrong because it's a generic function, not at application level. Probably better would be to pass the default protocol to use, for when not specified in the URL to be parsed. By the way, in which real situation are you confronted with URL without scheme ? This appeared in a relocation of an URL provided by the TwitPic API, so perfectly valid because a relocation can be specified by a relative URL. The HTTPCli was failing to handle such relocation . Then I thought to extend the handling of such URLs for start URL too, the same way "www.someting.com" or "://www.someting.com" is handled already. This is currently done in the ParseURL, so I added the code there too. But, as I said above, may be wrong for all these situations. Application should pass the default protocol to assume to that ParseURL function. Or the ParseURL should return proto='', and calling code handle it that way. But this will probably brake already existent code. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
This kind of change would be interesting if implemented in a way that a derived class can implement his own parser. The base class in ICS would then parse the URL as currently and the developer would be able to create a derived class (component) which override the standard behaviour with whatever is required by his specific context. francois.pie...@overbyte.be The author of the freeware multi-tier middleware MidWare The author of the freeware Internet Component Suite (ICS) http://www.overbyte.be -Message d'origine- From: RTT Sent: Friday, October 28, 2011 12:12 AM To: ICS support mailing Subject: Re: [twsocket] THTTPCli fail to resolve URLs that start with "//" On 27-10-2011 18:33, Arno Garrels wrote: I'd be happy to see more contributions from the ICS users in general. 1st change In unit OverbyteIcsUrl, procedure ParseURL(..., replace if (url[1] = '/') then begin { Relative path without protocol specified } proto := 'http'; p := 1; if (Length(url) > 1) and (url[2] <> '/') then begin { Relative path } Path := Copy(url, 1, Length(url)); Exit; end; end by if (url[1] = '/') then begin { Relative path without protocol specified } proto := 'http'; if (Length(url) > 1) then begin if (url[2] <> '/') then begin { Relative path } Path := Copy(url, 1, Length(url)); Exit; end else p:=2; end else begin Path := '/'; Exit; end; end 2nd Change In unit OverbyteIcsHttpProt, procedure THttpCli.GetHeaderLineNext, replace if Field = 'location' then begin { Change the URL ! } by if Field = 'location' then begin { Change the URL ! } if Copy(Data,1,2)='//' then Data:=FProtocol+':'+Data; -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
RFC2616 references RFC2396 as a source of information about URI in the description of a general syntax (Chapter 3.2.1). Then the next chapter (3.2.2) specifies the http URL which /must/ start with the "http:" scheme. In my opinion, a HTTP client should not handle URL not starting with the HTTP scheme altough a browser may do it because a browser support many schemes, and probably default to the http: scheme if none is given (check that with a network sniffer). By the way, in which real situation are you confronted with URL without scheme ? -- francois.pie...@overbyte.be The author of the freeware multi-tier middleware MidWare The author of the freeware Internet Component Suite (ICS) http://www.overbyte.be -Message d'origine- From: RTT Sent: Thursday, October 27, 2011 9:01 PM To: ICS support mailing Subject: Re: [twsocket] THTTPCli fail to resolve URLs that start with "//" On 27-10-2011 19:28, Francois PIETTE wrote: If I'm wrong, please point me to the exact text in the /HTTP/ standard (RFC2616). From RFC2396, that merge RFC1808 with two others, and that is referenced in the RFC2616. absoluteURI = scheme ":" ( hier_part | opaque_part ) URI that are hierarchical in nature use the slash "/" character for separating hierarchical components. For some file systems, a "/" character (used to denote the hierarchical structure of a URI) is the delimiter used to construct a file name hierarchy, and thus the URI path will look similar to a file pathname. This does NOT imply that the resource is a file or that the URI maps to an actual filesystem pathname. hier_part = ( net_path | abs_path ) [ "?" query ] net_path = "//" authority [ abs_path ] abs_path = "/" path_segments The syntax for relative URI takes advantage of the syntax of (Section 3) in order to express a reference that is relative to the namespace of another hierarchical URI. relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ] A relative reference beginning with two slash characters is termed a network-path reference, as defined by in Section 3. Such references are rarely used. So, for this error in particular, seems to be a relative URI, in the "net_path" form. I haven't investigated if this form of relative (quasi absolute, except for the unknown scheme) URI is valid as start URI, but browsers accept it without problem. But seems to be valid in a relocation, where relative URIs have to be handled. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
On 27-10-2011 18:33, Arno Garrels wrote: I'd be happy to see more contributions from the ICS users in general. 1st change In unit OverbyteIcsUrl, procedure ParseURL(..., replace if (url[1] = '/') then begin { Relative path without protocol specified } proto := 'http'; p := 1; if (Length(url) > 1) and (url[2] <> '/') then begin { Relative path } Path := Copy(url, 1, Length(url)); Exit; end; end by if (url[1] = '/') then begin { Relative path without protocol specified } proto := 'http'; if (Length(url) > 1) then begin if (url[2] <> '/') then begin { Relative path } Path := Copy(url, 1, Length(url)); Exit; end else p:=2; end else begin Path := '/'; Exit; end; end 2nd Change In unit OverbyteIcsHttpProt, procedure THttpCli.GetHeaderLineNext, replace if Field = 'location' then begin { Change the URL ! } by if Field = 'location' then begin { Change the URL ! } if Copy(Data,1,2)='//' then Data:=FProtocol+':'+Data; -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
On 27-10-2011 19:28, Francois PIETTE wrote: If I'm wrong, please point me to the exact text in the /HTTP/ standard (RFC2616). From RFC2396, that merge RFC1808 with two others, and that is referenced in the RFC2616. absoluteURI = scheme ":" ( hier_part | opaque_part ) URI that are hierarchical in nature use the slash "/" character for separating hierarchical components. For some file systems, a "/" character (used to denote the hierarchical structure of a URI) is the delimiter used to construct a file name hierarchy, and thus the URI path will look similar to a file pathname. This does NOT imply that the resource is a file or that the URI maps to an actual filesystem pathname. hier_part = ( net_path | abs_path ) [ "?" query ] net_path = "//" authority [ abs_path ] abs_path = "/" path_segments The syntax for relative URI takes advantage of the syntax of (Section 3) in order to express a reference that is relative to the namespace of another hierarchical URI. relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ] A relative reference beginning with two slash characters is termed a network-path reference, as defined by in Section 3. Such references are rarely used. So, for this error in particular, seems to be a relative URI, in the "net_path" form. I haven't investigated if this form of relative (quasi absolute, except for the unknown scheme) URI is valid as start URI, but browsers accept it without problem. But seems to be valid in a relocation, where relative URIs have to be handled. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
According to the rfc1808.txt, an URL can start with // The ParseURL function will fail in such cases. Also, during a relocation, the THTTPCli fail to parse a "location" field with such URLs. Here is an example, that result in a relocation with these characteristics http://twitpic.com/show/thumb/73aj3p I have not read RFC1808 in details, at least I have not found where this kind of URL is valid in the context of a HTTP client. Maybe there is a confusion between an URL which can be found within a HTML document and a URL which is valid within a HTTP request ? If I'm wrong, please point me to the exact text in the /HTTP/ standard (RFC2616). -- francois.pie...@overbyte.be The author of the freeware multi-tier middleware MidWare The author of the freeware Internet Component Suite (ICS) http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
RTT wrote: > The bellow provided example isn't producing anymore the relocation > with such "begin with //" address, but this don't invalidate the fact > that the THTTPCli cant resolve a url such as: > > //www.google.com > > nor handle a relocation with the "location" field set to equal URL I haven't looked into it. But is it realy so difficult to change the very simple ICS parser to support that kind of URL as well? I'd be happy to see more contributions from the ICS users in general. -- Arno Garrels > >> According to the rfc1808.txt, an URL can start with // >> The ParseURL function will fail in such cases. Also, during a >> relocation, the THTTPCli fail to parse a "location" field with such >> URLs. Here is an example, that result in a relocation with these >> characteristics >> http://twitpic.com/show/thumb/73aj3p -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] THTTPCli fail to resolve URLs that start with "//"
The bellow provided example isn't producing anymore the relocation with such "begin with //" address, but this don't invalidate the fact that the THTTPCli cant resolve a url such as: //www.google.com nor handle a relocation with the "location" field set to equal URL According to the rfc1808.txt, an URL can start with // The ParseURL function will fail in such cases. Also, during a relocation, the THTTPCli fail to parse a "location" field with such URLs. Here is an example, that result in a relocation with these characteristics http://twitpic.com/show/thumb/73aj3p -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be