Re: [Bug-wget] Segmentation fault with current development version of wget
On Thu, 2 May 2013, Giuseppe Scrivano wrote: RFC 2606 doesn't seem very clear about it, and I can't find anywhere that PUT/OPTIONS/ANYTHING should be handled differently than POST wrt redirections. I don't see why suspending a PUT request would be incorrect. Darshit, do you have any pointer? Please use the httpbis documents instead of solely relying on RFC2616: http://datatracker.ietf.org/doc/draft-ietf-httpbis-p2-semantics/?include_text=1 they're much updated and leaves less to interpret and is now based on many years of real-world HTTP use. While on the subject of redirects, you may also like this: http://tools.ietf.org/html/draft-reschke-http-status-308-07 -- / daniel.haxx.se
Re: [Bug-wget] Segmentation fault with current development version of wget
Gijs van Tulder writes: > Hi Darshit, > > Darshit Shah wrote: >> Unless we explicitly check for opt.method = POST, this will cause a lot >> of issues. Since the macro is being called on every redirect, even >> HEAD/PUT/OPTIONS headers will get suspended to give way to GET. That is >> not the behaviour we want. > > Ah, I see it's more complicated. If I understand RFC 2616 correctly > (section 10.3), suspending a PUT request in response to a redirect is > incorrect. But so is changing a POST request to a GET in response to > anything but a 303. But that's expected (and the current behaviour of > Wget): Thanks you both for the feedback. Indeed, my first patch doesn't consider this case. RFC 2606 doesn't seem very clear about it, and I can't find anywhere that PUT/OPTIONS/ANYTHING should be handled differently than POST wrt redirections. I don't see why suspending a PUT request would be incorrect. Darshit, do you have any pointer? -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
Hi Darshit, Darshit Shah wrote: > Unless we explicitly check for opt.method = POST, this will cause a lot > of issues. Since the macro is being called on every redirect, even > HEAD/PUT/OPTIONS headers will get suspended to give way to GET. That is > not the behaviour we want. Ah, I see it's more complicated. If I understand RFC 2616 correctly (section 10.3), suspending a PUT request in response to a redirect is incorrect. But so is changing a POST request to a GET in response to anything but a 303. But that's expected (and the current behaviour of Wget): Note: When automatically redirecting a POST request after receiving a 301 status code, some existing HTTP/1.0 user agents will erroneously change it into a GET request. I give up. Regards, Gijs
Re: [Bug-wget] Segmentation fault with current development version of wget
> > Still, even if the sanitization is removed: I think it would be better if > RESTORE_POST_DATA restores the previous value of opt.method, instead of > overwriting it with a hardcoded "POST". Isn't it? > > As double safety yes. Maybe we should do that. > A related question: how is a redirect response to a PUT request handled? > How should it be handled? > I'd rather that Giuseppe or someone else answers this in more detail. I haven't tried it, but it looks like in that case the SUSPEND_POST_DATA > macro is called (by retrieve_url in retr.c). If that's true, then later on > the opt.method would be 'restored' to "POST" by RESTORE_POST_DATA. > > In case, we wish to use the SUSPEND_POST_DATA macro for other commands too, then we must make it store method in a backup string and retrieve it. In any case, I think storing and retrieving opt.method is probably a good, safe option that we must implement. > Regards, > > Gijs > > -- Thanking You, Darshit Shah Research Lead, Code Innovation Kill Code Phobia. B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani
Re: [Bug-wget] Segmentation fault with current development version of wget
Hi Giuseppe, Dropping the bit that sanitizes the opt.method is probably a good idea. (Perhaps I shouldn't have replied to your patch directly.) Still, even if the sanitization is removed: I think it would be better if RESTORE_POST_DATA restores the previous value of opt.method, instead of overwriting it with a hardcoded "POST". Isn't it? A related question: how is a redirect response to a PUT request handled? How should it be handled? I haven't tried it, but it looks like in that case the SUSPEND_POST_DATA macro is called (by retrieve_url in retr.c). If that's true, then later on the opt.method would be 'restored' to "POST" by RESTORE_POST_DATA. Regards, Gijs Op 01-05-13 22:16 schreef Giuseppe Scrivano: hi Gijs, Gijs van Tulder writes: Giuseppe Scrivano wrote: what about this patch? Any comment? Another suggestion: why not save the original opt.method, set opt.method to NULL and put the original opt.method back later? thanks for your suggestion but I think we should drop the code that modifies opt.method, since we have to sanitize it only when it is specified as argument. Objections?
Re: [Bug-wget] Segmentation fault with current development version of wget
On Thu, May 2, 2013 at 12:56 AM, Gijs van Tulder wrote: > Giuseppe Scrivano wrote: > > what about this patch? Any comment? > > Another suggestion: why not save the original opt.method, set opt.method > to NULL and put the original opt.method back later? > > Gijs > Unless we explicitly check for opt.method = POST, this will cause a lot of issues. Since the macro is being called on every redirect, even HEAD/PUT/OPTIONS headers will get suspended to give way to GET. That is not the behaviour we want. And if we are explicitly writing the above test, then why not simply allow the xstrdup to copy the string? (Although, I believe using a const *string may be more efficient. ) @Tim: The various tests using -r are all in spider mode. This part of the code is not reached when in spider mode. opt.method && strcasecmp(opt.method,"POST") == 0 provides for more readable code. Maybe we should use that. @Giuseppe: Thanks for that patch! I didn't think along those lines, adding a new command. It also cleans up the code from http.c. It shouldn't have been there in the first place.
Re: [Bug-wget] Segmentation fault with current development version of wget
hi Gijs, Gijs van Tulder writes: > Giuseppe Scrivano wrote: >> what about this patch? Any comment? > > Another suggestion: why not save the original opt.method, set > opt.method to NULL and put the original opt.method back later? thanks for your suggestion but I think we should drop the code that modifies opt.method, since we have to sanitize it only when it is specified as argument. Objections? -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
Giuseppe Scrivano wrote: > what about this patch? Any comment? Another suggestion: why not save the original opt.method, set opt.method to NULL and put the original opt.method back later? Gijs diff --git a/src/retr.c b/src/retr.c index d51b7e7..2aee578 100644 --- a/src/retr.c +++ b/src/retr.c @@ -679,22 +679,23 @@ calc_rate (wgint bytes, double secs, int *units) #define SUSPEND_POST_DATA do { \ post_data_suspended = true; \ saved_post_data = opt.body_data; \ saved_post_file_name = opt.body_file; \ + saved_method = opt.method;\ opt.body_data = NULL; \ opt.body_file = NULL; \ opt.method = NULL;\ } while (0) #define RESTORE_POST_DATA do { \ if (post_data_suspended) \ { \ opt.body_data = saved_post_data; \ opt.body_file = saved_post_file_name; \ + opt.method = saved_method;\ post_data_suspended = false; \ - opt.method = "POST"; \ } \ } while (0) static char *getproxy (struct url *); @@ -721,10 +722,11 @@ retrieve_url (struct url * orig_parsed, const char *origurl, char **file, int redirection_count = 0; bool post_data_suspended = false; char *saved_post_data = NULL; char *saved_post_file_name = NULL; + char *saved_method = NULL; /* If dt is NULL, use local storage. */ if (!dt) { dt = &dummy;
Re: [Bug-wget] Segmentation fault with current development version of wget
Am Mittwoch, 1. Mai 2013 schrieb Darshit Shah: > First, sorry for the quick and dirty hack which was the perfect example of > how NOT to do things. Than it was a good example ;-) > Secondly, it lies upon me that this feature wasn't tested before submitting > the patch. I had however relied on the Test Environment and since it passed > everything there, I thought it was working correctly. Guess, we should add > a test for this soon. --recursive is a commonly used switch with Wget and > not having a test to prevent regressions on it is very bad. There are several tests using -r. The question is why the problem doesn't come out. > I am fixing this issue, but it is a terribly ugly hack. If someone could > help improve it I'd be most truly grateful. > I have a couple of ideas, but I will need to work them out and implement > them when I have the time. > > The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp > on a NULL String, and we cannot initialize opt.method since that would > break some sanity checks which are in place so that --post-* and --body-* > commands don't conflict with each other. Your test isn't really ugly. I (and most C programmers) favor opt.method && strcasecmp(opt.method,"POST") == 0 instead of opt.method ? strcasecmp(opt.method,"POST") == 0 : false But that is not really important. It is pretty common, that one or both args to strcmp may be NULL. The really ugly thing is, that there are no string function alternatives that handle NULL pointers. I regularly use my own versions like strcmp_null() to avoid extra checks. Regards, Tim
Re: [Bug-wget] Segmentation fault with current development version of wget
Darshit Shah writes: > I am fixing this issue, but it is a terribly ugly hack. If someone could > help improve it I'd be most truly grateful. > I have a couple of ideas, but I will need to work them out and implement > them when I have the time. > > The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp > on a NULL String, and we cannot initialize opt.method since that would > break some sanity checks which are in place so that --post-* and --body-* > commands don't conflict with each other. what about this patch? Any comment? diff --git a/src/http.c b/src/http.c index 25ad474..6b23382 100644 --- a/src/http.c +++ b/src/http.c @@ -1766,12 +1766,8 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, struct url *proxy, if (head_only) meth = "HEAD"; else if (opt.method) - { -char *q; -for (q = opt.method; *q; ++q) - *q = c_toupper (*q); -meth = opt.method; - } + meth = opt.method; /* Use the full path, i.e. one that includes the leading slash and the query string. E.g. if u->path is "foo/bar" and u->query is "param=value", full_path will be "/foo/bar?param=value". */ diff --git a/src/init.c b/src/init.c index 813781f..370506f 100644 --- a/src/init.c +++ b/src/init.c @@ -87,6 +87,7 @@ CMD_DECLARE (cmd_directory_vector); CMD_DECLARE (cmd_number); CMD_DECLARE (cmd_number_inf); CMD_DECLARE (cmd_string); +CMD_DECLARE (cmd_string_uppercase); CMD_DECLARE (cmd_file); CMD_DECLARE (cmd_directory); CMD_DECLARE (cmd_time); @@ -212,7 +213,7 @@ static const struct { { "logfile", &opt.lfilename, cmd_file }, { "login",&opt.ftp_user, cmd_string },/* deprecated*/ { "maxredirect", &opt.max_redirect, cmd_number }, - { "method", &opt.method,cmd_string }, + { "method", &opt.method,cmd_string_uppercase }, { "mirror", NULL, cmd_spec_mirror }, { "netrc",&opt.netrc, cmd_boolean }, { "noclobber",&opt.noclobber, cmd_boolean }, @@ -959,8 +960,24 @@ cmd_string (const char *com, const char *val, void *place) return true; } +/* Like cmd_string but ensure the string is upper case. */ +static bool +cmd_string_uppercase (const char *com, const char *val, void *place) +{ + char *q; + bool ret = cmd_string (com, val, place); + q = *((char **) place); + if (!ret || q == NULL) +return false; + + while (*q) +*q++ = c_toupper (*q); + + return true; +} + -/* Like the above, but handles tilde-expansion when reading a user's +/* Like cmd_string, but handles tilde-expansion when reading a user's `.wgetrc'. In that case, and if VAL begins with `~', the tilde gets expanded to the user's home directory. */ static bool -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
Couple of things. First, sorry for the quick and dirty hack which was the perfect example of how NOT to do things. Secondly, it lies upon me that this feature wasn't tested before submitting the patch. I had however relied on the Test Environment and since it passed everything there, I thought it was working correctly. Guess, we should add a test for this soon. --recursive is a commonly used switch with Wget and not having a test to prevent regressions on it is very bad. Lastly, I have spent some time on this issue today and figured out the issue. The SUSPEND_POST_DATA logic is so written that the macro is called on every 301/302 redirect. This did not pose any issues till now since when NOT using --post-file or --post-data, the respective variables would be holding a NULL value which is backed up and then restored without any problems. However, with the addition of the --method option, I am explicitly setting opt.method to POST on RESUME_POST_DATA since the macro is supposed to be called only during POST requests. This is the source of the error. The actual Segmentation Fault was being caused since I, in all my foolishness did not strdup, but instead directly assigned the string to a pointer. I am fixing this issue, but it is a terribly ugly hack. If someone could help improve it I'd be most truly grateful. I have a couple of ideas, but I will need to work them out and implement them when I have the time. The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp on a NULL String, and we cannot initialize opt.method since that would break some sanity checks which are in place so that --post-* and --body-* commands don't conflict with each other. 0001-Fix-bug-in-program-logic-and-Seg-Fault.patch Description: Binary data