Re: [Bug-wget] Segmentation fault with current development version of wget
On Mon, Jun 17, 2013 at 2:05 AM, Giuseppe Scrivano wrote: > Thanks, the code seems correct but since you are going to write a lot of > code during your Summer of Code project, I will be very pedantic. This > time I have just amended the changes before push your patch. > > > Darshit Shah writes: > > > Please be aware that Wget needs to know the size of the POST data in > > -advance. Therefore the argument to @code{--post-file} must be a regular > > we put two empty spaces after a sentence. I have fixed this change in > other places too. > > > > + switch (statcode) > > + { > > I have indented the '{' correctly. Two empty spaces. Same for the > closing curly bracket. > > > > + if ((!(*dt & RETROKF) && !opt.content_on_error) || head_only || > (opt.method && strcasecmp (opt.method, "GET") != 0)) > > long line, I have splitted it. > Thanks! And sorry for the goof ups in the formatting. I'm getting rusty. Although, I must admit I wasn't aware of the convention to use two spaces after each sentence and thought it was a mistake by someone. -- Thanking You, Darshit Shah
Re: [Bug-wget] Segmentation fault with current development version of wget
Thanks, the code seems correct but since you are going to write a lot of code during your Summer of Code project, I will be very pedantic. This time I have just amended the changes before push your patch. Darshit Shah writes: > Please be aware that Wget needs to know the size of the POST data in > -advance. Therefore the argument to @code{--post-file} must be a regular we put two empty spaces after a sentence. I have fixed this change in other places too. > + switch (statcode) > + { I have indented the '{' correctly. Two empty spaces. Same for the closing curly bracket. > + if ((!(*dt & RETROKF) && !opt.content_on_error) || head_only || > (opt.method && strcasecmp (opt.method, "GET") != 0)) long line, I have splitted it. -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
> please follow the GNU Coding standards here. Empty space between > function name and '(' and also leave spaces around the operator !=. > > Sorry for missing these. Fixed in all the relevant locations in the patch. > Could you please also add some comments in the documentation? > Added the changes in wget.texi. Also fixed formatting where two whitespaces were used between sentences and a couple of typos. > We can skip tests for now, since you are going to rewrite the unit tests > for your Summer of Code, but we will probably want something to stress > these features too. > Yes, I intend to write rigorous tests for these features. -- Thanking You, Darshit Shah 0001-Follow-RFC-2616-and-httpbis-specifications-when-hand.patch Description: Binary data
Re: [Bug-wget] Segmentation fault with current development version of wget
thanks for your contribution, some comments inline: Darshit Shah writes: > - if (statcode == HTTP_STATUS_TEMPORARY_REDIRECT) > -return NEWLOCATION_KEEP_POST; > + switch (statcode) > + { > +case HTTP_STATUS_TEMPORARY_REDIRECT: > + return NEWLOCATION_KEEP_POST; > + break; > +case HTTP_STATUS_MOVED_PERMANENTLY: > + if(opt.method && strcasecmp(opt.method,"post")!=0) > +return NEWLOCATION_KEEP_POST; > + break; > +case HTTP_STATUS_MOVED_TEMPORARILY: > + if(opt.method && strcasecmp(opt.method,"post")!=0) > +return NEWLOCATION_KEEP_POST; > + break; please follow the GNU Coding standards here. Empty space between function name and '(' and also leave spaces around the operator !=. > + if (opt.method && strcasecmp(opt.method,"HEAD")==0) > +setoptval("spider","1","spider"); same here. Also leave a space after the commas. I have seen these errors in other places too, please double check the patch. Could you please also add some comments in the documentation? We can skip tests for now, since you are going to rewrite the unit tests for your Summer of Code, but we will probably want something to stress these features too. -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
Attaching a patch that should (hopefully) fix all the problems listed here. I have tested it and it looks good to me. On Thu, May 9, 2013 at 9:55 AM, Darshit Shah wrote: > > > We currently suspend any method if the response code is a non 307 >> > redirect. That I believe is wrong. We should only be suspending the >> > POST method and not others. >> >> It sounds good, but we must be careful also with 303. >> >> Do you agree that the matrix for the method to use on the next request >> should look this way? >> >> original >> method -> POST FOO HEAD >> response >> status >> | >> \/ >> 301 GETFOO(terminate) >> 302 GETFOO(terminate) >> 303 GETGET(terminate) >> 307 POST FOO(terminate) >> >> Looks good to me. This will require some work so that the code remains > maintainable later. > I implemented setting opt.spider when invoked with --method head. But > spider mode currently follows all redirects till it receives a 200 OK. This > does not match with the intended behaviour. > > >> > Well yeah. It has opened a whole lot of possibilities that we must >> > keep considering. I will handle it separately for HEAD and send in a >> > patch soon. However, I was thinking, that since the new --method >> > command needs a lot of work specific to itself and that it may only >> > increase in the future, we should handle that code separately? Maybe a >> > set_method_opts() function? >> >> we can refactor if/when we really need it, no need to overengineer the >> code now. > > Let's keep it for later then. > > Do you already have any of such "opts" in mind? >> > Just the code that handles post-body/data and now for HEAD. > > > > -- > Thanking You, > Darshit Shah > > -- Thanking You, Darshit Shah 0001-Follow-RFC-2616-and-httpbis-specifications-when-hand.patch Description: Binary data
Re: [Bug-wget] Segmentation fault with current development version of wget
> > We currently suspend any method if the response code is a non 307 > > redirect. That I believe is wrong. We should only be suspending the > > POST method and not others. > > It sounds good, but we must be careful also with 303. > > Do you agree that the matrix for the method to use on the next request > should look this way? > > original > method -> POST FOO HEAD > response > status > | > \/ > 301 GETFOO(terminate) > 302 GETFOO(terminate) > 303 GETGET(terminate) > 307 POST FOO(terminate) > > Looks good to me. This will require some work so that the code remains maintainable later. I implemented setting opt.spider when invoked with --method head. But spider mode currently follows all redirects till it receives a 200 OK. This does not match with the intended behaviour. > > Well yeah. It has opened a whole lot of possibilities that we must > > keep considering. I will handle it separately for HEAD and send in a > > patch soon. However, I was thinking, that since the new --method > > command needs a lot of work specific to itself and that it may only > > increase in the future, we should handle that code separately? Maybe a > > set_method_opts() function? > > we can refactor if/when we really need it, no need to overengineer the > code now. Let's keep it for later then. Do you already have any of such "opts" in mind? > Just the code that handles post-body/data and now for HEAD. -- Thanking You, Darshit Shah
Re: [Bug-wget] Segmentation fault with current development version of wget
Darshit Shah writes: > > We suspend the post data only when we receive a 307, or do you > mean we > shouldn't suspend in this case too? > > It's the other way round. A 307 response code is used when the server > wishes to explicitly ask the client to not suspend. And that is also > the current behaviour. true, I made some confusion with the terms :-) > We currently suspend any method if the response code is a non 307 > redirect. That I believe is wrong. We should only be suspending the > POST method and not others. It sounds good, but we must be careful also with 303. Do you agree that the matrix for the method to use on the next request should look this way? original method -> POST FOO HEAD response status | \/ 301 GETFOO(terminate) 302 GETFOO(terminate) 303 GETGET(terminate) 307 POST FOO(terminate) > Well yeah. It has opened a whole lot of possibilities that we must > keep considering. I will handle it separately for HEAD and send in a > patch soon. However, I was thinking, that since the new --method > command needs a lot of work specific to itself and that it may only > increase in the future, we should handle that code separately? Maybe a > set_method_opts() function? we can refactor if/when we really need it, no need to overengineer the code now. Do you already have any of such "opts" in mind? -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
> > We suspend the post data only when we receive a 307, or do you mean we > shouldn't suspend in this case too? > > It's the other way round. A 307 response code is used when the server wishes to explicitly ask the client to not suspend. And that is also the current behaviour. We currently suspend any method if the response code is a non 307 redirect. That I believe is wrong. We should only be suspending the POST method and not others. > The --method patch has opened a can of worms and now we have to consider > literally any HTTP method and how it make more sense for the users. > I think that very few people are going to abuse --method with > TRACE/OPTIONS and we can consider these cases later, if they really hurt > someone. On the other hand, it will be nice to handle separately HEAD. > > Well yeah. It has opened a whole lot of possibilities that we must keep considering. I will handle it separately for HEAD and send in a patch soon. However, I was thinking, that since the new --method command needs a lot of work specific to itself and that it may only increase in the future, we should handle that code separately? Maybe a set_method_opts() function? -- Thanking You, Darshit Shah
Re: [Bug-wget] Segmentation fault with current development version of wget
Darshit Shah writes: > While DELETE *may* be redirected, the document did not mention > anything explicitly about it. From what I gauge, we should not suspend > a DELETE command upon a redirect. We suspend the post data only when we receive a 307, or do you mean we shouldn't suspend in this case too? > Also, this reminds me, when a user specifies --method HEAD, that > implies s/he wishes to see the Response provided by the server and > hence in this case the headers should be printed in ALL outputs > (quiet, verbose and debug). Same with TRACE/OPTIONS commands since > these are primarily used when the user wishes to view the Response > generated. The --method patch has opened a can of worms and now we have to consider literally any HTTP method and how it make more sense for the users. I think that very few people are going to abuse --method with TRACE/OPTIONS and we can consider these cases later, if they really hurt someone. On the other hand, it will be nice to handle separately HEAD. -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
> This paragraph: > > "The 307 (Temporary Redirect) status code indicates that the target > resource resides temporarily under a different URI and the user agent > MUST NOT change the request method if it performs an automatic > redirection to that URI. Since the redirection can change over time, > the client ought to continue using the original effective request URI > for future requests." > > seems to confirm that the patch from Gijs, that keeps the original > method, is correct. > > I am going to clean it up and if nobody complains in a few hours, I will > push it. > I spent some time reading the HTTPbis draft for the update to RFC 2616. According to HTTPbis, ONLY the POST request should be suspended (which again is in violation of RFC 2616 but allowed due to popular implementation). The PUT method should almost never be redirected, and if it is redirected the new URI should point to the location where the resource is to be created/updated. As has been discussed earlier, HEAD should not even be following the redirects. OPTIONS/CONNECT/TRACE should not have a 3xx Response. While DELETE *may* be redirected, the document did not mention anything explicitly about it. From what I gauge, we should not suspend a DELETE command upon a redirect. Also, this reminds me, when a user specifies --method HEAD, that implies s/he wishes to see the Response provided by the server and hence in this case the headers should be printed in ALL outputs (quiet, verbose and debug). Same with TRACE/OPTIONS commands since these are primarily used when the user wishes to view the Response generated.
Re: [Bug-wget] Segmentation fault with current development version of wget
Darshit Shah writes: >while I agree that your patch makes the code more readable and it is > > good to apply it, I don't see how the previous version was broken. > What > compiler have you used? > >I see you pushed the patch. But have we gotten to the bottom of the >cause? If the result changes with the GCC version, we should probably >report it upstream. no, my mistake. I remembered that it was correct to change the lvalue on the left side of the assignment but it seems not the case, and that is an undefined behaviour; good that you hit this problem and fixed it. > no, it doesn't, same for GET. > >True. No sense suspending and setting the same method again. But adding >code to test for the same may add enough overhead to negate most gains >obtained by preventing it. >I feel we should instead ignore processing the --method option if it's >argument is "GET". >Also, I am not aware, but is there any use case where a user may wish >to send body-data with a GET request? maybe testing the web server :-) but let's keep in mind the normal usage and it seems correct to use "--method GET" or "--method HEAD", in this case we should raise a warning if the user also specifies a body-data saying that the body-data is ignored. Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
> > while I agree that your patch makes the code more readable and it is > good to apply it, I don't see how the previous version was broken. What > compiler have you used? > > I see you pushed the patch. But have we gotten to the bottom of the cause? If the result changes with the GCC version, we should probably report it upstream. no, it doesn't, same for GET. True. No sense suspending and setting the same method again. But adding code to test for the same may add enough overhead to negate most gains obtained by preventing it. I feel we should instead ignore processing the --method option if it's argument is "GET". Also, I am not aware, but is there any use case where a user may wish to send body-data with a GET request?
Re: [Bug-wget] Segmentation fault with current development version of wget
Darshit Shah writes: > I also think some more explicit sanity checks are needed in the code. > Like sending data with a HEAD request makes absolutely no sense (does > it?). no, it doesn't, same for GET. -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
Hi Giuseppe, while I agree that your patch makes the code more readable and it is > good to apply it, I don't see how the previous version was broken. What > compiler have you used? > This occurs when compiling on gcc 4.8.0. The following output ensues: Setting --method (method) to head DEBUG output created by Wget 1.14.43-cf74-dirty on linux-gnu. URI encoding = ‘UTF-8’ --2013-05-05 07:56:19-- http://www.example.com/ Resolving www.example.com (www.example.com)... 192.0.43.10, 2001:500:88:200::10 Caching www.example.com => 192.0.43.10 2001:500:88:200::10 Connecting to www.example.com (www.example.com)|192.0.43.10|:80... connected. Created socket 3. Releasing 0x01271410 (new refcount 1). ---request begin--- EAD / HTTP/1.1 User-Agent: Wget/1.14.43-cf74-dirty (linux-gnu) Accept: */* Host: www.example.com Connection: Keep-Alive See how the request has the first character missing? And since --post-data is also handled internally through this, it breaks all scripts that use wget to send POST headers. While on the topic of compilation issues, the linker flag, D_FORTIFY_SOURCE throws a couple of warnings. Would we be interested in looking into those and sorting them out? > > Good catch! Maybe we should handle --method=HEAD as opt.spider? > That makes sense. I also think some more explicit sanity checks are needed in the code. Like sending data with a HEAD request makes absolutely no sense (does it?).
Re: [Bug-wget] Segmentation fault with current development version of wget
Hi Darshit, Darshit Shah writes: > The patch supplied by you adding cmd_uppercase_String seems to break > the header generation since (*q++) causes the pointer to increment > before the c_toupper() method returns. > > We should instead increment it only after the statement successfully > returns. > I have also converted the loop to a for loop to reduce number of > lines. while I agree that your patch makes the code more readable and it is good to apply it, I don't see how the previous version was broken. What compiler have you used? > One issue I see with the latest commit that suspends the method on > redirection is when using HEAD. > When I use --method=HEAD, I have no intentions of it getting converted > to GET upon redirection. In fact, I don't think wget should even > follow the redirection when opt.method is "HEAD". Good catch! Maybe we should handle --method=HEAD as opt.spider? -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
Hi Giuseppe, The patch supplied by you adding cmd_uppercase_String seems to break the header generation since (*q++) causes the pointer to increment before the c_toupper() method returns. We should instead increment it only after the statement successfully returns. I have also converted the loop to a for loop to reduce number of lines. One issue I see with the latest commit that suspends the method on redirection is when using HEAD. When I use --method=HEAD, I have no intentions of it getting converted to GET upon redirection. In fact, I don't think wget should even follow the redirection when opt.method is "HEAD". On Fri, May 3, 2013 at 12:04 AM, Giuseppe Scrivano wrote: > Daniel Stenberg writes: > > > On Thu, 2 May 2013, Giuseppe Scrivano wrote: > > > >> RFC 2606 doesn't seem very clear about it, and I can't find anywhere > >> that PUT/OPTIONS/ANYTHING should be handled differently than POST > >> wrt redirections. I don't see why suspending a PUT request would be > >> incorrect. Darshit, do you have any pointer? > > > > Please use the httpbis documents instead of solely relying on RFC2616: > > > http://datatracker.ietf.org/doc/draft-ietf-httpbis-p2-semantics/?include_text=1 > > they're much updated and leaves less to interpret and is now based on > > many years of real-world HTTP use. > > thanks for the additional info. > > This paragraph: > > "The 307 (Temporary Redirect) status code indicates that the target > resource resides temporarily under a different URI and the user agent > MUST NOT change the request method if it performs an automatic > redirection to that URI. Since the redirection can change over time, > the client ought to continue using the original effective request URI > for future requests." > > seems to confirm that the patch from Gijs, that keeps the original > method, is correct. > > I am going to clean it up and if nobody complains in a few hours, I will > push it. > > -- > Giuseppe > > -- Thanking You, Darshit Shah Research Lead, Code Innovation Kill Code Phobia. B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani 0001-Fix-issue-when-converting-string-to-uppercase.patch Description: Binary data
Re: [Bug-wget] Segmentation fault with current development version of wget
Daniel Stenberg writes: > On Thu, 2 May 2013, Giuseppe Scrivano wrote: > >> RFC 2606 doesn't seem very clear about it, and I can't find anywhere >> that PUT/OPTIONS/ANYTHING should be handled differently than POST >> wrt redirections. I don't see why suspending a PUT request would be >> incorrect. Darshit, do you have any pointer? > > Please use the httpbis documents instead of solely relying on RFC2616: > http://datatracker.ietf.org/doc/draft-ietf-httpbis-p2-semantics/?include_text=1 > they're much updated and leaves less to interpret and is now based on > many years of real-world HTTP use. thanks for the additional info. This paragraph: "The 307 (Temporary Redirect) status code indicates that the target resource resides temporarily under a different URI and the user agent MUST NOT change the request method if it performs an automatic redirection to that URI. Since the redirection can change over time, the client ought to continue using the original effective request URI for future requests." seems to confirm that the patch from Gijs, that keeps the original method, is correct. I am going to clean it up and if nobody complains in a few hours, I will push it. -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
On Thu, 2 May 2013, Giuseppe Scrivano wrote: RFC 2606 doesn't seem very clear about it, and I can't find anywhere that PUT/OPTIONS/ANYTHING should be handled differently than POST wrt redirections. I don't see why suspending a PUT request would be incorrect. Darshit, do you have any pointer? Please use the httpbis documents instead of solely relying on RFC2616: http://datatracker.ietf.org/doc/draft-ietf-httpbis-p2-semantics/?include_text=1 they're much updated and leaves less to interpret and is now based on many years of real-world HTTP use. While on the subject of redirects, you may also like this: http://tools.ietf.org/html/draft-reschke-http-status-308-07 -- / daniel.haxx.se
Re: [Bug-wget] Segmentation fault with current development version of wget
Gijs van Tulder writes: > Hi Darshit, > > Darshit Shah wrote: >> Unless we explicitly check for opt.method = POST, this will cause a lot >> of issues. Since the macro is being called on every redirect, even >> HEAD/PUT/OPTIONS headers will get suspended to give way to GET. That is >> not the behaviour we want. > > Ah, I see it's more complicated. If I understand RFC 2616 correctly > (section 10.3), suspending a PUT request in response to a redirect is > incorrect. But so is changing a POST request to a GET in response to > anything but a 303. But that's expected (and the current behaviour of > Wget): Thanks you both for the feedback. Indeed, my first patch doesn't consider this case. RFC 2606 doesn't seem very clear about it, and I can't find anywhere that PUT/OPTIONS/ANYTHING should be handled differently than POST wrt redirections. I don't see why suspending a PUT request would be incorrect. Darshit, do you have any pointer? -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
Hi Darshit, Darshit Shah wrote: > Unless we explicitly check for opt.method = POST, this will cause a lot > of issues. Since the macro is being called on every redirect, even > HEAD/PUT/OPTIONS headers will get suspended to give way to GET. That is > not the behaviour we want. Ah, I see it's more complicated. If I understand RFC 2616 correctly (section 10.3), suspending a PUT request in response to a redirect is incorrect. But so is changing a POST request to a GET in response to anything but a 303. But that's expected (and the current behaviour of Wget): Note: When automatically redirecting a POST request after receiving a 301 status code, some existing HTTP/1.0 user agents will erroneously change it into a GET request. I give up. Regards, Gijs
Re: [Bug-wget] Segmentation fault with current development version of wget
> > Still, even if the sanitization is removed: I think it would be better if > RESTORE_POST_DATA restores the previous value of opt.method, instead of > overwriting it with a hardcoded "POST". Isn't it? > > As double safety yes. Maybe we should do that. > A related question: how is a redirect response to a PUT request handled? > How should it be handled? > I'd rather that Giuseppe or someone else answers this in more detail. I haven't tried it, but it looks like in that case the SUSPEND_POST_DATA > macro is called (by retrieve_url in retr.c). If that's true, then later on > the opt.method would be 'restored' to "POST" by RESTORE_POST_DATA. > > In case, we wish to use the SUSPEND_POST_DATA macro for other commands too, then we must make it store method in a backup string and retrieve it. In any case, I think storing and retrieving opt.method is probably a good, safe option that we must implement. > Regards, > > Gijs > > -- Thanking You, Darshit Shah Research Lead, Code Innovation Kill Code Phobia. B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani
Re: [Bug-wget] Segmentation fault with current development version of wget
Hi Giuseppe, Dropping the bit that sanitizes the opt.method is probably a good idea. (Perhaps I shouldn't have replied to your patch directly.) Still, even if the sanitization is removed: I think it would be better if RESTORE_POST_DATA restores the previous value of opt.method, instead of overwriting it with a hardcoded "POST". Isn't it? A related question: how is a redirect response to a PUT request handled? How should it be handled? I haven't tried it, but it looks like in that case the SUSPEND_POST_DATA macro is called (by retrieve_url in retr.c). If that's true, then later on the opt.method would be 'restored' to "POST" by RESTORE_POST_DATA. Regards, Gijs Op 01-05-13 22:16 schreef Giuseppe Scrivano: hi Gijs, Gijs van Tulder writes: Giuseppe Scrivano wrote: what about this patch? Any comment? Another suggestion: why not save the original opt.method, set opt.method to NULL and put the original opt.method back later? thanks for your suggestion but I think we should drop the code that modifies opt.method, since we have to sanitize it only when it is specified as argument. Objections?
Re: [Bug-wget] Segmentation fault with current development version of wget
On Thu, May 2, 2013 at 12:56 AM, Gijs van Tulder wrote: > Giuseppe Scrivano wrote: > > what about this patch? Any comment? > > Another suggestion: why not save the original opt.method, set opt.method > to NULL and put the original opt.method back later? > > Gijs > Unless we explicitly check for opt.method = POST, this will cause a lot of issues. Since the macro is being called on every redirect, even HEAD/PUT/OPTIONS headers will get suspended to give way to GET. That is not the behaviour we want. And if we are explicitly writing the above test, then why not simply allow the xstrdup to copy the string? (Although, I believe using a const *string may be more efficient. ) @Tim: The various tests using -r are all in spider mode. This part of the code is not reached when in spider mode. opt.method && strcasecmp(opt.method,"POST") == 0 provides for more readable code. Maybe we should use that. @Giuseppe: Thanks for that patch! I didn't think along those lines, adding a new command. It also cleans up the code from http.c. It shouldn't have been there in the first place.
Re: [Bug-wget] Segmentation fault with current development version of wget
hi Gijs, Gijs van Tulder writes: > Giuseppe Scrivano wrote: >> what about this patch? Any comment? > > Another suggestion: why not save the original opt.method, set > opt.method to NULL and put the original opt.method back later? thanks for your suggestion but I think we should drop the code that modifies opt.method, since we have to sanitize it only when it is specified as argument. Objections? -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
Giuseppe Scrivano wrote: > what about this patch? Any comment? Another suggestion: why not save the original opt.method, set opt.method to NULL and put the original opt.method back later? Gijs diff --git a/src/retr.c b/src/retr.c index d51b7e7..2aee578 100644 --- a/src/retr.c +++ b/src/retr.c @@ -679,22 +679,23 @@ calc_rate (wgint bytes, double secs, int *units) #define SUSPEND_POST_DATA do { \ post_data_suspended = true; \ saved_post_data = opt.body_data; \ saved_post_file_name = opt.body_file; \ + saved_method = opt.method;\ opt.body_data = NULL; \ opt.body_file = NULL; \ opt.method = NULL;\ } while (0) #define RESTORE_POST_DATA do { \ if (post_data_suspended) \ { \ opt.body_data = saved_post_data; \ opt.body_file = saved_post_file_name; \ + opt.method = saved_method;\ post_data_suspended = false; \ - opt.method = "POST"; \ } \ } while (0) static char *getproxy (struct url *); @@ -721,10 +722,11 @@ retrieve_url (struct url * orig_parsed, const char *origurl, char **file, int redirection_count = 0; bool post_data_suspended = false; char *saved_post_data = NULL; char *saved_post_file_name = NULL; + char *saved_method = NULL; /* If dt is NULL, use local storage. */ if (!dt) { dt = &dummy;
Re: [Bug-wget] Segmentation fault with current development version of wget
Am Mittwoch, 1. Mai 2013 schrieb Darshit Shah: > First, sorry for the quick and dirty hack which was the perfect example of > how NOT to do things. Than it was a good example ;-) > Secondly, it lies upon me that this feature wasn't tested before submitting > the patch. I had however relied on the Test Environment and since it passed > everything there, I thought it was working correctly. Guess, we should add > a test for this soon. --recursive is a commonly used switch with Wget and > not having a test to prevent regressions on it is very bad. There are several tests using -r. The question is why the problem doesn't come out. > I am fixing this issue, but it is a terribly ugly hack. If someone could > help improve it I'd be most truly grateful. > I have a couple of ideas, but I will need to work them out and implement > them when I have the time. > > The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp > on a NULL String, and we cannot initialize opt.method since that would > break some sanity checks which are in place so that --post-* and --body-* > commands don't conflict with each other. Your test isn't really ugly. I (and most C programmers) favor opt.method && strcasecmp(opt.method,"POST") == 0 instead of opt.method ? strcasecmp(opt.method,"POST") == 0 : false But that is not really important. It is pretty common, that one or both args to strcmp may be NULL. The really ugly thing is, that there are no string function alternatives that handle NULL pointers. I regularly use my own versions like strcmp_null() to avoid extra checks. Regards, Tim
Re: [Bug-wget] Segmentation fault with current development version of wget
Darshit Shah writes: > I am fixing this issue, but it is a terribly ugly hack. If someone could > help improve it I'd be most truly grateful. > I have a couple of ideas, but I will need to work them out and implement > them when I have the time. > > The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp > on a NULL String, and we cannot initialize opt.method since that would > break some sanity checks which are in place so that --post-* and --body-* > commands don't conflict with each other. what about this patch? Any comment? diff --git a/src/http.c b/src/http.c index 25ad474..6b23382 100644 --- a/src/http.c +++ b/src/http.c @@ -1766,12 +1766,8 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, struct url *proxy, if (head_only) meth = "HEAD"; else if (opt.method) - { -char *q; -for (q = opt.method; *q; ++q) - *q = c_toupper (*q); -meth = opt.method; - } + meth = opt.method; /* Use the full path, i.e. one that includes the leading slash and the query string. E.g. if u->path is "foo/bar" and u->query is "param=value", full_path will be "/foo/bar?param=value". */ diff --git a/src/init.c b/src/init.c index 813781f..370506f 100644 --- a/src/init.c +++ b/src/init.c @@ -87,6 +87,7 @@ CMD_DECLARE (cmd_directory_vector); CMD_DECLARE (cmd_number); CMD_DECLARE (cmd_number_inf); CMD_DECLARE (cmd_string); +CMD_DECLARE (cmd_string_uppercase); CMD_DECLARE (cmd_file); CMD_DECLARE (cmd_directory); CMD_DECLARE (cmd_time); @@ -212,7 +213,7 @@ static const struct { { "logfile", &opt.lfilename, cmd_file }, { "login",&opt.ftp_user, cmd_string },/* deprecated*/ { "maxredirect", &opt.max_redirect, cmd_number }, - { "method", &opt.method,cmd_string }, + { "method", &opt.method,cmd_string_uppercase }, { "mirror", NULL, cmd_spec_mirror }, { "netrc",&opt.netrc, cmd_boolean }, { "noclobber",&opt.noclobber, cmd_boolean }, @@ -959,8 +960,24 @@ cmd_string (const char *com, const char *val, void *place) return true; } +/* Like cmd_string but ensure the string is upper case. */ +static bool +cmd_string_uppercase (const char *com, const char *val, void *place) +{ + char *q; + bool ret = cmd_string (com, val, place); + q = *((char **) place); + if (!ret || q == NULL) +return false; + + while (*q) +*q++ = c_toupper (*q); + + return true; +} + -/* Like the above, but handles tilde-expansion when reading a user's +/* Like cmd_string, but handles tilde-expansion when reading a user's `.wgetrc'. In that case, and if VAL begins with `~', the tilde gets expanded to the user's home directory. */ static bool -- Giuseppe
Re: [Bug-wget] Segmentation fault with current development version of wget
Couple of things. First, sorry for the quick and dirty hack which was the perfect example of how NOT to do things. Secondly, it lies upon me that this feature wasn't tested before submitting the patch. I had however relied on the Test Environment and since it passed everything there, I thought it was working correctly. Guess, we should add a test for this soon. --recursive is a commonly used switch with Wget and not having a test to prevent regressions on it is very bad. Lastly, I have spent some time on this issue today and figured out the issue. The SUSPEND_POST_DATA logic is so written that the macro is called on every 301/302 redirect. This did not pose any issues till now since when NOT using --post-file or --post-data, the respective variables would be holding a NULL value which is backed up and then restored without any problems. However, with the addition of the --method option, I am explicitly setting opt.method to POST on RESUME_POST_DATA since the macro is supposed to be called only during POST requests. This is the source of the error. The actual Segmentation Fault was being caused since I, in all my foolishness did not strdup, but instead directly assigned the string to a pointer. I am fixing this issue, but it is a terribly ugly hack. If someone could help improve it I'd be most truly grateful. I have a couple of ideas, but I will need to work them out and implement them when I have the time. The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp on a NULL String, and we cannot initialize opt.method since that would break some sanity checks which are in place so that --post-* and --body-* commands don't conflict with each other. 0001-Fix-bug-in-program-logic-and-Seg-Fault.patch Description: Binary data
Re: [Bug-wget] Segmentation fault with current development version of wget
Hi Darshit, I understand that your patch was just a quick hack. But even than you should avoid doing opt.method == "POST" for string comparisons. This definitely not portable. Not every compiler/linker aggregates two occurrences of the same static string into one address in memory. You should use strcmp or strcasecmp to check if opt.method points to "POST". Regards, Tim Am Dienstag, 30. April 2013 schrieb Darshit Shah: > Okay, I could not prevent myself from looking into it. > > It seems as if the SUSPEND_POST_DATA macro was being called during a > recursive download attempt. > > Attaching a hack around the situation. Will look more deeply when I have > more time to identify what caused the regression. > > On Tue, Apr 30, 2013 at 9:09 PM, Darshit Shah wrote: > > > I'm in the middle of University exams at the moment. I'll still have a > > look at it tomorrow when I get a breather. > > > > However, it looks like wget is converting any method to a POST request > > which is weird since that should have caused it to fail most of the tests. > > > > I'll have to look into it and check what is causing this issue. > > > > On Tue, Apr 30, 2013 at 8:46 PM, Tim Ruehsen wrote: > > > >> Hi, > >> > >> you can even reproduce it with a simple > >> wget -r http://translationproject.org/latest/make > >> > >> Darshit, maybe you can have a look at it. It has something to do with > >> opt.method (set to read-only "POST" in http.c, line 1772). > >> > >> Even with --method=GET opt.method points to "POST". And the code tries to > >> uppercase it inline. > >> > >> Regards, Tim > >> > >> Am Tuesday 30 April 2013 schrieb Stefano Lattarini: > >> > Here is the reproducer: > >> > > >> > $ wget --passive-ftp -nv --recursive --level=1 --no-directories > >> > --no-parent \ --no-check-certificate -A '*.po' > >> > http://translationproject.org/latest/make 2013-04-30 16:41:25 > >> > URL:https://translationproject.org/latest/make/ [5489/5489] -> "make" > >> [1] > >> > Segmentation fault > >> > > >> > Sorry, I don't have to look into this more deeply ATM, not to provide > >> > further feedback. Hope you can reproduce this yourself; otherwise, I'll > >> > try to get back to you in the next days. > >> > > >> > Regards, and HTH, > >> > Stefano > >> > >> Mit freundlichem Gruß > >> > >> Tim Rühsen > >> > > > > > > > > -- > > Thanking You, > > Darshit Shah > > Research Lead, Code Innovation > > Kill Code Phobia. > > B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani > > > > > > -- > Thanking You, > Darshit Shah > Research Lead, Code Innovation > Kill Code Phobia. > B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani >
Re: [Bug-wget] Segmentation fault with current development version of wget
On 04/30/2013 07:01 PM, Darshit Shah wrote: > Okay, I could not prevent myself from looking into it. > > It seems as if the SUSPEND_POST_DATA macro was being called during a > recursive download attempt. > > Attaching a hack around the situation. Will look more deeply when I have > more time to identify what caused the regression. > I can confirm this temporary fix works for me. Thanks, Stefano
Re: [Bug-wget] Segmentation fault with current development version of wget
Okay, I could not prevent myself from looking into it. It seems as if the SUSPEND_POST_DATA macro was being called during a recursive download attempt. Attaching a hack around the situation. Will look more deeply when I have more time to identify what caused the regression. On Tue, Apr 30, 2013 at 9:09 PM, Darshit Shah wrote: > I'm in the middle of University exams at the moment. I'll still have a > look at it tomorrow when I get a breather. > > However, it looks like wget is converting any method to a POST request > which is weird since that should have caused it to fail most of the tests. > > I'll have to look into it and check what is causing this issue. > > On Tue, Apr 30, 2013 at 8:46 PM, Tim Ruehsen wrote: > >> Hi, >> >> you can even reproduce it with a simple >> wget -r http://translationproject.org/latest/make >> >> Darshit, maybe you can have a look at it. It has something to do with >> opt.method (set to read-only "POST" in http.c, line 1772). >> >> Even with --method=GET opt.method points to "POST". And the code tries to >> uppercase it inline. >> >> Regards, Tim >> >> Am Tuesday 30 April 2013 schrieb Stefano Lattarini: >> > Here is the reproducer: >> > >> > $ wget --passive-ftp -nv --recursive --level=1 --no-directories >> > --no-parent \ --no-check-certificate -A '*.po' >> > http://translationproject.org/latest/make 2013-04-30 16:41:25 >> > URL:https://translationproject.org/latest/make/ [5489/5489] -> "make" >> [1] >> > Segmentation fault >> > >> > Sorry, I don't have to look into this more deeply ATM, not to provide >> > further feedback. Hope you can reproduce this yourself; otherwise, I'll >> > try to get back to you in the next days. >> > >> > Regards, and HTH, >> > Stefano >> >> Mit freundlichem Gruß >> >> Tim Rühsen >> > > > > -- > Thanking You, > Darshit Shah > Research Lead, Code Innovation > Kill Code Phobia. > B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani > -- Thanking You, Darshit Shah Research Lead, Code Innovation Kill Code Phobia. B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani 0001-Fix-segfault-in-recursive-calls.patch Description: Binary data
Re: [Bug-wget] Segmentation fault with current development version of wget
I'm in the middle of University exams at the moment. I'll still have a look at it tomorrow when I get a breather. However, it looks like wget is converting any method to a POST request which is weird since that should have caused it to fail most of the tests. I'll have to look into it and check what is causing this issue. On Tue, Apr 30, 2013 at 8:46 PM, Tim Ruehsen wrote: > Hi, > > you can even reproduce it with a simple > wget -r http://translationproject.org/latest/make > > Darshit, maybe you can have a look at it. It has something to do with > opt.method (set to read-only "POST" in http.c, line 1772). > > Even with --method=GET opt.method points to "POST". And the code tries to > uppercase it inline. > > Regards, Tim > > Am Tuesday 30 April 2013 schrieb Stefano Lattarini: > > Here is the reproducer: > > > > $ wget --passive-ftp -nv --recursive --level=1 --no-directories > > --no-parent \ --no-check-certificate -A '*.po' > > http://translationproject.org/latest/make 2013-04-30 16:41:25 > > URL:https://translationproject.org/latest/make/ [5489/5489] -> "make" > [1] > > Segmentation fault > > > > Sorry, I don't have to look into this more deeply ATM, not to provide > > further feedback. Hope you can reproduce this yourself; otherwise, I'll > > try to get back to you in the next days. > > > > Regards, and HTH, > > Stefano > > Mit freundlichem Gruß > > Tim Rühsen > -- Thanking You, Darshit Shah Research Lead, Code Innovation Kill Code Phobia. B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani
Re: [Bug-wget] Segmentation fault with current development version of wget
Hi, you can even reproduce it with a simple wget -r http://translationproject.org/latest/make Darshit, maybe you can have a look at it. It has something to do with opt.method (set to read-only "POST" in http.c, line 1772). Even with --method=GET opt.method points to "POST". And the code tries to uppercase it inline. Regards, Tim Am Tuesday 30 April 2013 schrieb Stefano Lattarini: > Here is the reproducer: > > $ wget --passive-ftp -nv --recursive --level=1 --no-directories > --no-parent \ --no-check-certificate -A '*.po' > http://translationproject.org/latest/make 2013-04-30 16:41:25 > URL:https://translationproject.org/latest/make/ [5489/5489] -> "make" [1] > Segmentation fault > > Sorry, I don't have to look into this more deeply ATM, not to provide > further feedback. Hope you can reproduce this yourself; otherwise, I'll > try to get back to you in the next days. > > Regards, and HTH, > Stefano Mit freundlichem Gruß Tim Rühsen
[Bug-wget] Segmentation fault with current development version of wget
Here is the reproducer: $ wget --passive-ftp -nv --recursive --level=1 --no-directories --no-parent \ --no-check-certificate -A '*.po' http://translationproject.org/latest/make 2013-04-30 16:41:25 URL:https://translationproject.org/latest/make/ [5489/5489] -> "make" [1] Segmentation fault Sorry, I don't have to look into this more deeply ATM, not to provide further feedback. Hope you can reproduce this yourself; otherwise, I'll try to get back to you in the next days. Regards, and HTH, Stefano