Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Daniel Stenberg

On Thu, 2 May 2013, Giuseppe Scrivano wrote:

RFC 2606 doesn't seem very clear about it, and I can't find anywhere that 
PUT/OPTIONS/ANYTHING should be handled differently than POST wrt 
redirections.  I don't see why suspending a PUT request would be incorrect. 
Darshit, do you have any pointer?


Please use the httpbis documents instead of solely relying on RFC2616: 
http://datatracker.ietf.org/doc/draft-ietf-httpbis-p2-semantics/?include_text=1
they're much updated and leaves less to interpret and is now based on many 
years of real-world HTTP use.


While on the subject of redirects, you may also like this: 
http://tools.ietf.org/html/draft-reschke-http-status-308-07


--

 / daniel.haxx.se



Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Giuseppe Scrivano
Gijs van Tulder  writes:

> Hi Darshit,
>
> Darshit Shah wrote:
>> Unless we explicitly check for opt.method = POST, this will cause a lot
>> of issues. Since the macro is being called on every redirect, even
>> HEAD/PUT/OPTIONS headers will get suspended to give way to GET. That is
>> not the behaviour we want.
>
> Ah, I see it's more complicated. If I understand RFC 2616 correctly
> (section 10.3), suspending a PUT request in response to a redirect is
> incorrect. But so is changing a POST request to a GET in response to
> anything but a 303. But that's expected (and the current behaviour of
> Wget):

Thanks you both for the feedback.  Indeed, my first patch doesn't
consider this case.

RFC 2606 doesn't seem very clear about it, and I can't find anywhere
that PUT/OPTIONS/ANYTHING should be handled differently than POST wrt
redirections.  I don't see why suspending a PUT request would be
incorrect.  Darshit, do you have any pointer?

-- 
Giuseppe



Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Gijs van Tulder

Hi Darshit,

Darshit Shah wrote:
> Unless we explicitly check for opt.method = POST, this will cause a lot
> of issues. Since the macro is being called on every redirect, even
> HEAD/PUT/OPTIONS headers will get suspended to give way to GET. That is
> not the behaviour we want.

Ah, I see it's more complicated. If I understand RFC 2616 correctly 
(section 10.3), suspending a PUT request in response to a redirect is 
incorrect. But so is changing a POST request to a GET in response to 
anything but a 303. But that's expected (and the current behaviour of Wget):


  Note: When automatically redirecting a POST request after
  receiving a 301 status code, some existing HTTP/1.0 user agents
  will erroneously change it into a GET request.

I give up.

Regards,

Gijs



Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Darshit Shah
>
> Still, even if the sanitization is removed: I think it would be better if
> RESTORE_POST_DATA restores the previous value of opt.method, instead of
> overwriting it with a hardcoded "POST". Isn't it?
>
> As double safety yes. Maybe we should do that.


> A related question: how is a redirect response to a PUT request handled?
> How should it be handled?
>

I'd rather that Giuseppe or someone else answers this in more detail.

I haven't tried it, but it looks like in that case the SUSPEND_POST_DATA
> macro is called (by retrieve_url in retr.c). If that's true, then later on
> the opt.method would be 'restored' to "POST" by RESTORE_POST_DATA.
>
> In case, we wish to use the SUSPEND_POST_DATA macro for other commands
too, then we must make it store method in a backup string and retrieve it.
In any case, I think storing and retrieving opt.method is probably a good,
safe option that we must implement.


> Regards,
>
> Gijs
>
>


-- 
Thanking You,
Darshit Shah
Research Lead, Code Innovation
Kill Code Phobia.
B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani


Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Gijs van Tulder

Hi Giuseppe,

Dropping the bit that sanitizes the opt.method is probably a good idea. 
(Perhaps I shouldn't have replied to your patch directly.)


Still, even if the sanitization is removed: I think it would be better 
if RESTORE_POST_DATA restores the previous value of opt.method, instead 
of overwriting it with a hardcoded "POST". Isn't it?


A related question: how is a redirect response to a PUT request handled? 
How should it be handled?


I haven't tried it, but it looks like in that case the SUSPEND_POST_DATA 
macro is called (by retrieve_url in retr.c). If that's true, then later 
on the opt.method would be 'restored' to "POST" by RESTORE_POST_DATA.


Regards,

Gijs


Op 01-05-13 22:16 schreef Giuseppe Scrivano:

hi Gijs,

Gijs van Tulder  writes:


Giuseppe Scrivano wrote:

what about this patch?  Any comment?


Another suggestion: why not save the original opt.method, set
opt.method to NULL and put the original opt.method back later?


thanks for your suggestion but I think we should drop the code that
modifies opt.method, since we have to sanitize it only when it is
specified as argument.  Objections?






Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Darshit Shah
On Thu, May 2, 2013 at 12:56 AM, Gijs van Tulder  wrote:

> Giuseppe Scrivano wrote:
> > what about this patch?  Any comment?
>
> Another suggestion: why not save the original opt.method, set opt.method
> to NULL and put the original opt.method back later?
>
> Gijs
>

Unless we explicitly check for opt.method = POST, this will cause a lot of
issues. Since the macro is being called on every redirect, even
HEAD/PUT/OPTIONS headers will get suspended to give way to GET. That is not
the behaviour we want.
And if we are explicitly writing the above test, then why not simply allow
the xstrdup to copy the string? (Although, I believe using a const *string
may be more efficient. )

@Tim: The various tests using -r are all in spider mode. This part of the
code is not reached when in spider mode.
opt.method && strcasecmp(opt.method,"POST") == 0 provides for more readable
code. Maybe we should use that.

@Giuseppe: Thanks for that patch! I didn't think along those lines, adding
a new command. It also cleans up the code from http.c. It shouldn't have
been there in the first place.


Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Giuseppe Scrivano
hi Gijs,

Gijs van Tulder  writes:

> Giuseppe Scrivano wrote:
>> what about this patch?  Any comment?
>
> Another suggestion: why not save the original opt.method, set
> opt.method to NULL and put the original opt.method back later?

thanks for your suggestion but I think we should drop the code that
modifies opt.method, since we have to sanitize it only when it is
specified as argument.  Objections?

-- 
Giuseppe



Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Gijs van Tulder

Giuseppe Scrivano wrote:
> what about this patch?  Any comment?

Another suggestion: why not save the original opt.method, set opt.method 
to NULL and put the original opt.method back later?


Gijs


diff --git a/src/retr.c b/src/retr.c
index d51b7e7..2aee578 100644
--- a/src/retr.c
+++ b/src/retr.c
@@ -679,22 +679,23 @@ calc_rate (wgint bytes, double secs, int *units)

 #define SUSPEND_POST_DATA do {  \
   post_data_suspended = true;   \
   saved_post_data = opt.body_data;  \
   saved_post_file_name = opt.body_file; \
+  saved_method = opt.method;\
   opt.body_data = NULL; \
   opt.body_file = NULL; \
   opt.method = NULL;\
 } while (0)

 #define RESTORE_POST_DATA do {  \
   if (post_data_suspended)  \
 {   \
   opt.body_data = saved_post_data;  \
   opt.body_file = saved_post_file_name; \
+  opt.method = saved_method;\
   post_data_suspended = false;  \
-  opt.method = "POST";  \
 }   \
 } while (0)

 static char *getproxy (struct url *);

@@ -721,10 +722,11 @@ retrieve_url (struct url * orig_parsed, const char 
*origurl, char **file,

   int redirection_count = 0;

   bool post_data_suspended = false;
   char *saved_post_data = NULL;
   char *saved_post_file_name = NULL;
+  char *saved_method = NULL;

   /* If dt is NULL, use local storage.  */
   if (!dt)
 {
   dt = &dummy;




Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Tim Rühsen
Am Mittwoch, 1. Mai 2013 schrieb Darshit Shah:
> First, sorry for the quick and dirty hack which was the perfect example of
> how NOT to do things.

Than it was a good example ;-)

> Secondly, it lies upon me that this feature wasn't tested before submitting
> the patch. I had however relied on the Test Environment and since it passed
> everything there, I thought it was working correctly. Guess, we should add
> a test for this soon. --recursive is a commonly used switch with Wget and
> not having a test to prevent regressions on it is very bad.

There are several tests using -r.
The question is why the problem doesn't come out.

> I am fixing this issue, but it is a terribly ugly hack. If someone could
> help improve it I'd be most truly grateful.
> I have a couple of ideas, but I will need to work them out and implement
> them when I have the time.
> 
> The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp
> on a NULL String, and we cannot initialize opt.method since that would
> break some sanity checks which are in place so that --post-* and --body-*
> commands don't conflict with each other.

Your test isn't really ugly.
I (and most C programmers) favor
opt.method && strcasecmp(opt.method,"POST") == 0
instead of
opt.method ? strcasecmp(opt.method,"POST") == 0 : false
But that is not really important.

It is pretty common, that one or both args to strcmp may be NULL. The really 
ugly thing is, that there are no string function alternatives that handle NULL 
pointers. I regularly use my own versions like strcmp_null() to avoid extra 
checks.

Regards, Tim



Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Giuseppe Scrivano
Darshit Shah  writes:

> I am fixing this issue, but it is a terribly ugly hack. If someone could
> help improve it I'd be most truly grateful.
> I have a couple of ideas, but I will need to work them out and implement
> them when I have the time.
>
> The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp
> on a NULL String, and we cannot initialize opt.method since that would
> break some sanity checks which are in place so that --post-* and --body-*
> commands don't conflict with each other.

what about this patch?  Any comment?

diff --git a/src/http.c b/src/http.c
index 25ad474..6b23382 100644
--- a/src/http.c
+++ b/src/http.c
@@ -1766,12 +1766,8 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, 
struct url *proxy,
 if (head_only)
   meth = "HEAD";
 else if (opt.method)
-  {
-char *q;
-for (q = opt.method; *q; ++q)
-  *q = c_toupper (*q);
-meth = opt.method;
-  }
+  meth = opt.method;
 /* Use the full path, i.e. one that includes the leading slash and
the query string.  E.g. if u->path is "foo/bar" and u->query is
"param=value", full_path will be "/foo/bar?param=value".  */
diff --git a/src/init.c b/src/init.c
index 813781f..370506f 100644
--- a/src/init.c
+++ b/src/init.c
@@ -87,6 +87,7 @@ CMD_DECLARE (cmd_directory_vector);
 CMD_DECLARE (cmd_number);
 CMD_DECLARE (cmd_number_inf);
 CMD_DECLARE (cmd_string);
+CMD_DECLARE (cmd_string_uppercase);
 CMD_DECLARE (cmd_file);
 CMD_DECLARE (cmd_directory);
 CMD_DECLARE (cmd_time);
@@ -212,7 +213,7 @@ static const struct {
   { "logfile",  &opt.lfilename, cmd_file },
   { "login",&opt.ftp_user,  cmd_string },/* deprecated*/
   { "maxredirect",  &opt.max_redirect,  cmd_number },
-  { "method",   &opt.method,cmd_string },
+  { "method",   &opt.method,cmd_string_uppercase },
   { "mirror",   NULL,   cmd_spec_mirror },
   { "netrc",&opt.netrc, cmd_boolean },
   { "noclobber",&opt.noclobber, cmd_boolean },
@@ -959,8 +960,24 @@ cmd_string (const char *com, const char *val, void *place)
   return true;
 }
 
+/* Like cmd_string but ensure the string is upper case.  */
+static bool
+cmd_string_uppercase (const char *com, const char *val, void *place)
+{
+  char *q;
+  bool ret = cmd_string (com, val, place);
+  q = *((char **) place);
+  if (!ret || q == NULL)
+return false;
+
+  while (*q)
+*q++ = c_toupper (*q);
+
+  return true;
+}
+
 
-/* Like the above, but handles tilde-expansion when reading a user's
+/* Like cmd_string, but handles tilde-expansion when reading a user's
`.wgetrc'.  In that case, and if VAL begins with `~', the tilde
gets expanded to the user's home directory.  */
 static bool


-- 
Giuseppe



Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Darshit Shah
Couple of things.
First, sorry for the quick and dirty hack which was the perfect example of
how NOT to do things.

Secondly, it lies upon me that this feature wasn't tested before submitting
the patch. I had however relied on the Test Environment and since it passed
everything there, I thought it was working correctly. Guess, we should add
a test for this soon. --recursive is a commonly used switch with Wget and
not having a test to prevent regressions on it is very bad.

Lastly, I have spent some time on this issue today and figured out the
issue.
The SUSPEND_POST_DATA logic is so written that the macro is called on every
301/302 redirect. This did not pose any issues till now since when NOT
using --post-file or --post-data, the respective variables would be holding
a NULL value which is backed up and then restored without any problems.
However, with the addition of the --method option, I am explicitly setting
opt.method to POST on RESUME_POST_DATA since the macro is supposed to be
called only during POST requests. This is the source of the error.

The actual Segmentation Fault was being caused since I, in all my
foolishness did not strdup, but instead directly assigned the string to a
pointer.

I am fixing this issue, but it is a terribly ugly hack. If someone could
help improve it I'd be most truly grateful.
I have a couple of ideas, but I will need to work them out and implement
them when I have the time.

The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp
on a NULL String, and we cannot initialize opt.method since that would
break some sanity checks which are in place so that --post-* and --body-*
commands don't conflict with each other.


0001-Fix-bug-in-program-logic-and-Seg-Fault.patch
Description: Binary data