Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-04-05 Thread Yousong Zhou
On 5 April 2014 17:28, Jure Grabnar  wrote:
> Hi,
>
>>
>> >
>> > That's true. "type" is currently only used to filter out types which
>> > Wget
>> > doesn't support.
>> > Do you think parsing it ("type") is irrelevant?
>>
>> IMHO, if it will not be used in the near future, then better document
>> or remove it.
>
>
> I tried removing elect_resources() (essentially removing "type" attribute)
> and it mostly works.
> It fails when "bittorrent" url resource has top priority. In this case it
> HTTP downloads what looks to me like a tracker info.
> Since checksum differs from original file (extracted from metalink file)
> download fails.
>
> I also merged two metalink files (header from the first file and resources
> from the second file) and Wget crashes. I found out there are some issues
> with temporary files.
>
> I do believe checking types is more fail-safe since these issues do not
> occur there. At least "bittorrent" resources have to be eliminated
> beforehand or make Wget somehow aware of them.

Yes, I agree that the parsing is needed for filtering out schemes like
ed2k and bittorrent.

yousong



Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-04-05 Thread Jure Grabnar
Hi,


> >
> > That's true. "type" is currently only used to filter out types which Wget
> > doesn't support.
> > Do you think parsing it ("type") is irrelevant?
>
> IMHO, if it will not be used in the near future, then better document
> or remove it.


I tried removing elect_resources() (essentially removing "type" attribute)
and it mostly works.
It fails when "bittorrent" url resource has top priority. In this case it
HTTP downloads what looks to me like a tracker info.
Since checksum differs from original file (extracted from metalink file)
download fails.

I also merged two metalink files (header from the first file and resources
from the second file) and Wget crashes. I found out there are some issues
with temporary files.

I do believe checking types is more fail-safe since these issues do not
occur there. At least "bittorrent" resources have to be eliminated
beforehand or make Wget somehow aware of them.

Regards,

Jure Grabnar


Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-04-01 Thread Yousong Zhou
Hi,

On 1 April 2014 23:02, Jure Grabnar  wrote:
>
> On 1 April 2014 10:39, Yousong Zhou  wrote:
>>
>> On 1 April 2014 15:48, Jure Grabnar  wrote:
>> > Hi,
>> >
>> > I debugged code before writing the 1st patch and found out that if
>> > "type"
>> > attribute is not present in v3.0, libmetalink completly ignores it (URL
>> > is
>> > not present in resources!).
>> > If you write "type" attribute in v4.0, libmetalink ignores it (only
>> > "type",
>> > URL is still present in resources!). So you have to find out protocol
>> > type
>> > from URL in v4.0.
>>
>> But the type attribute is currently not used by wget.  I cannot find
>> any reference to it outside metalink.c.  Anyway, IIUC, types like
>> torrent, ed2k, etc. are not in the realm of wget.
>

I just checked 4.1.2.5 of metalink 3.0 spec.  It says when the "type"
attribute is missing users can derive if it is for BitTorrent by
examining the suffix of the URL.  That's bad.  URL is only for
Universal Resource Locator, it doesn't end with a specific name to
indicate its type.  I may say that libmetalink does the right thing by
ignoring those metalink:url element.

>
> That's true. "type" is currently only used to filter out types which Wget
> doesn't support.
> Do you think parsing it ("type") is irrelevant?

IMHO, if it will not be used in the near future, then better document
or remove it.

>
> Regards,
>
> Jure Grabnar
>



Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-04-01 Thread Jure Grabnar
On 1 April 2014 10:39, Yousong Zhou  wrote:

> On 1 April 2014 15:48, Jure Grabnar  wrote:
> > Hi,
> >
> > I debugged code before writing the 1st patch and found out that if "type"
> > attribute is not present in v3.0, libmetalink completly ignores it (URL
> is
> > not present in resources!).
> > If you write "type" attribute in v4.0, libmetalink ignores it (only
> "type",
> > URL is still present in resources!). So you have to find out protocol
> type
> > from URL in v4.0.
>
> But the type attribute is currently not used by wget.  I cannot find
> any reference to it outside metalink.c.  Anyway, IIUC, types like
> torrent, ed2k, etc. are not in the realm of wget.


That's true. "type" is currently only used to filter out types which Wget
doesn't support.
Do you think parsing it ("type") is irrelevant?

Regards,

Jure Grabnar


Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-04-01 Thread Yousong Zhou
On 1 April 2014 15:48, Jure Grabnar  wrote:
> Hi,
>
> I debugged code before writing the 1st patch and found out that if "type"
> attribute is not present in v3.0, libmetalink completly ignores it (URL is
> not present in resources!).
> If you write "type" attribute in v4.0, libmetalink ignores it (only "type",
> URL is still present in resources!). So you have to find out protocol type
> from URL in v4.0.

But the type attribute is currently not used by wget.  I cannot find
any reference to it outside metalink.c.  Anyway, IIUC, types like
torrent, ed2k, etc. are not in the realm of wget.

yousong

> This was the main purpose of the 1st patch.
>
>
> On 1 April 2014 03:20, Yousong Zhou  wrote:
>>
>> Hi, Jure.
>>
>> On 1 April 2014 03:46, Jure Grabnar  wrote:
>> > Hello,
>> >
>> > thanks for your feedback! I corrected the first patch.
>>
>> Then the 1st one is fine with me.
>>
>> I am not fluent with Metalink and libmetalink on how it handles the
>> type attribute.  In version 4.0 of the standard, there is no type
>> attribute for metalink:url element, only metalink:metaurl has it.
>> Though not explicitly using a word like "must" in version 3.0 of the
>> spec, looks like type attribute is a required one there (See 4.1.2.4
>> of the 3.0 spec).
>
>
> I thought so too, but if you take a look at 4.1.2.5 section of the v3.0
> spec, the last example shows that "type" attribute can be omitted.
>
>>
>> If that is the case, then the metalink file is not a
>> standard-compliant one if that attribute is missing.  Maybe later
>> people want a way to ignore those non-compliant metalink:url element.
>> But let that be another story when the need actually came up.  :)
>
>
> Then libmetalink should be tweaked a bit, to allow non-compliant 
> elements, because currently it just ignores them (v3.0).
> Although, to be honest, they could just switch to v4.0, where "type" is
> optional and properly parsed by libmetalink. :)
>
> Regards,
>
> Jure Grabnar
>
>



Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-04-01 Thread Jure Grabnar
Hi,

I debugged code before writing the 1st patch and found out that if "type"
attribute is not present in v3.0, libmetalink completly ignores it (URL is
not present in resources!).
If you write "type" attribute in v4.0, libmetalink ignores it (only "type",
URL is still present in resources!). So you have to find out protocol type
from URL in v4.0.
This was the main purpose of the 1st patch.


On 1 April 2014 03:20, Yousong Zhou  wrote:

> Hi, Jure.
>
> On 1 April 2014 03:46, Jure Grabnar  wrote:
> > Hello,
> >
> > thanks for your feedback! I corrected the first patch.
>
> Then the 1st one is fine with me.
>
> I am not fluent with Metalink and libmetalink on how it handles the
> type attribute.  In version 4.0 of the standard, there is no type
> attribute for metalink:url element, only metalink:metaurl has it.
> Though not explicitly using a word like "must" in version 3.0 of the
> spec, looks like type attribute is a required one there (See 4.1.2.4
> of the 3.0 spec).


I thought so too, but if you take a look at 4.1.2.5 section of the v3.0
spec, the last example shows that "type" attribute can be omitted.


> If that is the case, then the metalink file is not a
> standard-compliant one if that attribute is missing.  Maybe later
> people want a way to ignore those non-compliant metalink:url element.
> But let that be another story when the need actually came up.  :)
>

Then libmetalink should be tweaked a bit, to allow non-compliant 
elements, because currently it just ignores them (v3.0).
Although, to be honest, they could just switch to v4.0, where "type" is
optional and properly parsed by libmetalink. :)

Regards,

Jure Grabnar


Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-31 Thread Yousong Zhou
Hi, Jure.

On 1 April 2014 03:46, Jure Grabnar  wrote:
> Hello,
>
> thanks for your feedback! I corrected the first patch.

Then the 1st one is fine with me.

I am not fluent with Metalink and libmetalink on how it handles the
type attribute.  In version 4.0 of the standard, there is no type
attribute for metalink:url element, only metalink:metaurl has it.
Though not explicitly using a word like "must" in version 3.0 of the
spec, looks like type attribute is a required one there (See 4.1.2.4
of the 3.0 spec). If that is the case, then the metalink file is not a
standard-compliant one if that attribute is missing.  Maybe later
people want a way to ignore those non-compliant metalink:url element.
But let that be another story when the need actually came up.  :)


   yousong



Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-31 Thread Jure Grabnar
Hello,

thanks for your feedback! I corrected the first patch.

On 31 March 2014 04:01, Yousong Zhou  wrote:

> Hi,
>
> On 28 March 2014 20:33, Jure Grabnar  wrote:
> > Hi,
> >
> > Thank you Yousong. I've listened to your advice and changed type of
> > resource->type to
> > enum url_scheme. Now it looks much cleaner.
>
> Using enum is a step forward.
>
> > @@ -134,7 +135,20 @@ parse_metalink(char *input_file)
> >++(file->num_of_res);
> >
> >resource->url = xstrdup ((*resources)->url);
> > -  resource->type = ((*resources)->type ? xstrdup
> ((*resources)->type) : NULL);
> > +
> > +  if ((*resources)->type)
> > +{
> > +  /* Append "://" to resource type so url_scheme()
> recognizes type */
> > +  char *temp_url = malloc ( strlen ( (*resources)->type) +
> 4);
> > +  sprintf (temp_url, "%s://", (*resources)->type);
> > +
> > +  resource->type = url_scheme (temp_url);
> > +
> > +  free (temp_url);
> > +}
>
This is a little hacky.  Adding a utility function like
> url_scheme_str_to_enum() will be better.


I added url_scheme_str_to_enum() function.


> > +  else
> > +resource->type = url_scheme (resource->url);
> > +
> >resource->location = ((*resources)->location ? xstrdup
> ((*resources)->location) : NULL);
> >resource->preference = (*resources)->preference;
> >resource->maxconnections = (*resources)->maxconnections;
> > @@ -143,7 +157,7 @@ parse_metalink(char *input_file)
> >(file->resources) = resource;
> >  }
> >
> > -  for (checksums = (*files)->checksums; *checksums; ++checksums)
> > +  for (checksums = (*files)->checksums; checksums && *checksums;
> ++checksums)
>
> Good catch.  Should do the same NULL check for (*files)->resources.
>
> >  {
> >mlink_checksum *checksum = malloc (sizeof(mlink_checksum));
> >
> >
>
> <...>
>
> > @@ -215,19 +229,25 @@ elect_resources (mlink *mlink)
> >
> >while (res_next = res->next)
> >  {
> > -  if (strcmp(res_next->type, "ftp") && strcmp(res_next->type,
> "http"))
> > +  if (schemes_are_similar_p (res_next->type, SCHEME_INVALID))
> >  {
> >res->next = res_next->next;
> >free(res_next);
> > +
> > +  --(file->num_of_res);
> >  }
> >else
> >  res = res_next;
> >  }
> >res = file->resources;
> > -  if (strcmp(res->type, "ftp") && strcmp(res->type, "http"))
> > +  if (schemes_are_similar_p (res->type, SCHEME_INVALID))
> >  {
> >file->resources = res->next;
>
> If I am right, this will set it to NULL if file->num_of_res is 1.
>

You're right. I took a second look and it's ok. I was afraid res->next was
a dangling pointer.


> > -  free(res);
> > +  free (res);
> > +
> > +  --(file->num_of_res);
> > +  if (!file->num_of_res)
> > +file->resources = NULL;
>
> So explicitly setting it to NULL is not needed.
>
> >  }
> >  }
> >  }
> >
>
> >
> > I also added check for whenever there's no resources available to
> download a
> > file.
> >
> > Second patch remains unchanged.
> >
> > Regards,
> >
> >
> > Jure Grabnar
>

Regards,

Jure Grabnar
From 67297ae805d42af60fbcce4f804353259e3d8015 Mon Sep 17 00:00:00 2001
From: Jure Grabnar 
Date: Mon, 31 Mar 2014 21:37:54 +0200
Subject: [PATCH 1/2] Fix metalink issues when type is not present

---
 src/ChangeLog  | 14 ++
 src/metalink.c | 23 ---
 src/metalink.h |  4 +++-
 src/retr.c | 10 ++
 src/url.c  | 22 ++
 src/url.h  |  1 +
 6 files changed, 66 insertions(+), 8 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 537a707..74d061e 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,17 @@
+2014-03-31	Jure Grabnar  
+
+	* metalink.h: Change type of 'mlink_resource.type' to enum url_scheme.
+	Include url.h.
+	* metalink.c (parse_metalink): Find out resources' protocol type from URL if
+	not present in  tag in metalink file.
+	Include url.h.
+	(elect_resources): Do not crash when protocol type is NULL.
+	Count number of resources still available, if zero, assign NULL.
+	* url.c:
+	(url_scheme_str_to_enum): New function: extracts protocol type from string
+	without "://" suffix.
+	* retr.c: Add check for number of resources available. Do not crash when zero.
+
 2014-03-26  Darshit Shah  
 
 	* ftp.c (getftp): Rearrange parameters to fix compiler warning
diff --git a/src/metalink.c b/src/metalink.c
index 76db2fa..4ff565d 100644
--- a/src/metalink.c
+++ b/src/metalink.c
@@ -42,6 +42,7 @@ as that of the covered work.  */
 #include "sha256.h"
 #include "metalink.h"
 #include "utils.h"
+#include "url.h"
 
 
 #define HASH_TYPES 3
@@ -119,7 +120,7 @@ parse_metalink(char *input_file)
   file->chunk_checksum = NULL;
 

Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-30 Thread Yousong Zhou
Hi,

On 28 March 2014 20:33, Jure Grabnar  wrote:
> Hi,
>
> Thank you Yousong. I've listened to your advice and changed type of
> resource->type to
> enum url_scheme. Now it looks much cleaner.

Using enum is a step forward.

> @@ -134,7 +135,20 @@ parse_metalink(char *input_file)
>++(file->num_of_res);
>
>resource->url = xstrdup ((*resources)->url);
> -  resource->type = ((*resources)->type ? xstrdup 
> ((*resources)->type) : NULL);
> +
> +  if ((*resources)->type)
> +{
> +  /* Append "://" to resource type so url_scheme() recognizes 
> type */
> +  char *temp_url = malloc ( strlen ( (*resources)->type) + 4);
> +  sprintf (temp_url, "%s://", (*resources)->type);
> +
> +  resource->type = url_scheme (temp_url);
> +
> +  free (temp_url);
> +}

This is a little hacky.  Adding a utility function like
url_scheme_str_to_enum() will be better.

> +  else
> +resource->type = url_scheme (resource->url);
> +
>resource->location = ((*resources)->location ? xstrdup 
> ((*resources)->location) : NULL);
>resource->preference = (*resources)->preference;
>resource->maxconnections = (*resources)->maxconnections;
> @@ -143,7 +157,7 @@ parse_metalink(char *input_file)
>(file->resources) = resource;
>  }
>
> -  for (checksums = (*files)->checksums; *checksums; ++checksums)
> +  for (checksums = (*files)->checksums; checksums && *checksums; 
> ++checksums)

Good catch.  Should do the same NULL check for (*files)->resources.

>  {
>mlink_checksum *checksum = malloc (sizeof(mlink_checksum));
>
>

<...>

> @@ -215,19 +229,25 @@ elect_resources (mlink *mlink)
>
>while (res_next = res->next)
>  {
> -  if (strcmp(res_next->type, "ftp") && strcmp(res_next->type, 
> "http"))
> +  if (schemes_are_similar_p (res_next->type, SCHEME_INVALID))
>  {
>res->next = res_next->next;
>free(res_next);
> +
> +  --(file->num_of_res);
>  }
>else
>  res = res_next;
>  }
>res = file->resources;
> -  if (strcmp(res->type, "ftp") && strcmp(res->type, "http"))
> +  if (schemes_are_similar_p (res->type, SCHEME_INVALID))
>  {
>file->resources = res->next;

If I am right, this will set it to NULL if file->num_of_res is 1.

> -  free(res);
> +  free (res);
> +
> +  --(file->num_of_res);
> +  if (!file->num_of_res)
> +file->resources = NULL;

So explicitly setting it to NULL is not needed.

>  }
>  }
>  }
>

>
> I also added check for whenever there's no resources available to download a
> file.
>
> Second patch remains unchanged.
>
> Regards,
>
>
> Jure Grabnar



Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-28 Thread Jure Grabnar
Patches were based off the wrong branch. I'm resending the correct ones.

I apologize for spam.

Regards,


Jure Grabnar


On 28 March 2014 13:33, Jure Grabnar  wrote:

> Hi,
>
> Thank you Yousong. I've listened to your advice and changed type of
> resource->type to
> enum url_scheme. Now it looks much cleaner.
>
> I also added check for whenever there's no resources available to download
> a file.
>
> Second patch remains unchanged.
>
> Regards,
>
>
> Jure Grabnar
>
>
> On 27 March 2014 16:58, Jure Grabnar  wrote:
>
>> I didn't have time since the last upload (I'm planning to change
>> 'resource->type' type to enum url_scheme). I'll try to upload it
>> today/tomorrow.
>>
>>  Regards,
>>
>>
>> Jure Grabnar
>>
>>
>> On 27 March 2014 14:41, Darshit Shah  wrote:
>>
>>> Any updates on this set of patches?
>>>
>>> On Sat, Mar 22, 2014 at 12:05 PM, Yousong Zhou 
>>> wrote:
>>> > Hi, Jure.
>>> >
>>> > On 22 March 2014 18:02, Jure Grabnar  wrote:
>>> >> Hi,
>>> >>
>>> >> thank you for your feedback, Darshit, Yousong!
>>> >>
>>> >> I reverted magic number back to its original state ('tmp2'), because
>>> it
>>> >> should
>>> >> be there (I overlooked that 'tmp' variable is changed in the very next
>>> >> statement).
>>> >>
>>> >> Duplicated line is removed.
>>> >>
>>> >> I also changed resource->type to point at dynamic memory.
>>> >
>>> > +  if (type)
>>> > +{
>>> > +  resource->type = malloc (strlen (type));
>>> > +  sprintf(resource->type, type);
>>> > +}
>>> >
>>> > xstrdup() is better because that is how existing code does it.  And
>>> > you may want to know that using a variable as the format string is not
>>> > a good practice for secure code.
>>> >
>>> > yousong
>>> >
>>> >>
>>> >> They say third's time's the charm. :) I hope it's ok now.
>>> >>
>>> >> Regards,
>>> >>
>>> >>
>>> >> Jure Grabnar
>>> >>
>>> >>
>>>
>>>
>>>
>>> --
>>> Thanking You,
>>> Darshit Shah
>>>
>>
>>
>
From 766e13fbdef04be26226b9c3cd4e6ce9078e0af9 Mon Sep 17 00:00:00 2001
From: Jure Grabnar 
Date: Fri, 28 Mar 2014 15:01:01 +0100
Subject: [PATCH 1/2] Fix metalink issues when type is not present

---
 src/ChangeLog  | 11 +++
 src/metalink.c | 31 +--
 src/metalink.h |  4 +++-
 src/retr.c | 10 ++
 4 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 5d1147e..3ababfd 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,14 @@
+2014-03-28	Jure Grabnar  
+
+	* metalink.h: Change type of 'mlink_resource.type' to enum url_scheme.
+	Include url.h.
+	* metalink.c (parse_metalink): Find out resources' protocol type from URL if
+	not present in  tag in metalink file.
+	Include url.h.
+	(elect_resources): Do not crash when protocol type is NULL.
+	Count number of resources still available, if zero, assign NULL.
+	* retr.c: Add check for number of resources available. Do not crash when zero.
+
 2014-03-19  Yousong Zhou  
 
 	* init.c, main.c, options.h: Add option --start-pos for specifying
diff --git a/src/metalink.c b/src/metalink.c
index 76db2fa..4c8f02b 100644
--- a/src/metalink.c
+++ b/src/metalink.c
@@ -42,6 +42,7 @@ as that of the covered work.  */
 #include "sha256.h"
 #include "metalink.h"
 #include "utils.h"
+#include "url.h"
 
 
 #define HASH_TYPES 3
@@ -134,7 +135,20 @@ parse_metalink(char *input_file)
   ++(file->num_of_res);
 
   resource->url = xstrdup ((*resources)->url);
-  resource->type = ((*resources)->type ? xstrdup ((*resources)->type) : NULL);
+
+  if ((*resources)->type)
+{
+  /* Append "://" to resource type so url_scheme() recognizes type */
+  char *temp_url = malloc ( strlen ( (*resources)->type) + 4);
+  sprintf (temp_url, "%s://", (*resources)->type);
+
+  resource->type = url_scheme (temp_url);
+
+  free (temp_url);
+}
+  else
+resource->type = url_scheme (resource->url);
+
   resource->location = ((*resources)->location ? xstrdup ((*resources)->location) : NULL);
   resource->preference = (*resources)->preference;
   resource->maxconnections = (*resources)->maxconnections;
@@ -143,7 +157,7 @@ parse_metalink(char *input_file)
   (file->resources) = resource;
 }
 
-  for (checksums = (*files)->checksums; *checksums; ++checksums)
+  for (checksums = (*files)->checksums; checksums && *checksums; ++checksums)
 {
   mlink_checksum *checksum = malloc (sizeof(mlink_checksum));
 
@@ -215,19 +229,25 @@ elect_resources (mlink *mlink)
 
   while (res_next = res->next)
 {
-  if (strcmp(res_next->type, "ftp") && strcmp(res_next->type, "http"))
+  if (schemes_are_similar_p (res_next->type, SCHEME_INVALID))
 {
   res->next = res_next->next;
   free(res_next);
+
+  --(file->num_o

Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-28 Thread Jure Grabnar
Hi,

Thank you Yousong. I've listened to your advice and changed type of
resource->type to
enum url_scheme. Now it looks much cleaner.

I also added check for whenever there's no resources available to download
a file.

Second patch remains unchanged.

Regards,


Jure Grabnar


On 27 March 2014 16:58, Jure Grabnar  wrote:

> I didn't have time since the last upload (I'm planning to change
> 'resource->type' type to enum url_scheme). I'll try to upload it
> today/tomorrow.
>
>  Regards,
>
>
> Jure Grabnar
>
>
> On 27 March 2014 14:41, Darshit Shah  wrote:
>
>> Any updates on this set of patches?
>>
>> On Sat, Mar 22, 2014 at 12:05 PM, Yousong Zhou 
>> wrote:
>> > Hi, Jure.
>> >
>> > On 22 March 2014 18:02, Jure Grabnar  wrote:
>> >> Hi,
>> >>
>> >> thank you for your feedback, Darshit, Yousong!
>> >>
>> >> I reverted magic number back to its original state ('tmp2'), because it
>> >> should
>> >> be there (I overlooked that 'tmp' variable is changed in the very next
>> >> statement).
>> >>
>> >> Duplicated line is removed.
>> >>
>> >> I also changed resource->type to point at dynamic memory.
>> >
>> > +  if (type)
>> > +{
>> > +  resource->type = malloc (strlen (type));
>> > +  sprintf(resource->type, type);
>> > +}
>> >
>> > xstrdup() is better because that is how existing code does it.  And
>> > you may want to know that using a variable as the format string is not
>> > a good practice for secure code.
>> >
>> > yousong
>> >
>> >>
>> >> They say third's time's the charm. :) I hope it's ok now.
>> >>
>> >> Regards,
>> >>
>> >>
>> >> Jure Grabnar
>> >>
>> >>
>>
>>
>>
>> --
>> Thanking You,
>> Darshit Shah
>>
>
>
From 1fae0900e4474f27b6c20520304b1c1b69321690 Mon Sep 17 00:00:00 2001
From: Jure Grabnar 
Date: Fri, 28 Mar 2014 13:13:40 +0100
Subject: [PATCH 1/2] Fix metalink issues when type is not present

---
 src/ChangeLog  | 11 +++
 src/metalink.c | 31 +--
 src/metalink.h |  4 +++-
 src/retr.c | 10 ++
 4 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index f222037..8c3ba1b 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,14 @@
+2014-03-28	Jure Grabnar  
+
+	* metalink.h: Change type of 'mlink_resource.type' to enum url_scheme.
+	Include url.h.
+	* metalink.c (parse_metalink): Find out resources' protocol type from URL if
+	not present in  tag in metalink file.
+	Include url.h.
+	(elect_resources): Do not crash when protocol type is NULL.
+	Count number of resources still available, if zero, assign NULL.
+	* retr.c: Add check for number of resources available. Do not crash when zero.
+
 2014-03-04  Giuseppe Scrivano  
 
 	* http.c (modify_param_value, extract_param): Aesthetic change.
diff --git a/src/metalink.c b/src/metalink.c
index 76db2fa..4c8f02b 100644
--- a/src/metalink.c
+++ b/src/metalink.c
@@ -42,6 +42,7 @@ as that of the covered work.  */
 #include "sha256.h"
 #include "metalink.h"
 #include "utils.h"
+#include "url.h"
 
 
 #define HASH_TYPES 3
@@ -134,7 +135,20 @@ parse_metalink(char *input_file)
   ++(file->num_of_res);
 
   resource->url = xstrdup ((*resources)->url);
-  resource->type = ((*resources)->type ? xstrdup ((*resources)->type) : NULL);
+
+  if ((*resources)->type)
+{
+  /* Append "://" to resource type so url_scheme() recognizes type */
+  char *temp_url = malloc ( strlen ( (*resources)->type) + 4);
+  sprintf (temp_url, "%s://", (*resources)->type);
+
+  resource->type = url_scheme (temp_url);
+
+  free (temp_url);
+}
+  else
+resource->type = url_scheme (resource->url);
+
   resource->location = ((*resources)->location ? xstrdup ((*resources)->location) : NULL);
   resource->preference = (*resources)->preference;
   resource->maxconnections = (*resources)->maxconnections;
@@ -143,7 +157,7 @@ parse_metalink(char *input_file)
   (file->resources) = resource;
 }
 
-  for (checksums = (*files)->checksums; *checksums; ++checksums)
+  for (checksums = (*files)->checksums; checksums && *checksums; ++checksums)
 {
   mlink_checksum *checksum = malloc (sizeof(mlink_checksum));
 
@@ -215,19 +229,25 @@ elect_resources (mlink *mlink)
 
   while (res_next = res->next)
 {
-  if (strcmp(res_next->type, "ftp") && strcmp(res_next->type, "http"))
+  if (schemes_are_similar_p (res_next->type, SCHEME_INVALID))
 {
   res->next = res_next->next;
   free(res_next);
+
+  --(file->num_of_res);
 }
   else
 res = res_next;
 }
   res = file->resources;
-  if (strcmp(res->type, "ftp") && strcmp(res->type, "http"))
+  if (schemes_are_similar_p (res->type, SCHEME_INVALID))
 {
   file-

Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-27 Thread Jure Grabnar
I didn't have time since the last upload (I'm planning to change
'resource->type' type to enum url_scheme). I'll try to upload it
today/tomorrow.

Regards,


Jure Grabnar


On 27 March 2014 14:41, Darshit Shah  wrote:

> Any updates on this set of patches?
>
> On Sat, Mar 22, 2014 at 12:05 PM, Yousong Zhou 
> wrote:
> > Hi, Jure.
> >
> > On 22 March 2014 18:02, Jure Grabnar  wrote:
> >> Hi,
> >>
> >> thank you for your feedback, Darshit, Yousong!
> >>
> >> I reverted magic number back to its original state ('tmp2'), because it
> >> should
> >> be there (I overlooked that 'tmp' variable is changed in the very next
> >> statement).
> >>
> >> Duplicated line is removed.
> >>
> >> I also changed resource->type to point at dynamic memory.
> >
> > +  if (type)
> > +{
> > +  resource->type = malloc (strlen (type));
> > +  sprintf(resource->type, type);
> > +}
> >
> > xstrdup() is better because that is how existing code does it.  And
> > you may want to know that using a variable as the format string is not
> > a good practice for secure code.
> >
> > yousong
> >
> >>
> >> They say third's time's the charm. :) I hope it's ok now.
> >>
> >> Regards,
> >>
> >>
> >> Jure Grabnar
> >>
> >>
>
>
>
> --
> Thanking You,
> Darshit Shah
>


Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-27 Thread Darshit Shah
Any updates on this set of patches?

On Sat, Mar 22, 2014 at 12:05 PM, Yousong Zhou  wrote:
> Hi, Jure.
>
> On 22 March 2014 18:02, Jure Grabnar  wrote:
>> Hi,
>>
>> thank you for your feedback, Darshit, Yousong!
>>
>> I reverted magic number back to its original state ('tmp2'), because it
>> should
>> be there (I overlooked that 'tmp' variable is changed in the very next
>> statement).
>>
>> Duplicated line is removed.
>>
>> I also changed resource->type to point at dynamic memory.
>
> +  if (type)
> +{
> +  resource->type = malloc (strlen (type));
> +  sprintf(resource->type, type);
> +}
>
> xstrdup() is better because that is how existing code does it.  And
> you may want to know that using a variable as the format string is not
> a good practice for secure code.
>
> yousong
>
>>
>> They say third's time's the charm. :) I hope it's ok now.
>>
>> Regards,
>>
>>
>> Jure Grabnar
>>
>>



-- 
Thanking You,
Darshit Shah



Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-22 Thread Yousong Zhou
Hi, Jure.

On 22 March 2014 18:02, Jure Grabnar  wrote:
> Hi,
>
> thank you for your feedback, Darshit, Yousong!
>
> I reverted magic number back to its original state ('tmp2'), because it
> should
> be there (I overlooked that 'tmp' variable is changed in the very next
> statement).
>
> Duplicated line is removed.
>
> I also changed resource->type to point at dynamic memory.

+  if (type)
+{
+  resource->type = malloc (strlen (type));
+  sprintf(resource->type, type);
+}

xstrdup() is better because that is how existing code does it.  And
you may want to know that using a variable as the format string is not
a good practice for secure code.

yousong

>
> They say third's time's the charm. :) I hope it's ok now.
>
> Regards,
>
>
> Jure Grabnar
>
>



Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-22 Thread Jure Grabnar
Hi,

thank you for your feedback, Darshit, Yousong!

I reverted magic number back to its original state ('tmp2'), because it
should
be there (I overlooked that 'tmp' variable is changed in the very next
statement).

Duplicated line is removed.

I also changed resource->type to point at dynamic memory.

They say third's time's the charm. :) I hope it's ok now.

Regards,


Jure Grabnar


On 21 March 2014 14:31, Yousong Zhou  wrote:

> Hi, Jure.
>
> On 21 March 2014 03:23, Jure Grabnar  wrote:
> > Thank you for you feedback Darshit. I changed my proposal according to
> your
> > advices. Hopefully a new version is better.
> >
> > I'm also sending corrected patches, again thanks to your review, Darshit.
> > First patch allows Metalink to have optional argument "type" in 
> > field. Where type is not present, it extracts protocol type from URL
> string.
> >
>
> On the 1st patch, static "char *" value should not be assigned to
> resource->type that will later be free()'ed.
>
>
>yousong
>
From 622b8d63b15c6217c63f39e4eddfbde5c7462636 Mon Sep 17 00:00:00 2001
From: Jure Grabnar 
Date: Sat, 22 Mar 2014 10:38:16 +0100
Subject: [PATCH 1/2] Fix metalink issues when type is not present

---
 src/ChangeLog  |  6 ++
 src/metalink.c | 44 
 2 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index f222037..6e4729c 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,9 @@
+2014-03-20	Jure Grabnar  
+
+	* metalink.c (parse_metalink): Find out resources' protocol type from URL.
+	Do not crash when there're no resources.
+	(elect_resources): Do not crash when protocol type is NULL.
+
 2014-03-04  Giuseppe Scrivano  
 
 	* http.c (modify_param_value, extract_param): Aesthetic change.
diff --git a/src/metalink.c b/src/metalink.c
index 76db2fa..18bc1ee 100644
--- a/src/metalink.c
+++ b/src/metalink.c
@@ -42,6 +42,7 @@ as that of the covered work.  */
 #include "sha256.h"
 #include "metalink.h"
 #include "utils.h"
+#include "url.h"
 
 
 #define HASH_TYPES 3
@@ -134,7 +135,42 @@ parse_metalink(char *input_file)
   ++(file->num_of_res);
 
   resource->url = xstrdup ((*resources)->url);
-  resource->type = ((*resources)->type ? xstrdup ((*resources)->type) : NULL);
+
+  if ((*resources)->type)
+resource->type = xstrdup ((*resources)->type);
+  else
+{
+  const char *type = NULL;
+  enum url_scheme res_scheme = url_scheme (resource->url);
+
+  switch (res_scheme)
+{
+case SCHEME_HTTP:
+case SCHEME_HTTPS:
+  type = "http";
+  break;
+case SCHEME_FTP:
+  type = "ftp";
+  break;
+default:
+  type = NULL;
+}
+
+  int len_url = strlen (resource->url);
+  if (len_url >= 8)
+{
+  const char *url_ending = &(resource->url[len_url-8]);
+  if (0 == strncasecmp (url_ending, ".torrent", 8))
+type = "bittorrent";
+}
+
+  if (type)
+{
+  resource->type = malloc (strlen (type));
+  sprintf(resource->type, type);
+}
+}
+
   resource->location = ((*resources)->location ? xstrdup ((*resources)->location) : NULL);
   resource->preference = (*resources)->preference;
   resource->maxconnections = (*resources)->maxconnections;
@@ -143,7 +179,7 @@ parse_metalink(char *input_file)
   (file->resources) = resource;
 }
 
-  for (checksums = (*files)->checksums; *checksums; ++checksums)
+  for (checksums = (*files)->checksums; checksums && *checksums; ++checksums)
 {
   mlink_checksum *checksum = malloc (sizeof(mlink_checksum));
 
@@ -215,7 +251,7 @@ elect_resources (mlink *mlink)
 
   while (res_next = res->next)
 {
-  if (strcmp(res_next->type, "ftp") && strcmp(res_next->type, "http"))
+  if (res_next->type == NULL || (strcmp(res_next->type, "ftp") && strcmp(res_next->type, "http")))
 {
   res->next = res_next->next;
   free(res_next);
@@ -224,7 +260,7 @@ elect_resources (mlink *mlink)
 res = res_next;
 }
   res = file->resources;
-  if (strcmp(res->type, "ftp") && strcmp(res->type, "http"))
+  if (res->type == NULL || (strcmp(res->type, "ftp") && strcmp(res->type, "http")))
 {
   file->resources = res->next;
   free(res);
-- 
1.9.0

From 28bd23b824d1bf73424ce5d9669e817c3e6ae4c2 Mon Sep 17 00:00:00 2001
From: Jure Grabnar 
Date: Sat, 22 Mar 2014 10:40:24 +0100
Subject: [PATCH 2/2] Fix some compiler warnings

---
 src/ChangeLog  | 18 ++
 src/iri.h  |  2 +-
 src/metalink.c | 11 +--
 s

Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-21 Thread Yousong Zhou
Hi, Jure.

On 21 March 2014 03:23, Jure Grabnar  wrote:
> Thank you for you feedback Darshit. I changed my proposal according to your
> advices. Hopefully a new version is better.
>
> I'm also sending corrected patches, again thanks to your review, Darshit.
> First patch allows Metalink to have optional argument "type" in 
> field. Where type is not present, it extracts protocol type from URL string.
>

On the 1st patch, static "char *" value should not be assigned to
resource->type that will later be free()'ed.


   yousong



Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-21 Thread Darshit Shah
Hi Jure,

I have only had time to skim through these. The first seems fine.
However, you have magic numbers in the second patch.

Where does the 1 in tmp+1 come from? Why is it there? What is the significance?
Even if it is being used only once, you should #define it so that we
know what it is and why its there.

Lines 233 and 234 in the second patch are exactly the same. Why so?

On Thu, Mar 20, 2014 at 8:23 PM, Jure Grabnar  wrote:
> Thank you for you feedback Darshit. I changed my proposal according to your
> advices. Hopefully a new version is better.
>
> I'm also sending corrected patches, again thanks to your review, Darshit.
> First patch allows Metalink to have optional argument "type" in  field.
> Where type is not present, it extracts protocol type from URL string.
>
> Second patch fixes few compiler warnings.  Details are written in ChangeLog.
>
> Regards,
>
>
> Jure Grabnar
>
>
>
> On 20 March 2014 13:13, Darshit Shah  wrote:
>>
>> I have a few comments about your proposal.
>>
>> 1. When downloading multiple files from multiple servers, there is
>> also the use case where you may download multiple files while each is
>> downloaded using multiple threads. Think about this:
>>
>> I have a 2 files I'm trying to download, both are 4 GB each. I'd
>> rather download them with 4 jobs each using 8 different servers than
>> as 2 jobs in all. It is a case which should otherwise be covered int
>> he implementation, but it should be there in your proposal too.
>>
>> Think about how this might be implemented, and how this *might* differ
>> from the other cases.
>>
>> 2. You state re-implement Metalink 3.0 code as the first point in your
>> timeline. So, during this period will you not be looking at Metalink
>> 4.0 compatibility at all? If so, when will you be looking at Metalink
>> 4? Just mention it somewhere.
>>
>> 3. It looks slightly off when the dates on your timeline are not
>> continuous. Don't put any breaks there. You can mention Weekend, if
>> you want, but it should be documented. It's just easier to read when
>> the dates flow continuously.
>>
>> 4. If you have exams, relax for that period. Take some time off. Don't
>> mess up both your code and your exams. I suggest you concentrate on
>> your exams during that time.
>>
>> 5. You give yourself 5 days to merge code and write documentation.
>> That's not happening. Trust me. The merging process will take a lot
>> more time. You're going to be writing a lot of code that edits lots of
>> files in mailine. This code will not be simple to merge directly.
>> Also, documentation takes time.
>>
>> There's also some other higher level comments about your timeline and work
>> flow:
>>
>> You wish to implement/fix Metalink first. But Metalink requires
>> concurrent downloading capability.
>> You may want to rethink the order in which you're implementing these
>> features.
>>
>> On Wed, Mar 19, 2014 at 7:40 PM, Jure Grabnar  wrote:
>> > Hi,
>> >
>> > my proposal is available on
>> >
>> > http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/toomanysecrets/5629499534213120
>> > I'm looking forward to your review.
>> >
>> > Regards,
>> >
>> >
>> > Jure Grabnar
>> >
>> >
>> >
>> > On 18 March 2014 01:06, Darshit Shah  wrote:
>> >>
>> >> Hi Jure,
>> >>
>> >> Thanks for your patches. However, I do have a few comments about the
>> >> same:
>> >>
>> >> 1. Trailing Whitespaces: This is essentially extra whitespaces at the
>> >> end of a line or on a blank line. See [1] and [2] for more
>> >> information.
>> >> 2. The indentation is mostly right, but sometimes off.
>> >> 3. Your first patch is missing a ChangeLog. Every commit must be
>> >> accompanied by a ChangeLog entry, no matter how trivial it is.
>> >> 4. Your 2nd patch seems to revert things from the first one. This
>> >> usually means some cleanup is needed.
>> >>
>> >> I'm not completely sure of some of the details of the lines you change
>> >> in your second patch, but they seem a little sketchy. I'll have to dig
>> >> into the code and check it out.
>> >>
>> >> Also, for a non-trivial (>10 LoC) patch, you'll first need to submit
>> >> your copyright assignment to the FSF.
>> >> Giuseppe will arrange for the documents as soon as your patch is ready.
>> >>
>> >> The code however, does fix a segfault and maybe a few compiler
>> >> warnings. When it fixes something, an explanation is usually a nice
>> >> idea.
>> >>
>> >> [1]
>> >>
>> >> http://codeimpossible.com/2012/04/02/Trailing-whitespace-is-evil-Don-t-commit-evil-into-your-repo-/
>> >> [2]
>> >>
>> >> https://stackoverflow.com/questions/1583406/why-does-git-care-about-trailing-whitespace-in-my-files
>> >>
>> >> On Mon, Mar 17, 2014 at 6:56 PM, Jure Grabnar 
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > this patch fixes some of compiler warnings. I was uncertain for the
>> >> > remaining ones (5) - I believe some of them might be stubs for
>> >> > upcoming
>> >> > features.
>> >> >
>> >> >  Best Regards,
>> >> >
>> >> >
>> >> > Jure Grabnar (tooma

Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-20 Thread Jure Grabnar
Thank you for you feedback Darshit. I changed my proposal according to your
advices. Hopefully a new version is better.

I'm also sending corrected patches, again thanks to your review, Darshit.
First patch allows Metalink to have optional argument "type" in 
field. Where type is not present, it extracts protocol type from URL string.

Second patch fixes few compiler warnings.  Details are written in ChangeLog.

Regards,


Jure Grabnar


On 20 March 2014 13:13, Darshit Shah  wrote:

> I have a few comments about your proposal.
>
> 1. When downloading multiple files from multiple servers, there is
> also the use case where you may download multiple files while each is
> downloaded using multiple threads. Think about this:
>
> I have a 2 files I'm trying to download, both are 4 GB each. I'd
> rather download them with 4 jobs each using 8 different servers than
> as 2 jobs in all. It is a case which should otherwise be covered int
> he implementation, but it should be there in your proposal too.
>
> Think about how this might be implemented, and how this *might* differ
> from the other cases.
>
> 2. You state re-implement Metalink 3.0 code as the first point in your
> timeline. So, during this period will you not be looking at Metalink
> 4.0 compatibility at all? If so, when will you be looking at Metalink
> 4? Just mention it somewhere.
>
> 3. It looks slightly off when the dates on your timeline are not
> continuous. Don't put any breaks there. You can mention Weekend, if
> you want, but it should be documented. It's just easier to read when
> the dates flow continuously.
>
> 4. If you have exams, relax for that period. Take some time off. Don't
> mess up both your code and your exams. I suggest you concentrate on
> your exams during that time.
>
> 5. You give yourself 5 days to merge code and write documentation.
> That's not happening. Trust me. The merging process will take a lot
> more time. You're going to be writing a lot of code that edits lots of
> files in mailine. This code will not be simple to merge directly.
> Also, documentation takes time.
>
> There's also some other higher level comments about your timeline and work
> flow:
>
> You wish to implement/fix Metalink first. But Metalink requires
> concurrent downloading capability.
> You may want to rethink the order in which you're implementing these
> features.
>
> On Wed, Mar 19, 2014 at 7:40 PM, Jure Grabnar  wrote:
> > Hi,
> >
> > my proposal is available on
> >
> http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/toomanysecrets/5629499534213120
> > I'm looking forward to your review.
> >
> > Regards,
> >
> >
> > Jure Grabnar
> >
> >
> >
> > On 18 March 2014 01:06, Darshit Shah  wrote:
> >>
> >> Hi Jure,
> >>
> >> Thanks for your patches. However, I do have a few comments about the
> same:
> >>
> >> 1. Trailing Whitespaces: This is essentially extra whitespaces at the
> >> end of a line or on a blank line. See [1] and [2] for more
> >> information.
> >> 2. The indentation is mostly right, but sometimes off.
> >> 3. Your first patch is missing a ChangeLog. Every commit must be
> >> accompanied by a ChangeLog entry, no matter how trivial it is.
> >> 4. Your 2nd patch seems to revert things from the first one. This
> >> usually means some cleanup is needed.
> >>
> >> I'm not completely sure of some of the details of the lines you change
> >> in your second patch, but they seem a little sketchy. I'll have to dig
> >> into the code and check it out.
> >>
> >> Also, for a non-trivial (>10 LoC) patch, you'll first need to submit
> >> your copyright assignment to the FSF.
> >> Giuseppe will arrange for the documents as soon as your patch is ready.
> >>
> >> The code however, does fix a segfault and maybe a few compiler
> >> warnings. When it fixes something, an explanation is usually a nice
> >> idea.
> >>
> >> [1]
> >>
> http://codeimpossible.com/2012/04/02/Trailing-whitespace-is-evil-Don-t-commit-evil-into-your-repo-/
> >> [2]
> >>
> https://stackoverflow.com/questions/1583406/why-does-git-care-about-trailing-whitespace-in-my-files
> >>
> >> On Mon, Mar 17, 2014 at 6:56 PM, Jure Grabnar 
> wrote:
> >> > Hi,
> >> >
> >> > this patch fixes some of compiler warnings. I was uncertain for the
> >> > remaining ones (5) - I believe some of them might be stubs for
> upcoming
> >> > features.
> >> >
> >> >  Best Regards,
> >> >
> >> >
> >> > Jure Grabnar (toomanysecrets)
> >>
> >>
> >>
> >> --
> >> Thanking You,
> >> Darshit Shah
> >
> >
>
>
>
> --
> Thanking You,
> Darshit Shah
>
From 149cba12e3687d3197360a639b0cb1365361dc68 Mon Sep 17 00:00:00 2001
From: Jure Grabnar 
Date: Thu, 20 Mar 2014 17:59:59 +0100
Subject: [PATCH 1/2] Fix metalink issues when type is not present.

---
 src/ChangeLog  |  6 ++
 src/metalink.c | 38 +-
 2 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index f222037..6e4729c 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,9 @

Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-20 Thread Darshit Shah
I have a few comments about your proposal.

1. When downloading multiple files from multiple servers, there is
also the use case where you may download multiple files while each is
downloaded using multiple threads. Think about this:

I have a 2 files I'm trying to download, both are 4 GB each. I'd
rather download them with 4 jobs each using 8 different servers than
as 2 jobs in all. It is a case which should otherwise be covered int
he implementation, but it should be there in your proposal too.

Think about how this might be implemented, and how this *might* differ
from the other cases.

2. You state re-implement Metalink 3.0 code as the first point in your
timeline. So, during this period will you not be looking at Metalink
4.0 compatibility at all? If so, when will you be looking at Metalink
4? Just mention it somewhere.

3. It looks slightly off when the dates on your timeline are not
continuous. Don't put any breaks there. You can mention Weekend, if
you want, but it should be documented. It's just easier to read when
the dates flow continuously.

4. If you have exams, relax for that period. Take some time off. Don't
mess up both your code and your exams. I suggest you concentrate on
your exams during that time.

5. You give yourself 5 days to merge code and write documentation.
That's not happening. Trust me. The merging process will take a lot
more time. You're going to be writing a lot of code that edits lots of
files in mailine. This code will not be simple to merge directly.
Also, documentation takes time.

There's also some other higher level comments about your timeline and work flow:

You wish to implement/fix Metalink first. But Metalink requires
concurrent downloading capability.
You may want to rethink the order in which you're implementing these features.

On Wed, Mar 19, 2014 at 7:40 PM, Jure Grabnar  wrote:
> Hi,
>
> my proposal is available on
> http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/toomanysecrets/5629499534213120
> I'm looking forward to your review.
>
> Regards,
>
>
> Jure Grabnar
>
>
>
> On 18 March 2014 01:06, Darshit Shah  wrote:
>>
>> Hi Jure,
>>
>> Thanks for your patches. However, I do have a few comments about the same:
>>
>> 1. Trailing Whitespaces: This is essentially extra whitespaces at the
>> end of a line or on a blank line. See [1] and [2] for more
>> information.
>> 2. The indentation is mostly right, but sometimes off.
>> 3. Your first patch is missing a ChangeLog. Every commit must be
>> accompanied by a ChangeLog entry, no matter how trivial it is.
>> 4. Your 2nd patch seems to revert things from the first one. This
>> usually means some cleanup is needed.
>>
>> I'm not completely sure of some of the details of the lines you change
>> in your second patch, but they seem a little sketchy. I'll have to dig
>> into the code and check it out.
>>
>> Also, for a non-trivial (>10 LoC) patch, you'll first need to submit
>> your copyright assignment to the FSF.
>> Giuseppe will arrange for the documents as soon as your patch is ready.
>>
>> The code however, does fix a segfault and maybe a few compiler
>> warnings. When it fixes something, an explanation is usually a nice
>> idea.
>>
>> [1]
>> http://codeimpossible.com/2012/04/02/Trailing-whitespace-is-evil-Don-t-commit-evil-into-your-repo-/
>> [2]
>> https://stackoverflow.com/questions/1583406/why-does-git-care-about-trailing-whitespace-in-my-files
>>
>> On Mon, Mar 17, 2014 at 6:56 PM, Jure Grabnar  wrote:
>> > Hi,
>> >
>> > this patch fixes some of compiler warnings. I was uncertain for the
>> > remaining ones (5) - I believe some of them might be stubs for upcoming
>> > features.
>> >
>> >  Best Regards,
>> >
>> >
>> > Jure Grabnar (toomanysecrets)
>>
>>
>>
>> --
>> Thanking You,
>> Darshit Shah
>
>



-- 
Thanking You,
Darshit Shah



Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-19 Thread Jure Grabnar
Hi,

my proposal is available on
http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/toomanysecrets/5629499534213120
I'm looking forward to your review.

Regards,


Jure Grabnar


On 18 March 2014 01:06, Darshit Shah  wrote:

> Hi Jure,
>
> Thanks for your patches. However, I do have a few comments about the same:
>
> 1. Trailing Whitespaces: This is essentially extra whitespaces at the
> end of a line or on a blank line. See [1] and [2] for more
> information.
> 2. The indentation is mostly right, but sometimes off.
> 3. Your first patch is missing a ChangeLog. Every commit must be
> accompanied by a ChangeLog entry, no matter how trivial it is.
> 4. Your 2nd patch seems to revert things from the first one. This
> usually means some cleanup is needed.
>
> I'm not completely sure of some of the details of the lines you change
> in your second patch, but they seem a little sketchy. I'll have to dig
> into the code and check it out.
>
> Also, for a non-trivial (>10 LoC) patch, you'll first need to submit
> your copyright assignment to the FSF.
> Giuseppe will arrange for the documents as soon as your patch is ready.
>
> The code however, does fix a segfault and maybe a few compiler
> warnings. When it fixes something, an explanation is usually a nice
> idea.
>
> [1]
> http://codeimpossible.com/2012/04/02/Trailing-whitespace-is-evil-Don-t-commit-evil-into-your-repo-/
> [2]
> https://stackoverflow.com/questions/1583406/why-does-git-care-about-trailing-whitespace-in-my-files
>
> On Mon, Mar 17, 2014 at 6:56 PM, Jure Grabnar  wrote:
> > Hi,
> >
> > this patch fixes some of compiler warnings. I was uncertain for the
> > remaining ones (5) - I believe some of them might be stubs for upcoming
> > features.
> >
> >  Best Regards,
> >
> >
> > Jure Grabnar (toomanysecrets)
>
>
>
> --
> Thanking You,
> Darshit Shah
>


Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-17 Thread Darshit Shah
Hi Jure,

Thanks for your patches. However, I do have a few comments about the same:

1. Trailing Whitespaces: This is essentially extra whitespaces at the
end of a line or on a blank line. See [1] and [2] for more
information.
2. The indentation is mostly right, but sometimes off.
3. Your first patch is missing a ChangeLog. Every commit must be
accompanied by a ChangeLog entry, no matter how trivial it is.
4. Your 2nd patch seems to revert things from the first one. This
usually means some cleanup is needed.

I'm not completely sure of some of the details of the lines you change
in your second patch, but they seem a little sketchy. I'll have to dig
into the code and check it out.

Also, for a non-trivial (>10 LoC) patch, you'll first need to submit
your copyright assignment to the FSF.
Giuseppe will arrange for the documents as soon as your patch is ready.

The code however, does fix a segfault and maybe a few compiler
warnings. When it fixes something, an explanation is usually a nice
idea.

[1] 
http://codeimpossible.com/2012/04/02/Trailing-whitespace-is-evil-Don-t-commit-evil-into-your-repo-/
[2] 
https://stackoverflow.com/questions/1583406/why-does-git-care-about-trailing-whitespace-in-my-files

On Mon, Mar 17, 2014 at 6:56 PM, Jure Grabnar  wrote:
> Hi,
>
> this patch fixes some of compiler warnings. I was uncertain for the
> remaining ones (5) - I believe some of them might be stubs for upcoming
> features.
>
>  Best Regards,
>
>
> Jure Grabnar (toomanysecrets)



-- 
Thanking You,
Darshit Shah



[Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-03-17 Thread Jure Grabnar
Hi,

this patch fixes some of compiler warnings. I was uncertain for the
remaining ones (5) - I believe some of them might be stubs for upcoming
features.

 Best Regards,


Jure Grabnar (toomanysecrets)
From f81e0c25456d1c60d998b62cb1ea2740ddd1a353 Mon Sep 17 00:00:00 2001
From: Jure Grabnar 
Date: Fri, 14 Mar 2014 08:38:25 +0100
Subject: [PATCH 1/2] Fixed some metalink issues with openSUSE.iso

Still returns memory corruption bug.
---
 src/metalink.c | 30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/src/metalink.c b/src/metalink.c
index 76db2fa..9bb2274 100644
--- a/src/metalink.c
+++ b/src/metalink.c
@@ -42,6 +42,7 @@ as that of the covered work.  */
 #include "sha256.h"
 #include "metalink.h"
 #include "utils.h"
+#include "url.h"
 
 
 #define HASH_TYPES 3
@@ -134,7 +135,28 @@ parse_metalink(char *input_file)
   ++(file->num_of_res);
 
   resource->url = xstrdup ((*resources)->url);
-  resource->type = ((*resources)->type ? xstrdup ((*resources)->type) : NULL);
+  
+  if ((*resources)->type)
+resource->type = xstrdup((*resources)->type);
+  else
+{
+enum url_scheme res_scheme = url_scheme(resource->url);
+switch (res_scheme) 
+  {
+  case SCHEME_HTTP:
+resource->type = "http";
+break;
+  case SCHEME_HTTPS:
+resource->type = "https";
+break;
+  case SCHEME_FTP:
+resource->type = "ftp";
+break;
+  default:
+resource->type = NULL;
+  }
+}
+  
   resource->location = ((*resources)->location ? xstrdup ((*resources)->location) : NULL);
   resource->preference = (*resources)->preference;
   resource->maxconnections = (*resources)->maxconnections;
@@ -143,7 +165,7 @@ parse_metalink(char *input_file)
   (file->resources) = resource;
 }
 
-  for (checksums = (*files)->checksums; *checksums; ++checksums)
+  for (checksums = (*files)->checksums; checksums && *checksums; ++checksums)
 {
   mlink_checksum *checksum = malloc (sizeof(mlink_checksum));
 
@@ -215,7 +237,7 @@ elect_resources (mlink *mlink)
 
   while (res_next = res->next)
 {
-  if (strcmp(res_next->type, "ftp") && strcmp(res_next->type, "http"))
+  if (res_next->type == NULL || (strcmp(res_next->type, "ftp") && strcmp(res_next->type, "http")))
 {
   res->next = res_next->next;
   free(res_next);
@@ -224,7 +246,7 @@ elect_resources (mlink *mlink)
 res = res_next;
 }
   res = file->resources;
-  if (strcmp(res->type, "ftp") && strcmp(res->type, "http"))
+  if (res->type == NULL || (strcmp(res->type, "ftp") && strcmp(res->type, "http")))
 {
   file->resources = res->next;
   free(res);
-- 
1.9.0

From 59623d0171b3d1aecc9c41f646807416d392cdc7 Mon Sep 17 00:00:00 2001
From: Jure Grabnar 
Date: Mon, 17 Mar 2014 16:41:16 +0100
Subject: [PATCH 2/2] Fix some compiler warnings.

---
 src/ChangeLog  | 19 +++
 src/http.c |  5 +
 src/iri.h  |  2 +-
 src/metalink.c | 55 +++
 src/multi.c|  8 
 src/multi.h|  2 --
 src/progress.c |  1 -
 src/recur.c|  2 ++
 src/retr.c |  4 ++--
 9 files changed, 64 insertions(+), 34 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index f222037..693e75d 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,22 @@
+2014-03-17	Jure Grabnar  
+	
+	* http.c (gethttp): Remove unused variable 'tmp2'.
+	* iri.h: Fix macro parse_charset(str) to not cause 'left-hand operand of
+	comma expression has no effect'.
+	* metalink.c (parse_metalink): Remove unused variable 'piece_hash'.
+	(elect_resources): Add parentheses to assigment.
+	(elect_checksums): Same. Add cast to (unsigned char *).
+	(veriy_file_hash): Add casts to (char *).
+	* multi.h: Remove prototype of static function segmented_retrieve_url().
+	* multi.c: Include "retr.h". Add prototype of static function 
+	segmented_retrieve_url();
+ 	(collect_thread): Add return value to suppress compiler warning.
+	(segmented_retrieve_url): Same.
+	* recur.c (start_retrieve_url): Same.
+	* progress.c: Removed unused variable 'header'.
+	* retr.c (retrieve_from_file): Remove unused variables 'i', 'url_err', 
+	'retries'.
+
 2014-03-04  Giuseppe Scrivano  
 
 	* http.c (modify_param_value, extract_param): Aesthetic change.
diff --git a/src/http.c b/src/http.c
index c5c4d8a..fa627a7 100644
--- a/src/http.c
+++ b/src/http.c
@@ -2872,9 +2872,6 @@ read_header:
   char *tmp = strchr (type, ';');
   if (tmp)
 {
-  /* sXXXav: only needed if IRI support is enabled */
-  char *tmp2 = tmp + 1;
-
   while (tmp > type && c_isspace (tmp[-