Commit e39be3283836b8cb7b9ee320456eefb2a2fda173 added a message that said links will not be followed whenever the nofollow attribute is found in a page. It didn't take into account that with -e robots=off (and equivalents) links will still be followed.
This bug has been noticed multiple times: * https://www.reddit.com/r/DataHoarder/comments/mprq89/wget_respects_nofollow_attribute_despite_e/ * https://gist.github.com/simonw/27e810771137408fd7834ad153750c41#gistcomment-3648191 * https://superuser.com/questions/1494761/wget-wont-ignore-no-follow-attributes This commits makes it so that this message is only printed when a nofollow link is found and the norobots convention is respected. --- src/html-url.c | 3 --- src/recur.c | 1 + 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/src/html-url.c b/src/html-url.c index 2d79ca49..acf7f515 100644 --- a/src/html-url.c +++ b/src/html-url.c @@ -837,9 +837,6 @@ get_urls_html_fm (const char *file, const struct file_memory *fm, #endif xfree (meta_charset); - if (ctx.nofollow) { - logprintf(LOG_VERBOSE, _("no-follow attribute found in %s. Will not follow any links on this page\n"), file); - } DEBUGP (("no-follow in %s: %d\n", file, ctx.nofollow)); if (meta_disallow_follow) diff --git a/src/recur.c b/src/recur.c index 7bc4ec42..3c4136d1 100644 --- a/src/recur.c +++ b/src/recur.c @@ -427,6 +427,7 @@ retrieve_tree (struct url *start_url_parsed, struct iri *pi) if (opt.use_robots && meta_disallow_follow) { + logprintf(LOG_VERBOSE, _("no-follow attribute found in %s. Will not follow any links on this page\n"), file); free_urlpos (children); children = NULL; } -- 2.29.3
