Ah HA!  I don't know about the bug in wget, but I found the oddness in
the site being crawled which was *causing* wget to trip *its* bug:

<script
src=”http://ajax.googleapis.com/ajax/libs/jquery/1.5/jquery.min.js”></script>

Took forever to spot this: somebody put "pretty quotes" in a CSS file in
a WordPress theme - browsers, wget included, don't recognize the pretty
quotes as quotes for coding purposes, so you end up trying to fetch a
really, really broken URL:

http://www.[redacted]/%E2%80%9Dhttp:/ajax.googleapis.com/ajax/libs/jquery/1.5/jquery.min.js%E2%80%9D

Most browsers just try to get that URL, fail at it, and move on with
life: but the new version of wget in 12.04 actually *segfaults* when it
encounters that, which of course it only will if recursion is turned on.
If it helps: you really do ONLY get this in recursion; an attempt to
fetch the botched URL manually - either using the HTML escape codes, or
using the prettyquotes directly at the shell - results in the expected
404, not a segfault.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1022124

Title:
  segfault in wget 1.13.4

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/wget/+bug/1022124/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to