# skips all double-encoded [ui]ris because it reinterprets them, outside uri.c:reencode_escapes(), probably in iri.c. wget --iri -mr http://www.liteirc.net/mirrors/siyobik.info/reference.html
# works wget --no-iri -mr http://www.liteirc.net/mirrors/siyobik.info/reference.html Correct [ui]ri: http://www.liteirc.net/mirrors/siyobik.info/instruction/XLAT%252FXLATB.html (200) Incorrect [ui]ri: Correct [ui]ri: http://www.liteirc.net/mirrors/siyobik.info/instruction/XLAT%2FXLATB.html (404) # pcnt_decode(pcnt_decode(“%252F”) -> “%2F") -> “/" Simple-but-incomplete hackaround: use --no-ri To improve compatibility with mirroring international sites, the iri code path could approximate behavior of url.c/url_parse() by avoiding unnecessary modification to --mirror extracted [ui]ris, possibly around the time it adds/dequeues them to/from the queue. Best, Barry Allard
signature.asc
Description: Message signed with OpenPGP using GPGMail
