[Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-05 Thread Orange Tsai
Hi

I just found that there is a CRLF Injection in the latest version of Wget
1.19.

Wget uses urlencode to encode CRLF in PATH part but doesn't use in HOST
port. So an attacker can inject arbitrary header in the request.


For example:

# This will fail
$ wget 'http://127.0.0.1/%0d%0Cookie: hi'

GET /%0d%0Cookie:%20hi HTTP/1.1
User-Agent: Wget/1.19 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: 127.0.0.1
Connection: Keep-Alive


# This will work
$ wget 'http://127.0.0.1%0d%0aCookie%3a hi%0a/'

GET / HTTP/1.1
User-Agent: Wget/1.19 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: [127.0.0.1
*cookie: hi*
]
Connection: Keep-Alive


Wish you aware this, thanks for your reading :)

-- 
- Orange -


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Tim Ruehsen
On Monday, March 6, 2017 3:42:24 AM CET Orange Tsai wrote:
> Hi
> 
> I just found that there is a CRLF Injection in the latest version of Wget
> 1.19.
> 
> Wget uses urlencode to encode CRLF in PATH part but doesn't use in HOST
> port. So an attacker can inject arbitrary header in the request.
> 
> 
> For example:
> 
> # This will fail
> $ wget 'http://127.0.0.1/%0d%0Cookie: hi'
> 
> GET /%0d%0Cookie:%20hi HTTP/1.1
> User-Agent: Wget/1.19 (linux-gnu)
> Accept: */*
> Accept-Encoding: identity
> Host: 127.0.0.1
> Connection: Keep-Alive
> 
> 
> # This will work
> $ wget 'http://127.0.0.1%0d%0aCookie%3a hi%0a/'
> 
> GET / HTTP/1.1
> User-Agent: Wget/1.19 (linux-gnu)
> Accept: */*
> Accept-Encoding: identity
> Host: [127.0.0.1
> *cookie: hi*
> ]
> Connection: Keep-Alive
> 
> 
> Wish you aware this, thanks for your reading :)

Thanks, just pushed a commit, not allowing control chars in host part.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Dale R. Worley
Orange Tsai  writes:
> # This will work
> $ wget 'http://127.0.0.1%0d%0aCookie%3a hi%0a/'

Not even considering the effect on headers, it's surprising that wget
doesn't produce an immediate error, since
"127.0.0.1%0d%0aCookie%3a hi%0a" is syntactically invalid as a host
part.  Why doesn't wget's URL parser detect that?  I'm sure the new
patch is an improvement, but it's surprising that the old code didn't
detect that was an invalid URL anyway, since it contains characters that
aren't permitted in those locations.

Dale



Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Eli Zaretskii
> From: Tim Ruehsen 
> Date: Mon, 06 Mar 2017 10:17:25 +0100
> Cc: Orange Tsai 
> 
> Thanks, just pushed a commit, not allowing control chars in host part.

Hmm... is it really enough to reject only ASCII control characters?
Maybe we should also reject control characters from other Unicode
ranges?  Just a thought.



Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Orange Tsai
I am surprise that `http://u...@evil.com:8...@good.com` will connect to `
evil.com`, not `good.com`.
Most of URL parser will recognize `good.com` is host part. Like this
advisory, https://curl.haxx.se/docs/adv_20161102J.html
It seem more dangerous if a developer still rely on the result of parse URL
than my original report.

Some testing:
$ python try.py 'http://user@127.3.3.3:80@127.2.2.2/x'

Python scheme=http, host=user@127.3.3.3:80@127.2.2.2, port=
PHP scheme=http, host=127.2.2.2, port=
Perl scheme=http, host=127.2.2.2, port=80
Ruby2 scheme=http, host=127.2.2.2, port=
GO scheme=http, host=127.2.2.2, port=
Java scheme=http, host=, port=-1
JS scheme=http, host=127.2.2.2, port=null



But it seems also the same root cause and fixed at this patch. :)
By the way, would you mind that allocating a CVE-ID to address this?

2017-03-07 0:11 GMT+08:00 Eli Zaretskii :

> > From: Tim Ruehsen 
> > Date: Mon, 06 Mar 2017 10:17:25 +0100
> > Cc: Orange Tsai 
> >
> > Thanks, just pushed a commit, not allowing control chars in host part.
>
> Hmm... is it really enough to reject only ASCII control characters?
> Maybe we should also reject control characters from other Unicode
> ranges?  Just a thought.
>



-- 
- Orange -


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Tim Rühsen
On Dienstag, 7. März 2017 02:01:06 CET Orange Tsai wrote:
> I am surprise that `http://u...@evil.com:8...@good.com` will connect to `
> evil.com`, not `good.com`.
> Most of URL parser will recognize `good.com` is host part. Like this
> advisory, https://curl.haxx.se/docs/adv_20161102J.html
> It seem more dangerous if a developer still rely on the result of parse URL
> than my original report.
> 
> Some testing:
> $ python try.py 'http://user@127.3.3.3:80@127.2.2.2/x'
> 
> Python scheme=http, host=user@127.3.3.3:80@127.2.2.2, port=
> PHP scheme=http, host=127.2.2.2, port=
> Perl scheme=http, host=127.2.2.2, port=80
> Ruby2 scheme=http, host=127.2.2.2, port=
> GO scheme=http, host=127.2.2.2, port=
> Java scheme=http, host=, port=-1
> JS scheme=http, host=127.2.2.2, port=null
> 
> 
> 
> But it seems also the same root cause and fixed at this patch. :)
> By the way, would you mind that allocating a CVE-ID to address this?

I'd appreciate that. But I never did that, so who does allocate a CVE how and 
where ? I am willing to learn :-)

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Orange Tsai
But still thanks :)
I will try to ask for allocating a CVE from https://cve.mitre.org/


2017-03-07 3:05 GMT+08:00 Orange Tsai :

> Oops
>
> That my fault. I sent the wrong mail.
>
> Very sorry :(
>
> 2017-03-07 3:03 GMT+08:00 Tim Rühsen :
>
>> On Dienstag, 7. März 2017 02:01:06 CET Orange Tsai wrote:
>> > I am surprise that `http://u...@evil.com:8...@good.com` will connect to `
>> > evil.com`, not `good.com`.
>> > Most of URL parser will recognize `good.com` is host part. Like this
>> > advisory, https://curl.haxx.se/docs/adv_20161102J.html
>> > It seem more dangerous if a developer still rely on the result of parse
>> URL
>> > than my original report.
>> >
>> > Some testing:
>> > $ python try.py 'http://user@127.3.3.3:80@127.2.2.2/x'
>> >
>> > Python scheme=http, host=user@127.3.3.3:80@127.2.2.2, port=
>> > PHP scheme=http, host=127.2.2.2, port=
>> > Perl scheme=http, host=127.2.2.2, port=80
>> > Ruby2 scheme=http, host=127.2.2.2, port=
>> > GO scheme=http, host=127.2.2.2, port=
>> > Java scheme=http, host=, port=-1
>> > JS scheme=http, host=127.2.2.2, port=null
>> >
>> >
>> >
>> > But it seems also the same root cause and fixed at this patch. :)
>> > By the way, would you mind that allocating a CVE-ID to address this?
>>
>> I'd appreciate that. But I never did that, so who does allocate a CVE how
>> and
>> where ? I am willing to learn :-)
>>
>> Tim
>>
>
>
>
> --
> - Orange -
>



-- 
- Orange -


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Orange Tsai
Oops

That my fault. I sent the wrong mail.

Very sorry :(

2017-03-07 3:03 GMT+08:00 Tim Rühsen :

> On Dienstag, 7. März 2017 02:01:06 CET Orange Tsai wrote:
> > I am surprise that `http://u...@evil.com:8...@good.com` will connect to `
> > evil.com`, not `good.com`.
> > Most of URL parser will recognize `good.com` is host part. Like this
> > advisory, https://curl.haxx.se/docs/adv_20161102J.html
> > It seem more dangerous if a developer still rely on the result of parse
> URL
> > than my original report.
> >
> > Some testing:
> > $ python try.py 'http://user@127.3.3.3:80@127.2.2.2/x'
> >
> > Python scheme=http, host=user@127.3.3.3:80@127.2.2.2, port=
> > PHP scheme=http, host=127.2.2.2, port=
> > Perl scheme=http, host=127.2.2.2, port=80
> > Ruby2 scheme=http, host=127.2.2.2, port=
> > GO scheme=http, host=127.2.2.2, port=
> > Java scheme=http, host=, port=-1
> > JS scheme=http, host=127.2.2.2, port=null
> >
> >
> >
> > But it seems also the same root cause and fixed at this patch. :)
> > By the way, would you mind that allocating a CVE-ID to address this?
>
> I'd appreciate that. But I never did that, so who does allocate a CVE how
> and
> where ? I am willing to learn :-)
>
> Tim
>



-- 
- Orange -


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Tim Rühsen
On Montag, 6. März 2017 18:11:52 CET Eli Zaretskii wrote:
> > From: Tim Ruehsen 
> > Date: Mon, 06 Mar 2017 10:17:25 +0100
> > Cc: Orange Tsai 
> > 
> > Thanks, just pushed a commit, not allowing control chars in host part.
> 
> Hmm... is it really enough to reject only ASCII control characters?
> Maybe we should also reject control characters from other Unicode
> ranges?  Just a thought.

That is a different issue. And non-ASCII chars will be translated into 
punycode form if libidn2 / IRI support is built in. With --disable-iri we 
perhaps should reject any non-ASCII chars !? I am open to a patch...

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Tim Rühsen
On Dienstag, 7. März 2017 02:01:06 CET Orange Tsai wrote:
> I am surprise that `http://u...@evil.com:8...@good.com` will connect to `
> evil.com`, not `good.com`.
> Most of URL parser will recognize `good.com` is host part. Like this
> advisory, https://curl.haxx.se/docs/adv_20161102J.html

The advisory is different in details (it's about # in userinfo, which is 
forbidden regarding RFC 3986).

userinfo does not contain '@' and since 
authority   = [ userinfo "@" ] host [ ":" port ]
we know the userinfo is 'user' and than begins the host part.

What is not correct in your example is that the port is not followed by /. So 
this kind of 'garbage' should result in an error (curl and wget2 ignore 
garbage after the port, which might not be correct, but is 'relaxed' style of 
parsing).

> It seem more dangerous if a developer still rely on the result of parse URL
> than my original report.
> 
> Some testing:
> $ python try.py 'http://user@127.3.3.3:80@127.2.2.2/x'
> 
> Python scheme=http, host=user@127.3.3.3:80@127.2.2.2, port=
> PHP scheme=http, host=127.2.2.2, port=
> Perl scheme=http, host=127.2.2.2, port=80
> Ruby2 scheme=http, host=127.2.2.2, port=
> GO scheme=http, host=127.2.2.2, port=
> Java scheme=http, host=, port=-1
> JS scheme=http, host=127.2.2.2, port=null

The only parser that handles it correctly is Java: returning an error.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-06 Thread Orange Tsai
Sorry, please ignore that. That issue is not related to Wget. I sent to the
wrong mailing list. It's my fault :(

Tim Rühsen 於 2017年3月7日 週二,上午4:26寫道:

> On Dienstag, 7. März 2017 02:01:06 CET Orange Tsai wrote:
> > I am surprise that `http://u...@evil.com:8...@good.com` will connect to `
> > evil.com`, not `good.com`.
> > Most of URL parser will recognize `good.com` is host part. Like this
> > advisory, https://curl.haxx.se/docs/adv_20161102J.html
>
> The advisory is different in details (it's about # in userinfo, which is
> forbidden regarding RFC 3986).
>
> userinfo does not contain '@' and since
> authority   = [ userinfo "@" ] host [ ":" port ]
> we know the userinfo is 'user' and than begins the host part.
>
> What is not correct in your example is that the port is not followed by /.
> So
> this kind of 'garbage' should result in an error (curl and wget2 ignore
> garbage after the port, which might not be correct, but is 'relaxed' style
> of
> parsing).
>
> > It seem more dangerous if a developer still rely on the result of parse
> URL
> > than my original report.
> >
> > Some testing:
> > $ python try.py 'http://user@127.3.3.3:80@127.2.2.2/x'
> >
> > Python scheme=http, host=user@127.3.3.3:80@127.2.2.2, port=
> > PHP scheme=http, host=127.2.2.2, port=
> > Perl scheme=http, host=127.2.2.2, port=80
> > Ruby2 scheme=http, host=127.2.2.2, port=
> > GO scheme=http, host=127.2.2.2, port=
> > Java scheme=http, host=, port=-1
> > JS scheme=http, host=127.2.2.2, port=null
>
> The only parser that handles it correctly is Java: returning an error.
>
> Tim
>
-- 
- Orange -


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-07 Thread Ander Juaristi
Hi Dale,

On 06/03/17 16:47, Dale R. Worley wrote:
> Orange Tsai  writes:
>> # This will work
>> $ wget 'http://127.0.0.1%0d%0aCookie%3a hi%0a/'
> 
> Not even considering the effect on headers, it's surprising that wget
> doesn't produce an immediate error, since
> "127.0.0.1%0d%0aCookie%3a hi%0a" is syntactically invalid as a host
> part.  Why doesn't wget's URL parser detect that?

Simply because it first splits the URL into several parts according to
the delimiters, and then decodes the percent-encoding.

Additionally for the host part it also checks whether it's an IP address
and the IDNA stuff, but yeah you raise a good point. Other than that the
host part is treated similarly to the other parts.

So all in a rush I see RFC 1034 says a domain name should have "any one
of the 52 alphabetic characters A through Z in upper case and a through
z in lower case", and digits, basically.

Do you think it's enough to just blacklist anything outside
[a-z0-9\.\-_], or is there something else to be done?

> I'm sure the new
> patch is an improvement, but it's surprising that the old code didn't
> detect that was an invalid URL anyway, since it contains characters that
> aren't permitted in those locations.
> 
> Dale
> 



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-08 Thread Tim Ruehsen
On Tuesday, March 7, 2017 10:19:15 PM CET Ander Juaristi wrote:
> Hi Dale,
> 
> On 06/03/17 16:47, Dale R. Worley wrote:
> > Orange Tsai  writes:
> >> # This will work
> >> $ wget 'http://127.0.0.1%0d%0aCookie%3a hi%0a/'
> > 
> > Not even considering the effect on headers, it's surprising that wget
> > doesn't produce an immediate error, since
> > "127.0.0.1%0d%0aCookie%3a hi%0a" is syntactically invalid as a host
> > part.  Why doesn't wget's URL parser detect that?
> 
> Simply because it first splits the URL into several parts according to
> the delimiters, and then decodes the percent-encoding.
> 
> Additionally for the host part it also checks whether it's an IP address
> and the IDNA stuff, but yeah you raise a good point. Other than that the
> host part is treated similarly to the other parts.
> 
> So all in a rush I see RFC 1034 says a domain name should have "any one
> of the 52 alphabetic characters A through Z in upper case and a through
> z in lower case", and digits, basically.
> 
> Do you think it's enough to just blacklist anything outside
> [a-z0-9\.\-_], or is there something else to be done?

We are talking about URL validity, not DNS (That is what we use IDNA 2008 / 
TR46 for). Wget should comply to RFC3986 + RFC7320, basically. There are some 
subtleties of course, a good starting point is https://daniel.haxx.se/blog/
2017/01/30/one-url-standard-please/. Thanks, Daniel.

Now we see that 'relaxed' parsing has it's caveats and may easily result in 
security issues.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Vulnerability Report - CRLF Injection in Wget Host Part

2017-03-08 Thread Dale R. Worley
Ander Juaristi  writes:
> On 06/03/17 16:47, Dale R. Worley wrote:
>> Orange Tsai  writes:
>>> # This will work
>>> $ wget 'http://127.0.0.1%0d%0aCookie%3a hi%0a/'
>> 
>> Not even considering the effect on headers, it's surprising that wget
>> doesn't produce an immediate error, since
>> "127.0.0.1%0d%0aCookie%3a hi%0a" is syntactically invalid as a host
>> part.  Why doesn't wget's URL parser detect that?
>
> Simply because it first splits the URL into several parts according to
> the delimiters, and then decodes the percent-encoding.
>
> Additionally for the host part it also checks whether it's an IP address
> and the IDNA stuff, but yeah you raise a good point. Other than that the
> host part is treated similarly to the other parts.

Ah, I looked into RFC 3986, and the generic syntax *does* allow the host
part to contain %-escapes.  But in any case,
"127.0.0.1Cookie:hi" is not parsable as an IPv4
address.  (Always beware of parsing functions that stop when they see
the first invalid character!)

(Also, shouldn't the above example have ended "hi%0d%0a/"?)

Dale