Nevertheless,
I just wanted to point out that not every library seems to properly
validate/sanitize all the input:
(core/data/url/handlers/redirect.py)
# fix a possible malformed URL
urlparts = urlparse.urlparse(newurl)
if not urlparts.path:
urlparts = list(urlparts)
urlparts[2] = "/"
newurl = urlparse.urlunparse(urlparts)
urlparse for example, won't complain about a URL like
"http://foobar.com:some_non_integer_input/foo...".
>>> urlparse.urlparse("http://w3af.org:fooo:myhost.com/foo?bar=bla").netloc
'w3af.org:fooo:myhost.com'
>>> urlparse.urlparse("http://w3af.org:fooo:myhost.com/foo?bar=bla").hostname
'w3af.org'
>>>
It will crash when you try to call the "port" attribute, but there is no type
casting performed for the "netloc" attribute.
Worse: No validation is performed when you unparse the urlparse object.
So in the end, newurl could be --something---.
I'm kind of afraid of bugs like that, but this topic isn't related to the UTF-8
stuff anymore...
Regards,
Daniel
Am 16.02.2012 um 15:35 schrieb Andres Riancho:
> Daniel,
>
> On Thu, Feb 16, 2012 at 10:38 AM, Daniel Zulla
> <[email protected]> wrote:
>> All software has vulnerabilities, it's in their nature :)
>>
>>
>> Right.
>>
>> Don't really. As soon as the byte string enters w3af, the best
>> thing to do is to decode it using the best encoding available (the one
>> in Content-Encoding header, or some other we might have in the HTTP
>> response) and after that all the rest of w3af's code simply forgets
>> about encodings and uses the unicode string.
>>
>>
>> Cool.
>>
>> Vulnerable to what?
>>
>>
>> A forced crash. I can't see any validation of the incoming data. E.g.:
>> Is resp.code really an integer > 100 < 900.
>
> That's because the validation is done in httplib, please see " def
> _read_status(self):" in httplib.py. We use urllib2, which uses
> httplib, so we don't have to worry about that. The worse thing that
> can happen is that we get a BadStatusLine exception and we're handling
> those in our code in order to avoid crashes.
>
>> We're not assuming that, if the response is not HTTP then httplib,
>> or urllib, or urllib2 (don't really know which one) will fail and
>> raise an exception.
>>
>>
>> That's my point. I would like to be sure about that. Because, for example,
>> if there will be additional c++ based code in w3af one day, and there are
>> chances to bypass filters or to cause exceptions, a python exception could
>> turn into a really dangerous exploitable flaw in PyQt4 or Cython referenced
>> code really quickly.
>
> Could be, but we ARE doing proper error handling in xUrllib and httplib.py
>
>> Could you explain me a little bit more about this? I tried to
>> google for ChunkOfUnidentified or ChunkOfUnidentifiedData and found
>> nothing.
>>
>>
>> http://docs.python.org/release/3.0.1/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit
>
> Quoting you: "Everything is a ChunkOfUnidentified data until it gets
> converted to a string. If it's a string, it's Unicode and everthing is
> fine. If not, everthing breaks immediately."
>
> "Everything is a ChunkOfUnidentified data until it gets converted to a
> string. If it's a string, it's Unicode and everthing is fine." That's
> what we're doing now at w3af. We receive a string of bytes and convert
> it to a unicode string based on the encoding that was indicated by the
> HTTP response. In some cases we're having errors in the conversion
> (because of various reasons that would also happen in py3k), that's
> why we have those bugs.
>
> "If not, everthing breaks immediately." We're trying to avoid that :)
> The problem is that if we use errors=ignore/replace we end up in a
> situation where we don't know about the errors and can't fix them.
>
> PS: Please check how to properly answer emails inline so that it is
> then easier to answer back :)
>
>> Regards,
>> Daniel
>> Am 16.02.2012 um 14:26 schrieb Andres Riancho:
>>
>> Daniel,
>>
>> On Thu, Feb 16, 2012 at 10:07 AM, Daniel Zulla
>> <[email protected]> wrote:
>>
>> I have analyzed some closed source vulnerability scanners, and audited open
>> source scanners like skipfish.
>>
>> Some of them are ironically vulnerable. Somebody may create an apache2
>> module that recognizes attacks in order to force penetration testers'
>> software to crash (or worse, e.g. to execute arbitrary code).
>>
>>
>> All software has vulnerabilities, it's in their nature :)
>>
>> errors=ignore or errors=replace may be a nice way to go, but - here are my
>> two cents:
>>
>> Treating HTTP Responses as an UnidentifiedChunkOfPossiblyMaliciousData" as
>> long as possible is definitely the right way to go.
>>
>>
>> Don't really. As soon as the byte string enters w3af, the best
>> thing to do is to decode it using the best encoding available (the one
>> in Content-Encoding header, or some other we might have in the HTTP
>> response) and after that all the rest of w3af's code simply forgets
>> about encodings and uses the unicode string.
>>
>> I haven't audited or reviewed the httplib, but the "from_httplib_resp"
>> method, looks extremely vulnerable:
>>
>>
>> Vulnerable to what?
>>
>> resp = httplibresp
>>
>> code, msg, hdrs, body = (resp.code, resp.msg, resp.info(), resp.read())
>>
>>
>> if original_url:
>>
>> url_inst = url_object(resp.geturl(), original_url.encoding)
>>
>> else:
>>
>> url_inst = original_url = url_object(resp.geturl())
>>
>>
>> charset = getattr(httplibresp, 'encoding', None)
>>
>> return httpResponse(code, body, hdrs, url_inst,
>>
>> original_url, msg, charset=charset)
>>
>>
>> I am just skeptical about assuming that the response of a webserver is valid
>> HTTP.
>>
>>
>> We're not assuming that, if the response is not HTTP then httplib,
>> or urllib, or urllib2 (don't really know which one) will fail and
>> raise an exception.
>>
>> That's why i mentioned py3k - it's exactly how Python3 handles external
>> data:
>>
>> Everything is a ChunkOfUnidentified data until it gets converted to a
>> string. If it's a string, it's Unicode and everthing is fine. If not,
>> everthing breaks immediately.
>>
>>
>> Could you explain me a little bit more about this? I tried to
>> google for ChunkOfUnidentified or ChunkOfUnidentifiedData and found
>> nothing.
>>
>>
>> Regards,
>>
>> Daniel
>>
>>
>> Am 16.02.2012 um 13:33 schrieb Andres Riancho:
>>
>>
>> sends a string of bytes back to you in the HTTP response.
>>
>>
>> Do you have some code / a example where those exceptions usually appear in
>> the current w3af code?
>>
>>
>> Regards,
>>
>> Daniel
>>
>>
>> Am 15.02.2012 um 22:06 schrieb Javier Andalia:
>>
>>
>> Hello Daniel,
>>
>>
>> On Wed, Feb 15, 2012 at 5:11 PM, Daniel Zulla
>>
>> <[email protected]> wrote:
>>
>> What about switching over to Python3?
>>
>> It solves the UnicodeDecodeException madness.
>>
>>
>> Can you please be more specific? What exactly do you have in mind?
>>
>>
>> Maybe I'm wrong, but the way I see it w3af would still
>>
>> receive/transmit encoded bytes so there's no way to skip the
>>
>> bytestring_to_unicode and unicode_to_bytestring conversions. Not even
>>
>> in py3k.
>>
>>
>> Regards,
>>
>>
>> Javier
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> Andrés Riancho
>> Director of Web Security at Rapid7 LLC
>> Founder at Bonsai Information Security
>> Project Leader at w3af
>>
>>
>
>
>
> --
> Andrés Riancho
> Director of Web Security at Rapid7 LLC
> Founder at Bonsai Information Security
> Project Leader at w3af
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
W3af-develop mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/w3af-develop