Hi,

>>     What I think is missing from the XSS plugin is the ability to know
>> WHERE the user controlled information is echoed back. By where I mean
>> x or y:
>>
>> <tag parameter="x">y</tag>
>>
>>     When the plugin identifies that the user controlled information is
>> being used in x, then it should check if it is possible to escape from
>> the parameter string using another " (in some cases another '), if its
>> not possible because the character is escaped, then xss is not
>> possible (correct me if I'm missing some edge case here). In the y
>> case, the plugin should check if it is able to send < AND > and they
>> don't get escaped.
>>
>>     I think that it would be really cool to add this logic to the
>> plugin, which will transform the act of sending the vectors in a
>> "proof of concept" because the plugin will already know its
>> exploitable in some way.
>>
>>     Got my idea? What do you think?
>>   
>>     
> Yes, it sounds like a great idea! This is the way I think it *should*
> work :
>
> 1 Send out reflector probes (as right now)
> 2 Check responses, if reflected continue (as right now)
> 3 Check in what context the reflection occurred (may be several - each
> one must be tested for):
>    <tag param="a" param='a2'>b</tag>
> special case :
> <script>foo="c"  ; baz = 'c2'</script>
>
> 4. Based on context, determine what characters are needed to break
> context (if needed) and possibly execute xss. In examples above :
> a: "
> a2:'
> b:<>
> c:" or <>
> c2: ' or <>
>
> 5. Test the generated list of characters clumped together
> 6. Optional step (thoroughtest) : If the parameter was not reflected at
> all this time, test each character separately
> 7. Report any findings where context can be broken
> 8. Optional step (xss-vector): Check extra chars such as () and others
> needed to construct a vector
> 8b: Test xss-vectors based on allowed chars
> 8c: Report xss
>
> Would you agree?
> Then only one xss module is needed, if it can be configured to do only
> up to step 8 (and always reports if markup can be broken).
> /Martin
>
>   

I have now implemented the above mentioned scenario, except 8, 8b and
8c. It uses a ContextParser based on the python HTMLParser to keep track
of where a particular payload was reflected. Based on that, a list of
interesting characters are generated. After that, it basically works
like before, except I didn't add back the actual xss-vectors. Notes:
* If reflection occurs several times, like

<input value="foo"/><p>You searched for foo</p>

the generated list will be " and <>. When test next step tests [",<,>]
and the server responds with

<input value="%22<>"/><p>You searched for "&lt;&gt;</p>

The current logic will think this is XSS. It would be easy to use the
same context-parser in this step aswell to eliminate those false
positives. However, since the input the second time most likely will
contain 'broken' html, the parser will throw errors when parsing it...

* The response body (and payload) is made lowercase before searching to
make the searches case insensitive. When I tested the plugin against
some sites, I noticed that one of them capitalized my input in a title tag.

Regards,
Martin Holst Swende
'''
xss.py

Copyright 2006 Andres Riancho

This file is part of w3af, w3af.sourceforge.net .

w3af is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation version 2 of the License.

w3af is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with w3af; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA

'''

import core.controllers.outputManager as om

# options
from core.data.options.option import option
from core.data.options.optionList import optionList

from core.controllers.basePlugin.baseAuditPlugin import baseAuditPlugin
from core.data.fuzzer.fuzzer import createMutants, createRandAlNum

import core.data.kb.knowledgeBase as kb
import core.data.kb.vuln as vuln
import core.data.constants.severity as severity
import re
import HTMLParser

class Contextparser(HTMLParser.HTMLParser):
    """
    Parser which determines where in the html context a certain payload was found. After parsing,
    the getCharacters() can be called to get interesting characters to test with
    It may also discover that parameters are echoed back as pure javascript, in which 
    case it will report directly to the kb
    example:
    http://bad.programming.com?time=getTime()
    <script>getTime()</script>
    Not seem often though...
    Based on the python HTMLParser
    @author Martin Holst Swende 2010
    """
    
    def __init__(self, payload, mutant, response):
        """
        @param payload the payload to look for
        @param mutant needed to be able to report to the kb
        @param response needed to be able to report to the kb
        """
        HTMLParser.HTMLParser.__init__(self)
        self.mutant = mutant
        self.response = response
        
        self.payload = payload
        self.inScriptingContext = 0
        self.interestingCharacters = {}
    def debug(self,arg):
        #print(arg)
        om.out.debug(arg)
    def getCharacters(self):
        """ 
        Returns the interesting characters determined after context analysis
        """ 
        return self.interestingCharacters.keys()
    
    def handle_startendtag(self, tag, attrs):
        self.handle_starttag(tag, attrs)
        self.handle_endtag(tag)
    
    def handle_starttag(self, tag, attrs):
        #Note when we are in a scripting context
        if(tag == 'script'):
            self.inScriptingContext = True
        #Check for payload
        for (attr, val) in attrs : 
            if attr.find(self.payload ) > -1:
                self.debug("Payload found in tag attribute (weird!):"+attr)
            if val.find(self.payload) > -1 :
                #Now, need to find out what character was used for quoting
                tag = self.get_starttag_text()
                index = tag.find(val)
                quoteChar = tag[index-1]
                self.interestingCharacters[quoteChar] = True
                self.debug("Payload found in tag attribute value: %s =%s...%s" % (attr,quoteChar,quoteChar))
                
    def handle_endtag(self, tag):
        if tag == 'script':
            self.inScriptingContext = False
            
    def handle_data(self, data):
        if data.find(self.payload) >-1:
            if self.inScriptingContext:
                self.debug("Payload found in scripting context:"+data)
                #We assume it is in a variable declaration, such as
                # <script> 
                #   var currentPage="index.jsp?foo=PAYLOAD"
                # ...
                # </script>
                #---
                #Therefore, we just parse the data until we reach the payload, and 
                # count the quotations
                self.parseJS(data,self.payload)
            else:
                self.debug("Payload found in tag data value: "+data)
            
            self.interestingCharacters["<"] = True
            self.interestingCharacters[">"] = True
                
    def parseJS(self,data,payload):
        """Parse the javascript for quotes"""
        i = 0
        n = len(data)
        context_quoted=False
        interesting_js = re.compile('[\\"\']|'+payload)
        while i < n:
            match = interesting_js.search(data, i) # ' or " or \
            if match:
                i = match.start()
            if i == n: break
            startswith = data.startswith
            if startswith('\\',i): # Escape char, ignore next if it is one of " or '
                if startswith('"',i+1) or startswith("'",i+1):
                    i = i+1
            elif startswith('"',i):
                if context_quoted == '"':
                    context_quoted= False
                else:
                    context_quoted = '"'
            elif startswith("'",i):
                if context_quoted == "'":
                    context_quoted= False
                else:
                    context_quoted = "'"
            elif startswith(payload,i): # The payload!
                if context_quoted:
                    self.debug("Found payload quoted within "+context_quoted)
                    self.interestingCharacters[context_quoted] = True
                else:
                    self.debug("Found payload unquoted (strange!)")
                                # Save it to the KB
                    v = vuln.vuln( self.mutant )
                    v.setId( self.response.id )
                    v.setName( 'XSS' )
                    v.setSeverity(severity.MEDIUM)
                    msg = 'It appears javascript is executed directly via browser input, which would be XSS. This was found at ' + self.mutant.foundAt() 
                    msg += '\r\nThis needs to be verified manually'
                    v.setDesc( msg )
                    kb.kb.append( self, 'Probable xss', v )
   
            i=i+1
           
    def handle_comment(self, data):
        if data.find(self.payload) > -1:
            self.debug("Payload found in comment (weird!): "+data)

    # Stuff like <!
    def handle_decl(self, decl):
        if decl.find(self.payload) > -1:
            self.debug("Payload found in declaration (weird!): "+decl)

    # Stuff like <?
    def handle_pi(self, data):
        if data.find(self.payload) > -1:
            self.debug("Payload found in processing instruction (weird!): "+data)

    

class xss2(baseAuditPlugin):
    '''
    Find cross site scripting vulnerabilities.
    @author: Martin Holst Swende 2010, based on xss.py by Andres Riancho
    '''
    #Default chars are used when html parsing fails
    defaultCharacters = ["<",">","'",'"']
    
    def __init__(self):
        baseAuditPlugin.__init__(self)
        
        # Some internal variables to keep track of remote web application sanitization
        self._fuzzableRequests = []
        self._xssMutants = []
        self._thoroughTest = False
        
    def audit(self, freq ):
        '''
        Tests an URL for potential XSS vulnerabilities.
        
        @param freq: A fuzzableRequest
        '''
        om.out.debug( 'XssLite plugin is testing: ' + freq.getURL() )
        
        # This list is just to test if the parameter is echoed back
        fake_mutants = createMutants( freq , ['', ] )
            
        for mutant in fake_mutants:
            # verify if the variable we are fuzzing is actually being echoed back
            # and get characters to test
            chars_to_test = self._is_echoed( mutant )
            if chars_to_test:
                    self._quick_test(mutant, chars_to_test)    
    
    def _report(self,allowed, mutant, response):
        '''
        Writes to kb or log if anything interesting is found
        '''
        if len(allowed)  > 0 :
            # Save it to the KB
            v = vuln.vuln( mutant )
            v.setId( response.id )
            v.setName( 'XSS' )
            v.setSeverity(severity.MEDIUM)
            msg = 'HTML or javascript special characters that are not properly encoded and can break context were found at : ' + mutant.foundAt() 
            msg += '\r\n The following characters were echoed on the page in an unsafe manner: %s \r\n' % ''.join(allowed)
            v.setDesc( msg )

            kb.kb.append( self, 'Probable xss', v )
            
    def _thorough_test(self,mutant, special_characters):
        '''
        Tests for injection of special characters by creating one request per character
        @return: nothing, stores result in kb
        '''
        
        # Create a random number and assign it to the mutant modified parameter
        oldValue = mutant.getModValue() 
        allowed = []
        
        for char in special_characters: 
            prefix = str( self.generatePayload( 3 ) )
            suffix = str( self.generatePayload( 3 ) ) 
            payload = prefix + char + suffix
        
            mutant.setModValue(payload)
        
            # send
            response = self._sendMutant( mutant, analyze=False )
            data = response.getBody().lower()
            # Analyze the response
            if data.find(payload) > -1 :
                allowed.append(char)
                self._report(allowed, mutant, response)
                #What about response header ? 
        
        mutant.setModValue(oldValue)
        
    def _quick_test(self, mutant, special_characters):
        '''
        Tests for injection of special characters by bundling them in one request
        @return: nothing, stores result in kb
        '''
        
        # Create a random number and assign it to the mutant modified parameter
        matchList = {}
        allowed = []
        #First prefix is longer, used as fingerprint later to check if
        # the payload was dropped
        prefix = str( self.generatePayload( 4 ) )
        fingerprint = prefix
        payload = prefix
        for char in special_characters: 
            suffix = str(  self.generatePayload( 3 ) )  
            payload = payload + char + suffix     
            matchList[char] = prefix+ char + suffix
            prefix = suffix
        
        oldValue = mutant.getModValue() 
        mutant.setModValue(payload)          
        
        # send
        response = self._sendMutant( mutant, analyze=False )
        data = response.getBody().lower()
        # Analyze the response
        for char,match in matchList.items():
                if data.find(match) > -1 :
                    allowed.append(char)
                #What about response header ? 
        
        if len(allowed) == 0 : # No reflections
            #Was it encoded or dropped ? 
            #If dropped, we should try again with one character at a time
            if data.find(fingerprint) == -1 and self._thoroughTest:
                om.out.debug('Payload dropped, trying one char at a time')
                #Parameter was not echoed, proceed with one char at a time
                mutant.setModValue(oldValue)
                return self._thorough_test(mutant, special_characters)
           
        self._report(allowed, mutant, response)
        
    def generatePayload(self,length):
        '''
        Generate a lowercase random alnum string. The payload is lowercase since the
        response body is made to lowercase before the analysis starts. Parameters may be echoed
        like <title>Home - Payload</title> where payload has been capitalized. In order to
        achieve case insensitivity and still use string.find instead of expensive regexps, this 
        approach is used. 
        '''
        return createRandAlNum( 5 ).lower()
        
    def _is_echoed( self, mutant):
        '''
        Verify if the parameter we are fuzzing is really being echoed back in the
        HTML response or not. If it isn't echoed there is no chance we are going to
        find a reflected XSS here.
        
        If the parameter is echoed, the html is parsed to find out exactly what 
        characters are interesting in the context where the reflection(s) occurred. 
        
        Also please note that I send a random alphanumeric value, and not a numeric
        value, because even if the number is echoed back (and only numbers are echoed
        back by the application) that won't be of any use in the XSS detection.
        
        @parameter mutant: The request to send.
        @return: False if variable is not echoed, otherwise a list of characters to test
         based on the context where the reflection occurred
        '''
        # Create a random number and assign it to the mutant modified
        # parameter
        rndNum = str( self.generatePayload( 5 ) )
        oldValue = mutant.getModValue() 
        mutant.setModValue(rndNum)

        # send
        response = self._sendMutant( mutant, analyze=False )
        
        # Analyze and return response
        data = response.getBody().lower()
        if rndNum in data:
            om.out.debug('The variable ' + mutant.getVar() + ' is being echoed back.' )
            c = Contextparser(rndNum, mutant,response)
            try:
                c.feed(data)
                # restore the mutant values
                mutant.setModValue(oldValue)
                return c.getCharacters()
            except :
                om.out.debug("Error occurred while parsing, returning all default characters for testing")
                # restore the mutant values
                mutant.setModValue(oldValue)
                return self.defaultCharacters
                
        else:
            om.out.debug('The variable ' + mutant.getVar() + ' is NOT being echoed back.' )
            return False
    
  
 

    def getOptions( self ):
        '''
        @return: A list of option objects for this plugin.
        '''
        d1 = 'Thorough test'
        h1 = 'If set to True, w3af will test for each special character separately, in case a payload is dropped'
        o1 = option('thoroughTest', self._thoroughTest, d1, 'boolean', help=h1)
        
        ol = optionList()
        ol.add(o1)
        return ol
    
    def setOptions( self, optionsMap ):
        '''
        This method sets all the options that are configured using the user interface 
        generated by the framework using the result of getOptions().
        
        @parameter OptionList: A dictionary with the options for the plugin.
        @return: No value is returned.
        '''
        self._thoroughTest = optionsMap['thoroughTest'].getValue()
       
    def getPluginDeps( self ):
        '''
        @return: A list with the names of the plugins that should be runned before the
        current one.
        '''
        return []

    def getLongDesc( self ):
        '''
        @return: A DETAILED description of the plugin functions and features.
        '''
        return '''
        This plugin finds potential Cross Site Scripting (XSS) vulnerabilities.
        
        One configurable parameter exist:
            - thoroughTest (default False)
        
        The plugin checks if parameters are reflected onto the page. IF reflection 
        occurs, the page is parsed to determine what characters are needed to break
        out of context and execute javascript. 
        
        The plugin then tries several characters per request to see if they are reflected
        without modification.  If the server should cease to reflect the payload, 
        by dropping it completely, the plugin will check each character one by one of thoroughTest is True
        which makes more requests but have higher chance of not getting dropped
        '''
        

def test(html, payload):
    c = Contextparser(payload,False,False)
    c.feed(html)
    print("Interesting characters: "+str(c.getCharacters()))
if __name__ == "__main__":
    payload ="abcdef"

    tests = [
    "<html><body><input value='%s'/><input value=\"%s\"/>"%( payload,payload),
    "<html><body><input value=\"appa'asd%sasdf'adf\"/><a href='test'/></html>" % payload,
    "<html><body><tag value=\"foo\"/>%s<a href='test'/></html>" % payload,
    "<html><body><script>foo=\"%s\"</script></html>" % payload,
    ]
    for html in tests:
        test(html,payload)
    
    

    
------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
W3af-develop mailing list
W3af-develop@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/w3af-develop

Reply via email to