Hi,
it's awfully quiet on the list. Is everyone in Vegas? ;)
I've attached a patch which adds black- and whitelist functionality to w3af, to
be able to restrict the scanning. This is a merge of my own code and Zach
Jansen's code, which he posted to the list on July 7th and which I didn't see
before I started coding... If you'd be so kind to review and test this, I would
be glad. The tests I did seemed to work.
My code (hopefully) improves his by adding support for more than one regex and
also the ability to switch to wildcard mode, which is easier to type but less
powerful. The first improvement comes with a problem though: As Python
interprets commas as list item separators, you currently can't use commas in
your regex. Maybe someone has an idea on how to solve this elegantly.
There's still the problem remaining that this patch will "break" some of
the passive discovery plugins as Zach alredy wrote, like the ones using
Archive.org or Google. Before I go and implement it, I wanted to discuss what I
have come up with. I can think of two options at the moment:
1. Add a static list to the whitelisting code with all the URLs we do not want
to restrict. Not a good solution IMHO, as the users' decision gets overriden
without their knowledge. Maybe they really want to restrict requests to
Google.
2. Add an "override whitelist" config item to all plugins affected. Allow the
user to decide if this plugin should be allowed to bypass the whitelist. This
would save the user the hassle to add a whitelisting to the config, as he
only
has to set a tick.
Are there any other solutions I'm missing?
Patrick
--
The Plague: You wanted to know who I am, Zero Cool? Well, let me explain
the New World Order. Governments and corporations need people
like you and me. We are Samurai... the Keyboard Cowboys... and
all those other people who have no idea what's going on are
the cattle... Moooo.
(Hackers)
diff --git a/core/controllers/miscSettings.py b/core/controllers/miscSettings.py
index b71bd78..daaf6d8 100644
--- a/core/controllers/miscSettings.py
+++ b/core/controllers/miscSettings.py
@@ -30,6 +30,9 @@ from core.data.options.optionList import optionList
# Raise errors
from core.controllers.w3afException import w3afException
+import re
+import fnmatch
+
class miscSettings(configurable):
'''
@@ -53,8 +56,10 @@ class miscSettings(configurable):
cf.cf.save('interface', 'eth0' )
cf.cf.save('localAddress', '127.0.0.1' )
cf.cf.save('demo', False )
- cf.cf.save('nonTargets', [] )
cf.cf.save('exportFuzzableRequests', '')
+ cf.cf.save('useWildcardMatching', False )
+ cf.cf.save('blacklistRegex', [] )
+ cf.cf.save('whitelistRegex', ['.*'] )
def getOptions( self ):
'''
@@ -77,43 +82,63 @@ class miscSettings(configurable):
d5 = 'A list with all fuzzable header names'
o5 = option('fuzzableHeaders', cf.cf.getData('fuzzableHeaders'), d5,
'list', tabid='Fuzzer parameters')
- d15 = 'Indicates what HTML form combo values w3af plugins will use:
all, tb, tmb, t, b'
- h15 = 'Indicates what HTML form combo values, e.g. select options
values, w3af plugins will use: all (All values), tb (only top and bottom
values), tmb (top, middle and bottom values), t (top values), b (bottom values)'
- o15 = option('fuzzFormComboValues',
cf.cf.getData('fuzzFormComboValues'), d15, 'string', help=h15, tabid='Fuzzer
parameters')
+ d6 = 'Indicates what HTML form combo values w3af plugins will use:
all, tb, tmb, t, b'
+ h6 = 'Indicates what HTML form combo values, e.g. select options
values, w3af plugins will use: all (All values), tb (only top and bottom
values), tmb (top, middle and bottom values), t (top values), b (bottom values)'
+ o6 = option('fuzzFormComboValues',
cf.cf.getData('fuzzFormComboValues'), d6, 'string', help=h6, tabid='Fuzzer
parameters')
######## Core parameters ########
- d6 = 'Automatic dependency enabling for plugins'
- h6 = 'If autoDependencies is enabled, and pluginA depends on pluginB
that wasn\'t enabled, then pluginB is automatically enabled.'
- o6 = option('autoDependencies', cf.cf.getData('autoDependencies'), d6,
'boolean', help=h6, tabid='Core settings')
+ d7 = 'Automatic dependency enabling for plugins'
+ h7 = 'If autoDependencies is enabled, and pluginA depends on pluginB
that wasn\'t enabled, then pluginB is automatically enabled.'
+ o7 = option('autoDependencies', cf.cf.getData('autoDependencies'), d7,
'boolean', help=h7, tabid='Core settings')
- d7 = 'Maximum depth of the discovery phase'
- h7 = 'For example, if set to 10, the webSpider plugin will only follow
10 link levels while spidering the site. This applies to the whole discovery
phase; not only to the webSpider.'
- o7 = option('maxDepth', cf.cf.getData('maxDepth'), d7, 'integer',
help=h7, tabid='Core settings')
+ d8 = 'Maximum depth of the discovery phase'
+ h8 = 'For example, if set to 10, the webSpider plugin will only follow
10 link levels while spidering the site. This applies to the whole discovery
phase; not only to the webSpider.'
+ o8 = option('maxDepth', cf.cf.getData('maxDepth'), d8, 'integer',
help=h8, tabid='Core settings')
- d8 = 'Maximum number of threads that the w3af process will spawn'
- h8 = 'The maximum valid number of threads is 100.'
- o8 = option('maxThreads', cf.cf.getData('maxThreads'), d8, 'integer',
tabid='Core settings', help=h8)
+ d9 = 'Maximum number of threads that the w3af process will spawn'
+ h9 = 'The maximum valid number of threads is 100.'
+ o9 = option('maxThreads', cf.cf.getData('maxThreads'), d9, 'integer',
tabid='Core settings', help=h9)
- d9 = 'Maximum number of times the discovery function is called'
- o9 = option('maxDiscoveryLoops', cf.cf.getData('maxDiscoveryLoops'),
d9, 'integer', tabid='Core settings')
+ d10 = 'Maximum number of times the discovery function is called'
+ o10 = option('maxDiscoveryLoops', cf.cf.getData('maxDiscoveryLoops'),
d10, 'integer', tabid='Core settings')
######## Network parameters ########
- d10 = 'Local interface name to use when sniffing, doing reverse
connections, etc.'
- o10 = option('interface', cf.cf.getData('interface'), d10, 'string',
tabid='Network settings')
+ d11 = 'Local interface name to use when sniffing, doing reverse
connections, etc.'
+ o11 = option('interface', cf.cf.getData('interface'), d11, 'string',
tabid='Network settings')
- d11 = 'Local IP address to use when doing reverse connections'
- o11 = option('localAddress', cf.cf.getData('localAddress'), d11,
'string', tabid='Core settings')
+ d12 = 'Local IP address to use when doing reverse connections'
+ o12 = option('localAddress', cf.cf.getData('localAddress'), d12,
'string', tabid='Core settings')
######### Misc ###########
- d12 = 'Enable this when you are doing a demo in a conference'
- o12 = option('demo', cf.cf.getData('demo'), d12, 'boolean',
tabid='Misc settings')
+ d13 = 'Enable this when you are doing a demo in a conference'
+ o13 = option('demo', cf.cf.getData('demo'), d13, 'boolean',
tabid='Misc settings')
+
+ d14 = 'Use wildcard- instead of regex matching for defining black- or
whitelists'
+ h14 = ('Per default, w3af uses regex matching for the black- and '
+ 'whitelist. If useWildcardMatching is set, the target black-
and '
+ 'whitelist will use wildcard patterns instead of regular '
+ 'expressions for matching, which are easier to define.')
+ o14 = option('useWildcardMatching',
cf.cf.getData('useWildcardMatching'), d14, 'boolean', tabid='Target settings',
help=h14)
- d13 = 'A comma separated list of URLs that w3af should completely
ignore'
- h13 = 'Sometimes it\'s a good idea to ignore some URLs and test them
manually'
- o13 = option('nonTargets', cf.cf.getData('nonTargets'), d13, 'list',
tabid='Misc settings')
+ ######### Targets #########
+ #
+ # XXX There's one bug here: You can't use commas (',') in the regexes,
as
+ # they will be recognized as list item separators by Python...
+ d15 = 'A comma separated blacklist of URLs that w3af should completely
ignore'
+ h15 = ('URLs in the blacklist will not be tested by w3af. Please use '
+ 'regular expressions to specify the URLs, or, if
useWildcardMatching '
+ 'is enabled, wildcard patterns')
+ o15 = option('blacklistRegex', cf.cf.getData('blacklistRegex'), d15,
'list', tabid='Target settings', help=h15)
+
+ d16 = 'A comma separated whitelist that every URL has to match before
it is tested'
+ h16 = ('If a whitelist is given, only targets matching one of the '
+ 'patterns in the list will be tested. Please use regular '
+ 'expressions to specify the URLs, or, if useWildcardMatching is
'
+ 'enabled, wildcard patterns')
+ o16 = option('whitelistRegex', cf.cf.getData('whitelistRegex'), d16,
'list', tabid='Target settings', help=h16)
- d14 = 'Export all discovered fuzzable requests to the given file (CSV)'
- o14 = option('exportFuzzableRequests',
cf.cf.getData('exportFuzzableRequests'), d14, 'string', tabid='Export fuzzable
Requests')
+ d17 = 'Export all discovered fuzzable requests to the given file (CSV)'
+ o17 = option('exportFuzzableRequests',
cf.cf.getData('exportFuzzableRequests'), d17, 'string', tabid='Export fuzzable
Requests')
ol = optionList()
ol.add(o1)
@@ -131,6 +156,8 @@ class miscSettings(configurable):
ol.add(o13)
ol.add(o14)
ol.add(o15)
+ ol.add(o16)
+ ol.add(o17)
return ol
def getDesc( self ):
@@ -161,8 +188,23 @@ class miscSettings(configurable):
cf.cf.save('interface', optionsMap['interface'].getValue() )
cf.cf.save('localAddress', optionsMap['localAddress'].getValue() )
cf.cf.save('demo', optionsMap['demo'].getValue() )
- cf.cf.save('nonTargets', optionsMap['nonTargets'].getValue() )
cf.cf.save('exportFuzzableRequests',
optionsMap['exportFuzzableRequests'].getValue() )
+ cf.cf.save('useWildcardMatching',
optionsMap['useWildcardMatching'].getValue() )
+
+ for l in ( ( optionsMap['blacklistRegex'].getValue(), 'blacklistRegex'
),
+ ( optionsMap['whitelistRegex'].getValue(), 'whitelistRegex'
) ):
+ the_list, var_name = l
+ if optionsMap['useWildcardMatching'].getValue():
+ regex_list = map( fnmatch.translate, the_list )
+ else:
+ regex_list = the_list
+ for regex in regex_list:
+ try:
+ re.compile( regex )
+ except:
+ msg = 'You specified an invalid regular expression: "
%s".' % regex
+ raise w3afException(msg)
+ cf.cf.save( var_name, the_list )
# This is an undercover call to __init__ :) , so I can set all default
parameters.
miscSettings()
diff --git a/core/data/url/xUrllib.py b/core/data/url/xUrllib.py
index ab16622..dc64182 100644
--- a/core/data/url/xUrllib.py
+++ b/core/data/url/xUrllib.py
@@ -55,6 +55,8 @@ import core.data.kb.knowledgeBase as kb
# This is a singleton that's used for assigning request IDs
from core.controllers.misc.number_generator import consecutive_number_generator
+# For the blacklisting code
+import fnmatch
class sizeExceeded( Exception ):
pass
@@ -89,6 +91,10 @@ class xUrllib:
self._paused = False
self._mustStop = False
self._ignore_errors_conf = False
+
+ # Whitelist/blacklist regex options
+ self._compiled_whitelist_re = None
+ self._compiled_blacklist_re = None
def pause(self, pauseYesNo):
'''
@@ -214,20 +220,40 @@ class xUrllib:
req = urllib2.Request( uri )
req = self._addHeaders( req )
return req.headers
-
+
def _isBlacklisted( self, uri ):
'''
- If the user configured w3af to ignore a URL, we are going to be
applying that configuration here.
+ If the user configured w3af to ignore a URL, we apply that
configuration here.
This is the lowest layer inside w3af.
'''
- listOfNonTargets = cf.cf.getData('nonTargets') or []
- for u in listOfNonTargets:
- if urlParser.uri2url( uri ) == urlParser.uri2url( u ):
- msg = 'The URL you are trying to reach was configured as a
non-target. ( '
- msg += uri +' ). Returning an empty response.'
+ # Don't recompile if we don't have to. Seems like there might be a
better way to do this.
+ # Part of init maybe?
+ for l in ( ( '_compiled_whitelist_re', 'whitelistRegex' ),
+ ( '_compiled_blacklist_re', 'blacklistRegex' ) ):
+ var, option_name = l
+ if getattr( self, var ) == None:
+ the_list = cf.cf.getData( option_name )
+ if cf.cf.getData( 'useWildcardMatching' ):
+ the_list = map( fnmatch.translate, the_list )
+ setattr( self, var, map( re.compile, the_list ) )
+
+ # Test against the regex's
+ #
+ # First: blacklist
+ for regex in self._compiled_blacklist_re:
+ if regex.match( uri ):
+ msg = ( 'The URL you are trying to reach was configured as a '
+ 'non-target via blacklistRegex. ( %s ). Returning an
empty '
+ 'response.' ) % uri
om.out.debug( msg )
return True
-
+ # Second: whitelist
+ if filter(lambda regex: regex.match( uri ),
self._compiled_whitelist_re) == []:
+ msg = ( 'The URL you are trying to reach was configured as a '
+ 'non-target via whitelistRegex. ( %s ). Returning an empty
'
+ 'response.' ) % uri
+ om.out.information( msg )
+ return True
return False
def sendRawRequest( self, head, postdata, fixContentLength=True,
get_size=True):
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
W3af-develop mailing list
W3af-develop@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/w3af-develop