Tomas, On Fri, Oct 5, 2012 at 2:14 PM, Andres Riancho <andres.rian...@gmail.com> wrote: > Tomas, > > On Thu, Oct 4, 2012 at 9:50 PM, Andres Riancho <andres.rian...@gmail.com> > wrote: >> Tomas, >> >> On Sat, Sep 29, 2012 at 12:02 AM, Tomas Velazquez >> <tomas.velazqu...@gmail.com> wrote: >>> Andres, >>> >>> The web_spider alone works well, but if you use it with dir_bruter something >>> strange happens because web_spider does not crawl all directories ... >>> >>> dir_bruter works well for me with this fix ;) >>> >>> Index: dir_bruter.py >>> =================================================================== >>> --- dir_bruter.py (revisión: 5824) >>> +++ dir_bruter.py (copia de trabajo) >>> @@ -73,12 +73,12 @@ >>> base_url = fuzzable_request.getURL().baseUrl() >>> >>> if base_url not in self._already_tested: >>> + self._already_tested.add( base_url ) >>> self._bruteforce_directories( base_url ) >>> - self._already_tested.add( base_url ) >>> >>> if self._be_recursive and domain_path not in >>> self._already_tested: >>> + self._already_tested.add( domain_path ) >>> self._bruteforce_directories( domain_path ) >>> - self._already_tested.add( domain_path ) >>> >>> def _dir_name_generator(self, base_path): >>> ''' >> >> I completely trust you on this one and it would be very easy for me to >> simply apply this patch to the code, but I want to understand what's >> going on. I'm starting to work on this issue now, give me some minutes >> and I might have something. I'll try to use TDD :) First reproduce the >> issue, then write a test that fails, understand what happens and >> finally apply the patch that fixes it. > > All right, after some tests that lead to the fix of other unrelated > bugs I was able to come back to this issue. First, I realized that > your patch was against threading2, which made things a little bit > easier for me :) Then, and after running the plugin several times I > realized that this was a threading issue: > > - dir_bruter#1 starts and finds some new directories > - In line 125 we're doing: self.output_queue.put(fr) > - dir_bruter#1 continues to bruteforce other directories > - dir_bruter#1.output_queue is consumed by the core and then sent > again to another dir_bruter instance: #2 > - because #1 is still running, self._already_tested hasn't been > populated yet, thus dir_bruter#2 will run again with an already used > base_url/domain_path , which leads to those duplicates we're seeing. > > Ahhh, the threading horror! > > Now I'll check if I can write a test for this, potentially a check > that verifies if no dups are returned, but... hard to really check > it... will see. > > All in all, thanks for the good bug report and the patch, which I > applied to the threading2 branch. I hope to finish this branch in a > couple of weeks (no engagements for me so far!) and then the users > will get all these fixes.
All right, I was unable to write a test for this bug, but improved the pre-existing tests: https://sourceforge.net/apps/trac/w3af/changeset/5837 sourceforge.net/apps/trac/w3af/changeset/5838 Once again, thanks for your ideas. Now I'm going to start reviewing your other email :) Regards, > Regards, > >> Regards, >> >>> >>> >>> On Fri, Sep 28, 2012 at 9:09 PM, Tomas Velazquez >>> <tomas.velazqu...@gmail.com> wrote: >>>> >>>> Andres, >>>> >>>> I'm sorry, redundancy also exist at threading2 branch. >>>> >>>> I explain the test: >>>> - Exist directory listing in all directories except /. >>>> - oneword.txt wordlist has hide_folder. >>>> >>>> Problems found: >>>> - dir_bruter brute force same directory: >>>> http://localhost/ 4 times >>>> http://localhost/test/ 2 times >>>> http://localhost/test/hide_folder/ 2 times >>>> http://localhost/test/hide_folder/another/ 1 time >>>> all directories inside another/ are not brute forced at any depth. >>>> - web_spider does not crawl to maximum directory depth. >>>> >>>> Result: >>>> Found 6 URLs and 6 different points of injection. >>>> The list of URLs is: >>>> - http://localhost/test/hide_folder/another/1/ >>>> - http://localhost/test/hide_folder/test.txt >>>> - http://localhost/ >>>> - http://localhost/test/hide_folder/another/ >>>> - http://localhost/test/hide_folder/ >>>> - http://localhost/test/ >>>> >>>> Test script: >>>> plugins >>>> crawl web_spider dir_bruter >>>> crawl config dir_bruter >>>> set wordlist /tmp/oneword.txt >>>> back >>>> back >>>> target >>>> set target http://localhost/test/ >>>> back >>>> start >>>> >>>> I hope you can reproduce it, thanks a lot for your work! >>>> >>>> PD: I like the new plugin filename homogenization ;) >>>> >>>> >>>> >>>> On Fri, Sep 28, 2012 at 1:56 AM, Andres Riancho <andres.rian...@gmail.com> >>>> wrote: >>>>> >>>>> Tomas, >>>>> >>>>> Thanks for the patch! I've been working on improvements in my >>>>> threading2 branch, where I think this was fixed [0], could you please >>>>> verify? >>>>> >>>>> [0] >>>>> http://sourceforge.net/apps/trac/w3af/browser/branches/threading2/plugins/crawl/dir_bruter.py >>>>> >>>>> On Tue, Sep 25, 2012 at 9:27 PM, Tomas Velazquez >>>>> <tomas.velazqu...@gmail.com> wrote: >>>>> > Hi list, >>>>> > >>>>> > I see that dir_bruter brute force the same folder more than once. This >>>>> > redundancy increases if you add other plugins like webSpider. >>>>> > >>>>> > Regards, >>>>> > >>>>> > >>>>> > Possible patch: >>>>> > >>>>> > Index: dir_bruter.py >>>>> > =================================================================== >>>>> > --- dir_bruter.py (revision 5824) >>>>> > +++ dir_bruter.py (working copy) >>>>> > @@ -53,6 +53,7 @@ >>>>> > # Internal variables >>>>> > self._fuzzable_requests = [] >>>>> > self._tested_base_url = False >>>>> > + self._already_done = [] >>>>> > >>>>> > def discover(self, fuzzableRequest ): >>>>> > ''' >>>>> > @@ -82,6 +83,9 @@ >>>>> > to_test.append( domain_path ) >>>>> > >>>>> > for base_path in to_test: >>>>> > + # Check if the url is a folder and if the url already >>>>> > been >>>>> > bruteforced >>>>> > + if base_path.url_string.endswith('/') and filter(lambda >>>>> > x: >>>>> > x.url_string==base_path.url_string,self._already_done) == []: >>>>> > + self._already_done.append(base_path) >>>>> > # Send the requests using threads: >>>>> > self._run_async( >>>>> > meth=self._bruteforce_directories, >>>>> > >>>>> > >>>>> > >>>>> > ------------------------------------------------------------------------------ >>>>> > Live Security Virtual Conference >>>>> > Exclusive live event will cover all the ways today's security and >>>>> > threat landscape has changed and how IT managers can respond. >>>>> > Discussions >>>>> > will include endpoint security, mobile security and the latest in >>>>> > malware >>>>> > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>>>> > _______________________________________________ >>>>> > W3af-develop mailing list >>>>> > W3af-develop@lists.sourceforge.net >>>>> > https://lists.sourceforge.net/lists/listinfo/w3af-develop >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> Andrés Riancho >>>>> Project Leader at w3af - http://w3af.org/ >>>>> Web Application Attack and Audit Framework >>>>> Twitter: @w3af >>>>> GPG: 0x93C344F3 >>>> >>>> >>> >> >> >> >> -- >> Andrés Riancho >> Project Leader at w3af - http://w3af.org/ >> Web Application Attack and Audit Framework >> Twitter: @w3af >> GPG: 0x93C344F3 > > > > -- > Andrés Riancho > Project Leader at w3af - http://w3af.org/ > Web Application Attack and Audit Framework > Twitter: @w3af > GPG: 0x93C344F3 -- Andrés Riancho Project Leader at w3af - http://w3af.org/ Web Application Attack and Audit Framework Twitter: @w3af GPG: 0x93C344F3 ------------------------------------------------------------------------------ Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev _______________________________________________ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop