Re: [W3af-develop] core/data/db/history.py and .trace files
Martin, On Wed, Feb 9, 2011 at 4:39 PM, Martin Holst Swende mar...@swende.se wrote: On 02/09/2011 02:31 AM, Steve Pinkham wrote: On 02/08/2011 08:08 PM, Andres Riancho wrote: Steve, noSQL servers are usually fast because they are in-memory systems. sqlite can be used in that mode also if you like. mongodb is not an in-memory db! In practice, it is. It stores all indexes in memory and uses memory mapped files. It will automatically consume all available memory (which is a good thing or bad thing depending on what else you want to use the server for). http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-MakesureyourindexescanfitinRAM. http://www.mongodb.org/display/DOCS/Caching Hi all, I have to say I disagree that MongoDB is called a memory-db. There are such things as memory-databases, e.g. H2, and MongoDB is not one of them. These databases keep *all* data in memory, which is another matter than using the memory for indices (which are orders of magnitude smaller than the data) and caching (which I would guess that all daemon-mode databases tries to do as best as possible). I also disagree that they are usually fast because they are in-memory systems. They are usually fast because they basically let the 'C' in Brewers CAP-theorem suffer, that is to say that they do not enforce consistency across all nodes. This allows for better partition tolerance and availability. They often employ eventual consistency. An example of one eventual-consistency system is the internet DNS system. Individual nodes (dns servers) may give stale information about a hostname, but eventually updates will reach all nodes and the system will be consistent again. Why is this important? Like the DNS-example, such a system can be built without any locks on readers or writers. Since it is ok for a reader to get 'stale' information, a writer can create a new version of a data-post first, then update the pointer. Neither reader nor writer have to wait. Some databases, such as CouchDB have gone further, using MVCC with a built-in git vcs to handle simultaneous modification of data on several nodes. It also has an append-only filesystem-implementation to further eliminate locking at the filesystem-level (and ensure that file corruption cannot occur) Having said all this, I concur that using e.g. MongoDB for w3af is probably not necessary, it sounds strange that sqlite would be unable to handle the somewhat modest amounts of data we're talking about. Also, I can see that concerns whether to really switch to a daemon-mode database arise. That totally depends on what is the purpose of w3af - if the purpose is to be a good scanner which is easy to use and install, daemon-db is a bad choice. If the purpose is to be the best - regardless of ease of installation and use - then I wouldn't blink before switching to a daemon database if that gives any advantage. Two more comments I disagree with: It's useful in distributed, massively parallel systems, but offers no real benefit for single user databases. and noSQL is just the new term for key-value stores. It is true that it is useful for distributed, massively parallel system, but there are also advantages to using it for data which fits the dynamic (schemaless) model. Having no schema enforced by the database does not mean that the database is just a disk-based hash table with blobs for values. I would instead say that noSQL is more like a new generation of the object databases, but now with generic API's (json/bson/http) and wide language support. Certain kinds of data fit very well into these models. I have written a proxy which saves http traffic into a MongoDB (http://martin.swende.se/hg/#hatkit_proxy-t1/) and a framework to analyse traffic from this database (http://martin.swende.se/hg#hatkit_fiddler-t1). Http traffic looks very non-uniform. Some requests are basically GET / HTTP/1.1 while others contain forms or json and lots and lots of headers. Using MongoDB, it is possible to represent the data more at an object-level, e.g. { request: { method: GET, headers:{ Content-Length: 1233, Host : foobar.com, Foo: bar} parameters: {gaz: onk} }, response : {...} } MongoDB has very powerful querying-facilities (http://www.mongodb.org/display/DOCS/Advanced+Queries). Since the object is stored with this structure in the database, it is possible to reach into objects (http://www.mongodb.org/display/DOCS/Dot+Notation+(Reaching+into+Objects)), and perform e.g these kind of queries: give me response.body where request.parameters.filename exists, or give me request.body.parameters where request.body.parameters.__viewstate does not exist Also, MongoDB has very powerful aggregation mechanisms (http://www.mongodb.org/display/DOCS/Aggregation), where queries like the following can be used: Organized by request.headers.host give me all unique
Re: [W3af-develop] core/data/db/history.py and .trace files
On 02/09/2011 02:31 AM, Steve Pinkham wrote: On 02/08/2011 08:08 PM, Andres Riancho wrote: Steve, noSQL servers are usually fast because they are in-memory systems. sqlite can be used in that mode also if you like. mongodb is not an in-memory db! In practice, it is. It stores all indexes in memory and uses memory mapped files. It will automatically consume all available memory (which is a good thing or bad thing depending on what else you want to use the server for). http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-MakesureyourindexescanfitinRAM. http://www.mongodb.org/display/DOCS/Caching Hi all, I have to say I disagree that MongoDB is called a memory-db. There are such things as memory-databases, e.g. H2, and MongoDB is not one of them. These databases keep *all* data in memory, which is another matter than using the memory for indices (which are orders of magnitude smaller than the data) and caching (which I would guess that all daemon-mode databases tries to do as best as possible). I also disagree that they are usually fast because they are in-memory systems. They are usually fast because they basically let the 'C' in Brewers CAP-theorem suffer, that is to say that they do not enforce consistency across all nodes. This allows for better partition tolerance and availability. They often employ eventual consistency. An example of one eventual-consistency system is the internet DNS system. Individual nodes (dns servers) may give stale information about a hostname, but eventually updates will reach all nodes and the system will be consistent again. Why is this important? Like the DNS-example, such a system can be built without any locks on readers or writers. Since it is ok for a reader to get 'stale' information, a writer can create a new version of a data-post first, then update the pointer. Neither reader nor writer have to wait. Some databases, such as CouchDB have gone further, using MVCC with a built-in git vcs to handle simultaneous modification of data on several nodes. It also has an append-only filesystem-implementation to further eliminate locking at the filesystem-level (and ensure that file corruption cannot occur) Having said all this, I concur that using e.g. MongoDB for w3af is probably not necessary, it sounds strange that sqlite would be unable to handle the somewhat modest amounts of data we're talking about. Also, I can see that concerns whether to really switch to a daemon-mode database arise. That totally depends on what is the purpose of w3af - if the purpose is to be a good scanner which is easy to use and install, daemon-db is a bad choice. If the purpose is to be the best - regardless of ease of installation and use - then I wouldn't blink before switching to a daemon database if that gives any advantage. Two more comments I disagree with: It's useful in distributed, massively parallel systems, but offers no real benefit for single user databases. and noSQL is just the new term for key-value stores. It is true that it is useful for distributed, massively parallel system, but there are also advantages to using it for data which fits the dynamic (schemaless) model. Having no schema enforced by the database does not mean that the database is just a disk-based hash table with blobs for values. I would instead say that noSQL is more like a new generation of the object databases, but now with generic API's (json/bson/http) and wide language support. Certain kinds of data fit very well into these models. I have written a proxy which saves http traffic into a MongoDB (http://martin.swende.se/hg/#hatkit_proxy-t1/ http://martin.swende.se/hgwebdir.cgi/hatkit_proxy/) and a framework to analyse traffic from this database (http://martin.swende.se/hg#hatkit_fiddler-t1). Http traffic looks very non-uniform. Some requests are basically GET / HTTP/1.1 while others contain forms or json and lots and lots of headers. Using MongoDB, it is possible to represent the data more at an object-level, e.g. { request: { method: GET, headers:{ Content-Length: 1233, Host : foobar.com, Foo: bar} parameters: {gaz: onk} }, response : {...} } MongoDB has very powerful querying-facilities (http://www.mongodb.org/display/DOCS/Advanced+Queries). Since the object is stored with this structure in the database, it is possible to reach into objects (http://www.mongodb.org/display/DOCS/Dot+Notation+(Reaching+into+Objects) http://www.mongodb.org/display/DOCS/Dot+Notation+%28Reaching+into+Objects%29), and perform e.g these kind of queries: give me response.body where request.parameters.filename exists, or give me request.body.parameters where request.body.parameters.__viewstate does not exist Also, MongoDB has very powerful aggregation mechanisms (http://www.mongodb.org/display/DOCS/Aggregation), where queries like the following can be used: Organized by request.headers.host give me all unique parameter names., or, organized by
Re: [W3af-develop] core/data/db/history.py and .trace files
The only nosql databases I've used so far are key/value oriented(mostly riak). You've convinced me that there might be some benefits to documented oriented storage I haven't considered, so thank you for that. On 02/09/2011 02:39 PM, Martin Holst Swende wrote: I also disagree that they are usually fast because they are in-memory systems. They are usually fast because they basically let the 'C' in Brewers CAP-theorem suffer, that is to say that they do not enforce consistency across all nodes. I still must stick by my guns on this part. I think understanding the place performance wins comes from is important. Performance wins come from breaking consistency across nodes, but not only that. They (usually) also break every part of ACID, often even on each node. Most nosql databases don't support transactions, and the defaults don't sync in memory results with the filesystem. For example, mongodb only syncs data with permanent storage once a minute. Most relational databases sync every transaction. If you're willing to forgo that requirement, relational databases can be much faster too. http://www.mongodb.org/display/DOCS/Durability+and+Repair#DurabilityandRepair-%7B%7B%5Csyncdelay%7D%7DCommandLineOption This non-syncing behaviour is why I say its performace is from acting like an in-memory database. Because, quite frankly, it is ;-). Most nosql databases act like this, depending on the syncing of multiple servers to maintain data instead of permanent storage syncing. Yes, some nosql databases (including mongodb) are adding more advanced durability and transactional features as time goes on. I don't wish to argue against nosql as non-useful, just to say that you won't necessarily get better performance in the same memory space and data integrity requirements. There's no magic pixy dust in nosql (just like there is none in Ruby. ;-) Nosql is a buzzword, and I admit most of my response came from an adverse reaction to that. If the message was we should explore document-oriented storage(which is what mongoDB is) or maybe key value stores are all we need I'm much less hostile ;-) -- | Steven Pinkham, Security Consultant| | http://www.mavensecurity.com | | GPG public key ID CD31CAFB | smime.p7s Description: S/MIME Cryptographic Signature -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
A very common no-sql document related database are IBM Lotus Notes db ( .nsf ) . These dbs are usually used for team room applications or for storing transaccional data between IBM mainframe and other platforms like HP-NON STOP most seen on Banking environments. Inside IBM doors these nsf databases are used for almost everything (team rooms, phone db, rrhh dbs, physical security dbs). Regards -- Leandro Reox On Fri, Feb 4, 2011 at 8:39 PM, Andres Riancho andres.rian...@gmail.comwrote: Agreed, we need fixed. Lots of bug reports about it in Trac. -- Andres Riancho El feb 4, 2011 6:51 p.m., Taras ox...@oxdef.info escribió: Andres, I didn't use noSQL databases but it can be interesting research =) But for the first lets simply fix this bug with files. Do we know about any noSQL database that's file based like sqlite? Maybe we could use this s... -- The modern datacenter depends on network connectivity to access resources and provide services. The best practices for maximizing a physical server's connectivity to a physical network are well understood - see how these rules translate into the virtual world? http://p.sf.net/sfu/oracle-sfdevnlfb ___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
Lean, Do you know if the format is open? Do we have a Python binding to write to them? Any clue on how they scale in performance when saving thousands of registries? Regards, On Tue, Feb 8, 2011 at 4:42 PM, Leandro Reox leandro.r...@gmail.com wrote: A very common no-sql document related database are IBM Lotus Notes db ( .nsf ) . These dbs are usually used for team room applications or for storing transaccional data between IBM mainframe and other platforms like HP-NON STOP most seen on Banking environments. Inside IBM doors these nsf databases are used for almost everything (team rooms, phone db, rrhh dbs, physical security dbs). Regards -- Leandro Reox On Fri, Feb 4, 2011 at 8:39 PM, Andres Riancho andres.rian...@gmail.com wrote: Agreed, we need fixed. Lots of bug reports about it in Trac. -- Andres Riancho El feb 4, 2011 6:51 p.m., Taras ox...@oxdef.info escribió: Andres, I didn't use noSQL databases but it can be interesting research =) But for the first lets simply fix this bug with files. Do we know about any noSQL database that's file based like sqlite? Maybe we could use this s... -- The modern datacenter depends on network connectivity to access resources and provide services. The best practices for maximizing a physical server's connectivity to a physical network are well understood - see how these rules translate into the virtual world? http://p.sf.net/sfu/oracle-sfdevnlfb ___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
Andres, Sadly the format is not open. Theres a few ways to write and retrieve data via Python to this kind of databases (like jython + and the notes.jar classes - notessql drivers on win, etc). Regarding performance a 100.000 records with attachments databases are very common on IBM infraestructure and the database itself performs like a charm. Have you consider using the open-source alternative to .nsf , mongodb ? Its an document oriented database type like nsf, open format, fully compatible with python and the performance is pretty awesome Regards On Tue, Feb 8, 2011 at 5:08 PM, Andres Riancho andres.rian...@gmail.comwrote: Lean, Do you know if the format is open? Do we have a Python binding to write to them? Any clue on how they scale in performance when saving thousands of registries? Regards, On Tue, Feb 8, 2011 at 4:42 PM, Leandro Reox leandro.r...@gmail.com wrote: A very common no-sql document related database are IBM Lotus Notes db ( .nsf ) . These dbs are usually used for team room applications or for storing transaccional data between IBM mainframe and other platforms like HP-NON STOP most seen on Banking environments. Inside IBM doors these nsf databases are used for almost everything (team rooms, phone db, rrhh dbs, physical security dbs). Regards -- Leandro Reox On Fri, Feb 4, 2011 at 8:39 PM, Andres Riancho andres.rian...@gmail.com wrote: Agreed, we need fixed. Lots of bug reports about it in Trac. -- Andres Riancho El feb 4, 2011 6:51 p.m., Taras ox...@oxdef.info escribió: Andres, I didn't use noSQL databases but it can be interesting research =) But for the first lets simply fix this bug with files. Do we know about any noSQL database that's file based like sqlite? Maybe we could use this s... -- The modern datacenter depends on network connectivity to access resources and provide services. The best practices for maximizing a physical server's connectivity to a physical network are well understood - see how these rules translate into the virtual world? http://p.sf.net/sfu/oracle-sfdevnlfb ___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
The only issue with mongodb is that its a daemon, I'm not sure if we want to have mongod as a w3af dependency. It could complicate packaging and install process. Regards, -- Andres Riancho El feb 8, 2011 6:39 p.m., Leandro Reox leandro.r...@gmail.com escribió: Here is a living proof of MongoDB deployed on large scale scenarios : http://www.mongodb.org/display/DOCS/Production+Deployments Regards Lean On Tue, Feb 8, 2011 at 6:08 PM, Leandro Reox leandro.r...@gmail.com wrote: Andres, Sadl... -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
Steve, On Tue, Feb 8, 2011 at 8:45 PM, Steve Pinkham steve.pink...@gmail.com wrote: On 02/03/2011 12:04 PM, Andres Riancho wrote: Do we know about any noSQL database that's file based like sqlite? Maybe we could use this small rewrite to compare the performance of those backends. Regards, I'm somewhat at a loss of what you think noSQL will buy you. It's useful in distributed, massively parallel systems, but offers no real benefit for single user databases. I disagree. I've seen how sqlite3's performance impacted in the framework's performance before mainly because of its slow access (SELECT). For what I can understand from the noSQL databases, the access to any row should be ultra fast, even if we save whole HTTP requests and responses to it. noSQL is just the new term for key-value stores. Yes. Berkeley DB is what was used as a file based key-value store before sqlite, but has no major benefits in most uses over sqlite which is why it didn't spring to mind. ;-) If you have many threads writing concurrently, BDB can be faster, but you have a great decrease in functionality as a cost. http://en.wikipedia.org/wiki/Berkeley_DB I already took a look into BDB and for some reason I discarded it, now I don't remember why :( Here's one set of benchmarks. For low number of records, BDB was faster, for number of high records sqlite was faster. Both should be fast enough. You shouldn't need transactional capabilities where sqlite was the slowest. http://www.sqlite.org/cvstrac/wiki?p=KeyValueDatabase What I read from this performance test is: BDB is faster in 90% of the cases. In the cases where BDB is faster, its ~50% faster in average. Regards, -- | Steven Pinkham, Security Consultant | | http://www.mavensecurity.com | | GPG public key ID CD31CAFB | -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
Steve, On Tue, Feb 8, 2011 at 9:07 PM, Andres Riancho andres.rian...@gmail.com wrote: Berkeley DB is what was used as a file based key-value store before sqlite, but has no major benefits in most uses over sqlite which is why it didn't spring to mind. ;-) If you have many threads writing concurrently, BDB can be faster, but you have a great decrease in functionality as a cost. http://en.wikipedia.org/wiki/Berkeley_DB I already took a look into BDB and for some reason I discarded it, now I don't remember why :( Ahh, this is why! Deprecated since version 2.6: The bsddb module has been deprecated for removal in Python 3.0. [0] [0] http://docs.python.org/library/bsddb.html -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
On 02/08/2011 07:07 PM, Andres Riancho wrote: Steve, On Tue, Feb 8, 2011 at 8:45 PM, Steve Pinkham steve.pink...@gmail.com wrote: On 02/03/2011 12:04 PM, Andres Riancho wrote: Do we know about any noSQL database that's file based like sqlite? Maybe we could use this small rewrite to compare the performance of those backends. Regards, I'm somewhat at a loss of what you think noSQL will buy you. It's useful in distributed, massively parallel systems, but offers no real benefit for single user databases. I disagree. I've seen how sqlite3's performance impacted in the framework's performance before mainly because of its slow access (SELECT). For what I can understand from the noSQL databases, the access to any row should be ultra fast, even if we save whole HTTP requests and responses to it. noSQL servers are usually fast because they are in-memory systems. sqlite can be used in that mode also if you like. If select is your problem, you're probably not indexing properly or your selects are waiting on writes. Both are fixable. That said, if all you ever want is a key value store and you never see yourself use any more complicated searches than that, maybe a key value store is for you. Otherwise writing better selects and tuning your indexing is probably a bigger win. I haven't looked at what you're using the database for or how you have it tuned yet, but I'll try to soon. Here's one set of benchmarks. For low number of records, BDB was faster, for number of high records sqlite was faster. Both should be fast enough. You shouldn't need transactional capabilities where sqlite was the slowest. http://www.sqlite.org/cvstrac/wiki?p=KeyValueDatabase What I read from this performance test is: BDB is faster in 90% of the cases. In the cases where BDB is faster, its ~50% faster in average. What I see is if you're making more than 100,000 selects/second in a web app scanner you seriously screwed up somewhere and need to be caching more. Being 2x or 10x faster will still lose to better design. -- | Steven Pinkham, Security Consultant| | http://www.mavensecurity.com | | GPG public key ID CD31CAFB | smime.p7s Description: S/MIME Cryptographic Signature -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
On 02/08/2011 08:08 PM, Andres Riancho wrote: Steve, noSQL servers are usually fast because they are in-memory systems. sqlite can be used in that mode also if you like. mongodb is not an in-memory db! In practice, it is. It stores all indexes in memory and uses memory mapped files. It will automatically consume all available memory (which is a good thing or bad thing depending on what else you want to use the server for). http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-MakesureyourindexescanfitinRAM. http://www.mongodb.org/display/DOCS/Caching -- | Steven Pinkham, Security Consultant| | http://www.mavensecurity.com | | GPG public key ID CD31CAFB | smime.p7s Description: S/MIME Cryptographic Signature -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
Andres, I didn't use noSQL databases but it can be interesting research =) But for the first lets simply fix this bug with files. Do we know about any noSQL database that's file based like sqlite? Maybe we could use this small rewrite to compare the performance of those backends. Regards, On Mon, Jan 31, 2011 at 6:38 PM, Andres Riancho andres.rian...@gmail.com wrote: Taras, On Mon, Jan 31, 2011 at 6:08 PM, Taras ox...@oxdef.info wrote: Andres, Oh, it is bad and good bug in same time =) Bad side is that bug is not trivial to reproduce and it occurs suddenly. But it looks like I found the problem. It is because of mistiming of db file and transactions files (*.trace) when target is changed. DB file is initialized in start of application and then it is bypassed through KB global object. But transactions files stores in 'get_home_dir() + 'sessions' + 'db_' + sessionName' dir This dir can be changed from start! Steps to reproduce: 1. run ./w3af_gui 2. launch proxy tool and test some site like http://pentagon.afis.osd.mil ;) 3. close proxy tool and try to scan some *different* site e.g. http://www.defense.gov 4. launch proxy tool again Current result: you must see this cruel exception Good to see that we know how to reproduce this vulnerability! I've assigned it to you to fix at your earliest convenience :) https://sourceforge.net/apps/trac/w3af/ticket/161417 So the solution is to use single dir to transactions files with name similar to DB file and do not use sessionName in it to generate path every time. Agreed. The good side in this bug is opportunity to make one more improvement in deal with this *big* number of session transactions files. We need to delete it in the end of session (when w3af is being closed). Yep, we should use only one file there. I can fix it in the nearest days or you of course can assign it to another person if we need to fix it e.g. tomorrow =) Thanks! On Mon, 2011-01-31 at 09:49 -0300, Andres Riancho wrote: Oxdef, We've been getting a lot [0] of automatic bug reports that look like this: w3afException: An internal error ocurred while searching for id 246. Original exception: [Errno 2] No such file or directory: '/root/.w3af/sessions/some-site.com-2011-Jan-31_12-56-05/246.trace' The only location where .trace files are created is in core/data/db/history.py. Do you have any idea on why this might happen? How can we fix it? Thanks! [0] https://sourceforge.net/apps/trac/w3af/search?q=.trace Regards, -- Taras http://oxdef.info Software is like sex: it's better when it's free. - Linus Torvalds -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af -- Taras http://oxdef.info Software is like sex: it's better when it's free. - Linus Torvalds -- The modern datacenter depends on network connectivity to access resources and provide services. The best practices for maximizing a physical server's connectivity to a physical network are well understood - see how these rules translate into the virtual world? http://p.sf.net/sfu/oracle-sfdevnlfb ___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
Agreed, we need fixed. Lots of bug reports about it in Trac. -- Andres Riancho El feb 4, 2011 6:51 p.m., Taras ox...@oxdef.info escribió: Andres, I didn't use noSQL databases but it can be interesting research =) But for the first lets simply fix this bug with files. Do we know about any noSQL database that's file based like sqlite? Maybe we could use this s... -- The modern datacenter depends on network connectivity to access resources and provide services. The best practices for maximizing a physical server's connectivity to a physical network are well understood - see how these rules translate into the virtual world? http://p.sf.net/sfu/oracle-sfdevnlfb___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
[W3af-develop] core/data/db/history.py and .trace files
Oxdef, We've been getting a lot [0] of automatic bug reports that look like this: w3afException: An internal error ocurred while searching for id 246. Original exception: [Errno 2] No such file or directory: '/root/.w3af/sessions/some-site.com-2011-Jan-31_12-56-05/246.trace' The only location where .trace files are created is in core/data/db/history.py. Do you have any idea on why this might happen? How can we fix it? Thanks! [0] https://sourceforge.net/apps/trac/w3af/search?q=.trace Regards, -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop
Re: [W3af-develop] core/data/db/history.py and .trace files
Taras, On Mon, Jan 31, 2011 at 6:08 PM, Taras ox...@oxdef.info wrote: Andres, Oh, it is bad and good bug in same time =) Bad side is that bug is not trivial to reproduce and it occurs suddenly. But it looks like I found the problem. It is because of mistiming of db file and transactions files (*.trace) when target is changed. DB file is initialized in start of application and then it is bypassed through KB global object. But transactions files stores in 'get_home_dir() + 'sessions' + 'db_' + sessionName' dir This dir can be changed from start! Steps to reproduce: 1. run ./w3af_gui 2. launch proxy tool and test some site like http://pentagon.afis.osd.mil ;) 3. close proxy tool and try to scan some *different* site e.g. http://www.defense.gov 4. launch proxy tool again Current result: you must see this cruel exception Good to see that we know how to reproduce this vulnerability! I've assigned it to you to fix at your earliest convenience :) https://sourceforge.net/apps/trac/w3af/ticket/161417 So the solution is to use single dir to transactions files with name similar to DB file and do not use sessionName in it to generate path every time. Agreed. The good side in this bug is opportunity to make one more improvement in deal with this *big* number of session transactions files. We need to delete it in the end of session (when w3af is being closed). Yep, we should use only one file there. I can fix it in the nearest days or you of course can assign it to another person if we need to fix it e.g. tomorrow =) Thanks! On Mon, 2011-01-31 at 09:49 -0300, Andres Riancho wrote: Oxdef, We've been getting a lot [0] of automatic bug reports that look like this: w3afException: An internal error ocurred while searching for id 246. Original exception: [Errno 2] No such file or directory: '/root/.w3af/sessions/some-site.com-2011-Jan-31_12-56-05/246.trace' The only location where .trace files are created is in core/data/db/history.py. Do you have any idea on why this might happen? How can we fix it? Thanks! [0] https://sourceforge.net/apps/trac/w3af/search?q=.trace Regards, -- Taras http://oxdef.info Software is like sex: it's better when it's free. - Linus Torvalds -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop