website catcher

2005-07-03 Thread jwaixs
Hello, I'm busy to build some kind of webpage framework written in Python. But there's a small problem in this framework. This framework should open a page, parse it, take some other information out of it and should store it in some kind of fast storage. This storage need to be very fast so every

Re: website catcher

2005-07-03 Thread [EMAIL PROTECTED]
You can catch the content of an url like this: http://www.python.org/doc/current/lib/node478.html, from here you can parse it, and the store the result e.g. in dictionary, you will have a very well performing solution like this. -- http://mail.python.org/mailman/listinfo/python-list

Re: website catcher

2005-07-03 Thread jwaixs
Thank you, but it's not what I mean. I don't want some kind of client parser thing. But I mean the page is already been parsed and ready to be read. But I want to store this page for more use. I need some kind of database that won't exit if the cgi-bin script has finished. This database need to be

Re: website catcher

2005-07-03 Thread Diez B. Roggisch
jwaixs wrote: > Thank you, but it's not what I mean. I don't want some kind of client > parser thing. But I mean the page is already been parsed and ready to > be read. But I want to store this page for more use. I need some kind > of database that won't exit if the cgi-bin script has finished. Thi

Re: website catcher

2005-07-03 Thread jwaixs
If I should put the parsedwebsites in, for example, a tablehash it will be at least 5 times faster than just putting it in a file that needs to be stored on a slow harddrive. Memory is a lot faster than harddisk space. And if there would be a lot of people asking for a page all of them have to open

Re: website catcher

2005-07-03 Thread Diez B. Roggisch
jwaixs wrote: > If I should put the parsedwebsites in, for example, a tablehash it will > be at least 5 times faster than just putting it in a file that needs to > be stored on a slow harddrive. Memory is a lot faster than harddisk > space. And if there would be a lot of people asking for a page al

Re: website catcher

2005-07-03 Thread jwaixs
Well, thank you for your advice. So I have a couple of solutions, but it can't become a server at its own so this means I will deal with files. Thank you for your advice, I'll first make it work... than the server. Noud Aldenhoven -- http://mail.python.org/mailman/listinfo/python-list

Re: website catcher

2005-07-03 Thread gene tani
maybe look at Harvestman http://cheeseshop.python.org/HarvestMan/1.4%20final -- http://mail.python.org/mailman/listinfo/python-list

Re: website catcher

2005-07-03 Thread Mike Meyer
"jwaixs" <[EMAIL PROTECTED]> writes: > If I should put the parsedwebsites in, for example, a tablehash it will > be at least 5 times faster than just putting it in a file that needs to > be stored on a slow harddrive. Memory is a lot faster than harddisk > space. And if there would be a lot of peo

Re: website catcher

2005-07-06 Thread Michael Ströder
jwaixs wrote: > I need some kind > of database that won't exit if the cgi-bin script has finished. This > database need to be open all the time and communicate very easily with > the cgi-bin framwork main class. Maybe long-running multi-threaded processes for FastCGI, SCGI or similar is what you'r