On Sun, May 08, 2005 at 09:48:07AM +0200, Nacho wrote: > > On Mon, May 02, 2005 at 01:27:41PM +0100, Richard Lyons wrote: > > > I am considering how to crawl a site which is dynamically generated, > > > and create a static version of all generated pages (or selected [...] > > Well, I don't know an "elegant" solution... one dirty approach would be to > first download the site with "wget -r", then you would get lots of files with > names like this: > > index.php?lang=es&tipo=obras&com=extracto > index.php?lang=es&tipo=obras&com=lista > index.php?lang=es&tipo=obras&com=susobras > > So it would be quite easy to write a simple perl script that substitutes the > special characters for others more "static-like", and you would get something > like: > > index_lang-es_tipo-obras_com-extracto.html > index_lang-es_tipo-obras_com-lista.html > index_lang-es_tipo-obras_com-susobras.html > > Also, surely you should have to parse the content of each file to substitute > the links inside them. > > Maybe too complicated?
Yes... that is the kind of thing I was imagining. It will probably be quite simple once I get started. But first I need to find time :-( Thanks for the pointer. -- richard -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]