Hello, I am working on a website[1], which purpose is let the visitor browse a _virtual_ filesystem, made of all the files shipped in Debian packages. Then view or compare the files.
The problem is that google will never finish indexing the 10 million pages (not on my home DSL, at least)... My first plan is to track unstable, then provide a kind of news feed for search engines. [my DebCamp8 plan] The second improvent, is to actualy prevent google from indexing useless pages. The question is what pages are usefull, and which are useless ? My current (quick) list is : ^/etc/.*$ ^/var/lib/dpkg/.*$ ^/usr/share/doc/[^/]*/[^/]*$ Any suggestion ? Franklin [1] http://sysinf0.klabs.be/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]