Hello,

I am working on a website[1], which purpose is let the visitor browse
a _virtual_ filesystem, made of all the files shipped in Debian
packages. Then view or compare the files.

The problem is that google will never finish indexing the 10 million
pages (not on my home DSL, at least)...

My first plan is to track unstable, then provide a kind of news feed for
search engines. [my DebCamp8 plan]

The second improvent, is to actualy prevent google from indexing useless
pages. The question is what pages are usefull, and which are useless ?
My current (quick) list is :
 ^/etc/.*$
 ^/var/lib/dpkg/.*$
 ^/usr/share/doc/[^/]*/[^/]*$

Any suggestion ?

Franklin

[1] http://sysinf0.klabs.be/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to