- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Maxime Subject: Re: limit index to domain, text files with specific extensions, i.e. .lib, .spi, etc
A page of a site is crawled only if there is an appropriate Server/Realm/Subnet command and there isn't any Disallow command prevents to do so. With the following command you crawl all target site to follow all links on it looking for urls: Server hrefonly http://www.domain.com/ With the following command you index all urls that end in .lib or .spi: Realm regex http://www.doamain.com/.*\.(lib|spi)$ As well, make sure that remote httpd-server supply a text/plain , text/html or text/xml Content-Type header with these files. Otherwise you need to specify a parser to convert such data to one of types specified above. - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=02;topic_id=1226881847