Hi there!
I am sure my team and me could do this. But we are a German-Company.
If this doesn't matter, please reply to this and the work can start!
Yours,
Krystian Stolorz / DatenSturm
- Original Message -
From: Richard Barnett <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, M
Hello everyone,
Our company is seeking an organization that can create, or has an
existing bot designed to crawl Affiliate Merchant sites and update a
local Oracle database. The functionality we are seeking is similar to
popular comparison sites such as MySimon.com, Dealtime.com and
Bottomdollar.
> So:
>
> I was looking at a robots.txt file and it had a series of disallow
> instructions for various user agents, and then at the bottom was a full
> disallow:
[...]
> Wouldn't this just disallow everyone from everything?
No, it would disallow everyone but a ... d (with the
specified
Bert Van Kets:
>It even overrides the other disalows.
No override... Robots Exclusion Standard on User-agent:
"(If the value is '*', the record describes the default access policy for
any robot that) has not [yet] matched any of the other records."
(http://info.webcrawler.com/mak/projects/robots/
Jonathan Knoll:
>User-agent: aa
>Disallow: /cgi-bin
>Disallow: /stuff
>Disallow: /x.html
[...]
>User-agent: *
>Disallow: /
>Wouldn't this just disallow everyone from everything?
No, the file is perfectly OK...
The "*" has a special meaning in the Standard:
"every other User-agent (