I have had trouble getting search engines to see my site. I built it with 
struts, and use some tags from the index.html page to get business logic, to 
finally get to my page. The url is http://www.theuniquepear.com

Anyway, upon talking to some co-workers, they suggested I watch my access log, 
so I can see what files they are indexing. I thought I had the access log 
turned on for the site, and see when someone hits my web site, but as far as 
the searchbots go, I only see this in my logs daily.

$ cat  localhost_access_log.2006-02-07.txt | less - - [07/Feb/2006:03:44:55 -0600] "GET /robots.txt HTTP/1.0" 404 985 - - [07/Feb/2006:03:46:21 -0600] "GET / HTTP/1.0" 200 844 - - [07/Feb/2006:03:51:57 -0600] "GET /robots.txt HTTP/1.0" 404 985 - - [07/Feb/2006:03:52:42 -0600] "GET 
/unique/welcome.do?OVRAW=home%20decorating%20ideas&OVKEY=home - - [07/Feb/2006:03:52:44 -0600] "GET 
/unique/includes/siteWide.css HTTP/1.1" 200 15402 - - [07/Feb/2006:03:52:44 -0600] "GET 
/unique/images/header_pear.jpg HTTP/1.1" 200 11227

I see the entry for robots.txt, but I have no idea where they are going, or 
what they are doing.

I turned on access log like this in the server.xml like so:
        <Valve className="org.apache.catalina.valves.AccessLogValve"
                 directory="logs"  prefix="localhost_access_log." suffix=".txt"
                 pattern="common" resolveHosts="false"/>

And that is a snippet of the log from above.

Does anyone know how to get more involved text, or can anyone tell me what the 
robots.txt above is doing?


Reply via email to