Yeah, you still don't have the agent configured.  All your values for
the agent (the <value></value> needs a value) are blank.  So, you need
to at least confugure an agent name.



On 6/15/07, karan thakral <[EMAIL PROTECTED]> wrote:
> i m using crawl on the cygwin while working on windows
>
> but the crawl output is not proper
>
> during fetch its says fetch: the document could not be fetched java runtime
> exception  agent not configured
>
> my nutch-site.xml is  as follows
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
> <property>
>   <name>http.agent.name</name>
>   <value></value>
>   <description>HTTP 'User-Agent' request header. MUST NOT be empty -
>   please set this to a single word uniquely related to your organization.
>
>   NOTE: You should also check other related properties:
>
>   http.robots.agents
>   http.agent.description
>   http.agent.url
>   http.agent.email
>   http.agent.version
>
>   and set their values appropriately.
>
>   </description>
> </property>
>
> <property>
>   <name>http.agent.description</name>
>   <value></value>
>   <description>Further description of our bot- this text is used in
>   the User-Agent header.  It appears in parenthesis after the agent name.
>   </description>
> </property>
>
> <property>
>   <name>http.agent.url</name>
>   <value></value>
>   <description>A URL to advertise in the User-Agent header.  This will
>    appear in parenthesis after the agent name. Custom dictates that this
>    should be a URL of a page explaining the purpose and behavior of this
>    crawler.
>   </description>
> </property>
>
> <property>
>   <name>http.agent.email</name>
>   <value></value>
>   <description>An email address to advertise in the HTTP 'From' request
>    header and User-Agent header. A good practice is to mangle this
>    address (e.g. 'info at example dot com') to avoid spamming.
>   </description>
> </property>
> </configuration>
>
>   but still thrs error
>
> also please throw some light on the searching of info through the web
> interface after the crawl is made successful
> --
> With Regards
> Karan Thakral
>


-- 
"Conscious decisions by conscious minds are what make reality real"

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to