i m using crawl on the cygwin while working on windows

but the crawl output is not proper

during fetch its says fetch: the document could not be fetched java runtime
exception  agent not configured

my nutch-site.xml is  as follows

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
 <name>http.agent.name</name>
 <value></value>
 <description>HTTP 'User-Agent' request header. MUST NOT be empty -
 please set this to a single word uniquely related to your organization.

 NOTE: You should also check other related properties:

 http.robots.agents
 http.agent.description
 http.agent.url
 http.agent.email
 http.agent.version

 and set their values appropriately.

 </description>
</property>

<property>
 <name>http.agent.description</name>
 <value></value>
 <description>Further description of our bot- this text is used in
 the User-Agent header.  It appears in parenthesis after the agent name.
 </description>
</property>

<property>
 <name>http.agent.url</name>
 <value></value>
 <description>A URL to advertise in the User-Agent header.  This will
  appear in parenthesis after the agent name. Custom dictates that this
  should be a URL of a page explaining the purpose and behavior of this
  crawler.
 </description>
</property>

<property>
 <name>http.agent.email</name>
 <value></value>
 <description>An email address to advertise in the HTTP 'From' request
  header and User-Agent header. A good practice is to mangle this
  address (e.g. 'info at example dot com') to avoid spamming.
 </description>
</property>
</configuration>

 but still thrs error

also please throw some light on the searching of info through the web
interface after the crawl is made successful
--
With Regards
Karan Thakral
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to