Yeah, you still don't have the agent configured. All your values for the agent (the <value></value> needs a value) are blank. So, you need to at least confugure an agent name.
On 6/15/07, karan thakral <[EMAIL PROTECTED]> wrote: > i m using crawl on the cygwin while working on windows > > but the crawl output is not proper > > during fetch its says fetch: the document could not be fetched java runtime > exception agent not configured > > my nutch-site.xml is as follows > > <?xml version="1.0"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <!-- Put site-specific property overrides in this file. --> > > <configuration> > <property> > <name>http.agent.name</name> > <value></value> > <description>HTTP 'User-Agent' request header. MUST NOT be empty - > please set this to a single word uniquely related to your organization. > > NOTE: You should also check other related properties: > > http.robots.agents > http.agent.description > http.agent.url > http.agent.email > http.agent.version > > and set their values appropriately. > > </description> > </property> > > <property> > <name>http.agent.description</name> > <value></value> > <description>Further description of our bot- this text is used in > the User-Agent header. It appears in parenthesis after the agent name. > </description> > </property> > > <property> > <name>http.agent.url</name> > <value></value> > <description>A URL to advertise in the User-Agent header. This will > appear in parenthesis after the agent name. Custom dictates that this > should be a URL of a page explaining the purpose and behavior of this > crawler. > </description> > </property> > > <property> > <name>http.agent.email</name> > <value></value> > <description>An email address to advertise in the HTTP 'From' request > header and User-Agent header. A good practice is to mangle this > address (e.g. 'info at example dot com') to avoid spamming. > </description> > </property> > </configuration> > > but still thrs error > > also please throw some light on the searching of info through the web > interface after the crawl is made successful > -- > With Regards > Karan Thakral > -- "Conscious decisions by conscious minds are what make reality real" ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
