Thank you for your response. I was lost seeing that summaries were only generated for certain urls.
Is there any date set for the 1.0 release? Elena 2008/11/25 Dennis Kubes <[EMAIL PROTECTED]> > > > Elena wrote: > >> Hello everyone, >> >> I am using Nutch with the Solr plugin, and I am having a problem indexing >> redirected url´s. While Solr generates its fields just fine, as if they >> belonged to the redirected url, Nutch leaves the summary field empty. It >> seems as if Nutch tries to generate the summary of the original url and >> then >> makes the query to Solr, which then follows the redirect and fills the >> rest >> of the fields using the final url. But I am not quite sure of this. >> > > It depends on what version of Nutch you are using. This was a problem with > some older Trunk versions. The problem is that Nutch has the concept of a > representative url for redirects. Redirects have an original and a > redirected to url. Logic dictates which of those is stored as the url and > which is displayed on search results pages. Most of the problems which this > mismatch have been fixed in recent patches and should be deployed out in a > new 1.0 release in the next week or so. > > >> I would like to know what is the way Nutch generates summaries, why it >> leaves them empty when redirecting. Perharps there is a command to >> generate >> one field in particular, after the indexing is done. >> >> Summaries are generated, at query time, from the full text of the web > page stored in ParseText under segments. The > org.apache.nutch.searcher.Summarizer plugins are what actually returns the > summary text. By default it uses the summary-basic plugin. > > Dennis > > Thanks! >> >>
