https://issues.apache.org/jira/browse/NUTCH-1005
> Hi, > > Since I'm relatively new to Nutch/Solr, I was wondering if the following > would make sense: > > Headings in web pages (h1, h2, h3) should be more important than any > other content of the page, so if a match to a query turns up in a > heading, the ranking of the document should be higher. In order to boost > a field, I would need to separately index it - this would mean on > parsing the crawled pages, I would need to strip out the headings h1, h2 > and h3, index them in separate fields, and remove them from the content > field. I presume I would have to modify the HTML Parser and Index Basic > plugin for this, or is there an easier solution? > > Any input appreciated, > Elisabeth

