I'm wondering, is there some way ("out of the box") to tell Solr that
we're only interested in indexing certain parts of a page? For example,
let's say I have a bunch of pages in my site that contain some common
navigation elements, roughly like this:
<html>
<head><title></title></head>
<body>
<div id="myNavBar">
Stuff here about parts of my site
</div>
<div id="navBar2">
More stuff about other parts of the site
</div>
....A bunch of stuff particular to each individual page...
</body>
</html>
Is there some way to either tell Solr to not index what's in the two
divs whenever it encounters them (and it will-in nearly every page) or,
failing that, to somehow easily give content in those areas a large
negative score in order to get the same effect?
FWIW, we are using Nutch to do the crawling, but as I understand it
there's no way to get Nutch to skip only parts of pages without writing
custom code, right?