If you look into the BasicIndexingFilter.java plugin source you will
see that this is where those default fields get indexed. So, you can
either create a new plugin that is configurable for the properties you
want to index, or remove this plugin. Here is the snippet of code
that is in the filter:
if (host != null) {
// add host as un-stored, indexed and tokenized
doc.add(new Field("host", host, Field.Store.NO,
Field.Index.TOKENIZED));
// add site as un-stored, indexed and un-tokenized
doc.add(new Field("site", host, Field.Store.NO,
Field.Index.UN_TOKENIZED));
}
// url is both stored and indexed, so it's both searchable and returned
doc.add(new Field("url", url.toString(), Field.Store.YES,
Field.Index.TOKENIZED));
// content is indexed, so that it's searchable, but not stored in index
doc.add(new Field("content", parse.getText(), Field.Store.NO,
Field.Index.TOKENIZED));
// anchors are indexed, so they're searchable, but not stored in index
try {
String[] anchors = (inlinks != null ? inlinks.getAnchors()
: new String[0]);
for (int i = 0; i < anchors.length; i++) {
doc.add(new Field("anchor", anchors[i],
Field.Store.NO, Field.Index.TOKENIZED));
}
} catch (IOException ioe) {
if (LOG.isWarnEnabled()) {
LOG.warn("BasicIndexingFilter: can't get anchors for "
+ url.toString());
}
}
On 4/3/07, Ratnesh,V2Solutions India
<[EMAIL PROTECTED]> wrote:
>
> exactly offcourse ,
>
> I want this only, Do you have any solution for this??
>
> looking forwards for your reply
>
> Thnx
>
>
> Siddharth Jonathan wrote:
> >
> > Do you mean how do you get rid of some of the fields that are indexed by
> > default? eg. content, anchor text etc.
> >
> > Jonathan
> > On 4/2/07, Ratnesh,V2Solutions India
> > <[EMAIL PROTECTED]>
> > wrote:
> >>
> >>
> >> Hi,
> >> I have written a plugin , which finds no. of Object tags in a html and
> >> corresponding urls.
> >> I am storing "objects" as fields and page url as values.
> >>
> >> And finally interested in seeing the search realted with "objects"
> >> indexed
> >> fields not those which is already stored as indexed fields.
> >>
> >> So how shall I delete those index fields which is already stored????
> >>
> >> Looking forward towards your reply(Valuable
> >> inputs).........................
> >>
> >> Thnx to Nutch Community
> >> --
> >> View this message in context:
> >> http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a9786377
> >> Sent from the Nutch - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a9803792
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>
--
"Conscious decisions by concious minds are what make reality real"
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general