Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "Lucene" page has been changed by AlexMc. The comment on this change is: Please add any useful Lucene info here - or pointers to it. . http://wiki.apache.org/nutch/Lucene -------------------------------------------------- New page: == Lucene Jumping Off Point == This page will provide links to various Lucene pages in this wiki. More information about Lucene can be found at their website http://lucene.apache.org It should be noted that rather than using Lucene in-process the preferred solution nowadays is to use a separate SolR server. Nutch used to be a Lucene sub project but became a top level project in 2010. == Lucene dynamic attributes == Torsten Krah asks: {{{ > pre 1.0 Days, it was possible to have dynamic attributes in lucene, because > the API let you do such things (Lucene document access). > > How to do the same in 1.0> - using 1.1 the API the NutchDocument does only > know name and value, but if i don't know the name (dynamic attribute via > HtmlParser, meta tags indexing) - how can i still index them? Or is this > impossible with the lucene backend now? }}} Andrzej Bialecki replies: It's still possible to do this, but it's undocumented... Here's a quick howto: in your IndexingFilter, whenever you want to add a previously undeclared field you need to declare its Lucene options on a per-document level like this: {{{ String fieldName = "myMetaField"; String value = "undeclared meta value"; Metadata meta = nutchDocument.getDocumentMeta(); meta.add(LuceneConstants.FIELD_PREFIX + fieldName, LuceneConstants.STORE_YES); meta.add(LuceneConstants.FIELD_PREFIX + fieldName, LuceneConstants.INDEX_TOKENIZED); //... etc, add those field options that you want // and add the field value nutchDocument.add(fieldName, value); }}}

