Sorry I meant doesn't get to doc.add David
On 24 Nov 2009, at 11:27, "[email protected]" <[email protected] > wrote:
I thought I did but I thought before I did a bin/nutch index (or solrindex) it would be stored somewhere it does seems to be getting to the doc.add bit which makes me think the variable is empty{code} public void addIndexBackendOptions(Configuration conf) { LOG.warn("+_+_You called me _+_+");LuceneWriter.addFieldOptions("html_filter_data", STORE.YES, INDEX.UNTOKENIZED, conf);}public NutchDocument filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks) throws IndexingException { LOG.warn ("________________________FILTER_______________________"); String html_filter_data = parse.getData().getMeta ("html_filter_data");if (html_filter_data != null){LOG.warn("________________________Adding filter data_______________________");doc.add("html_filter_data", html_filter_data); } return doc; } {code} On 24 November 2009 at 12:05 Andrzej Bialecki <[email protected]> wrote: > [email protected] wrote: > > Hi All, > >> > I think I am just about finished my plugin (nutch 1.0) which adds extra > > metadata to during parsing the problem I am having is it doesn't seem to > > be adding the data to the system (via luke or readseg). I looked at in > > the wiki but it seems to be for 0.9 and the syntax looks different.> > > > {code}> > public ParseResult filter(Content content, ParseResult parseResult,> > HTMLMetaTags metaTags, DocumentFragment doc) { > > Metadata metadata = new Metadata(); > > // parse the content > > DocumentFragment root; > > String docTrans; > > try { > > byte[] contentInOctets = content.getContent(); > > String input = new String(contentInOctets);> > XSLTSimpleTransform DocTransform = new XSLTSimpleTransform();> > docTrans = DocTransform.doTransform(input); > > Parse parse = parseResult.get(content.getUrl()); > > metadata = parse.getData().getParseMeta(); > > metadata.add("filter_html_data", docTrans); > > > > } catch (Exception e) { > > e.printStackTrace(LogUtil.getWarnStream(LOG)); > > } > > > > return parseResult; > > } > > {code} > > Did you declare that you are adding this field in the > IndexingFilter.addIndexBackendOptions(..) ? See how other indexing > plugins do this. > > > -- > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __________________________________ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com >
