Felix Zimmermann wrote:
Hi,

I converted ARC-files (Heritrix) into segments using ArcSegmentCreator.
The next step is to index these segments with solrindex, but this needs
the crawl_db and link_db.

You can use the regular updatedb and linkdb commands on the segments. From there you should be able to use your choice of indexer.

Dennis


1. Is it possible to solr-index these segments without crawldb and
linkdb with NUTCH?

2. Or do I have to use NUTCH-WAX to get these files and afterwards I use
NUTCH to put the data into SOLR?

I know that it is possible to create a writer for direct import from
Heritrix to Solr, but I am not able to do this myself.

Thanks!


Reply via email to