Hi, I converted ARC-files (Heritrix) into segments using ArcSegmentCreator. The next step is to index these segments with solrindex, but this needs the crawl_db and link_db.
1. Is it possible to solr-index these segments without crawldb and linkdb with NUTCH? 2. Or do I have to use NUTCH-WAX to get these files and afterwards I use NUTCH to put the data into SOLR? I know that it is possible to create a writer for direct import from Heritrix to Solr, but I am not able to do this myself. Thanks!
