Alright, first post to this list and I hope the question
is not too stupid or misplaced ...

what I currently have:
- a nicely working Solr 1.3 index with information about some
entities e.g. organisations, indexed from an RDBMS. Many of these
entities have an URL pointing at further information, e.g. the
website of an institute or company.

- an installation of nutch 0.9 with which I can crawl for the
URLs that I can extract from the RDBMS mentioned above and put
into a seed file

- tutorials about how to put crawled and indexed data from
nutch 1.0 (which I could install w/o problems) into a separate
Solr index


what I want:
- combine the indexed information from the RDBMS and the website
in one Solr index so that I can search both in one and with the
capability of using all the Solr features. E.g. having the following
(example) fields in one document:

<doc>
  <name-from-RDBMS>
  <indexed-content-from-RDBMS>
  <indexed-content-from-website>
  <URL>
  <...>
</doc>

Any input appreciated!

Cheers, Sönke

Reply via email to