Hi Mari,

it depends ...

* How many records are stored in your MySQL databases?
* How often will updates occur?
* How many db records / index documents are changed per update?

I would suggest to start with a single Solr core first. Thereby, you can concentrate on the basics and do not need to deal with more advanced things like sharding. In case you encounter performance issues later on, you can switch to a multi-core setup.

-Sascha

Mari Masuda wrote:
Hello,

I am new to Solr and am in the beginning planning stage of a large project and 
could use some advice so as not to make a huge design blunder that I will 
regret down the road.

Currently I have about 10 MySQL databases that store information about 
different archival collections.  For example, we have data and metadata about a 
political poster collection, a television program, documents and photographs of 
and about a famous author, etc.  My job is to work with the staff archivists to 
come up with a standard metadata template so the 10 databases can be 
consolidated into one.

Currently the info in these databases is accessed through 10 different sets of 
PHP pages that were written a long time ago for PHP 4.  My plan is to write a 
new Java application that will handle both public display of the info as well 
as an administrative interface so that staff members can add or edit the 
records.

I have decided to use Solr as the search mechanism for this project.  Because the info in each of 
our 10 collections is slightly different (e.g., a record about a poster does not contain duration 
information, but a record about a TV show does) I was thinking it would be good to separate each 
collection's index into a separate Solr core so that commits coming from one collection do not bog 
down the other unrelated collections.  One reservation I have is that eventually we would like to 
be able to type in "Iraq" and find records across all of the collections at once instead 
of having to search each collection separately.  Although I don't know anything about it at this 
stage, I did Google "sharding" after reading someone's recent post on this list and it 
sounds like that may be a potential answer to my question.  Does anyone have any advice on how I 
should initially set up Solr for my situation?  I am slowly making my way through the wiki and 
RTFMing, but I wanted to see what
the experts have to say because at this point I don't really know where to 
start.

Thank you very much,
Mari

Reply via email to