Hi Mari,
it depends ...
* How many records are stored in your MySQL databases?
* How often will updates occur?
* How many db records / index documents are changed per update?
I would suggest to start with a single Solr core first. Thereby, you can
concentrate on the basics and do not need to deal with more advanced
things like sharding. In case you encounter performance issues later on,
you can switch to a multi-core setup.
-Sascha
Mari Masuda wrote:
Hello,
I am new to Solr and am in the beginning planning stage of a large project and
could use some advice so as not to make a huge design blunder that I will
regret down the road.
Currently I have about 10 MySQL databases that store information about
different archival collections. For example, we have data and metadata about a
political poster collection, a television program, documents and photographs of
and about a famous author, etc. My job is to work with the staff archivists to
come up with a standard metadata template so the 10 databases can be
consolidated into one.
Currently the info in these databases is accessed through 10 different sets of
PHP pages that were written a long time ago for PHP 4. My plan is to write a
new Java application that will handle both public display of the info as well
as an administrative interface so that staff members can add or edit the
records.
I have decided to use Solr as the search mechanism for this project. Because the info in each of
our 10 collections is slightly different (e.g., a record about a poster does not contain duration
information, but a record about a TV show does) I was thinking it would be good to separate each
collection's index into a separate Solr core so that commits coming from one collection do not bog
down the other unrelated collections. One reservation I have is that eventually we would like to
be able to type in "Iraq" and find records across all of the collections at once instead
of having to search each collection separately. Although I don't know anything about it at this
stage, I did Google "sharding" after reading someone's recent post on this list and it
sounds like that may be a potential answer to my question. Does anyone have any advice on how I
should initially set up Solr for my situation? I am slowly making my way through the wiki and
RTFMing, but I wanted to see what
the experts have to say because at this point I don't really know where to
start.
Thank you very much,
Mari