On 3/2/2017 5:14 PM, Shawn Heisey wrote:
On 3/2/2017 2:58 PM, Daniel Miller wrote:
I'm asking for some guidance on how I might
optimize Solr.
I use Solr for work. I use Dovecot for personal domains. I have not
used them together. I probably should -- my personal mailbox is many
gigabytes and would benefit from a boost in search performance.
If using Thunderbird - searches for header fields like sender or subject
doesn't change much. Body type searches - unbelievable difference. And
of course other client's, especially mobile clients, benefit tremendously.
What I don't know is:
1. Is it possible to split the "indexes" (I'm still learning Solr
vocabulary) without creating separate "cores" (which to me means
separate Java instances)?
2. Can these separate "indexes" be created on-demand - or do they
need to be explictly created prior to use?
Here's a paragraph that hopefully clears up most confusion about Solr
terminology. This is applicable to SolrCloud:
Collections are made up of one or more shards. Shards are made up of
one or more replicas. Each replica is a core. One replica from each
shard is elected as the leader of that shard, and if there are multiple
replicas, the leader role can move between them in response to a change
in cluster state.
Further info: One Solr instance (JVM) can handle many cores. SolrCloud
allows multiple Solr instances to coordinate with each other (via
ZooKeeper) and form a whole cluster. Without SolrCloud, you have cores,
but no collections and no replicas. Sharding is possible without
SolrCloud, but is handled mostly manually.
What I think I want is create a single collection, with a
shard/replica/core per user. Or maybe I'm wanting a separate collection
per user - which would again mean a single shard/replica/core. But it
seems like each shard/replica/core is a separate instance.
Without modifying Dovecot source, I can have it generate URL's like,
"http://solr.server.local:8983/solr/dovecot/" (which is what I do now)
or maybe, "http://solr.server.local:8983/solr/dovecot_user/" or even
"http://solr.server.local:8983/solr/dovecot/dovecot_user". But I'm not
understanding how, if possible, I can have the indexes created
appropriately to support such access. The only examples I've seen use
either separate ports or ip's for listeners.
One thing to note: SolrCloud begins to have performance issues when the
number of collections in the cloud reaches the low hundreds. It's not
going to scale very well with a collection per user or per mailbox
unless there aren't very many users.
At the moment, without digging into Dovecot code, it doesn't look like a
per-mailbox option exists. But per-user certainly does - and in my case
I have less than 100 users so it shouldn't be an issue - if I get it to
work.
Daniel