bq: Is each shard/replica/core in fact a separate instance? No. I'm defining "instance" here as a JVM running Solr. And be careful here, a "shard" is made up of one or more "replicas". Those replicas may or may not be distributed amongst separate JVMs/machines. Each replica of a given shard has the same documents in it.
A "replica" is a specialized "core". The term "replica" is generally confined to talking about SolrCloud. So, in SolrCloud a "collection" is made up of one or more "shards". Each shard is made up of one or more "replicas". A replica is a specialized "core". Each Solr instance can host one or more "cores". I've seen hundreds of cores hosted by a single JVM. bq: If I'm running on a single machine - would I then have multiple "cores" listening on multiple ports? No. They're each address by a separate URL on the same port, i.e. http://localhost:8983/solr/core1 http://localhost:8983/solr/core2 etc. If you have more than one JVM on a single machine, _then_ you address them by different ports. bq: If so - I'm thinking there'd be no benefit. It Depends (tm). There's some loss since each core has some overhead. There's some gain because certain operations (filterCache comes to mind) operate over all the docs in a core so having one core has some memory costs. Not to mention that scoring happens over all the docs in a core, so the response time may be quicker with multiple cores (yes, fq clauses help with this, but they have their own overhead). If you're not using SolrCloud, you can use "Transient Cores" to limit the number of cores in memory at any given point. Smaller heap required, better performance characteristics. That presupposes that your usage pattern is "user signs on, searches for a bit and signs off", i.e. you're not supporting all users searching simultaneously. Best, Erick On Sun, Mar 5, 2017 at 12:13 AM, Daniel Miller <dmil...@amfes.com> wrote: > On 3/4/2017 12:00 PM, Shawn Heisey wrote: >> >> On 3/3/2017 11:28 PM, Daniel Miller wrote: >>> >>> What I think I want is create a single collection, with a >>> shard/replica/core per user. Or maybe I'm wanting a separate >>> collection per user - which would again mean a single >>> shard/replica/core. But it seems like each shard/replica/core is a >>> separate instance. >> >> Manual sharding (implicit) is something you can do, but it does mean a >> LOT of individual cores. Many shards/replicas can cause just as many >> performance issues as many collections. > > > Sorry to keep hitting the same point - but I'm still not understanding. Is > each shard/replica/core in fact a separate instance? If I'm running on a > single machine - would I then have multiple "cores" listening on multiple > ports? If so - I'm thinking there'd be no benefit. > >> >>> Without modifying Dovecot source, I can have it generate URL's like, >>> "http://solr.server.local:8983/solr/dovecot/" (which is what I do now) >>> or maybe, "http://solr.server.local:8983/solr/dovecot_user/" or even >>> "http://solr.server.local:8983/solr/dovecot/dovecot_user". But I'm >>> not understanding how, if possible, I can have the indexes created >>> appropriately to support such access. The only examples I've seen use >>> either separate ports or ip's for listeners. >> >> If you use shards, the shard name would be a URL parameter, not part of >> the URL path. Can Dovecot do that? > > > Not without modifying the source - which may indeed be appropriate. What I'm > still not clear on (actually there's a lot...) is: > > Without using multiple servers for redundancy or distributed search - would > splitting the index offer any performance benefit? If not, there's probably > no point in continuing and digging into Dovecot internals. > > Daniel >