Here is a small snippet that I copy pated from Shawn Helsey (who is a core contributor I think, he's good):
> One thing to note: SolrCloud begins to have performance issues when the > number of collections in the cloud reaches the low hundreds. It's not > going to scale very well with a collection per user or per mailbox > unless there aren't very many users. There are people looking into how > to scale better, but this hasn't really gone anywhere yet. Here's one > issue about it, with a lot of very dense comments: > > https://issues.apache.org/jira/browse/SOLR-7191 On Tue, Apr 11, 2017 at 9:11 PM, Dorian Hoxha <dorian.ho...@gmail.com> wrote: > And this overhead depends on what? I mean, if I create an empty collection >> will it take up much heap size just for "being there" ? > > Yes. You can search on elastic-search/solr/lucene mailing lists and see > that it's true. But nobody has `empty` collections, so yours will have a > schema and some data/segments and translog. > > On Tue, Apr 11, 2017 at 7:41 PM, jpereira <jpereira...@gmail.com> wrote: > >> The way the data is spread across the cluster is not really uniform. Most >> of >> shards have way lower than 50GB; I would say about 15% of the total shards >> have more than 50GB. >> >> >> Dorian Hoxha wrote >> > Each shard is a lucene index which has a lot of overhead. >> >> And this overhead depends on what? I mean, if I create an empty collection >> will it take up much heap size just for "being there" ? >> >> >> Dorian Hoxha wrote >> > I don't know about static/dynamic memory-issue though. >> >> I could not find anything related in the docs or the mailing list either, >> but I'm still not ready to discard this suspicion... >> >> Again, thx for your time >> >> >> >> -- >> View this message in context: http://lucene.472066.n3.nabble >> .com/Dynamic-schema-memory-consumption-tp4329184p4329367.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > >