On 4/7/2021 1:41 PM, Natarajan, Rajeswari wrote:
If there is any way to get the size of the index of tenant in a collection where multiple tenants co-exist with composite id router scheme ,let me know We need to somehow track the tenant's index size to see if it grows too big and document count is not proportional to index size in our case.
There isn't any way to do that. The way that Lucene's indexes are designed, obtaining that information is currently impossible, and it would likely take a VERY large amount of development effort to make it possible. I would guess that even if it were possible, obtaining that information would be very expensive in terms of system resources and time.
The best you can do with current technology is estimate the size based on document count compared to the whole index. But if each tenant has very different kinds of data in the index, that method would probably give you inaccurate information.
One thing you could do to have each one be its own collection is set up multiple cloud installs, which can share one zookeeper ensemble by using different chroot values for each one, and only put a few hundred collections in each cloud. This would probably require a lot of additional hardware, and because of Lucene's economies of scale that Walter was talking about, multiple collections WILL be larger on disk than multiple tenants in one collection.
Thanks, Shawn
