Could you please guide me how to reasonably estimate the disk size for Lucene 4.x (precisely 4.8.1 version) including worst case scenario.
I have referred the formula and excel sheet shared @ https://lucidworks.com/blog/estimating-memory-and-storage-for-lucenesolr/ I think it seems to be devised for Lucene 2.9. I am not sure if it's hold true for 4.x version. In my case, either the actual index size is coming close to the worst case or higher than that. Even, one of our enterprise customer has observed 3 times higher index size than the estimated index size (based on excel sheet). Alternatively, can I know the average doc size in Lucene index (of a reasonable size of data) so that I can extrapolate that for complete 250 million documents. Thanks Gaurav