Hi,

We observed that when we use the setting "compressed=true" the index size is 
around 0.66 times the actual log file, where as if we do not use any 
compressed=true setting, the index size is almost as much as 2.6 times.

Our sample solr document size is approximately 1000 bytes. In addition to the 
text data we have around 9 metadata tags associated to it. 

We need to display all off the metadata values on the GUI, and hence we are 
setting stored=true in our schema.xml

Now the question is, how the compressed=true flag impacts the indexing and 
Querying operations. I am sure that there will be CPU utilization spikes as 
there will be operation of compressing(during indexing) and 
uncompressing(during querying) of the indexed data. I am mainly looking for any 
bench marks for the above scenario.

The expected volumes of the data coming in would be approximately 400 GB of 
data per day, so it is very important for us to evaluate the compressed=true, 
due to the file system utilization and index sizing issues.

Any help would be greatly appreciated..

Thanks,
sS


      

Reply via email to