Solr 1.4 is doing great with respect to Indexing on a dedicated physical server (Windows Server 2008). For Indexing around 1 million full text documents (around 4 GB size) it takes around 20 minutes with Heap Size = 512M - 1G & 4GB RAM.
However while using Solr on a VM, with 4 GB RAM it took 50 minutes to index at the first time. Note that there is no Network delays and no RAM issues. Now when I increased the RAM to 8GB and increased the heap size, the indexing time increased to 2 hrs. That was really strange. Note that except for SQL Server there is no other process running. There are no network delays. However I have not checked for File I/O. Can that be a bottleneck? Does Solr has any issues running in "Virtualization" Environment? I read a paper today by Brian & Harry: "ON THE RESPONSE TIME OF A SOLR SEARCH ENGINE IN A VIRTUALIZED ENVIRONMENT" & they claim that performance gets deteriorated when RAM is increased when Solr is running on a VM but that is with respect to query times and not indexing times. I am bit confused as to why it took longer on a VM when I repeated the same test second time with increased heap size and RAM. </PRE> <BR> ******************************************************************************************<BR>This message may contain confidential or proprietary information intended only for the use of the<BR>addressee(s) named above or may contain information that is legally privileged. If you are<BR>not the intended addressee, or the person responsible for delivering it to the intended addressee,<BR>you are hereby notified that reading, disseminating, distributing or copying this message is strictly<BR>prohibited. If you have received this message by mistake, please immediately notify us by<BR>replying to the message and delete the original message and any copies immediately thereafter.<BR> <BR> Thank you.~<BR> ******************************************************************************************<BR> FAFLD<BR> <PRE>