On Fri, 2008-03-07 at 15:38 +0100, [EMAIL PROTECTED] wrote: > > With a commit after every add: (286 sec / 10,000 docs) 28.6 ms. > > With a commit after every 100 add: (12 sec / 10,000 docs) 1.2 ms. > > Only one commit: (8 sec / 10,000 docs) 0.8 ms. > > Of couse. If you need so less time to create a document than a commit which > may take, lets say 10 - 500 ms, will slow down indexing heavily.
I tried doing the same with 1 million documents and contrary to my expectations, the amortized time for commits didn't increase for the frequent commits. I thought that the merging of the segments were more expensive. But then again - the resulting index is "only" 1.3GB. With a commit after every add: (21818 sec / 1,000,000 docs) 21.818 ms. With a commit after every 100 add: (1879 sec / 1,000,000 docs) 1.879 ms. Only one commit: (1270 sec / 1,000,000 docs) 1.270 ms. While I were at it, I tried running the test on SSDs and fast harddisks on our dual-core test-server. 2 Sandisk solid state drives in RAID 0: With a commit after every add: (1460 sec / 1,000,000 docs) 1.460 ms. With a commit after every 100 add: (409 sec / 1,000,000 docs) 0.400 ms. Only one commit: (321 sec / 1,000,000 docs) 0.317 ms. 2 MTRON solid state drives in RAID 0: With a commit after every add: (1495 sec / 1,000,000 docs) 1.495 ms. With a commit after every 100 add: (353 sec / 1,000,000 docs) 0.353 ms. Only one commit: (296 sec / 1,000,000 docs) 0.296 ms. 2 10.000 RPM harddisks in RAID 0: With a commit after every add: (1781 sec / 1,000,000 docs) 1.781 ms. With a commit after every 100 add: (380 sec / 1,000,000 docs) 0.373 ms. Only one commit: (335 sec / 1,000,000 docs) 0.323 ms. 2 15.000 RPM harddisks in RAID 1: With a commit after every add: (1876 sec / 1,000,000 docs) 1.876 ms. With a commit after every 100 add: (458 sec / 1,000,000 docs) 0.458 ms. Only one commit: (361 sec / 1,000,000 docs) ms. Not enough data for real statistics, I know, but on the faster server system, the penalty for frequent commits doesn't appear to be so bad. > So it really depends on the use case and how long it takes to index a single > document, inclusive retrieval of the document from ist source. Absolutely. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]