How much time is it really taking? It maybe fast enough that the VM startup is "hiding" the result. In Kimbros example, he did a complete search of 149,025 documents in less than 12 minutes. If you have 2,000 documents, then it could take 1/75 of the time or about 10 seconds. If you are printing the output to the screen it may seem the same. Try doing an XPath search for the last document added, with and without the index. Just the one document, not all of them.
HTH, Mark Kimbro did some tests last September: He wrote this: "As I've been working out some issues with the CORBA system I've been working on getting larger document sets into the server. My largest set right now is 149,025 documents in a single collection. The server can easily handle more documents this is just the largest dataset I have available right now. Here are some stats to give us a better idea where we stand. These are run against the current CVS version with one exception. I used OpenORB for the server ORB instead of JacORB. JacORB was still used for the client. It's likely we'll need to switch to OpenORB overall as even the latest JacORB leaks memory on the server. computer: 750MHZ P3 256MB RAM Laptop running Mandrake Linux 8 jdk: Sun 1.3.0_04 Dataset size: 149,025 documents 601MB Insertion time (no indexes): 1 hour 45 minutes which is roughly 1,424 docs per minute or 24 per second. Collection size: 657MB Document retrieval: 2 seconds (including VM startup which is most of the time) Full collection scan query /disc[id = '11041c03']: 12 minutes Index creation: 13.5 minutes Index based query /disc[id = '11041c03']: 2.12 seconds (including VM startup which is most of that time) Index size 164MB The data set consists of documents similar to the following. <?xml version="1.0"?> <disc> <id>11041c03</id> <length>1054</length> <title>Orchestral Manoeuvres In The Dark / The OMD Remixes (Single)</title> <genre>cddb/misc</genre> <track index="1" offset="150">Enola Gay (OMD vs Sash! Radio Edit)</track> <track index="2" offset="18790"> (2)Souvenir (Moby Remix)</track> <track index="3" offset="39790"> (3)Electricity (The Micronauts Remix)</track> </disc> Kimbro Staken" Sreeni Chippada wrote: > Hi, > I am new to xindice. I added a few documents as DOMs and ran xpath > query successfully. Then I added an index on the collection and ran the > query. It takes same amount of time. > > Here are the details: > > My document structure looks like this: > > <INVOICE> > <BILL_INVOICE.bill_ref_no>2</BILL_INVOICE.bill_ref_no) > . > . > . > </INVOICE> > > I loaded about 2000 documents. > > When I run 'xindiceadmin xpath -c /db/test -q > /INOVICE/BILL_INVOICE.bill_ref_no' I get all the > /INOVICE/BILL_INVOICE.bill_ref_no elements. > > Then ran the following command to add an index. > > xindiceadmin ai -c /db/test -n BillRefNum -p > /INOVICE/BILL_INVOICE.bill_ref_no > > Now if run the same query as above, it still takes same time. Looks like it > not using the index i created. > > Appreciate any help. > > Thanks, > Sreeni
