What platform are you running on? Mark
Sreeni Chippada wrote: > Mark, > It took me about 3 minutes to load about 2300 documents in 102MB. > It took me 31 sec to index /INVOICE/BILL_INVOICE.bill_ref_no. > Now, I deleted that collection and added a new collection with 22 > documents/1MB(just to make it simple) > I did not index the xpath. If run > xindiceadmin xpath -c /db/lucent -q > /INVOICE/BILL_INVOICE.bill_ref_no > I get all the 22 documents. > If I run > xindiceadmin xpath -c /db/lucent -q > /INVOICE/[BILL_INVOICE.bill_ref_no="2"] > I get nothing. I hope this query is correct. > > What do you mean by 'VM startup is "hiding" the result' ? Could this > be what is happening in my case. > > Thanks, > Sreeni > > > > -----Original Message----- > From: Mark J. Stang [mailto:[EMAIL PROTECTED] > Sent: Monday, March 04, 2002 4:50 PM > To: [email protected] > Subject: Re: indexing/xpath query question > > How much time is it really taking? It maybe fast enough that > the VM startup is "hiding" the result. In Kimbros example, > he did a complete search of 149,025 documents in less than > 12 minutes. If you have 2,000 documents, then it could > take 1/75 of the time or about 10 seconds. If you are printing > the output to the screen it may seem the same. Try doing an XPath > search for the last document added, with and without the index. > Just the one document, not all of them. > > HTH, > Mark > > Kimbro did some tests last September: > > He wrote this: > "As I've been working out some issues with the CORBA system I've been > working on getting larger document sets into the server. My largest set > right now is 149,025 documents in a single collection. The server can > easily handle more documents this is just the largest dataset I have > available right now. Here are some stats to give us a better idea where we > stand. These are run against the current CVS version with one exception. I > used OpenORB for the server ORB instead of JacORB. JacORB was still used > for the client. It's likely we'll need to switch to OpenORB overall as > even the latest JacORB leaks memory on the server. > > computer: 750MHZ P3 256MB RAM Laptop running Mandrake Linux 8 > jdk: Sun 1.3.0_04 > Dataset size: 149,025 documents 601MB > Insertion time (no indexes): 1 hour 45 minutes which is roughly 1,424 docs > per minute or 24 per second. > Collection size: 657MB > Document retrieval: 2 seconds (including VM startup which is most of the > time) > Full collection scan query /disc[id = '11041c03']: 12 minutes > Index creation: 13.5 minutes > Index based query /disc[id = '11041c03']: 2.12 seconds (including VM > startup which is most of that time) > Index size 164MB > > The data set consists of documents similar to the following. > > <?xml version="1.0"?> > <disc> > <id>11041c03</id> > <length>1054</length> > <title>Orchestral Manoeuvres In The Dark / The OMD Remixes (Single)</title> > <genre>cddb/misc</genre> > <track index="1" offset="150">Enola Gay (OMD vs Sash! Radio Edit)</track> > <track index="2" offset="18790"> (2)Souvenir (Moby Remix)</track> > <track index="3" offset="39790"> (3)Electricity (The Micronauts > Remix)</track> > </disc> > > Kimbro Staken" > > Sreeni Chippada wrote: > > > Hi, > > I am new to xindice. I added a few documents as DOMs and ran xpath > > query successfully. Then I added an index on the collection and ran the > > query. It takes same amount of time. > > > > Here are the details: > > > > My document structure looks like this: > > > > <INVOICE> > > <BILL_INVOICE.bill_ref_no>2</BILL_INVOICE.bill_ref_no) > > . > > . > > . > > </INVOICE> > > > > I loaded about 2000 documents. > > > > When I run 'xindiceadmin xpath -c /db/test -q > > /INOVICE/BILL_INVOICE.bill_ref_no' I get all the > > /INOVICE/BILL_INVOICE.bill_ref_no elements. > > > > Then ran the following command to add an index. > > > > xindiceadmin ai -c /db/test -n BillRefNum -p > > /INOVICE/BILL_INVOICE.bill_ref_no > > > > Now if run the same query as above, it still takes same time. Looks like > it > > not using the index i created. > > > > Appreciate any help. > > > > Thanks, > > Sreeni
