BAD MSG: windows like to do funny things with the quotes, try single uotes. I haven't used the command-line tools for queries yet, so maybe someone else comment on the proper format.
Mark Sreeni Chippada wrote: > Windows 2000. > > -----Original Message----- > From: Mark J. Stang [mailto:[EMAIL PROTECTED] > Sent: Tuesday, March 05, 2002 12:28 PM > To: [email protected] > Subject: Re: indexing/xpath query question > > What platform are you running on? > > Mark > > Sreeni Chippada wrote: > > > Mark, > > It took me about 3 minutes to load about 2300 documents in 102MB. > > It took me 31 sec to index /INVOICE/BILL_INVOICE.bill_ref_no. > > Now, I deleted that collection and added a new collection with 22 > > documents/1MB(just to make it simple) > > I did not index the xpath. If run > > xindiceadmin xpath -c /db/lucent -q > > /INVOICE/BILL_INVOICE.bill_ref_no > > I get all the 22 documents. > > If I run > > xindiceadmin xpath -c /db/lucent -q > > /INVOICE/[BILL_INVOICE.bill_ref_no="2"] > > I get nothing. I hope this query is correct. > > > > What do you mean by 'VM startup is "hiding" the result' ? Could > this > > be what is happening in my case. > > > > Thanks, > > Sreeni > > > > > > > > -----Original Message----- > > From: Mark J. Stang [mailto:[EMAIL PROTECTED] > > Sent: Monday, March 04, 2002 4:50 PM > > To: [email protected] > > Subject: Re: indexing/xpath query question > > > > How much time is it really taking? It maybe fast enough that > > the VM startup is "hiding" the result. In Kimbros example, > > he did a complete search of 149,025 documents in less than > > 12 minutes. If you have 2,000 documents, then it could > > take 1/75 of the time or about 10 seconds. If you are printing > > the output to the screen it may seem the same. Try doing an XPath > > search for the last document added, with and without the index. > > Just the one document, not all of them. > > > > HTH, > > Mark > > > > Kimbro did some tests last September: > > > > He wrote this: > > "As I've been working out some issues with the CORBA system I've been > > working on getting larger document sets into the server. My largest set > > right now is 149,025 documents in a single collection. The server can > > easily handle more documents this is just the largest dataset I have > > available right now. Here are some stats to give us a better idea where we > > stand. These are run against the current CVS version with one exception. I > > used OpenORB for the server ORB instead of JacORB. JacORB was still used > > for the client. It's likely we'll need to switch to OpenORB overall as > > even the latest JacORB leaks memory on the server. > > > > computer: 750MHZ P3 256MB RAM Laptop running Mandrake Linux 8 > > jdk: Sun 1.3.0_04 > > Dataset size: 149,025 documents 601MB > > Insertion time (no indexes): 1 hour 45 minutes which is roughly 1,424 docs > > per minute or 24 per second. > > Collection size: 657MB > > Document retrieval: 2 seconds (including VM startup which is most of the > > time) > > Full collection scan query /disc[id = '11041c03']: 12 minutes > > Index creation: 13.5 minutes > > Index based query /disc[id = '11041c03']: 2.12 seconds (including VM > > startup which is most of that time) > > Index size 164MB > > > > The data set consists of documents similar to the following. > > > > <?xml version="1.0"?> > > <disc> > > <id>11041c03</id> > > <length>1054</length> > > <title>Orchestral Manoeuvres In The Dark / The OMD Remixes > (Single)</title> > > <genre>cddb/misc</genre> > > <track index="1" offset="150">Enola Gay (OMD vs Sash! Radio Edit)</track> > > <track index="2" offset="18790"> (2)Souvenir (Moby Remix)</track> > > <track index="3" offset="39790"> (3)Electricity (The Micronauts > > Remix)</track> > > </disc> > > > > Kimbro Staken" > > > > Sreeni Chippada wrote: > > > > > Hi, > > > I am new to xindice. I added a few documents as DOMs and ran > xpath > > > query successfully. Then I added an index on the collection and ran the > > > query. It takes same amount of time. > > > > > > Here are the details: > > > > > > My document structure looks like this: > > > > > > <INVOICE> > > > <BILL_INVOICE.bill_ref_no>2</BILL_INVOICE.bill_ref_no) > > > . > > > . > > > . > > > </INVOICE> > > > > > > I loaded about 2000 documents. > > > > > > When I run 'xindiceadmin xpath -c /db/test -q > > > /INOVICE/BILL_INVOICE.bill_ref_no' I get all the > > > /INOVICE/BILL_INVOICE.bill_ref_no elements. > > > > > > Then ran the following command to add an index. > > > > > > xindiceadmin ai -c /db/test -n BillRefNum -p > > > /INOVICE/BILL_INVOICE.bill_ref_no > > > > > > Now if run the same query as above, it still takes same time. Looks like > > it > > > not using the index i created. > > > > > > Appreciate any help. > > > > > > Thanks, > > > Sreeni
