the Tue May 10 21:33:17 2005

BAD MSG:
windows like to do funny things with the quotes, try single
uotes.   I haven't used the command-line tools for queries
yet, so maybe someone else comment on the proper
format.


Mark

Sreeni Chippada wrote:

> Windows 2000.
>
> -----Original Message-----
> From: Mark J. Stang [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, March 05, 2002 12:28 PM
> To: [email protected]
> Subject: Re: indexing/xpath query question
>
> What platform are you running on?
>
> Mark
>
> Sreeni Chippada wrote:
>
> > Mark,
> >         It took me about 3 minutes to load about 2300 documents in 102MB.
> >         It took me 31 sec to index /INVOICE/BILL_INVOICE.bill_ref_no.
> >         Now, I deleted that collection and added a new collection with 22
> > documents/1MB(just to make it simple)
> >         I did not index the xpath. If run
> >                 xindiceadmin xpath -c /db/lucent -q
> > /INVOICE/BILL_INVOICE.bill_ref_no
> >         I get all the 22 documents.
> >         If I run
> >                 xindiceadmin xpath -c /db/lucent -q
> > /INVOICE/[BILL_INVOICE.bill_ref_no="2"]
> >         I get nothing. I hope this query is correct.
> >
> >         What do you mean by 'VM startup is "hiding" the result' ? Could
> this
> > be what is happening in my case.
> >
> > Thanks,
> > Sreeni
> >
> >
> >
> > -----Original Message-----
> > From: Mark J. Stang [mailto:[EMAIL PROTECTED]
> > Sent: Monday, March 04, 2002 4:50 PM
> > To: [email protected]
> > Subject: Re: indexing/xpath query question
> >
> > How much time is it really taking?   It maybe fast enough that
> > the VM startup is "hiding" the result.   In Kimbros example,
> > he did a complete search of 149,025 documents in less than
> > 12 minutes.   If you have 2,000 documents, then it could
> > take 1/75 of the time or about 10 seconds.   If you are printing
> > the output to the screen it may seem the same.   Try doing an XPath
> > search for the last document added, with and without the index.
> > Just the one document, not all of them.
> >
> > HTH,
> > Mark
> >
> > Kimbro did some tests last September:
> >
> > He wrote this:
> > "As I've been working out some issues with the CORBA system I've been
> > working on getting larger document sets into the server. My largest set
> > right now is 149,025 documents in a single collection. The server can
> > easily handle more documents this is just the largest dataset I have
> > available right now. Here are some stats to give us a better idea where we
> > stand. These are run against the current CVS version with one exception. I
> > used OpenORB for the server ORB  instead of JacORB. JacORB was still used
> > for the client. It's likely we'll need to switch to OpenORB overall as
> > even the latest JacORB leaks memory on the server.
> >
> > computer: 750MHZ P3 256MB RAM Laptop running Mandrake Linux 8
> > jdk: Sun 1.3.0_04
> > Dataset size: 149,025 documents 601MB
> > Insertion time (no indexes): 1 hour 45 minutes which is roughly 1,424 docs
> > per minute or 24 per second.
> > Collection size: 657MB
> > Document retrieval: 2 seconds (including VM startup which is most of the
> > time)
> > Full collection scan query /disc[id = '11041c03']: 12 minutes
> > Index creation: 13.5 minutes
> > Index based query /disc[id = '11041c03']: 2.12 seconds (including VM
> > startup which is most of that time)
> > Index size 164MB
> >
> > The data set consists of documents similar to the following.
> >
> > <?xml version="1.0"?>
> > <disc>
> > <id>11041c03</id>
> > <length>1054</length>
> > <title>Orchestral Manoeuvres In The Dark / The OMD Remixes
> (Single)</title>
> > <genre>cddb/misc</genre>
> > <track index="1" offset="150">Enola Gay (OMD vs Sash! Radio Edit)</track>
> > <track index="2" offset="18790"> (2)Souvenir (Moby Remix)</track>
> > <track index="3" offset="39790"> (3)Electricity (The Micronauts
> > Remix)</track>
> > </disc>
> >
> > Kimbro Staken"
> >
> > Sreeni Chippada wrote:
> >
> > > Hi,
> > >         I am new to xindice. I added a few documents as DOMs and ran
> xpath
> > > query successfully. Then I added an index on the collection and ran the
> > > query. It takes same amount of time.
> > >
> > > Here are the details:
> > >
> > > My document structure looks like this:
> > >
> > > <INVOICE>
> > >         <BILL_INVOICE.bill_ref_no>2</BILL_INVOICE.bill_ref_no)
> > >         .
> > >         .
> > >         .
> > > </INVOICE>
> > >
> > > I loaded about 2000 documents.
> > >
> > > When I run 'xindiceadmin xpath -c /db/test -q
> > > /INOVICE/BILL_INVOICE.bill_ref_no' I get all the
> > > /INOVICE/BILL_INVOICE.bill_ref_no elements.
> > >
> > > Then ran the following command to add an index.
> > >
> > > xindiceadmin ai -c /db/test -n BillRefNum  -p
> > > /INOVICE/BILL_INVOICE.bill_ref_no
> > >
> > > Now if run the same query as above, it still takes same time. Looks like
> > it
> > > not using the index i created.
> > >
> > > Appreciate any help.
> > >
> > > Thanks,
> > > Sreeni

Reply via email to