Sounds like a good sample to me.   The collection could be smaller if you
document is mostly tags.   I would guess that the internal storage is not
just your raw document, but a parsed version and the tags are probably
represented by a number.   If the ratio of your data to your
tags goes way up, then you will probably see a difference.   I don't know
this for fact, I don't actually code Xindice, I just play a coder on television.

Tom/Kimbro can give more info or correct me if I am wrong :-).

Mark

Sreeni Chippada wrote:

> For each dataset, I have taken a few samples.
> For example, for 16MB, items 2, 100, 185, 370. All gave 210ms
> For the 1GB, items 2, 500, 1000, 2000, 5000, 10000, 20000, 2300.
> Couple of time I saw 400ms. But when i repeate the query, it takes
> 210/220ms.
> I also have other stuff running on the laptop.
>
> Why is the collection size is approximately or less than half the size of
> the dataset size?
> I was expecting the collection size is going to be much bigger than the
> actual dataset size when inserted as a dom.
>
>
> -----Original Message-----
> From: Mark J. Stang [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, March 06, 2002 5:15 PM
> To: [email protected]
> Subject: Re: indexing/xpath query question
>
> Thanks for the information, I haven't had time to run the tests yet!
> It appears that your access time is constant, that makes me think
> that it is hitting the same document in the same place everytime.
> How random is your selection of the document?
>
> Or your machine and Xindice are fast enough that it doesn't matter ;-).
>
> thanks,
>
> Mark
>
> Sreeni Chippada wrote:
>
> > Hi Mark,
> >         Here are the stats I gathered.
> >         I will try to load a much bigger file later this week. I will post
> > those stats as well when I am done.
> >
> > Thanks,
> > Sreeni
> >
> > ********************* BEGIN  *****************
> >
> > Hardware: Dell Inspiron 800 / 512 MB
> >
> > Software: Microsoft Win2K / JDK 1.4.0
> >
> > Dataset Size: 16MB
> > Number of Documents : 373
> > Load Type: DOM
> > Collection Size: 7.46MB
> > Insertion Time : 125s
> > Index Populating Time: 5s 738ms
> > Index Size: 386KB
> > Retrieval Time : 210ms for any item with index/ 28 Secs without index
> >
> > Dataset Size: 102.5MB
> > Number of Documents : 2389
> > Load Type: DOM
> > Collection Size: 37MB
> > Insertion Time : 770 secs
> > Index Populating Time: 32s 878ms
> > Index Size: 3.14MB
> > Retrieval Time : 210ms for any item with index/ 192 Secs without index
> >
> > Dataset Size: 1025MB
> > Number of Documents : ~23890
> > Load Type: DOM
> > Collection Size: 432MB
> > Insertion Time : 770 secs
> > Index Populating Time: 6m 28s 258ms
> > Index Size: 26.258MB
> > Retrieval Time : 220ms for any item with index/ Didn't try without
> indexing
> >
> > **********************  END   ****************
> >
> > -----Original Message-----
> > From: Mark J. Stang [mailto:[EMAIL PROTECTED]
> > Sent: Tuesday, March 05, 2002 10:13 PM
> > To: [email protected]
> > Subject: Re: indexing/xpath query question
> >
> > How did your speed comparison with and without the index go?
> >
> > thanks,
> >
> > Mark
> >
> > Sreeni Chippada wrote:
> >
> > > Mark,
> > >         That worked.
> > >         xindiceadmin xpath -c /db/lucent -q
> > > "//*/BILL_INVOICE.bill_ref_no[text()='2']"
> > >         or
> > >         xindiceadmin xpath -c /db/lucent -q
> > > "//INVOICE/BILL_INVOICE.bill_ref_no[text()='2']"
> > >         gives the expected result.
> > >
> > >         Really appreciate your help. Thanks to all who responded to the
> > > mails.
> > >
> > > -Sreeni
> > >
> > > -----Original Message-----
> > > From: Mark J. Stang [mailto:[EMAIL PROTECTED]
> > > Sent: Tuesday, March 05, 2002 7:32 PM
> > > To: [email protected]
> > > Subject: Re: indexing/xpath query question
> > >
> > > Sreeni,
> > >
> > > I tried it using differernt formats and the only one that worked was:
> > >
> > > xindiceadmin xpath -c /db/customers -q "//*/[EMAIL PROTECTED]'Stang']"
> > >
> > > In my case, my collection is customers, I am telling it to search
> > everything
> > > starting at the root of the document looking for any tag named "name"
> that
> > > has an attribute "lname" with a value of 'Stang'.   I had to put the
> > quotes
> > > around
> > > the whole thing.   It didn't work any other way.
> > >
> > > Mark
> > >
> > > Sreeni Chippada wrote:
> > >
> > > > Thanks Tom.
> > > > But I still do not know why this does not work.
> > > > xindiceadmin xpath -c /db/test -q
> > > > /INVOICE/BILL_INVOICE.bill_ref_no[text()="2"]
> > > > Also tried using single quotes. Any suggestions? I tried from both
> > command
> > > > line and using the java api.
> > > >
> > > > Thanks,
> > > > Sreeni
> > > >
> > > > -----Original Message-----
> > > > From: Tom Bradford [mailto:[EMAIL PROTECTED]
> > > > Sent: Tuesday, March 05, 2002 3:00 PM
> > > > To: [email protected]
> > > > Subject: Re: indexing/xpath query question
> > > >
> > > > On Monday, March 4, 2002, at 01:13 PM, Sreeni Chippada wrote:
> > > > >       I am new to xindice. I added a few documents as DOMs and ran
> > xpath
> > > > > query successfully. Then I added an index on the collection and ran
> > the
> > > > > query. It takes same amount of time.
> > > > >
> > > > > xindiceadmin ai -c /db/test -n BillRefNum  -p
> > > > > /INOVICE/BILL_INVOICE.bill_ref_no
> > > >
> > > > The IndexManager should have thrown an error when you tried to create
> > > > this index, because the pattern that you used is invalid.  This is a
> bug
> > > > in the IndexManager.
> > > >
> > > > Xindice indexing patterns *are not* XPaths, they are simple element,
> > > > attribute, or element/attribute combinations.
> > > >
> > > > You should have created your indexes like this:
> > > >
> > > > xindiceadmin ai -c /db/test -n BillRefNum -p BILL_INVOICE.bill_ref_no
> > > >
> > > > Read the Xindice Administrator docs for more information about
> Indexing
> > > > patterns.
> > > >
> > > > --
> > > > Tom Bradford - http://www.tbradford.org
> > > > Architect - XQRL (XQuery Engine) - http://www.xqrl.com
> > > > Apache Xindice (Native XML Database) - http://xml.apache.org/xindice
> > > > Project Labrador (Web Services Framework) - http://notdotnet.org

Reply via email to