I don't know the answers to these questions, as I'm still using 1.0. In particular, the CVS version has some considerable differences with 1.0 -- it's always embedded, not a separate server, may not use CORBA (but rather XML-RPC), so check it out carefully.
While you're thinking about it, you could convert the plain text to xml files stored in a single directory. Then, when you've decided which version of xindice to use, you could import the directory into a collection. Alternatively, you could take your existing database, extract each resource from the collection in question and stick it in a new collection with an index already established. You could do this with the commandline tools or a little java program (maybe even easier). Jeff ----- Original Message ----- From: "Matthew Van Horn" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Monday, November 11, 2002 8:54 PM Subject: Re: Performance question (Am I doing something wrong?) > Well it sounds worth a try - I just wish my program for populating the > db didn't take so long. It about an hour and a half to read a 8000 > plain text documents and convert them to XML for storage. Should I > upgrade Xindice to the CVS version to avoid this bug? How does it > compare to 1.0? > > > On Tuesday, November 12, 2002, at 01:17 PM, Jeff Greif wrote: > > > Well, I'll try once more. It has been observed in > > http://marc.theaimsgroup.com/?l=xindice-users&m=102155822118842&w=2 > > that owing to a bug (?) in xindice 1.0, under some or all > > circumstances, > > creating the index after adding documents does not work right. If you > > observed no difference in query time with and without the index, you > > probably need to delete the collection, create the index, and then add > > the > > documents again. In the cited message, query times dropped from 3 > > min. to > > 30ms for a collection with 100,000 documents (or 120ms for 500,000 > > documents). > > > > Jeff > > ----- Original Message ----- > > From: "Matthew Van Horn" <[EMAIL PROTECTED]> > > To: <[email protected]> > > Sent: Monday, November 11, 2002 7:08 PM > > Subject: Re: Performance question (Am I doing something wrong?) > > > > > >> On Tuesday, November 12, 2002, at 11:47 AM, Jeff Greif wrote: > >> > >> While this explanation may help for Beni's issue, my documents _are_ > >> fairly small with a minimum of duplicate elements, yet my queries are > >> taking 3 -4 *minutes*. I finally broke down and tried this > >> programmatically instead of from command line. The following program > >> runs the query in about 3:30. However, thinking that the first query > >> might be untypically slow, I tried again and had it run a few slightly > >> different queries in one run, and they all take that long. Does > >> anyone > >> know where I can start to look for possible causes of this, and ways > >> to > >> improve? > > > >
