On Tuesday, November 12, 2002, at 11:47 AM, Jeff Greif wrote:
I think I see the problem. I believe (if I've read the source code
correctly) that in xindice, your index on <w> is essentially a map from
values of <w> to document keys.
...snippage...
Clearly, if I'm not confused, xindice is optimized for small documents with
not much repeating structure, and its indexing mechanism is not optimal for
the type of query you're doing.
While this explanation may help for Beni's issue, my documents _are_ fairly small with a minimum of duplicate elements, yet my queries are taking 3 -4 *minutes*. I finally broke down and tried this programmatically instead of from command line. The following program runs the query in about 3:30. However, thinking that the first query might be untypically slow, I tried again and had it run a few slightly different queries in one run, and they all take that long. Does anyone know where I can start to look for possible causes of this, and ways to improve?
package foo;
import org.dom4j.io.*; import org.xmldb.api.*; import org.xmldb.api.base.*; import org.xmldb.api.modules.*;
public class QueryRunner {
private static DOMReader xmlReader = new DOMReader();
private static Database database = null;public static void main(String[] args) {
String xpath = "/candidate[biographic_data/id='ANON2021']";
org.w3c.dom.Node node;
org.dom4j.Node newnode;
try {
Class xindiceDriver = Class.forName("org.apache.xindice.client.xmldb.DatabaseImpl");
database = (Database) xindiceDriver.newInstance();
DatabaseManager.registerDatabase(database);
Collection col = DatabaseManager.getCollection("xmldb:xindice:///db/resumes");
XPathQueryService service = (XPathQueryService) col.getService("XPathQueryService", "1.0");
System.out.println(String.valueOf(new java.util.Date()));
ResourceSet rs = service.query(xpath);
System.out.println(String.valueOf(new java.util.Date()));
ResourceIterator ri = rs.getIterator();
while (ri.hasMoreResources()) {
node = ((XMLResource) ri.nextResource()).getContentAsDOM();
System.out.println(xmlReader.read((org.w3c.dom.Document) node).asXML());
}
}
catch (Exception e) { e.printStackTrace(); }
}
}
