I've had the same problem... I actually solved it by switching collection compression to true. Uncompressed collections thus have issues with index management.
Bye
At 16:55 23/02/2004, you wrote:
Re: your final question, there is currently no way to get just the doc ids.
Best you can do to ensure efficiency is to select a small fragment of the resource thus minimizing the excess bytes transmitted.
WRT the index creation problem, you could help the project if you can write a unit test demonstrating that index creation during collection creation fails, and filing a bug report in bugzilla including your unit test.
Thanks!
-Terry
Sascha Kulawik wrote:
Have you tried running the unit tests?
No, not now. The http://marc.theaimsgroup.com/?l=xindice-users&m=107426829426034&w=2 solved my problem, but I dont know why - so I won't create the index during the creation of the collection, furthermore after that with that given function. I've recreated the indexes with the patterns [EMAIL PROTECTED] and [EMAIL PROTECTED] - this is much better. A result for a Xpath Query takes 30ms with about 1000 Documents - thats far good enough for my workcase. Actually there is only one problem left - how could I speed up the Querys, where I don't need the Xpath result. So - is there any solution to get a ResultSet back without any Documents in? In this case I need only the DocumentIds res.getDocumentId();
Thank you very much for your help,
Sascha
The test IndexedSearchTest in
java/tests/src/org/apache/xindice/integration/client/services
includes a number of tests that test not only whether or not indexed searching is working, but also test whether or not indexing speeds up the query. One of the tests uses the following query:
//phone[starts-with(@call, 'n')]
That is very similar to the query used in:
> If I'm doing a search like "//[EMAIL PROTECTED]'170']", everything works fine, > except that it takes the same time as without an index.
The IndexedSearchTest indexer for this case uses the pattern "[EMAIL PROTECTED]" to speed up the //phone[starts-with(@call, 'n')] query. The pattern says index all "call" Attributes regardless of what Element they belong to.
Your indexer is defined with pattern "[EMAIL PROTECTED]". Since it does not index ALL possible viewid Attributes (only the viewid Attribute of the link Element) Xindice cannot use this index to search all occurrences of viewid Attributes. Thus, you see no speedup. Try pattern "[EMAIL PROTECTED]" instead.
I would expect to see the IndexedSearchTest fail if there is a problem.
Otherwise, perhaps you have a corrupted index. Try removing it and reindexing.
-Terry
Sascha Kulawik wrote:
Xindice. I'm using Xindice 1.1b3 (currently Ive tried a CVS checkout from today morning) in Jboss with Jetty as exploded war-archive.Hello,
I finally getting headage during the configuration of
I've created a collection with following code snippet:
---------------------------------------------------------------
String collectionConfig = "<collection compressed=\"false\" name=\""+collectionName+"\">"+ "<filer class=\"org.apache.xindice.core.filer.BTreeFiler\"gzip=\"false\"/>"+
"<indexes>"+ "<index class=\"org.apache.xindice.core.indexer.ValueIndexer\" name=\"internalLink_attr_idx\" pattern=\"[EMAIL PROTECTED]" type=\"String\"/>"+ "<index class=\"org.apache.xindice.core.indexer.ValueIndexer\" name=\"document_attr_idx\" pattern=\"document\" type=\"String\"/>"+ "</indexes>"+ "</collection>"; col = DatabaseManager.getCollection(uri);
CollectionManager collman = (CollectionManager) col.getService("CollectionManager", "1.0"); try { collman.createCollection(collectionName, XercesHelper.string2Dom(collectionConfig));
}catch(Exception exe) {
String errMsg = "Error during the converting of theCollection-String
to XML-DOM"; log.error(errMsg); throw new XMLDBException(ErrorCodes.VENDOR_ERROR, -1, errMsg, exe); }everything works fine, except that it takes the same time as without an index.
---------------------------------------------------------------
If I'm doing a search like "//[EMAIL PROTECTED]'170']",
If I'm trying to search for "//[EMAIL PROTECTED]'2045']",nothing happens,
no result, nothing. Without the index I will get someresults back. This Xpath search is very fast (80ms), but without any result it is obvious needless :) The idx file of the first one is about 30kB in size, the second one is 6kB - this is the default I think.
For the first Xpath Query it is only relevant, if thisdocument exists in any xml document in the collection. I've seen on MARC, that this could be done faster, so that the result of this Xpath Query will be only the Document itself or the id of the document. How is this possible?
Thank you all very much,
Sascha
