Re: your final question, there is currently no way
to get just the doc ids.

Best you can do to ensure efficiency is
to select a small fragment of the resource
thus minimizing the excess bytes transmitted.

WRT the index creation problem,
you could help the project if you can
write a unit test demonstrating that
index creation during collection creation
fails, and filing a bug report in
bugzilla including your unit test.

Thanks!

-Terry

Sascha Kulawik wrote:

Have you tried running the unit tests?



No, not now. The http://marc.theaimsgroup.com/?l=xindice-users&m=107426829426034&w=2 solved my problem, but I dont know why - so I won't create the index during the creation of the collection, furthermore after that with that given function. I've recreated the indexes with the patterns [EMAIL PROTECTED] and [EMAIL PROTECTED] - this is much better. A result for a Xpath Query takes 30ms with about 1000 Documents - thats far good enough for my workcase. Actually there is only one problem left - how could I speed up the Querys, where I don't need the Xpath result. So - is there any solution to get a ResultSet back without any Documents in? In this case I need only the DocumentIds res.getDocumentId();

Thank you very much for your help,

Sascha



The test IndexedSearchTest in
java/tests/src/org/apache/xindice/integration/client/services
includes a number of tests that test not only whether or not indexed searching is working, but also test whether or not indexing speeds up the query. One of the tests uses the following query:


//phone[starts-with(@call, 'n')]

That is very similar to the query used in:

> If I'm doing a search like "//[EMAIL PROTECTED]'170']", everything works fine, > except that it takes the same time as without an index.

The IndexedSearchTest indexer for this case uses the pattern "[EMAIL PROTECTED]" to speed up the //phone[starts-with(@call, 'n')] query. The pattern says index all "call" Attributes regardless of what Element they belong to.

Your indexer is defined with pattern "[EMAIL PROTECTED]". Since it does not index ALL possible viewid Attributes (only the viewid Attribute of the link Element) Xindice cannot use this index to search all occurrences of viewid Attributes. Thus, you see no speedup. Try pattern "[EMAIL PROTECTED]" instead.

I would expect to see the IndexedSearchTest fail if there is a problem.
Otherwise, perhaps you have a corrupted index. Try removing it and reindexing.


-Terry

Sascha Kulawik wrote:



Hello,

I finally getting headage during the configuration of

Xindice. I'm using Xindice 1.1b3 (currently Ive tried a CVS checkout from today morning) in Jboss with Jetty as exploded war-archive.


I've created a collection with following code snippet:

---------------------------------------------------------------
String collectionConfig = "<collection compressed=\"false\" name=\""+collectionName+"\">"+ "<filer class=\"org.apache.xindice.core.filer.BTreeFiler\"


gzip=\"false\"/>"+

"<indexes>"+ "<index class=\"org.apache.xindice.core.indexer.ValueIndexer\" name=\"internalLink_attr_idx\" pattern=\"[EMAIL PROTECTED]" type=\"String\"/>"+ "<index class=\"org.apache.xindice.core.indexer.ValueIndexer\" name=\"document_attr_idx\" pattern=\"document\" type=\"String\"/>"+ "</indexes>"+ "</collection>"; col = DatabaseManager.getCollection(uri);
CollectionManager collman = (CollectionManager) col.getService("CollectionManager", "1.0"); try { collman.createCollection(collectionName, XercesHelper.string2Dom(collectionConfig));
}catch(Exception exe) {
String errMsg = "Error during the converting of the


Collection-String

to XML-DOM"; log.error(errMsg); throw new XMLDBException(ErrorCodes.VENDOR_ERROR, -1, errMsg, exe); }
---------------------------------------------------------------


If I'm doing a search like "//[EMAIL PROTECTED]'170']",

everything works fine, except that it takes the same time as without an index.


If I'm trying to search for "//[EMAIL PROTECTED]'2045']",

nothing happens,

no result, nothing. Without the index I will get some

results back. This Xpath search is very fast (80ms), but without any result it is obvious needless :) The idx file of the first one is about 30kB in size, the second one is 6kB - this is the default I think.


For the first Xpath Query it is only relevant, if this

document exists in any xml document in the collection. I've seen on MARC, that this could be done faster, so that the result of this Xpath Query will be only the Document itself or the id of the document. How is this possible?


Thank you all very much,

Sascha










Reply via email to