Hello, > Hello all! > > We have a node with 50 children, each child has the property > URL, this property holds the URL. > > We want to use XPATH to query the nodes like this: > > /jcr:root/websites/*[jcr:like(@URL, "%\/\/www.domain.com%")] > > However for some reason this query takes about 30 seconds to execute > > JackRabbit version is 1.3.1, repository is configured to use > local filesystem storage and file bundle persistence manager. > > Could somebody please advice how can we speed up this query?
You cannot. There have been multiple mailing threads before regarding jcr:like starting with a %. You should not use a leading % if you want performing searches (this is quite general in any search implementation i am aware of, independant of lucene). So, it will be much faster if you have two tests, for example, https://www.domain.com% OR http://www.domain.com% OTOH, I still wouldn't like the % in the end. If you want it really like it should in my opinion (in other words, fast if you have millions of links), you should configure your property URL to be analyzed with your own custom url-analyzer. See [1] at the bottom for explanation: Resume what you should do: Add a indexing_configuration.xml to you SearchIndex configuration, and add something like: <analyzers> <analyzer class="com.domain.www.your.analyzers.urlAnalyzer"> <property>URL</property> </analyzer> </analyzers> and simply create an analyzer that only indexes the part from a url that holds the domain as a single term, ie www.domain.com (this shouldn't be to hard) Now, you can search for your urls like: /jcr:root/websites/*[jcr:contains(@URL, "someurl")] This works, because for searching, the parser of someurl will use the same analyzer, resulting in a search for a single term, which will work if your repository grows to tens of millions of documents within couple of ms. Hope all is clear (probably not trivial, but IMO the best solution for what you want) [1] http://wiki.apache.org/jackrabbit/IndexingConfiguration > > Thank you in advance! > > -- > Eugene N Dzhurinsky >
