Re: MissingLastRevSeeker
Hi Julian, The LastRevRecoveryAgent is executed at 2 places 1. On DocumentNodeStore startup where the MissingLastRevSeeker is used to get potential candidates for recovery. 2. At regular intervals defined by the property 'lastRevRecoveryJobIntervalInSecs' in the DocumentNodeStoreService (default 60 seconds). Short description is that MissingLastRevSeeker will be called rarely in this case. Long description - In this case a less expensive query is executed to find out all the stale clusterNodes for which recovery is to be performed. If there are clusterNodes that have unexpectedly shutdown and their 'leaseEndTime' has not expired then MissingLastRevSeeker will check all potential candidates. Proposal: if this code *is* used regularly, we'll need an API so that DocumentStore implementations other than Mongo can optimize the query. +1. Since, It will be executed on every startup. RDBDocumentStore already maintains the index on _modified property so, optimized querying is possible. Thanks Amit On Mon, Aug 25, 2014 at 7:36 PM, Julian Reschke julian.resc...@gmx.de wrote: Hi there, it appears that the MissingLastRevSeeker (oak-core), when run, will be very slow on large repos, unless they use a MongoDocumentStore (which has a special-cased query). Question: when will this code execute? I've seen it occasionally during benchmarking, but it doesn't seem to happen always. Proposal: if this code *is* used regularly, we'll need an API so that DocumentStore implementations other than Mongo can optimize the query. Best regards, Julian
Re: MissingLastRevSeeker
On 2014-08-26 08:03, Amit Jain wrote: Hi Julian, The LastRevRecoveryAgent is executed at 2 places 1. On DocumentNodeStore startup where the MissingLastRevSeeker is used to get potential candidates for recovery. 2. At regular intervals defined by the property 'lastRevRecoveryJobIntervalInSecs' in the DocumentNodeStoreService (default 60 seconds). Short description is that MissingLastRevSeeker will be called rarely in this case. Long description - In this case a less expensive query is executed to find out all the stale clusterNodes for which recovery is to be performed. If there are clusterNodes that have unexpectedly shutdown and their 'leaseEndTime' has not expired then MissingLastRevSeeker will check all potential candidates. Proposal: if this code *is* used regularly, we'll need an API so that DocumentStore implementations other than Mongo can optimize the query. +1. Since, It will be executed on every startup. RDBDocumentStore already maintains the index on _modified property so, optimized querying is possible. Thanks Amit OK, so can we put what's needed into the DocumentStore API, or alternatively have an extension interface, that both MongoDocumentStore and RDBDocumentStore could implement? Best regards, Julian
[VOTE] Release Apache Jackrabbit Oak 1.0.5
A candidate for the Jackrabbit Oak 1.0.5 release is available at: https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.5/ The release candidate is a zip archive of the sources in: https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.5/ The SHA1 checksum of the archive is 2cd71913fe66ba9491ee7edb4e82469e228412c9. A staged Maven repository is available for review at: https://repository.apache.org/ The command for running automated checks against this release candidate is: $ sh check-release.sh oak 1.0.5 2cd71913fe66ba9491ee7edb4e82469e228412c9 Please vote on releasing this package as Apache Jackrabbit Oak 1.0.5. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.5 [ ] -1 Do not release this package because... My vote is +1 Regards Thomas
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.5
On 26.8.14 8:42 , Thomas Mueller wrote: Please vote on releasing this package as Apache Jackrabbit Oak 1.0.5. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [X] +1 Release this package as Apache Jackrabbit Oak 1.0.5 Michael
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.5
+1 all checks ok On Tue, Aug 26, 2014 at 8:42 AM, Thomas Mueller muel...@adobe.com wrote: A candidate for the Jackrabbit Oak 1.0.5 release is available at: https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.5/ The release candidate is a zip archive of the sources in: https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.5/ The SHA1 checksum of the archive is 2cd71913fe66ba9491ee7edb4e82469e228412c9. A staged Maven repository is available for review at: https://repository.apache.org/ The command for running automated checks against this release candidate is: $ sh check-release.sh oak 1.0.5 2cd71913fe66ba9491ee7edb4e82469e228412c9 Please vote on releasing this package as Apache Jackrabbit Oak 1.0.5. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.5 [ ] -1 Do not release this package because... My vote is +1 Regards Thomas
Re: JCR API implementation transparency
On 26.8.14 7:14 , Tobias Bocanegra wrote: IMO, this should work, even if the value is not a ValueImpl. In this case, it should fall back to the API methods to read the binary. WDYT? Ack. This is most likely a regression introduces with OAK-1164. Michael
Re: [DISCUSS] supporting faceting in Oak query engine
Hi Laurie, 2014-08-25 18:43 GMT+02:00 Laurie Byrum lby...@adobe.com: Hi Tommaso, I am happy to see this thread! ;-) Questions: Do you expect to want to support hierarchical or pivoted facets soonish? I would say 'why not' if we have a valid use case. If so, does that influence this decision? I think so, especially it would influence the way that may be implemented. Do you know how ACLs will come into play with your facet implementation? not yet, I think that's one of the open points (e.g. Lukas mentioned that HippoCMS did use 'virtual nodes' for them) we should take care of; each 'term' in the facet should be properly checked, but of course doing this kind of check at that fine grain would be costly so we need to come up with a solution which is both correct from the security point of view and performant. If so, does that influence this decision? :-) yes, I think so :) Any suggestions and / or feedback would be highly welcome, especially from potential users of this feature so that we properly tackle your requirements (if any). Thanks and regards, Tommaso Thanks! Laurie On 8/25/14 7:08 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: Hi all, since this has been asked every now and then [1] and since I think it's a pretty useful and common feature for search engine nowadays I'd like to discuss introduction of facets [2] for the Oak query engine. Pros: having facets in search results usually helps filtering (drill down) the results before browsing all of them, so the main usage would be for client code. Impact: probably change / addition in both the JCR and Oak APIs to support returning other than just nodes (a NodeIterator and a Cursor respectively). Right now a couple of ideas on how we could do that come to my mind, both based on the approach of having an Oak index for them: 1. a (multivalued) property index for facets, meaning we would store the facets in the repository, so that we would run a query against it to have the facets of an originating query. 2. a dedicated QueryIndex implementation, eventually leveraging Lucene faceting capabilities, which could use the Lucene index we already have, together with a sidecar index [3]. What do you think? Regards, Tommaso [1] : http://markmail.org/search/?q=oak%20faceting#query:oak%20faceting%20list%3 Aorg.apache.jackrabbit.oak-dev+page:1+state:facets [2] : http://en.wikipedia.org/wiki/Faceted_search [3] : http://lucene.apache.org/core/4_0_0/facet/org/apache/lucene/facet/doc-file s/userguide.html
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.5
+1 Tommaso 2014-08-26 8:42 GMT+02:00 Thomas Mueller muel...@adobe.com: A candidate for the Jackrabbit Oak 1.0.5 release is available at: https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.5/ The release candidate is a zip archive of the sources in: https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.5/ The SHA1 checksum of the archive is 2cd71913fe66ba9491ee7edb4e82469e228412c9. A staged Maven repository is available for review at: https://repository.apache.org/ The command for running automated checks against this release candidate is: $ sh check-release.sh oak 1.0.5 2cd71913fe66ba9491ee7edb4e82469e228412c9 Please vote on releasing this package as Apache Jackrabbit Oak 1.0.5. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.5 [ ] -1 Do not release this package because... My vote is +1 Regards Thomas
Re: [DISCUSS] supporting faceting in Oak query engine
2014-08-25 19:02 GMT+02:00 Lukas Smith sm...@pooteeweet.org: Aloha, Aloha! you should definitely talk to the HippoCMS developers. They forked Jackrabbit 2.x to add facetting as virtual nodes. They ran into some performance issues but I am sure they still have value-able feedback on this. Cool, thanks for letting us know, if you or any other (from Hippo) would like to give some more insight on pros and cons of such an approach that'd be very good. Regards, Tommaso regards, Lukas Kahwe Smith On 25 Aug 2014, at 18:43, Laurie Byrum lby...@adobe.com wrote: Hi Tommaso, I am happy to see this thread! Questions: Do you expect to want to support hierarchical or pivoted facets soonish? If so, does that influence this decision? Do you know how ACLs will come into play with your facet implementation? If so, does that influence this decision? :-) Thanks! Laurie On 8/25/14 7:08 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: Hi all, since this has been asked every now and then [1] and since I think it's a pretty useful and common feature for search engine nowadays I'd like to discuss introduction of facets [2] for the Oak query engine. Pros: having facets in search results usually helps filtering (drill down) the results before browsing all of them, so the main usage would be for client code. Impact: probably change / addition in both the JCR and Oak APIs to support returning other than just nodes (a NodeIterator and a Cursor respectively). Right now a couple of ideas on how we could do that come to my mind, both based on the approach of having an Oak index for them: 1. a (multivalued) property index for facets, meaning we would store the facets in the repository, so that we would run a query against it to have the facets of an originating query. 2. a dedicated QueryIndex implementation, eventually leveraging Lucene faceting capabilities, which could use the Lucene index we already have, together with a sidecar index [3]. What do you think? Regards, Tommaso [1] : http://markmail.org/search/?q=oak%20faceting#query:oak%20faceting%20list%3 Aorg.apache.jackrabbit.oak-dev+page:1+state:facets [2] : http://en.wikipedia.org/wiki/Faceted_search [3] : http://lucene.apache.org/core/4_0_0/facet/org/apache/lucene/facet/doc-file s/userguide.html
Re: MissingLastRevSeeker
Hi, OK, so can we put what's needed into the DocumentStore API, or alternatively have an extension interface, that both MongoDocumentStore and RDBDocumentStore could implement? It would make sense to add a generic method which queries on a particular property(possibly limiting to only indexed ones), like below, to the DocumentStore interface. T extends Document ListT queryProperty(CollectionT collection, String indexedProperty, String fromKey, String toKey, int limit); Thoughts? Thanks Amit On Tue, Aug 26, 2014 at 12:03 PM, Julian Reschke julian.resc...@greenbytes.de wrote: On 2014-08-26 08:03, Amit Jain wrote: Hi Julian, The LastRevRecoveryAgent is executed at 2 places 1. On DocumentNodeStore startup where the MissingLastRevSeeker is used to get potential candidates for recovery. 2. At regular intervals defined by the property 'lastRevRecoveryJobIntervalInSecs' in the DocumentNodeStoreService (default 60 seconds). Short description is that MissingLastRevSeeker will be called rarely in this case. Long description - In this case a less expensive query is executed to find out all the stale clusterNodes for which recovery is to be performed. If there are clusterNodes that have unexpectedly shutdown and their 'leaseEndTime' has not expired then MissingLastRevSeeker will check all potential candidates. Proposal: if this code *is* used regularly, we'll need an API so that DocumentStore implementations other than Mongo can optimize the query. +1. Since, It will be executed on every startup. RDBDocumentStore already maintains the index on _modified property so, optimized querying is possible. Thanks Amit OK, so can we put what's needed into the DocumentStore API, or alternatively have an extension interface, that both MongoDocumentStore and RDBDocumentStore could implement? Best regards, Julian
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.5
On 26/08/2014 07:42, Thomas Mueller wrote: [X] +1 Release this package as Apache Jackrabbit Oak 1.0.5 Davide
Re: MissingLastRevSeeker
Hi, I would only add it if really necessary. We already have a very similar method: /** * Get a list of documents where the key is greater than a start value and * less than an end value. The returned documents are immutable. * * @param T the document type * @param collection the collection * @param fromKey the start value (excluding) * @param toKey the end value (excluding) * @param indexedProperty the name of the indexed property (optional) * @param startValue the minimum value of the indexed property * @param limit the maximum number of entries to return * @return the list (possibly empty) */ @Nonnull T extends Document ListT query(CollectionT collection, String fromKey, String toKey, String indexedProperty, long startValue, int limit); Can't we use this method to at least narrow down the query to the lower bound? I think for the purpose of the last rev seeker, this should be sufficient. Regards Marcel On 26/08/14 10:18, Amit Jain am...@ieee.org wrote: Hi, OK, so can we put what's needed into the DocumentStore API, or alternatively have an extension interface, that both MongoDocumentStore and RDBDocumentStore could implement? It would make sense to add a generic method which queries on a particular property(possibly limiting to only indexed ones), like below, to the DocumentStore interface. T extends Document ListT queryProperty(CollectionT collection, String indexedProperty, String fromKey, String toKey, int limit); Thoughts? Thanks Amit On Tue, Aug 26, 2014 at 12:03 PM, Julian Reschke julian.resc...@greenbytes.de wrote: On 2014-08-26 08:03, Amit Jain wrote: Hi Julian, The LastRevRecoveryAgent is executed at 2 places 1. On DocumentNodeStore startup where the MissingLastRevSeeker is used to get potential candidates for recovery. 2. At regular intervals defined by the property 'lastRevRecoveryJobIntervalInSecs' in the DocumentNodeStoreService (default 60 seconds). Short description is that MissingLastRevSeeker will be called rarely in this case. Long description - In this case a less expensive query is executed to find out all the stale clusterNodes for which recovery is to be performed. If there are clusterNodes that have unexpectedly shutdown and their 'leaseEndTime' has not expired then MissingLastRevSeeker will check all potential candidates. Proposal: if this code *is* used regularly, we'll need an API so that DocumentStore implementations other than Mongo can optimize the query. +1. Since, It will be executed on every startup. RDBDocumentStore already maintains the index on _modified property so, optimized querying is possible. Thanks Amit OK, so can we put what's needed into the DocumentStore API, or alternatively have an extension interface, that both MongoDocumentStore and RDBDocumentStore could implement? Best regards, Julian
Re: MissingLastRevSeeker
Hi, I was proposing the additional method for cases where we want to query the indexed properties other than _id like needed in MongoBlobReferenceIterator and MongoMissingLastRevSeeker. But, Can't we use this method to at least narrow down the query to the lower bound? I think for the purpose of the last rev seeker, this should be sufficient. Yes, this should speed up the query from what we have currently. So, right now we can make this change and see if further improvement is necessary. Will create a jira to track this. On Tue, Aug 26, 2014 at 2:37 PM, Marcel Reutegger mreut...@adobe.com wrote: Hi, I would only add it if really necessary. We already have a very similar method: /** * Get a list of documents where the key is greater than a start value and * less than an end value. The returned documents are immutable. * * @param T the document type * @param collection the collection * @param fromKey the start value (excluding) * @param toKey the end value (excluding) * @param indexedProperty the name of the indexed property (optional) * @param startValue the minimum value of the indexed property * @param limit the maximum number of entries to return * @return the list (possibly empty) */ @Nonnull T extends Document ListT query(CollectionT collection, String fromKey, String toKey, String indexedProperty, long startValue, int limit); Can't we use this method to at least narrow down the query to the lower bound? I think for the purpose of the last rev seeker, this should be sufficient. Regards Marcel On 26/08/14 10:18, Amit Jain am...@ieee.org wrote: Hi, OK, so can we put what's needed into the DocumentStore API, or alternatively have an extension interface, that both MongoDocumentStore and RDBDocumentStore could implement? It would make sense to add a generic method which queries on a particular property(possibly limiting to only indexed ones), like below, to the DocumentStore interface. T extends Document ListT queryProperty(CollectionT collection, String indexedProperty, String fromKey, String toKey, int limit); Thoughts? Thanks Amit On Tue, Aug 26, 2014 at 12:03 PM, Julian Reschke julian.resc...@greenbytes.de wrote: On 2014-08-26 08:03, Amit Jain wrote: Hi Julian, The LastRevRecoveryAgent is executed at 2 places 1. On DocumentNodeStore startup where the MissingLastRevSeeker is used to get potential candidates for recovery. 2. At regular intervals defined by the property 'lastRevRecoveryJobIntervalInSecs' in the DocumentNodeStoreService (default 60 seconds). Short description is that MissingLastRevSeeker will be called rarely in this case. Long description - In this case a less expensive query is executed to find out all the stale clusterNodes for which recovery is to be performed. If there are clusterNodes that have unexpectedly shutdown and their 'leaseEndTime' has not expired then MissingLastRevSeeker will check all potential candidates. Proposal: if this code *is* used regularly, we'll need an API so that DocumentStore implementations other than Mongo can optimize the query. +1. Since, It will be executed on every startup. RDBDocumentStore already maintains the index on _modified property so, optimized querying is possible. Thanks Amit OK, so can we put what's needed into the DocumentStore API, or alternatively have an extension interface, that both MongoDocumentStore and RDBDocumentStore could implement? Best regards, Julian
Re: reindex improvements
Hi Davide, this would be nice indeed! wouldn’t that be “indexPath”, not “re-indexPath” ? Nicolas On 26 Aug 2014, at 12:04, Davide Giannella dav...@apache.org wrote: Hello team, when we issue the reindex by changing the index definition with `reindex=true` the algorithm scan all the repository and issue the node modified/added to the specified index. While this works with small repositories it doesn't really scale with big ones. So for taking an extreme example, we have 2 millions node repository with only 1 node with the required property. The reindex will keep going for as long the 2m node have not been scanned. And with very active repositories where we changes a lot of nodes, manually or not, we could virtually have an endless reindexing. Based on my experience with content repositories normally clients are interested in querying only parts of it. For example /content. I was thinking that it could be a good added value if we could add an additional property to the index definition: reindexPaths (multivalue, String). When this property is specified, the reindex will happens only on those paths in the order as they are specified and it could potentially makes the currently indexed content available to the query engine for returning partial results when every path is completed. A single path could be just path or a glob/regex. I'm for using a java regex as it gives the end user a lot of power on fine tuning but on the other hand regex evaluation is pretty slow... thoughts? Cheers Davide
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.5
On 2014-08-26 08:42, Thomas Mueller wrote: ... Please vote on releasing this package as Apache Jackrabbit Oak 1.0.5. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.5 [ ] -1 Do not release this package because... ... [X] +1 Release this package as Apache Jackrabbit Oak 1.0.5
Re: [DISCUSS] supporting faceting in Oak query engine
This looks useful Tommaso. With OAK-2005 we should be able to support multiple LuceneIndexes and manage them easily. If we can abstract all this out and just expose the facet information as virtual node that would simplify the stuff for end users. Probably we can have a read only NodeStore impl to expose the faceted data bound to a system path. Otherwise we would need to expose the Lucene API and OakDirectory Chetan Mehrotra On Tue, Aug 26, 2014 at 1:28 PM, Tommaso Teofili tommaso.teof...@gmail.com wrote: 2014-08-25 19:02 GMT+02:00 Lukas Smith sm...@pooteeweet.org: Aloha, Aloha! you should definitely talk to the HippoCMS developers. They forked Jackrabbit 2.x to add facetting as virtual nodes. They ran into some performance issues but I am sure they still have value-able feedback on this. Cool, thanks for letting us know, if you or any other (from Hippo) would like to give some more insight on pros and cons of such an approach that'd be very good. Regards, Tommaso regards, Lukas Kahwe Smith On 25 Aug 2014, at 18:43, Laurie Byrum lby...@adobe.com wrote: Hi Tommaso, I am happy to see this thread! Questions: Do you expect to want to support hierarchical or pivoted facets soonish? If so, does that influence this decision? Do you know how ACLs will come into play with your facet implementation? If so, does that influence this decision? :-) Thanks! Laurie On 8/25/14 7:08 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: Hi all, since this has been asked every now and then [1] and since I think it's a pretty useful and common feature for search engine nowadays I'd like to discuss introduction of facets [2] for the Oak query engine. Pros: having facets in search results usually helps filtering (drill down) the results before browsing all of them, so the main usage would be for client code. Impact: probably change / addition in both the JCR and Oak APIs to support returning other than just nodes (a NodeIterator and a Cursor respectively). Right now a couple of ideas on how we could do that come to my mind, both based on the approach of having an Oak index for them: 1. a (multivalued) property index for facets, meaning we would store the facets in the repository, so that we would run a query against it to have the facets of an originating query. 2. a dedicated QueryIndex implementation, eventually leveraging Lucene faceting capabilities, which could use the Lucene index we already have, together with a sidecar index [3]. What do you think? Regards, Tommaso [1] : http://markmail.org/search/?q=oak%20faceting#query:oak%20faceting%20list%3 Aorg.apache.jackrabbit.oak-dev+page:1+state:facets [2] : http://en.wikipedia.org/wiki/Faceted_search [3] : http://lucene.apache.org/core/4_0_0/facet/org/apache/lucene/facet/doc-file s/userguide.html
Re: MissingLastRevSeeker
On 2014-08-26 08:03, Amit Jain wrote: Hi Julian, The LastRevRecoveryAgent is executed at 2 places 1. On DocumentNodeStore startup where the MissingLastRevSeeker is used to get potential candidates for recovery. Sure? I've been logging it, and I don't see it called on every startup... ... Best regards, Julian
Re: MissingLastRevSeeker
On 2014-08-26 11:32, Amit Jain wrote: Hi, I was proposing the additional method for cases where we want to query the indexed properties other than _id like needed in MongoBlobReferenceIterator and MongoMissingLastRevSeeker. But, Can't we use this method to at least narrow down the query to the lower bound? I think for the purpose of the last rev seeker, this should be sufficient. Yes, this should speed up the query from what we have currently. So, right now we can make this change and see if further improvement is necessary. Will create a jira to track this. ... +1. I can take over, if you want...
Re: reindex improvements
Hi Davide, So what would happen to the already-indexed content which wasn't in one of the reindexPaths? For example, let's say I'm building an index of a property called keywords. In the repo, I have: /content/foo@keywords=something /content/bar/one@keywords=something /content/bar/two@keywords=something And then I trigger a reindex with reindexPaths = /content/bar. Would //element(*)[@keywords='something'] still return /content/foo ? Regards, Justin On Tue, Aug 26, 2014 at 6:04 AM, Davide Giannella dav...@apache.org wrote: Hello team, when we issue the reindex by changing the index definition with `reindex=true` the algorithm scan all the repository and issue the node modified/added to the specified index. While this works with small repositories it doesn't really scale with big ones. So for taking an extreme example, we have 2 millions node repository with only 1 node with the required property. The reindex will keep going for as long the 2m node have not been scanned. And with very active repositories where we changes a lot of nodes, manually or not, we could virtually have an endless reindexing. Based on my experience with content repositories normally clients are interested in querying only parts of it. For example /content. I was thinking that it could be a good added value if we could add an additional property to the index definition: reindexPaths (multivalue, String). When this property is specified, the reindex will happens only on those paths in the order as they are specified and it could potentially makes the currently indexed content available to the query engine for returning partial results when every path is completed. A single path could be just path or a glob/regex. I'm for using a java regex as it gives the end user a lot of power on fine tuning but on the other hand regex evaluation is pretty slow... thoughts? Cheers Davide
Re: reindex improvements
On 26/08/2014 11:27, Nicolas Peltier wrote: Hi Davide, this would be nice indeed! wouldn’t that be “indexPath”, not “re-indexPath” ? I'd rather keep a sort of namespace in the property naming. By stating `reindexPath` it should be clear that is only related to reindexing and that if the index is then global (under /oak:indexs) it will index all repository. Any other opinions? I'm not convinced yet in my idea. There's something there that smells for me. :) Cheers Davide
Re: reindex improvements
On 26/08/2014 14:13, Justin Edelson wrote: Hi Davide, So what would happen to the already-indexed content which wasn't in one of the reindexPaths? For example, let's say I'm building an index of a property called keywords. In the repo, I have: /content/foo@keywords=something /content/bar/one@keywords=something /content/bar/two@keywords=something And then I trigger a reindex with reindexPaths = /content/bar. Would //element(*)[@keywords='something'] still return /content/foo ? In my idea no. Currently when reindexing the :index node, where the actual index is stored, is deleted and recreated. I would keep the same approach. I'm thinking of this as an advanced feature that someone has to know how to use it. So in the above example I would specify either: /content or /content/bar, /content/foo. It's a dangerous thing though. I can see it. :) D.
Re: reindex improvements
Hi, On Tue, Aug 26, 2014 at 10:01 AM, Davide Giannella dav...@apache.org wrote: On 26/08/2014 14:13, Justin Edelson wrote: Hi Davide, So what would happen to the already-indexed content which wasn't in one of the reindexPaths? For example, let's say I'm building an index of a property called keywords. In the repo, I have: /content/foo@keywords=something /content/bar/one@keywords=something /content/bar/two@keywords=something And then I trigger a reindex with reindexPaths = /content/bar. Would //element(*)[@keywords='something'] still return /content/foo ? In my idea no. Currently when reindexing the :index node, where the actual index is stored, is deleted and recreated. I would keep the same approach. I'm thinking of this as an advanced feature that someone has to know how to use it. So in the above example I would specify either: /content or /content/bar, /content/foo. It's a dangerous thing though. I can see it. :) In this case, I think Thomas's suggestion makes much more sense. Let's just add a property to the QID which allows an index to be restricted to particular paths. Regards, Justin D.