buildbot failure in ASF Buildbot on oak-trunk-win7
The Buildbot has detected a new failure on builder oak-trunk-win7 while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/oak-trunk-win7/builds/676 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: bb-win7 Build Reason: scheduler Build Source Stamp: [branch jackrabbit/oak/trunk] 1631284 Blamelist: chetanm BUILD FAILED: failed compile sincerely, -The Buildbot
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.7
[X] +1 Release this package as Apache Jackrabbit Oak 1.0.7
questions
Hi, I have some questions regarding the POC I am working on: Lifecycle management I have seen that it is not implemented in Oak while specified by the JCR. Is there any plan to implement it ? Observation How does it scale ? I need to have some custom operations executed on node creation, move, deletion, ... I guess Observation is the way to go, but I wonder how this scale in case I need to be able to handle several billions nodes ? ACL How does it scale ? If I query a large repo for nodes and only have access to few ones, how does the filtering work ? JCR vs RDBMS I come from the RDBMS world and I am pretty new to JCR so I apologize if these are dumb questions: * So far, I have manipulated the JCR API (node, properties, events, ...) and was able to cover my basic use cases. But, in a real application, I need to have OO modelisation and, therefore, at some point, have a way to map my business model to JCR nodes (something like an ORM). I found Jackrabbit OCMhttp://jackrabbit.apache.org/5-with-jackrabbit-ocm.html but nothing in Oak. Is there something in the pipe ? * What are the strategies and tooling for data migration ? I mean if I have millions of nodes of a certain type and need to do some modification in this type definition (adding a mandatory property or node, changing a property type, ); in this case how should I proceed ? Thanks in advance for your answers, Mohamed
buildbot success in ASF Buildbot on oak-trunk-win7
The Buildbot has detected a restored build on builder oak-trunk-win7 while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/oak-trunk-win7/builds/677 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: bb-win7 Build Reason: scheduler Build Source Stamp: [branch jackrabbit/oak/trunk] 1631302 Blamelist: mduerig Build succeeded! sincerely, -The Buildbot
Re: Reindex and external indexes - Possibility of stale index data
Hi, If we set reindex to true in any index definition then Oak would remove the existing index content before performing the reindex. This would work fine if the index content are stored within NodeStore itself. It is important to also specify that this appears as a single commit thanks to the mvcc model: (delete + set reindexed index) so there's no downtime to speak of, the original index is available during the reindex process. However if the index are stored externally e.g. Solr or Lucene index with persistence set to filesystem then I think currently we do not the remove the existing index data which might lead to index containing stale data. Agreed, this is a problem when storing the index outside the repo. The interesting part here is that only content updates might be affected, deleting a node will not resurface it thanks to the fact that the query engine will reload nodes to see if they are readable to the current session (acl checks) so it skips over the nodes it can't read, if I remember correctly. Focusing on the Lucene index now, I went through the code a bit (no proper tests yet) and it looks like it might not be affected by this that much. A reindex call has the before state empty so Lucene will update all the documents it finds [0], so no stale content on updates here. Just missing deleted node events. So the remaining question is about identifying content that was deleted between the indexed state and the current head state. One simple solution is to run a 'remove all documents query' on the lucene index, but that has the downside of making the index unusable during the time the indexing process runs, so I don't see it as a really good option, only maybe as a fallback of sorts. Should we provide any sort of callback for indexers when reindex is requested? Thinking about this a bit, there's a simpler way of handling a reindex call. If you really need to know that the current index is actually a reindex call, you can check if the before state is the empty one on the root index editor. best, alex [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexEditor.java#L109 On Mon, Oct 13, 2014 at 7:33 AM, Chetan Mehrotra chetan.mehro...@gmail.com wrote: Hi, If we set reindex to true in any index definition then Oak would remove the existing index content before performing the reindex. This would work fine if the index content are stored within NodeStore itself. However if the index are stored externally e.g. Solr or Lucene index with persistence set to filesystem then I think currently we do not the remove the existing index data which might lead to index containing stale data. Should we provide any sort of callback for indexers when reindex is requested? Chetan Mehrotra
Re: Reindex and external indexes - Possibility of stale index data
Hi, As for external Lucene indexes, what about this: * in the :data node, store a index creation time, in milliseconds since 1970 * use that as a path prefix for the actual index files So if the index is configured as follows: /oak:index/lucene { path: /quickstart/repo/lucenIndex } Then internally, Oak Lucene would create a node /oak:index/lucene/:dataInProgress { time: 1413189793297 } Then would use that UUID as the prefix for the directory, and the index is created in: /quickstart/repo/lucenIndex/1413189793297 When the index is built, the node :dataInProgress is renamed to :data: /oak:index/lucene/:data { time: 1413189793297 } To read, this the directory would be used. When reindexing, then temporarily two nodes and directories would exist: /oak:index/lucene/:data { time: 1413189793297 } /oak:index/lucene/:dataInProgress { time: 1413189822022 } /quickstart/repo/lucenIndex/1413189793297 /quickstart/repo/lucenIndex/1413189822022 Once the index is done, in one transaction, the old :data node is removed and the node :dataInProgress is removed to :data. Then the old directories are removed. You can only reindex once per millisecond, but I guess this isn't a problem. Regards, Thomas On 13/10/14 10:29, Alex Parvulescu alex.parvule...@gmail.com wrote: Hi, If we set reindex to true in any index definition then Oak would remove the existing index content before performing the reindex. This would work fine if the index content are stored within NodeStore itself. It is important to also specify that this appears as a single commit thanks to the mvcc model: (delete + set reindexed index) so there's no downtime to speak of, the original index is available during the reindex process. However if the index are stored externally e.g. Solr or Lucene index with persistence set to filesystem then I think currently we do not the remove the existing index data which might lead to index containing stale data. Agreed, this is a problem when storing the index outside the repo. The interesting part here is that only content updates might be affected, deleting a node will not resurface it thanks to the fact that the query engine will reload nodes to see if they are readable to the current session (acl checks) so it skips over the nodes it can't read, if I remember correctly. Focusing on the Lucene index now, I went through the code a bit (no proper tests yet) and it looks like it might not be affected by this that much. A reindex call has the before state empty so Lucene will update all the documents it finds [0], so no stale content on updates here. Just missing deleted node events. So the remaining question is about identifying content that was deleted between the indexed state and the current head state. One simple solution is to run a 'remove all documents query' on the lucene index, but that has the downside of making the index unusable during the time the indexing process runs, so I don't see it as a really good option, only maybe as a fallback of sorts. Should we provide any sort of callback for indexers when reindex is requested? Thinking about this a bit, there's a simpler way of handling a reindex call. If you really need to know that the current index is actually a reindex call, you can check if the before state is the empty one on the root index editor. best, alex [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/ja va/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexEditor.java#L 109 On Mon, Oct 13, 2014 at 7:33 AM, Chetan Mehrotra chetan.mehro...@gmail.com wrote: Hi, If we set reindex to true in any index definition then Oak would remove the existing index content before performing the reindex. This would work fine if the index content are stored within NodeStore itself. However if the index are stored externally e.g. Solr or Lucene index with persistence set to filesystem then I think currently we do not the remove the existing index data which might lead to index containing stale data. Should we provide any sort of callback for indexers when reindex is requested? Chetan Mehrotra
[RESULT] [VOTE] Release Apache Jackrabbit Oak 1.0.7
Hi, On 09/10/14 16:56, Marcel Reutegger mreut...@adobe.com wrote: Please vote on releasing this package as Apache Jackrabbit Oak 1.0.7. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. The vote passes as follows: +1 Marcel Reutegger +1 Alex Parvulescu +1 Michael Dürig +1 Amit Jain +1 Julian Reschke +1 Thomas Müller Thanks for voting. I'll push the release out. Regards Marcel
Re: Reindex and external indexes - Possibility of stale index data
Hi, Then would use that UUID as the prefix ... Sorry, that should be Then would use that _time_ as the prefix ... - I thought about using a UUID first, but then changed to milliseconds since 1970, as that's easier (you immediately see which one is the latest directory). But UUID would work as well. Regards, Thomas On 13/10/14 10:45, Thomas Mueller muel...@adobe.com wrote: Hi, As for external Lucene indexes, what about this: * in the :data node, store a index creation time, in milliseconds since 1970 * use that as a path prefix for the actual index files So if the index is configured as follows: /oak:index/lucene { path: /quickstart/repo/lucenIndex } Then internally, Oak Lucene would create a node /oak:index/lucene/:dataInProgress { time: 1413189793297 } Then would use that UUID as the prefix for the directory, and the index is created in: /quickstart/repo/lucenIndex/1413189793297 When the index is built, the node :dataInProgress is renamed to :data: /oak:index/lucene/:data { time: 1413189793297 } To read, this the directory would be used. When reindexing, then temporarily two nodes and directories would exist: /oak:index/lucene/:data { time: 1413189793297 } /oak:index/lucene/:dataInProgress { time: 1413189822022 } /quickstart/repo/lucenIndex/1413189793297 /quickstart/repo/lucenIndex/1413189822022 Once the index is done, in one transaction, the old :data node is removed and the node :dataInProgress is removed to :data. Then the old directories are removed. You can only reindex once per millisecond, but I guess this isn't a problem. Regards, Thomas On 13/10/14 10:29, Alex Parvulescu alex.parvule...@gmail.com wrote: Hi, If we set reindex to true in any index definition then Oak would remove the existing index content before performing the reindex. This would work fine if the index content are stored within NodeStore itself. It is important to also specify that this appears as a single commit thanks to the mvcc model: (delete + set reindexed index) so there's no downtime to speak of, the original index is available during the reindex process. However if the index are stored externally e.g. Solr or Lucene index with persistence set to filesystem then I think currently we do not the remove the existing index data which might lead to index containing stale data. Agreed, this is a problem when storing the index outside the repo. The interesting part here is that only content updates might be affected, deleting a node will not resurface it thanks to the fact that the query engine will reload nodes to see if they are readable to the current session (acl checks) so it skips over the nodes it can't read, if I remember correctly. Focusing on the Lucene index now, I went through the code a bit (no proper tests yet) and it looks like it might not be affected by this that much. A reindex call has the before state empty so Lucene will update all the documents it finds [0], so no stale content on updates here. Just missing deleted node events. So the remaining question is about identifying content that was deleted between the indexed state and the current head state. One simple solution is to run a 'remove all documents query' on the lucene index, but that has the downside of making the index unusable during the time the indexing process runs, so I don't see it as a really good option, only maybe as a fallback of sorts. Should we provide any sort of callback for indexers when reindex is requested? Thinking about this a bit, there's a simpler way of handling a reindex call. If you really need to know that the current index is actually a reindex call, you can check if the before state is the empty one on the root index editor. best, alex [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/j a va/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexEditor.java# L 109 On Mon, Oct 13, 2014 at 7:33 AM, Chetan Mehrotra chetan.mehro...@gmail.com wrote: Hi, If we set reindex to true in any index definition then Oak would remove the existing index content before performing the reindex. This would work fine if the index content are stored within NodeStore itself. However if the index are stored externally e.g. Solr or Lucene index with persistence set to filesystem then I think currently we do not the remove the existing index data which might lead to index containing stale data. Should we provide any sort of callback for indexers when reindex is requested? Chetan Mehrotra
Release 1.1.1
Hello Team, today, 12PM BST I will cut oak 1.1.1 (or at least try to). https://issues.apache.org/jira/browse/OAK-2184 all the issues are resolved/closed but https://issues.apache.org/jira/browse/OAK-2173. if you wish to delay the cut for some time please contact me directly; otherwise I will reschedule the un-resolved. Cheers Davide
Re: Release 1.1.1
Hi Davide, It would be good to have a bit more warning before cutting a release. 30 or so mins doesn't give anybody enough time to fix pending issues. best, alex On Mon, Oct 13, 2014 at 11:50 AM, Davide Giannella dav...@apache.org wrote: Hello Team, today, 12PM BST I will cut oak 1.1.1 (or at least try to). https://issues.apache.org/jira/browse/OAK-2184 all the issues are resolved/closed but https://issues.apache.org/jira/browse/OAK-2173. if you wish to delay the cut for some time please contact me directly; otherwise I will reschedule the un-resolved. Cheers Davide
Re: Release 1.1.1
Errata corrige: looking at my daily schedule it'll be 2.30pm BST. On 13/10/2014 10:50, Davide Giannella wrote: Hello Team, today, 12PM BST I will cut oak 1.1.1 (or at least try to). https://issues.apache.org/jira/browse/OAK-2184 all the issues are resolved/closed but https://issues.apache.org/jira/browse/OAK-2173. if you wish to delay the cut for some time please contact me directly; otherwise I will reschedule the un-resolved. Cheers Davide
buildbot failure in ASF Buildbot on oak-trunk-win7
The Buildbot has detected a new failure on builder oak-trunk-win7 while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/oak-trunk-win7/builds/679 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: bb-win7 Build Reason: scheduler Build Source Stamp: [branch jackrabbit/oak/trunk] 1631334 Blamelist: angela,chetanm BUILD FAILED: failed compile sincerely, -The Buildbot
Re: buildbot failure in ASF Buildbot on oak-trunk-win7
This one looks like window specific issue -- [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-clean-plugin:2.5:clean (default-clean) on project oak-mk-api: Failed to clean project: Failed to delete E:\slave14\oak-trunk-win7\build\oak-mk-api\target\oak-mk-api-1.1-SNAPSHOT.jar - [Help 1] [ Chetan Mehrotra On Mon, Oct 13, 2014 at 4:09 PM, build...@apache.org wrote: The Buildbot has detected a new failure on builder oak-trunk-win7 while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/oak-trunk-win7/builds/679 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: bb-win7 Build Reason: scheduler Build Source Stamp: [branch jackrabbit/oak/trunk] 1631334 Blamelist: angela,chetanm BUILD FAILED: failed compile sincerely, -The Buildbot
Re: Jira: Oak Fix Versions
On 08/10/2014 13:36, Marcel Reutegger wrote: Hi, On 08/10/14 14:20, Marcel Reutegger mreut...@adobe.com wrote: because we don't have a 1.2 version ;) I'll create one in JIRA and update unresolved issues. Done. Please change the fix version back to 1.1.x when you resolve an issue currently scheduled for 1.2. Also feel free to set it back to 1.1.1 when you intend to fix an issue for the next unstable release. Question for clarifying some doubts I have. Everything in trunk will be 1.2 (next stable release) and when we cut an unstable one we use 1.1.x. Is that right? Now should we leave multiple versions on the issue? For example we have create an issue in trunk for the next iteration and it will be 1.2. We resolve it and release it in the unstable cut: let's say 1.1.1. Now the issue is resolve 1.2, 1.1.1. We then backport it to the current stable as it was a bug and we therefore have resolve: 1.2, 1.1.1, 1.0.9. When we'll cut the next new stable the versions will be 1.2.0, 1.1.1, 1.0.9. Is this correct? Thank you Davide
buildbot failure in ASF Buildbot on oak-trunk-win7
The Buildbot has detected a new failure on builder oak-trunk-win7 while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/oak-trunk-win7/builds/681 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: bb-win7 Build Reason: scheduler Build Source Stamp: [branch jackrabbit/oak/trunk] 1631353 Blamelist: reschke BUILD FAILED: failed compile sincerely, -The Buildbot
Re: Jira: Oak Fix Versions
Hi, On 13/10/14 12:55, Davide Giannella dav...@apache.org wrote: Question for clarifying some doubts I have. Everything in trunk will be 1.2 (next stable release) and when we cut an unstable one we use 1.1.x. Is that right? Yes. The idea of the 1.2 fix version is to indicate what issues we intend to fix or implement until the 1.2 release. Now should we leave multiple versions on the issue? For example we have create an issue in trunk for the next iteration and it will be 1.2. We resolve it and release it in the unstable cut: let's say 1.1.1. Now the issue is resolve 1.2, 1.1.1. We then backport it to the current stable as it was a bug and we therefore have resolve: 1.2, 1.1.1, 1.0.9. When we'll cut the next new stable the versions will be 1.2.0, 1.1.1, 1.0.9. Is this correct? I wouldn't keep the 1.2 fix version when we release an unstable 1.1.x with resolved issues. Regards Marcel Thank you Davide
HTTP API
Aloha, Just wondering if there is any progress on the new REST API and what the relevant tickets are for its completion. Also of course hoping that the old HTTP API will be made available somehow in the default distribution again (at least in the mean time). regards, Lukas Kahwe Smith sm...@pooteeweet.org signature.asc Description: Message signed with OpenPGP using GPGMail
buildbot success in ASF Buildbot on oak-trunk
The Buildbot has detected a restored build on builder oak-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/oak-trunk/builds/628 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: bb-vm_ubuntu Build Reason: scheduler Build Source Stamp: [branch jackrabbit/oak/trunk] 1631353 Blamelist: angela,chetanm,reschke Build succeeded! sincerely, -The Buildbot
Re: Release 1.1.1
On 13/10/2014 11:25, Alex Parvulescu wrote: Hi Davide, It would be good to have a bit more warning before cutting a release. 30 or so mins doesn't give anybody enough time to fix pending issues. Hello Alex, I agree but I sent out the announcement today at 10.50am BST for the 2.30pm BST that is roughly 3 hours :) Anyhow as we don't have to stick to a precise time I'm happy to delay in case needed. Do you need more time? Cheers Davide PS: next time I'll send it out early morning (8am or so). Just came back today and was overwhelmed by the amount of emails in the inbox
RE: oak-run public distribution
Hello folks, Did the discussion eventually led to this public release ? I have still requests to access the binary from customers and even from the support organization. Where can I get the oak-run 1.0.7 ? https://issues.adobe.com/secure/attachment/282322/oak-run-1.0.7-SNAPSHOT.jar is a snapshot version , not sure it is the final release . The goal is still the same : add to this public Oak HFs package share pages a Download link to the oak-run-1.0.x as well. Geoffroy Schneck Enterprise Support Engineer – Team Lead Marketing Cloud Customer Care T: +41 61 226 55 70 M: +41 79 207 45 04 email: gschn...@adobe.com Barfuesserplatz 6 CH-4001 Basel, Switzerland www.adobe.com For CQ support and tips, follow us on Twitter: @AdobeMktgCare -Original Message- From: Jukka Zitting [mailto:ju...@zitting.name] Sent: Freitag, 29. August 2014 21:29 To: oak-dev@jackrabbit.apache.org Subject: Re: oak-run public distribution Hi, On Thu, Aug 28, 2014 at 8:42 PM, Michael Dürig mdue...@apache.orgmailto:mdue...@apache.org wrote: So far we didn't deploy oak-run. I'm not sure why but I think there where concerns regarding making developer tooling available to end users. During 0.x we figured that anyone working with Oak should be able to build the jar directly from sources, so making the pre-built binary available as a download wasn't too important. Now with 1.x I think it would make sense to include the oak-run jar as a download just like we do with jackrabbit-standalone. 2014-08-28 11:53 GMT-04:00 Chetan Mehrotra chetan.mehro...@gmail.commailto:chetan.mehro...@gmail.com: This was discussed earlier [1] and Jukka mentioned that there were some restriction of deployment size. I tried pushing a snapshot version sometime back and that got deploy fine. So I think we should try to deploy artifacts again That's a bit orthogonal, as the deployment question is about making oak-run available on the central Maven repository, which we can do regardless of whether we also post the jar on the Jackrabbit downloads page. BR, Jukka Zitting
updating creating-releases.html
Good morning, I'd like to update http://jackrabbit.apache.org/creating-releases.html#CreatingReleases-ReleaseArtifacts for adding some more details here and there for the first-timer on the release process. What/where should I update? Cheers Davide