[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641579#comment-13641579 ] Thomas Mueller commented on JCR-3534: - Well, this message is an access token. The message data must not be a general access token. I didn't say general access token, but it's still an access token: it grants access to read a certain binary. Sure, we can argue now what an access token exactly is. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640355#comment-13640355 ] Thomas Mueller commented on JCR-3534: - I was chatting with Tommaso Teofili about the basic data structures, features, and security protocols. There are still a few open questions regarding the API, but here what we have so far: DataIdentifier: The (unencryptged and unsigned) identifier of the binary, as already used by the Jackrabbit DataStore. Please note it could be a reference to a file, or, for small binaries, contain the data itself. DataStoreSecret: a secret value that needs to be configured to be the same value in all repositories that share the same physical data store. It is used as the basis to encrypt and decrypt the DataIdentifier, and to sign and verify the signature. This could be a configuration parameter of the DataStore element in the repository.xml, but then we would probably need to change each DataStore implementation were we want support for the new feature? To avoid that, should be add a new element to the repository.xml? Not sure what is the easiest. BinaryReferenceMessage: The encrypt DataIdentifier, the random salt, the expiry time. Plus the signature of all of that. The encryption key for the DataIdentifier is the (SHA-1) hash of the random salt combined with the DataStoreSecret (this is to avoid re-using the same encryption key for all BinaryReferenceMessages). The random salt is per message. The expiry time is the maximum system time up to when to accept the BinaryReferenceMessage (same as for time limited S3 URLs), for example the system time the message was generated plus 2 hours or so. The signature is the HMAC of the rest of the message, with the DataStoreSecret as the key. To simplify development/support, the message should readable, for example JSON or an URL. Example (shortened): {encryptedDataId:0123456789abcd, salt:1234, expiry:3456, signature:4567}. This will also allow to change the algorithms in the future. For now, we could use the following algorithms / formats: 128 bit DataStoreSecret and salt (generated with a SecureRandom); AES-256 encryption / AES-CTR mode; expiry: milliseconds since 1970 UTC; signature: HMAC-SHA-1. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640369#comment-13640369 ] Thomas Mueller commented on JCR-3534: - Well, we wanted to make it secure, right? Expiry: this is to avoid reply attacks. It is the same mechanism as used for S3, see http://stackoverflow.com/questions/5419264/best-practice-amazons3-url-sharing The identifier may be the data itself, so if it's not encrypted then the data would be included in the message. Without it, the message would no longer have the meaning of you have access to this binary but it would sometimes mean this is the data. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640461#comment-13640461 ] Thomas Mueller commented on JCR-3534: - Jukka, I understand now what your attack scenario is. However, this is not the only scenario. See the comment above from Chetan Mehrotra So may be we can have some service provided by DataStore which can provide such safe ids which can be passed around and still be secure. Having expiry and encrypting the identifier would prevent further damage in case the BinaryReferenceMessage leaks. It basically allows to use it within an email, embed it in a web site,... as described in http://stackoverflow.com/questions/5419264/best-practice-amazons3-url-sharing Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640533#comment-13640533 ] Thomas Mueller commented on JCR-3534: - To avoid further delay, we could already start implement what we seem to agree on (well, lets see :-). That is, we need a secret, that is used as the key to sign the message and verify the signature. One solution is a shared secret, configured in the repository.xml, in a new tag. Or it could be configured within the data store, but then each data store where we want to support this feature would need to be changed. Are the any other (simpler) solutions? The simplest to implement would probably be a system property, but that feels wrong :-) Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3547) Datastore GC doesn't reset updateModifiedDateOnAccess on datastore
[ https://issues.apache.org/jira/browse/JCR-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626313#comment-13626313 ] Thomas Mueller commented on JCR-3547: - The patch looks good, thanks! Because it's quite a big change, I think somebody else should have a look as well, specially the changes in the RepositoryContext and RepositoryImpl. I also think it's a good idea to run the GC tests serially, but I wonder why they didn't fail before when running concurrently, if they actually accessed the same repository concurrently? Or don't they access the same repository? But then they should still be able to run concurrently, right? Datastore GC doesn't reset updateModifiedDateOnAccess on datastore -- Key: JCR-3547 URL: https://issues.apache.org/jira/browse/JCR-3547 Project: Jackrabbit Content Repository Issue Type: Bug Components: jackrabbit-core Affects Versions: 2.4, 2.5 Reporter: Shashank Gupta Attachments: GarbageCollector.java.patch, GC_prevent_concurrent_run_app2.patch, GC_prevent_concurrnet_run_app1.patch In mark phase, GC updates store.updateModifiedDateOnAccess with current time, so that datastore updates record’s lastModified timestamp upon subsequent read/scan. But GC doesn't reset it to 0. So even after GC completes, datastore will continue updating lastModified timestamp on read invocations and it will have performance impact. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3547) Datastore GC doesn't reset updateModifiedDateOnAccess on datastore
[ https://issues.apache.org/jira/browse/JCR-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626431#comment-13626431 ] Thomas Mueller commented on JCR-3547: - Approach 2 is much better I think. If serializing the tests is not needed, then I think we shouldn't do it. That way there is less risk to introduce bugs / change behavior. Do you want to submit a new patch where the tests are not moved? I can do it myself if you want (it would just take a bit more time I guess). Should I go for RTC? No, I think voting is not needed, just somebody else reviewing the (final, smaller) patch. Datastore GC doesn't reset updateModifiedDateOnAccess on datastore -- Key: JCR-3547 URL: https://issues.apache.org/jira/browse/JCR-3547 Project: Jackrabbit Content Repository Issue Type: Bug Components: jackrabbit-core Affects Versions: 2.4, 2.5 Reporter: Shashank Gupta Attachments: GarbageCollector.java.patch, GC_prevent_concurrent_run_app2.patch, GC_prevent_concurrnet_run_app1.patch In mark phase, GC updates store.updateModifiedDateOnAccess with current time, so that datastore updates record’s lastModified timestamp upon subsequent read/scan. But GC doesn't reset it to 0. So even after GC completes, datastore will continue updating lastModified timestamp on read invocations and it will have performance impact. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3547) Datastore GC doesn't reset updateModifiedDateOnAccess on datastore
[ https://issues.apache.org/jira/browse/JCR-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625221#comment-13625221 ] Thomas Mueller commented on JCR-3547: - Sorry I didn't see this issue before. Yes, the value should be reset to 0. There is one exception, and I'm not sure if that's a possible / common use case: it shouldn't be reset if another garbage collection is running. I wonder if we could detect this reliably. Maybe only reset if the current value matches the value in the GarbageCollector class? Datastore GC doesn't reset updateModifiedDateOnAccess on datastore -- Key: JCR-3547 URL: https://issues.apache.org/jira/browse/JCR-3547 Project: Jackrabbit Content Repository Issue Type: Bug Components: jackrabbit-core Affects Versions: 2.4, 2.5 Reporter: Shashank Gupta Attachments: GarbageCollector.java.patch In mark phase, GC updates store.updateModifiedDateOnAccess with current time, so that datastore updates record’s lastModified timestamp upon subsequent read/scan. But GC doesn't reset it to 0. So even after GC completes, datastore will continue updating lastModified timestamp on read invocations and it will have performance impact. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3547) Datastore GC doesn't reset updateModifiedDateOnAccess on datastore
[ https://issues.apache.org/jira/browse/JCR-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625281#comment-13625281 ] Thomas Mueller commented on JCR-3547: - imo repository should forbid user to run two simultaneous gc. That's true. I guess it would be the best solution. I can't currently think of a reason to run two GCs concurrently. Maybe only reset if the current value matches the value in the GarbageCollector class? can you explain it more? ds interface doesn't expose a method to retrieve this value. We could add a getter method. But thinking about it again, it wouldn't be good enough. Just comparing the current value wouldn't be enough, as the second run (if we allow it) could be started at the exact same time, or started right after calling the getter and before resetting the value. So I guess we should add code to disallow running GC concurrently. Do you want to do that, or should I? Datastore GC doesn't reset updateModifiedDateOnAccess on datastore -- Key: JCR-3547 URL: https://issues.apache.org/jira/browse/JCR-3547 Project: Jackrabbit Content Repository Issue Type: Bug Components: jackrabbit-core Affects Versions: 2.4, 2.5 Reporter: Shashank Gupta Attachments: GarbageCollector.java.patch In mark phase, GC updates store.updateModifiedDateOnAccess with current time, so that datastore updates record’s lastModified timestamp upon subsequent read/scan. But GC doesn't reset it to 0. So even after GC completes, datastore will continue updating lastModified timestamp on read invocations and it will have performance impact. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608971#comment-13608971 ] Thomas Mueller commented on JCR-3534: - Sounds good. In addition, I think we should consider having a mechanism to expire the identifier, similar to Amazon S3. I found some information here: http://stackoverflow.com/questions/14414455/amazon-s3-generating-an-expiring-link-using-ruby-1-9-3 What do you think about the limited lifetime of an identifier? I think it might be overkill, but I'm not sure. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607707#comment-13607707 ] Thomas Mueller commented on JCR-3534: - (d) Content ids could expire after some time (for example one minute). One way to do that is to add the number of minutes since 1970 to the hash, then encrypt this using a datastore wide secret key, and use this as the content identifier. The receiver repository would decrypt the identifier and check time. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604961#comment-13604961 ] Thomas Mueller commented on JCR-3534: - The patch looks good to me. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605186#comment-13605186 ] Thomas Mueller commented on JCR-3534: - I would prefer the simpler return of null for the not-found case instead of the ItemNotFoundException I agree. leaky abstraction that may come back to haunt us for example if someone who doesn't realize the security implications I think what Felix described is a valid use case. If there are better ways to solve the problem, that would be great, but I also currently don't see other solutions that would work well. adjust the deployment configuration if you want to make those repositories share data more intimately Could you provide more details? How could you reference a binary stored in one repository in the other repository, if the repositories are not running in the same process? the implementation may well be something like hash(revision + path) that can't be reversed for use in something like getValueByContentId(). This is an idea that is new to me, could you tell us more about it? I believe we can and should support getValueByContentId() in Oak in the same way as in Jackrabbit 2.x. I don't see a reason to use hash(revision + path). Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605201#comment-13605201 ] Thomas Mueller commented on JCR-3534: - I think the general problem here is: how do you avoid sending the binary if the binary is already there? The repositories don't necessarily need to share the data store. Of course there are security questions, but then all operations have security questions (uploading a huge binary can fill the disk so we would need quotas in theory; a rogue remote client might already access binaries he is not allowed). Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (JCR-3529) Property2Index: node type is ignored
Thomas Mueller created JCR-3529: --- Summary: Property2Index: node type is ignored Key: JCR-3529 URL: https://issues.apache.org/jira/browse/JCR-3529 Project: Jackrabbit Content Repository Issue Type: Bug Reporter: Thomas Mueller The Property2Index filters by node type, so each index only contains data for a list of node types. But the getCost and and query methods ignore the node type. Because of that, if there are multiple indexes for the same property, each index filtering on certain node types, then running a query might pick the wrong index. The result is that the query doesn't return any data where it should, because the index doesn't return the right nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3509) Workspace maxIdleTime parameter not working
[ https://issues.apache.org/jira/browse/JCR-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13567687#comment-13567687 ] Thomas Mueller commented on JCR-3509: - I wonder, do you want to change the maxIdleTime in the workspace.xml after the repository and workspace was created? It seems the workspace.xml file is stored in the database in your case, so I guess you would have to change it in the database as well. But I'm not completely sure. Workspace maxIdleTime parameter not working --- Key: JCR-3509 URL: https://issues.apache.org/jira/browse/JCR-3509 Project: Jackrabbit Content Repository Issue Type: Bug Components: config Affects Versions: 2.4.3 Environment: JSF, SPRING Reporter: Sarfaraaz ASLAM Attachments: derby.jackrabbit.repository.xml, JcrConfigurer.java would like to set the maximum number of seconds that a workspace can remain unused before the workspace is automatically closed through maxIdleTime parameter but this seems not to work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3493) OUTER JOIN tests expect incorrect results
[ https://issues.apache.org/jira/browse/JCR-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561528#comment-13561528 ] Thomas Mueller commented on JCR-3493: - The patch looks good! There is one typo in the issue number: !--JCR-3493, JCR-2498-- should be !--JCR-3493, JCR-3498-- OUTER JOIN tests expect incorrect results - Key: JCR-3493 URL: https://issues.apache.org/jira/browse/JCR-3493 Project: Jackrabbit Content Repository Issue Type: Bug Components: jackrabbit-jcr-tests Affects Versions: 2.5.2 Reporter: Randall Hauch Fix For: 2.5.3, 2.6, 2.7 Attachments: jcr-3493-tests.patch Two of the OUTER JOIN tests appears to expect incorrect results: - org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testRightOuterJoin1 - org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testLeftOuterJoin2 Both tests are set up the same way: two nodes are created: /testroot/workarea/node1 {jcr:primaryType=nt:unstructured, prop1=yikqysrwur} /testroot/workarea/node1/node2 {jcr:primaryType=nt:unstructured, prop1=yikqysrwur, prop2=yikqysrwur, jcr:mixinTypes=[mix:referenceable], jcr:uuid=c9118bb2-922e-4612-acd7-7152105f5684} A single string is randomly generated and used for the values for prop1 and prop2, and only the second node is made to be mix:referenceable. The testRightOuterJoin1 test runs this query: SELECT * FROM [nt:unstructured] AS left RIGHT OUTER JOIN [nt:unstructured] AS right ON left.prop1 = right.prop2 WHERE ISDESCENDANTNODE(right,'/testroot/workarea') The left side of the join has at least two tuples (one for node1, one for node2, and other nodes which do not have a 'prop1' value), and column of interest is the prop1 column. Thus the left side tuples (or the parts we care about for the join) look like: [ node1, yikqysrwur ] [ node2, yikqysrwur ] [ …, null ] The right side of the join has only two tuples (node1 and node2) because of the ISDESCENDANTNODE criteria, and the only column of interest is the prop2 column. Thus, the right side tuples (or the parts we care about for the join) look like: [ node1, null ] [ node2, yikqysrwur ] When we perform a RIGHT OUTER JOIN, we have to **include all the tuples on the right** even if they don't match a value on the left tuples. Thus, node1 must be included in the results, and because it has a null value for the prop2 column will not match any of the tuples on the left (since a null value is not equal to another null value in the case of join criteria). So the result set should contain these combinations of nodes: [ null, node1 ] [ node1, node2 ] [ node2, node2 ] However, the test expects the following result: [ node1, node2 ] [ node2, node2 ] This is incorrect to me, because it is missing the [node1, null] tuple that was on the right side of the join. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3493) OUTER JOIN tests expect incorrect results
[ https://issues.apache.org/jira/browse/JCR-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561684#comment-13561684 ] Thomas Mueller commented on JCR-3493: - Look good to me, thanks! OUTER JOIN tests expect incorrect results - Key: JCR-3493 URL: https://issues.apache.org/jira/browse/JCR-3493 Project: Jackrabbit Content Repository Issue Type: Bug Components: jackrabbit-jcr-tests Affects Versions: 2.5.2 Reporter: Randall Hauch Fix For: 2.5.3, 2.6, 2.7 Attachments: jcr-3493-tests-2.patch, jcr-3493-tests.patch Two of the OUTER JOIN tests appears to expect incorrect results: - org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testRightOuterJoin1 - org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testLeftOuterJoin2 Both tests are set up the same way: two nodes are created: /testroot/workarea/node1 {jcr:primaryType=nt:unstructured, prop1=yikqysrwur} /testroot/workarea/node1/node2 {jcr:primaryType=nt:unstructured, prop1=yikqysrwur, prop2=yikqysrwur, jcr:mixinTypes=[mix:referenceable], jcr:uuid=c9118bb2-922e-4612-acd7-7152105f5684} A single string is randomly generated and used for the values for prop1 and prop2, and only the second node is made to be mix:referenceable. The testRightOuterJoin1 test runs this query: SELECT * FROM [nt:unstructured] AS left RIGHT OUTER JOIN [nt:unstructured] AS right ON left.prop1 = right.prop2 WHERE ISDESCENDANTNODE(right,'/testroot/workarea') The left side of the join has at least two tuples (one for node1, one for node2, and other nodes which do not have a 'prop1' value), and column of interest is the prop1 column. Thus the left side tuples (or the parts we care about for the join) look like: [ node1, yikqysrwur ] [ node2, yikqysrwur ] [ …, null ] The right side of the join has only two tuples (node1 and node2) because of the ISDESCENDANTNODE criteria, and the only column of interest is the prop2 column. Thus, the right side tuples (or the parts we care about for the join) look like: [ node1, null ] [ node2, yikqysrwur ] When we perform a RIGHT OUTER JOIN, we have to **include all the tuples on the right** even if they don't match a value on the left tuples. Thus, node1 must be included in the results, and because it has a null value for the prop2 column will not match any of the tuples on the left (since a null value is not equal to another null value in the case of join criteria). So the result set should contain these combinations of nodes: [ null, node1 ] [ node1, node2 ] [ node2, node2 ] However, the test expects the following result: [ node1, node2 ] [ node2, node2 ] This is incorrect to me, because it is missing the [node1, null] tuple that was on the right side of the join. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (JCR-3460) PropertyIndex uses TraversingCursor but should not
Thomas Mueller created JCR-3460: --- Summary: PropertyIndex uses TraversingCursor but should not Key: JCR-3460 URL: https://issues.apache.org/jira/browse/JCR-3460 Project: Jackrabbit Content Repository Issue Type: Bug Components: query Reporter: Thomas Mueller Assignee: Thomas Mueller The org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex uses the traversing cursor (that traverses over the whole repository) when there is no index. This is not how the index mechanism is supposed to work: if there is no property index, then the cost function of the property index should return infinity or max value, so that the property index isn't used. According to my test the PropertyIndex never really falls back to traversing, so this might just be defensive programming. However, in this case it would be better if the code would throw an exception, otherwise we risk not seeing the bug in the PropertyIndex cost method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3460) PropertyIndex uses TraversingCursor but should not
[ https://issues.apache.org/jira/browse/JCR-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502734#comment-13502734 ] Thomas Mueller commented on JCR-3460: - Revision 1412513 PropertyIndex uses TraversingCursor but should not -- Key: JCR-3460 URL: https://issues.apache.org/jira/browse/JCR-3460 Project: Jackrabbit Content Repository Issue Type: Bug Components: query Reporter: Thomas Mueller Assignee: Thomas Mueller The org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex uses the traversing cursor (that traverses over the whole repository) when there is no index. This is not how the index mechanism is supposed to work: if there is no property index, then the cost function of the property index should return infinity or max value, so that the property index isn't used. According to my test the PropertyIndex never really falls back to traversing, so this might just be defensive programming. However, in this case it would be better if the code would throw an exception, otherwise we risk not seeing the bug in the PropertyIndex cost method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3460) PropertyIndex uses TraversingCursor but should not
[ https://issues.apache.org/jira/browse/JCR-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502736#comment-13502736 ] Thomas Mueller commented on JCR-3460: - The NodeTypeIndex also currently uses the TraversingCursor PropertyIndex uses TraversingCursor but should not -- Key: JCR-3460 URL: https://issues.apache.org/jira/browse/JCR-3460 Project: Jackrabbit Content Repository Issue Type: Bug Components: query Reporter: Thomas Mueller Assignee: Thomas Mueller The org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex uses the traversing cursor (that traverses over the whole repository) when there is no index. This is not how the index mechanism is supposed to work: if there is no property index, then the cost function of the property index should return infinity or max value, so that the property index isn't used. According to my test the PropertyIndex never really falls back to traversing, so this might just be defensive programming. However, in this case it would be better if the code would throw an exception, otherwise we risk not seeing the bug in the PropertyIndex cost method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-3460) PropertyIndex uses TraversingCursor but should not
[ https://issues.apache.org/jira/browse/JCR-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-3460. - Resolution: Fixed Revision 1412514 PropertyIndex uses TraversingCursor but should not -- Key: JCR-3460 URL: https://issues.apache.org/jira/browse/JCR-3460 Project: Jackrabbit Content Repository Issue Type: Bug Components: query Reporter: Thomas Mueller Assignee: Thomas Mueller The org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex uses the traversing cursor (that traverses over the whole repository) when there is no index. This is not how the index mechanism is supposed to work: if there is no property index, then the cost function of the property index should return infinity or max value, so that the property index isn't used. According to my test the PropertyIndex never really falls back to traversing, so this might just be defensive programming. However, in this case it would be better if the code would throw an exception, otherwise we risk not seeing the bug in the PropertyIndex cost method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3461) SQL2 query returns no results because @ in path is ignored
[ https://issues.apache.org/jira/browse/JCR-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502753#comment-13502753 ] Thomas Mueller commented on JCR-3461: - Could you provide an example query please? SQL2 query returns no results because @ in path is ignored -- Key: JCR-3461 URL: https://issues.apache.org/jira/browse/JCR-3461 Project: Jackrabbit Content Repository Issue Type: Bug Components: jackrabbit-jcr-commons Affects Versions: 2.5 Reporter: Joel Richard Priority: Minor Labels: queryparser, sql2 If you search for nodes under a given path with ISDESCENDANTNODE and the path contains an @, no results are returned because the @ is removed from the path and then the path cannot be found. The @ gets lost in the org.apache.jackrabbit.commons.query.sql2.Parser#readName method. The reason is that the initialize method assigns the wrong type for @. Probably the problem exists for other special characters as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (JCR-3462) Documentation for the PropertyIndex
Thomas Mueller created JCR-3462: --- Summary: Documentation for the PropertyIndex Key: JCR-3462 URL: https://issues.apache.org/jira/browse/JCR-3462 Project: Jackrabbit Content Repository Issue Type: Bug Reporter: Thomas Mueller Priority: Minor This ticket is to improve the documentation of the PropertyIndex implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3462) Documentation for the PropertyIndex
[ https://issues.apache.org/jira/browse/JCR-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502848#comment-13502848 ] Thomas Mueller commented on JCR-3462: - Initial documentation in revision 1412613 Documentation for the PropertyIndex --- Key: JCR-3462 URL: https://issues.apache.org/jira/browse/JCR-3462 Project: Jackrabbit Content Repository Issue Type: Bug Reporter: Thomas Mueller Priority: Minor This ticket is to improve the documentation of the PropertyIndex implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-28) Query implementation
[ https://issues.apache.org/jira/browse/OAK-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445785#comment-13445785 ] Thomas Mueller commented on OAK-28: --- Thanks Chetan! Fixed in revision 1379373. Query implementation Key: OAK-28 URL: https://issues.apache.org/jira/browse/OAK-28 Project: Jackrabbit Oak Issue Type: New Feature Components: core, jcr Reporter: Thomas Mueller Assignee: Thomas Mueller Labels: query Attachments: OakToJcrQueryTreeConverter.java A query engine needs to be implemented. A query parser in oak-core should be able to handle xpath, sql2 and optionally other query languages. The jcr component must generate a valid query in one of those languages from JQOM queries and pass that statement along with value bindings, limit, offset, and name space mappings to the oak-core. We need to: * Define the oak-core API for handling queries. How are do we handle name space mappings, limit and offset * Implement a query builder in the jcr component which takes care of translating JQOM queries to statements in string form * Implement a query parser in oak-core and decide on a versatile AST representation which works with all query languages and which is extensible to future query languages. * Implement the actual query execution engine which interprets the query AST -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-288) QueryTests should use the NodeStore apis
[ https://issues.apache.org/jira/browse/OAK-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445804#comment-13445804 ] Thomas Mueller commented on OAK-288: util class That looks good to me. bypassing the CommitHook and going directly to the mk level while ignoring the oak-core layer is poor separation of concerns. You already wrote that, and I already wrote I used the MicroKernel API because it is a stable API, while the oak-core API was not stable when I wrote the tests. Actually the oak-core API didn't exist yet. Now that the oak-core API is ready, it does make sense to use it. the current property index implementation doesn't play nice with existing notification mechanisms (like the CommitHook). Sorry I don't understand, what do you mean with 'doesn't play nice'? the query tests pass if I update them to use the NodeStore, except the 'explain' ones. Hm, they should work if the same indexes are available... could you post the result you get? QueryTests should use the NodeStore apis Key: OAK-288 URL: https://issues.apache.org/jira/browse/OAK-288 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Alex Parvulescu Attachments: OAK-288-jsop-util.patch Currently the existing oak query tests come in form of a script file [0] that contains - commit commands which will be executed directly against the mk. - select commands - expected results while this was good for fast prototyping we should refactor the tests to use proper unit tests. Arguments for refactoring: - overall java style unit tests, reduce the complexity of running this setup - proper reporting unit test failures - executing the commit commands directly against the mk breaks the {{CommitHook}} mechanism because the commits will pass unnoticed - proper separation of concerns - oak core should not directly reference the mk, it should pass through exisiting apis like the {{NodeStore}} [0] http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/test/resources/org/apache/jackrabbit/oak/query/sql2.txt?view=markup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-288) QueryTests should use the NodeStore apis
[ https://issues.apache.org/jira/browse/OAK-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444797#comment-13444797 ] Thomas Mueller commented on OAK-288: I'd like to keep the internal DSL style for those tests, as it has proven very useful to me (very easy to add tests, very easy to verify and change the expected results). I used the MicroKernel API because it is a stable API, while the oak-core API was not stable when I wrote the tests. Those tests are testing the parser and the internals of the query engine. The tests are useful even if the changes don't go though the CommitHook, as the tests are not really meant to test the CommitHook and index implementations. If we want the changes to go though the oak-core API instead of using the MicroKernel API directly, the MicroKernel API DSL could be replaced with a oak-core DSL. Or even simpler: We could add a static node structure plus custom indexes (using the oak-core API) before executing the script. That way we could test index implementations that require the CommitHook to index data. QueryTests should use the NodeStore apis Key: OAK-288 URL: https://issues.apache.org/jira/browse/OAK-288 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Alex Parvulescu Currently the existing oak query tests come in form of a script file [0] that contains - commit commands which will be executed directly against the mk. - select commands - expected results while this was good for fast prototyping we should refactor the tests to use proper unit tests. Arguments for refactoring: - overall java style unit tests, reduce the complexity of running this setup - proper reporting unit test failures - executing the commit commands directly against the mk breaks the {{CommitHook}} mechanism because the commits will pass unnoticed - proper separation of concerns - oak core should not directly reference the mk, it should pass through exisiting apis like the {{NodeStore}} [0] http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/test/resources/org/apache/jackrabbit/oak/query/sql2.txt?view=markup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-288) QueryTests should use the NodeStore apis
[ https://issues.apache.org/jira/browse/OAK-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444801#comment-13444801 ] Thomas Mueller commented on OAK-288: proper reporting unit test failures If there is a test failure, what I usually do is compare the expected result (in the src/test directory) with the result in the target directory. the test in sql2.txt are somewhat hard to debug individually What I usually do to debug a test individually is move the test case to the top of the script, so it is run first. QueryTests should use the NodeStore apis Key: OAK-288 URL: https://issues.apache.org/jira/browse/OAK-288 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Alex Parvulescu Currently the existing oak query tests come in form of a script file [0] that contains - commit commands which will be executed directly against the mk. - select commands - expected results while this was good for fast prototyping we should refactor the tests to use proper unit tests. Arguments for refactoring: - overall java style unit tests, reduce the complexity of running this setup - proper reporting unit test failures - executing the commit commands directly against the mk breaks the {{CommitHook}} mechanism because the commits will pass unnoticed - proper separation of concerns - oak core should not directly reference the mk, it should pass through exisiting apis like the {{NodeStore}} [0] http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/test/resources/org/apache/jackrabbit/oak/query/sql2.txt?view=markup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-3406) Journal doUnlock sometimes not called on repository shutdown
[ https://issues.apache.org/jira/browse/JCR-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-3406. - Resolution: Fixed Fix Version/s: 2.6 Journal doUnlock sometimes not called on repository shutdown Key: JCR-3406 URL: https://issues.apache.org/jira/browse/JCR-3406 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller Fix For: 2.6 When the repository is shut down, the method AbstractJournal.doUnlock(boolean successful) is sometimes not called. The method Journal.close is called, but when the journal implementation uses a reentrant lock it can't unlock because close is called from a different thread. The reason for not calling doUnlock is that ClusterNode.stop() sets the status to stopped, which causes all WorkspaceUpdateChannel methods to not work, including updateCommitted and updateCancelled. Therefore, it is possible that an operation is started but never completed nor cancelled. To solve the issue, I found that it is enough to let updateCommitted and updateCancelled to complete, so that operations that are in progress can finish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-264) MicroKernel.diff for depth limited, unspecified changes
Thomas Mueller created OAK-264: -- Summary: MicroKernel.diff for depth limited, unspecified changes Key: OAK-264 URL: https://issues.apache.org/jira/browse/OAK-264 Project: Jackrabbit Oak Issue Type: Bug Components: mk Reporter: Thomas Mueller Currently the MicroKernel API specifies for the method diff, if the depth parameter is used, that unspecified changes below a certain path can be returned as: ^ /some/path I would prefer the slightly more verbose: ^ /some/path: {} Reason: It is similar to how getNode() returns node names if the depth limited: some:{path:{}}, and it makes parsing unambiguous: there is always a ':' after the path, whether a property was changed or a node was changed. Without the colon, the parser needs to look ahead to decide whether a node was changed or a property was changed (the token after the path could be the start of the next operation). And we could never ever support ':' as an operation because that would make parsing ambiguous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-264) MicroKernel.diff for depth limited, unspecified changes
[ https://issues.apache.org/jira/browse/OAK-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438565#comment-13438565 ] Thomas Mueller commented on OAK-264: Here the proposed patch. The project oak-it/mk doesn't need to be changed (there is a test, but it doesn't check for the end of the diff). {code} Index: src/main/java/org/apache/jackrabbit/mk/model/tree/DiffBuilder.java === --- src/main/java/org/apache/jackrabbit/mk/model/tree/DiffBuilder.java (revision 1375443) +++ src/main/java/org/apache/jackrabbit/mk/model/tree/DiffBuilder.java (working copy) @@ -144,7 +144,8 @@ super.childNodeChanged(name, before, after); } else { buff.tag('^'); -buff.value(p); +buff.key(p); +buff.object().endObject(); buff.newline(); } ++levels; Index: src/main/java/org/apache/jackrabbit/mk/api/MicroKernel.java === --- src/main/java/org/apache/jackrabbit/mk/api/MicroKernel.java (revision 1375021) +++ src/main/java/org/apache/jackrabbit/mk/api/MicroKernel.java (working copy) @@ -193,7 +193,7 @@ * The {@code depth} limit applies to the subtree rooted at {@code path}. * It allows to limit the depth of the diff, i.e. only changes up to the * specified depth will be included in full detail. changes at paths exceeding - * the specified depth limit will be reported as {@code ^/some/path}, + * the specified depth limit will be reported as {@code ^ /some/path: {}}, * indicating that there are unspecified changes below that path. * table border=1 * tr {code} MicroKernel.diff for depth limited, unspecified changes --- Key: OAK-264 URL: https://issues.apache.org/jira/browse/OAK-264 Project: Jackrabbit Oak Issue Type: Improvement Components: mk Reporter: Thomas Mueller Priority: Minor Currently the MicroKernel API specifies for the method diff, if the depth parameter is used, that unspecified changes below a certain path can be returned as: ^ /some/path I would prefer the slightly more verbose: ^ /some/path: {} Reason: It is similar to how getNode() returns node names if the depth limited: some:{path:{}}, and it makes parsing unambiguous: there is always a ':' after the path, whether a property was changed or a node was changed. Without the colon, the parser needs to look ahead to decide whether a node was changed or a property was changed (the token after the path could be the start of the next operation). And we could never ever support ':' as an operation because that would make parsing ambiguous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-260) Avoid the Turkish Locale Problem
Thomas Mueller created OAK-260: -- Summary: Avoid the Turkish Locale Problem Key: OAK-260 URL: https://issues.apache.org/jira/browse/OAK-260 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller We currently use String.toUpperCase() and String.toLowerCase() and in some cases where it is not appropriate. When running using the Turkish profile, this will not work as expected. See also http://mattryall.net/blog/2009/02/the-infamous-turkish-locale-bug Problematic are String.toUpperCase(), String.toLowerCase(). String.equalsIgnoreCase(..) isn't a problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-260) Avoid the Turkish Locale Problem
[ https://issues.apache.org/jira/browse/OAK-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437877#comment-13437877 ] Thomas Mueller commented on OAK-260: The main problems should be fixed in r1375026, r1375028, and r1375030. However, we should have a way to ensure no new such bugs are introduced. One option is to run the test case using the Turkish locale, another might be to configure Checkstyle to detect such problems. Avoid the Turkish Locale Problem -- Key: OAK-260 URL: https://issues.apache.org/jira/browse/OAK-260 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller We currently use String.toUpperCase() and String.toLowerCase() and in some cases where it is not appropriate. When running using the Turkish profile, this will not work as expected. See also http://mattryall.net/blog/2009/02/the-infamous-turkish-locale-bug Problematic are String.toUpperCase(), String.toLowerCase(). String.equalsIgnoreCase(..) isn't a problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-262) Query: support pseudo properties like jcr:score() and rep:excerpt()
Thomas Mueller created OAK-262: -- Summary: Query: support pseudo properties like jcr:score() and rep:excerpt() Key: OAK-262 URL: https://issues.apache.org/jira/browse/OAK-262 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller The query engine currently only supports properties that are stored within a node. It doesn't currently support pseudo-properties that are provided by an index, for example jcr:score() and rep:excerpt(). To support such properties, I suggest to change the Cursor interface to return an IndexRow (a new class that can return such pseudo-properties as well as the path) instead of just the path. This may also speed up queries that don't require to load the node itself (if access rights can be checked efficiently or don't need to be checked for a given query). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3406) Journal doUnlock sometimes not called on repository shutdown
[ https://issues.apache.org/jira/browse/JCR-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434890#comment-13434890 ] Thomas Mueller commented on JCR-3406: - Please note the patch only applies to updateCancelled and updateCommitted. As far as I see, those methods are called after updateCreated / updatePrepared, and those two methods do test for status == STARTED (I didn't change that). So I don't see how this could affect startup, but I might be wrong of course. I'm just afraid we might create some race condition on startup. The code seems to be designed to first trigger a sync() call before any other operations are done which seems correct to me. Sorry I don't understand, the patch I made doesn't affect sync() as far as I see. It is only supposed to ensure that the journal is unlocked if it was locked. It's quite hard to say if the patch would break something, just on a theoretical basis (without test cases). I do have an upstream test case that shows the current behavior is problematic, and the patch fixes that, so I suggest I will commit my patch next week, unless somebody can come up with a better patch, or a test case that shows my patch is problematic. Journal doUnlock sometimes not called on repository shutdown Key: JCR-3406 URL: https://issues.apache.org/jira/browse/JCR-3406 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller When the repository is shut down, the method AbstractJournal.doUnlock(boolean successful) is sometimes not called. The method Journal.close is called, but when the journal implementation uses a reentrant lock it can't unlock because close is called from a different thread. The reason for not calling doUnlock is that ClusterNode.stop() sets the status to stopped, which causes all WorkspaceUpdateChannel methods to not work, including updateCommitted and updateCancelled. Therefore, it is possible that an operation is started but never completed nor cancelled. To solve the issue, I found that it is enough to let updateCommitted and updateCancelled to complete, so that operations that are in progress can finish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-245) Add import for org.h2 in oak-mk bundle
[ https://issues.apache.org/jira/browse/OAK-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435156#comment-13435156 ] Thomas Mueller commented on OAK-245: Class.forName(org.h2.Driver) This would constitute a bug in itself. I don't consider this a bug. Let's say it doesn't work well with OSGi. But I believe both Class.forName and org.h2.Driver.load() are not required here, as anyway the H2 connection pool is used. I would simply remove the line. Add import for org.h2 in oak-mk bundle -- Key: OAK-245 URL: https://issues.apache.org/jira/browse/OAK-245 Project: Jackrabbit Oak Issue Type: Bug Components: mk Reporter: Chetan Mehrotra Labels: osgi Attachments: import-h2.patch, OAK-245-load-driver.patch The oak-mk bundle depends on H2 database. It internally uses Class.forName('org.h2.Driver) to load the H2 driver. Due to usage of Class.forName Bnd is not able to add org.h2 package to Import-Package list. So it should have an explicit entry in the maven-bundle-plugin config as shown below {code:xml} Import-Package org.h2;resolution:=optional, * /Import-Package {code} Without this MicroKernalService loading would fail with a CNFE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-239) MicroKernel.getRevisionHistory: maxEntries behavior should be documented
Thomas Mueller created OAK-239: -- Summary: MicroKernel.getRevisionHistory: maxEntries behavior should be documented Key: OAK-239 URL: https://issues.apache.org/jira/browse/OAK-239 Project: Jackrabbit Oak Issue Type: Improvement Components: mk Reporter: Thomas Mueller The method MicroKernel.getRevisionHistory uses a parameter maxEntries to limit the number of returned entries. If the implementation has to limit the entries, it is not clear from the documentation which entries to return (the oldest entries, the newest entries, or any x entries). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (OAK-239) MicroKernel.getRevisionHistory: maxEntries behavior should be documented
[ https://issues.apache.org/jira/browse/OAK-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-239: --- Priority: Minor (was: Major) MicroKernel.getRevisionHistory: maxEntries behavior should be documented Key: OAK-239 URL: https://issues.apache.org/jira/browse/OAK-239 Project: Jackrabbit Oak Issue Type: Improvement Components: mk Reporter: Thomas Mueller Priority: Minor The method MicroKernel.getRevisionHistory uses a parameter maxEntries to limit the number of returned entries. If the implementation has to limit the entries, it is not clear from the documentation which entries to return (the oldest entries, the newest entries, or any x entries). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-241) QueryEngine.executeQuery needs a session parameter
Thomas Mueller created OAK-241: -- Summary: QueryEngine.executeQuery needs a session parameter Key: OAK-241 URL: https://issues.apache.org/jira/browse/OAK-241 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor The method QueryEngine.executeQuery currently needs a ContentSession parameter, even thought the instance was retrieved from the ContentSession using ContentSession.getQueryEngine(). This is a bit confusing. To solve this, we could rename the QueryEngine interface to SessionQueryEngine, change QueryEngineImpl so it no longer implements any interface, add a class SessionQueryEngineImpl that calls the QueryEngineImpl methods (1:1, except for executeQuery where it adds the session parameter). An alternative would be to change the existing QueryEngineImpl so a new instance is created for each session. But I prefer not todo this as conceptually there is only one query engine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (JCR-3406) Journal doUnlock sometimes not called on repository shutdown
Thomas Mueller created JCR-3406: --- Summary: Journal doUnlock sometimes not called on repository shutdown Key: JCR-3406 URL: https://issues.apache.org/jira/browse/JCR-3406 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller When the repository is shut down, the method AbstractJournal.doUnlock(boolean successful) is sometimes not called. The method Journal.close is called, but when the journal implementation uses a reentrant lock it can't unlock because close is called from a different thread. The reason for not calling doUnlock is that ClusterNode.stop() sets the status to stopped, which causes all WorkspaceUpdateChannel methods to not work, including updateCommitted and updateCancelled. Therefore, it is possible that an operation is started but never completed nor cancelled. To solve the issue, I found that it is enough to let updateCommitted and updateCancelled to complete, so that operations that are in progress can finish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3406) Journal doUnlock sometimes not called on repository shutdown
[ https://issues.apache.org/jira/browse/JCR-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431101#comment-13431101 ] Thomas Mueller commented on JCR-3406: - Proposed patch: {code} Index: src/main/java/org/apache/jackrabbit/core/cluster/ClusterNode.java === --- src/main/java/org/apache/jackrabbit/core/cluster/ClusterNode.java (revision 1370130) +++ src/main/java/org/apache/jackrabbit/core/cluster/ClusterNode.java (working copy) @@ -665,10 +665,6 @@ * {@inheritDoc} */ public void updateCommitted(Update update, String path) { -if (status != STARTED) { -log.info(not started: update commit ignored.); -return; -} Record record = (Record) update.getAttribute(ATTRIBUTE_RECORD); if (record == null) { String msg = No record prepared.; @@ -705,10 +701,6 @@ * {@inheritDoc} */ public void updateCancelled(Update update) { -if (status != STARTED) { -log.info(not started: update cancel ignored.); -return; -} Record record = (Record) update.getAttribute(ATTRIBUTE_RECORD); if (record != null) { record.cancelUpdate(); {code} Journal doUnlock sometimes not called on repository shutdown Key: JCR-3406 URL: https://issues.apache.org/jira/browse/JCR-3406 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller When the repository is shut down, the method AbstractJournal.doUnlock(boolean successful) is sometimes not called. The method Journal.close is called, but when the journal implementation uses a reentrant lock it can't unlock because close is called from a different thread. The reason for not calling doUnlock is that ClusterNode.stop() sets the status to stopped, which causes all WorkspaceUpdateChannel methods to not work, including updateCommitted and updateCancelled. Therefore, it is possible that an operation is started but never completed nor cancelled. To solve the issue, I found that it is enough to let updateCommitted and updateCancelled to complete, so that operations that are in progress can finish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-225) Sling I18N queries not supported by Oak
[ https://issues.apache.org/jira/browse/OAK-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13430992#comment-13430992 ] Thomas Mueller commented on OAK-225: In revision 1370294 the query is converted to a SQL-2 query, so executing the query no longer fails. However the conversion is not correct (as far as I know): //element(*,mix:language)[fn:lower-case(@jcr:language)='en'] //element(*,sling:Message)[@sling:message] /(@sling:key|@sling:message) is currently converted to: select [jcr:path], [jcr:score], [sling:key], [sling:message] from [sling:Message] where (lower([jcr:language]) = 'en') and ([sling:message] is not null) I'm not sure if it's worth the effort to support such XPath queries. Sling I18N queries not supported by Oak --- Key: OAK-225 URL: https://issues.apache.org/jira/browse/OAK-225 Project: Jackrabbit Oak Issue Type: Bug Components: core Affects Versions: 0.3 Reporter: Jukka Zitting Priority: Minor Labels: sling, xpath The Sling I18N component issues XPath queries like the following: {code:none} //element(*,mix:language)[fn:lower-case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message) {code} Such queries currently fail with the following exception: {code:none} javax.jcr.query.InvalidQueryException: java.text.ParseException: Query: //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message); expected: ( at org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:115) at org.apache.jackrabbit.oak.jcr.query.QueryImpl.execute(QueryImpl.java:85) at org.apache.sling.jcr.resource.JcrResourceUtil.query(JcrResourceUtil.java:52) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.queryResources(JcrResourceProvider.java:262) ... 54 more Caused by: java.text.ParseException: Query: //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message); expected: ( at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.getSyntaxError(XPathToSQL2Converter.java:704) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.read(XPathToSQL2Converter.java:410) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseExpression(XPathToSQL2Converter.java:336) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseCondition(XPathToSQL2Converter.java:279) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseAnd(XPathToSQL2Converter.java:252) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseConstraint(XPathToSQL2Converter.java:244) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.convert(XPathToSQL2Converter.java:153) at org.apache.jackrabbit.oak.query.QueryEngineImpl.parseQuery(QueryEngineImpl.java:86) at org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:99) at org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:39) at org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:110) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-225) Sling I18N queries not supported by Oak
[ https://issues.apache.org/jira/browse/OAK-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431099#comment-13431099 ] Thomas Mueller commented on OAK-225: I tried to add the comment on OAK-225 yesterday, but Jira was not working properly. It seems even today it sometimes doesn't work, let's see. The fix I did so far was to support fn:lower-case. This wasn't supported before (even for very very simple queries). The query is not converted correctly, I know, and this needs to be either fixed or it has to fail (throw an exception). For XPath queries of this form, a join would be needed I believe. I didn't look into that so far, I'm not sure how easy it is to support it (all I know is that it's not trivial). I guess it's not just a question on whether we 'want' to support it (I want :-) but also if it's worth the effort, and this I'm not convinced yet. It sounds like, as a short term solution, it would be relatively easy to change the query in Sling, but in the long term I guess it would be better to support such queries. Sling I18N queries not supported by Oak --- Key: OAK-225 URL: https://issues.apache.org/jira/browse/OAK-225 Project: Jackrabbit Oak Issue Type: Bug Components: core Affects Versions: 0.3 Reporter: Jukka Zitting Priority: Minor Labels: sling, xpath The Sling I18N component issues XPath queries like the following: {code:none} //element(*,mix:language)[fn:lower-case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message) {code} Such queries currently fail with the following exception: {code:none} javax.jcr.query.InvalidQueryException: java.text.ParseException: Query: //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message); expected: ( at org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:115) at org.apache.jackrabbit.oak.jcr.query.QueryImpl.execute(QueryImpl.java:85) at org.apache.sling.jcr.resource.JcrResourceUtil.query(JcrResourceUtil.java:52) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.queryResources(JcrResourceProvider.java:262) ... 54 more Caused by: java.text.ParseException: Query: //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message); expected: ( at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.getSyntaxError(XPathToSQL2Converter.java:704) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.read(XPathToSQL2Converter.java:410) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseExpression(XPathToSQL2Converter.java:336) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseCondition(XPathToSQL2Converter.java:279) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseAnd(XPathToSQL2Converter.java:252) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseConstraint(XPathToSQL2Converter.java:244) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.convert(XPathToSQL2Converter.java:153) at org.apache.jackrabbit.oak.query.QueryEngineImpl.parseQuery(QueryEngineImpl.java:86) at org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:99) at org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:39) at org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:110) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-225) Sling I18N queries not supported by Oak
[ https://issues.apache.org/jira/browse/OAK-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431133#comment-13431133 ] Thomas Mueller commented on OAK-225: Let me rephrase it: I'm not convinced it's worth to support such queries *right now* (in the near term). If there are many such queries, then yes, if it's the only one, I would rather postpone support until the end of the year, and make such queries throw an exception currently. Sling I18N queries not supported by Oak --- Key: OAK-225 URL: https://issues.apache.org/jira/browse/OAK-225 Project: Jackrabbit Oak Issue Type: Bug Components: core Affects Versions: 0.3 Reporter: Jukka Zitting Priority: Minor Labels: sling, xpath The Sling I18N component issues XPath queries like the following: {code:none} //element(*,mix:language)[fn:lower-case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message) {code} Such queries currently fail with the following exception: {code:none} javax.jcr.query.InvalidQueryException: java.text.ParseException: Query: //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message); expected: ( at org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:115) at org.apache.jackrabbit.oak.jcr.query.QueryImpl.execute(QueryImpl.java:85) at org.apache.sling.jcr.resource.JcrResourceUtil.query(JcrResourceUtil.java:52) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.queryResources(JcrResourceProvider.java:262) ... 54 more Caused by: java.text.ParseException: Query: //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message); expected: ( at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.getSyntaxError(XPathToSQL2Converter.java:704) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.read(XPathToSQL2Converter.java:410) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseExpression(XPathToSQL2Converter.java:336) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseCondition(XPathToSQL2Converter.java:279) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseAnd(XPathToSQL2Converter.java:252) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseConstraint(XPathToSQL2Converter.java:244) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.convert(XPathToSQL2Converter.java:153) at org.apache.jackrabbit.oak.query.QueryEngineImpl.parseQuery(QueryEngineImpl.java:86) at org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:99) at org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:39) at org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:110) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (OAK-209) BlobStore: use SHA-256 instead of SHA-1, and use two directory levels for FileBlobStore
[ https://issues.apache.org/jira/browse/OAK-209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-209. Resolution: Fixed Revision 1368520 and revision 1368542. Some additional changes are included as some of the tests had to be changed in order to use SHA-256. Also I documented and changed the internal BlobStore interface a bit. BlobStore: use SHA-256 instead of SHA-1, and use two directory levels for FileBlobStore --- Key: OAK-209 URL: https://issues.apache.org/jira/browse/OAK-209 Project: Jackrabbit Oak Issue Type: Bug Components: mk Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Currently we use SHA-1 as the hash algorithm for the blob store (same as with Jackrabbit 2.x). I think it makes sense if we use SHA-256 instead: Advantages: - SHA-1 is considered broken by some experts: http://www.schneier.com/blog/archives/2005/02/sha1_broken.html - SHA-256 belongs to the SHA-2 family, which is recommended by NIST for new applications: http://csrc.nist.gov/groups/ST/toolkit/secure_hashing.html Disadvantages: - Longer file name - Longer content hash - Not compatible with Jackrabbit 2.x For the FileBlobStore, the current implementation uses only one directory level while Jackrabbit 2.x uses 3 levels. I think we should use two levels for Oak, to avoid too many files in the same directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (JCR-3369) Garbage collector improvements
[ https://issues.apache.org/jira/browse/JCR-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated JCR-3369: Fix Version/s: 2.4.3 Garbage collector improvements -- Key: JCR-3369 URL: https://issues.apache.org/jira/browse/JCR-3369 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-core Reporter: Mete Atamel Assignee: Thomas Mueller Fix For: 2.2.13, 2.4.3, 2.5.1 Attachments: JCR-3369-2.2.patch, JCR-3369-2.4.patch, JCR-3369-trunk.patch Original Estimate: 48h Remaining Estimate: 48h We identified a number of improvements to garbage collector related code to make it more robust, specifically: 1- As discussed in JCR-3340, when GC goes through nodes, it can encounter a lot of ItemStateExceptions. Currently, stack trace of these exceptions are not logged and this makes debugging difficult. Instead, ItemStateExceptions should at least be logged with full stack trace every 1 minute or so. 2- As discussed in JCR-3341, GC does not fail fast if there is a problem and it should. 3- Session usage in the GC is problematic. The session in GC is used for traversing the content and marking the binaries, but the listener in that class uses the same session as well, when a node is added. GC should rather use a separate session in onEvent() to avoid concurrent use. 4- GC listens for NODE_ADDED event for moved nodes but instead it should listen for NODE_MOVED. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3369) Garbage collector improvements
[ https://issues.apache.org/jira/browse/JCR-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425573#comment-13425573 ] Thomas Mueller commented on JCR-3369: - Merged into the 2.4 branch in revision 1367435 Garbage collector improvements -- Key: JCR-3369 URL: https://issues.apache.org/jira/browse/JCR-3369 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-core Reporter: Mete Atamel Assignee: Thomas Mueller Fix For: 2.2.13, 2.4.3, 2.5.1 Attachments: JCR-3369-2.2.patch, JCR-3369-2.4.patch, JCR-3369-trunk.patch Original Estimate: 48h Remaining Estimate: 48h We identified a number of improvements to garbage collector related code to make it more robust, specifically: 1- As discussed in JCR-3340, when GC goes through nodes, it can encounter a lot of ItemStateExceptions. Currently, stack trace of these exceptions are not logged and this makes debugging difficult. Instead, ItemStateExceptions should at least be logged with full stack trace every 1 minute or so. 2- As discussed in JCR-3341, GC does not fail fast if there is a problem and it should. 3- Session usage in the GC is problematic. The session in GC is used for traversing the content and marking the binaries, but the listener in that class uses the same session as well, when a node is added. GC should rather use a separate session in onEvent() to avoid concurrent use. 4- GC listens for NODE_ADDED event for moved nodes but instead it should listen for NODE_MOVED. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-3369) Garbage collector improvements
[ https://issues.apache.org/jira/browse/JCR-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-3369. - Resolution: Fixed Garbage collector improvements -- Key: JCR-3369 URL: https://issues.apache.org/jira/browse/JCR-3369 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-core Reporter: Mete Atamel Assignee: Thomas Mueller Fix For: 2.2.13, 2.4.3, 2.5.1 Attachments: JCR-3369-2.2.patch, JCR-3369-2.4.patch, JCR-3369-trunk.patch Original Estimate: 48h Remaining Estimate: 48h We identified a number of improvements to garbage collector related code to make it more robust, specifically: 1- As discussed in JCR-3340, when GC goes through nodes, it can encounter a lot of ItemStateExceptions. Currently, stack trace of these exceptions are not logged and this makes debugging difficult. Instead, ItemStateExceptions should at least be logged with full stack trace every 1 minute or so. 2- As discussed in JCR-3341, GC does not fail fast if there is a problem and it should. 3- Session usage in the GC is problematic. The session in GC is used for traversing the content and marking the binaries, but the listener in that class uses the same session as well, when a node is added. GC should rather use a separate session in onEvent() to avoid concurrent use. 4- GC listens for NODE_ADDED event for moved nodes but instead it should listen for NODE_MOVED. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-189) Swallowed exceptions
[ https://issues.apache.org/jira/browse/OAK-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422310#comment-13422310 ] Thomas Mueller commented on OAK-189: Well, this is not about checked versus uncheck exceptions, but about catching an exception and then simply return null, without logging the exception, without re-throwing a different exception. The exception is silently ignored, and the code behaves in a different way. Swallowed exceptions Key: OAK-189 URL: https://issues.apache.org/jira/browse/OAK-189 Project: Jackrabbit Oak Issue Type: Bug Components: jcr Reporter: Thomas Mueller Exceptions should not be silently swallowed. This is currently done in SessionDelegate$SessionNameMapper, methods getOakPrefix(), getOakPrefixFromURI(), and getJcrPrefix(). Those methods catch RepositoryException, don't log by default (only when using debug level), and don't log the exception stack trace or throw an exception. Catching a very wide band of exceptions (RepositoryException) and then simply returning null is not an acceptable solution in my view. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-3396) Simplify the code when possible
[ https://issues.apache.org/jira/browse/JCR-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-3396. - Resolution: Fixed Simplify the code when possible --- Key: JCR-3396 URL: https://issues.apache.org/jira/browse/JCR-3396 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Priority: Minor Sometimes it's possible to simplify the code, for example: - making methods static when possible, so a reader knows the method doesn't change the state of an object - the else is unnecessary if the if block always returns -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-182) Support for invisible internal content
[ https://issues.apache.org/jira/browse/OAK-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13421322#comment-13421322 ] Thomas Mueller commented on OAK-182: I think the ValidatingEditor shouldn't ignore hidden content... the MergingNodeStateDiff class probably shouldn't ignore hidden content. For index content, I don't agree. I see no reason to validate or merge changes in the index. The only hidden content I aware of is index content. But the tests seem to work now, so I will not change add the filtering to ValidatingEditor and MergingNodeStateDiff. I guess we will come back to this once it's a performance problem. Also, I will change NodeStateUtils.isHidden() to support simple names only. Support for invisible internal content Key: OAK-182 URL: https://issues.apache.org/jira/browse/OAK-182 Project: Jackrabbit Oak Issue Type: New Feature Components: core, jcr Reporter: Jukka Zitting Attachments: OAK-182-b.patch As discussed on the mailing list (http://markmail.org/message/kzt7csiz2bd5n3ww), it would be good to have a naming pattern line {{:name}} for internal content that we don't want to directly expose to JCR clients. JCR-related functionality like the namespace and node type validators and the observation dispatcher (see also OAK-181) should know to ignore such content and the JCR binding in oak-jcr should automatically filter out such internal content. Such internal content should probably also not be indexed for search. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (JCR-3396) Simplify the code when possible
Thomas Mueller created JCR-3396: --- Summary: Simplify the code when possible Key: JCR-3396 URL: https://issues.apache.org/jira/browse/JCR-3396 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Priority: Minor Sometimes it's possible to simplify the code, for example: - making methods static when possible, so a reader knows the method doesn't change the state of an object - the else is unnecessary if the if block always returns -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-201) NamespaceRegistry is very slow
Thomas Mueller created OAK-201: -- Summary: NamespaceRegistry is very slow Key: OAK-201 URL: https://issues.apache.org/jira/browse/OAK-201 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller The NamespaceRegistryImpl.getURI and getPrefix are called a lot, for example by NamePathMapperImpl.getOakName. The method doesn't do any caching, which is a problem because it has to read it each time from the repository. Even if it would do caching, it wouldn't help because it the method WorkspaceImpl.getNamespaceRegistry creates a new NamespaceRegistryImpl each time it is called. To allow caching of known mappings, the instance needs to be cached as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-202) Simplify the code when possible
Thomas Mueller created OAK-202: -- Summary: Simplify the code when possible Key: OAK-202 URL: https://issues.apache.org/jira/browse/OAK-202 Project: Jackrabbit Oak Issue Type: Improvement Reporter: Thomas Mueller Priority: Minor Sometimes it's possible to simplify the code, for example: - making methods static when possible, so a reader knows the method doesn't change the state of an object - the else is unnecessary if the if block always returns -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-201) NamespaceRegistry is very slow
[ https://issues.apache.org/jira/browse/OAK-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420578#comment-13420578 ] Thomas Mueller commented on OAK-201: Yes, that makes sense. It seems sometimes a one-way conversion is needed currently, for example in NodeImpl.setPrimaryType: String jcrPrimaryType = sessionDelegate.getOakPathOrThrow(Property.JCR_PRIMARY_TYPE); Property.JCR_PRIMARY_TYPE is the expanded form ({http://...}primaryType;). I guess we could create constants with the short form (nt:primaryType), and then check if there are remappings. Or use a hardcoded list for known remappings (either just the prefixes or the complete names). NamespaceRegistry is very slow -- Key: OAK-201 URL: https://issues.apache.org/jira/browse/OAK-201 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller The NamespaceRegistryImpl.getURI and getPrefix are called a lot, for example by NamePathMapperImpl.getOakName. The method doesn't do any caching, which is a problem because it has to read it each time from the repository. Even if it would do caching, it wouldn't help because it the method WorkspaceImpl.getNamespaceRegistry creates a new NamespaceRegistryImpl each time it is called. To allow caching of known mappings, the instance needs to be cached as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-201) NamespaceRegistry is very slow
[ https://issues.apache.org/jira/browse/OAK-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420597#comment-13420597 ] Thomas Mueller commented on OAK-201: A stack trace: SessionImpl.refresh(boolean) line: 156 WorkspaceImpl$1.refresh() line: 153 WorkspaceImpl$1(NamespaceRegistryImpl).getURI(String) line: 162 SessionImpl(AbstractSession).getNamespaceURI(String) line: 132 SessionDelegate$SessionNameMapper.getOakPrefix(String) line: 453 SessionDelegate$SessionNameMapper(AbstractNameMapper).getOakName(String) line: 61 NamePathMapperImpl.getOakName(String) line: 46 NodeTypeManagerImpl.getNodeType(String) line: 83 NodeImpl.addNode(String, String) line: 217 So for each addNode(String, String), currently there is a Session.refresh(true). NamespaceRegistry is very slow -- Key: OAK-201 URL: https://issues.apache.org/jira/browse/OAK-201 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller The NamespaceRegistryImpl.getURI and getPrefix are called a lot, for example by NamePathMapperImpl.getOakName. The method doesn't do any caching, which is a problem because it has to read it each time from the repository. Even if it would do caching, it wouldn't help because it the method WorkspaceImpl.getNamespaceRegistry creates a new NamespaceRegistryImpl each time it is called. To allow caching of known mappings, the instance needs to be cached as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-181) Observation / indexing: don't create events for index updates
[ https://issues.apache.org/jira/browse/OAK-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414944#comment-13414944 ] Thomas Mueller commented on OAK-181: Revision 1361938: the index content, as well as the internal index data, is now stored in a child node. The name of that child node is currently :data, but that can be changed later if required. There is one such a node per index, and one for the internal index data and temporary storage (used for move operations). The internal index data is currently just the revision id of the latest indexed revision. the index should be part of the repository (e.g. as binary nt:files), so you can easily back them up and copy over using the JCR API (and package systems on top of it) IIUC, that is one of the major reasons to put indexes into the repository. How visible the index data should be is a good question. I don't think we should leave it somewhat open currently, and decide once we have more experience. I think the main reasons to put the index data in the repository are: - to simplify backup / storage / maintenance - scalability (so the index can scale in the same way the repository can scale) - reduce complexity associated with separate storage for indexes But making the index accessible over the JCR API wasn't a goal so far (as far as I'm aware). What you describe is uses cases I didn't think about so far. Within relational databases, I never heard about a use case to copy index data from one database to another. You generally just copy the data, and then let the database reindex it. If you want to copy the index data, then you do a full database backup. Observation / indexing: don't create events for index updates - Key: OAK-181 URL: https://issues.apache.org/jira/browse/OAK-181 Project: Jackrabbit Oak Issue Type: New Feature Reporter: Thomas Mueller If index data is stored in the repository (for example under jcr:system/oak:indexes), then each change in the content might result in one or multiple changed in the affected indexes. Observation events should only be created for content changes, not for index changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-182) Support for invisible internal content
[ https://issues.apache.org/jira/browse/OAK-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414964#comment-13414964 ] Thomas Mueller commented on OAK-182: When enabling the IndexWrapper in the ContentRepositoryImpl default constructor, the org.apache.jackrabbit.oak.jcr.RepositoryTest.observation test fails with the exception stack trace below. I have a patch that fixed the issue, but I'm not familiar with the code and wonder if that is really the best way to fix it: Index: src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java === --- src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java (revision 1361947) +++ src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java (working copy) @@ -97,6 +97,10 @@ SetString baseChildNodes = new HashSetString(); for (ChildNodeEntry beforeCNE : base.getChildNodeEntries()) { String name = beforeCNE.getName(); +if (name.startsWith(:)) { +// OAK-182: ignore invisible internal content +continue; +} NodeState beforeChild = beforeCNE.getNodeState(); NodeState afterChild = getChildNode(name); if (afterChild == null) { @@ -110,6 +114,10 @@ } for (ChildNodeEntry afterChild : getChildNodeEntries()) { String name = afterChild.getName(); +if (name.startsWith(:)) { +// OAK-182: ignore invisible internal content +continue; +} if (!baseChildNodes.contains(name)) { diff.childNodeAdded(name, afterChild.getNodeState()); } Stack trace without the patch: Exception in thread Observation java.lang.IllegalArgumentException: '/jcr:system/indexes/:data' is not a valid path. Prefix must not be empty at org.apache.jackrabbit.oak.namepath.NamePathMapperImpl$1.error(NamePathMapperImpl.java:108) at org.apache.jackrabbit.oak.namepath.JcrPathParser.parse(JcrPathParser.java:151) at org.apache.jackrabbit.oak.namepath.NamePathMapperImpl.getJcrPath(NamePathMapperImpl.java:122) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.jcrPath(ChangeProcessor.java:104) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.propertyChanged(ChangeProcessor.java:117) at org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:87) at org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155) at org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107) at org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155) at org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107) at org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155) at org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107) at org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214) at org.apache.jackrabbit.oak.core.RootImpl$1.getChanges(RootImpl.java:187) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor.run(ChangeProcessor.java:70) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Support for invisible internal content Key: OAK-182 URL: https://issues.apache.org/jira/browse/OAK-182 Project: Jackrabbit Oak Issue Type: New Feature Components: core, jcr Reporter: Jukka Zitting As discussed on the mailing list (http://markmail.org/message/kzt7csiz2bd5n3ww), it would be good to have a naming pattern line {{:name}} for internal content that we don't want to directly expose to JCR clients. JCR-related functionality like the namespace and node type validators and the observation dispatcher (see also OAK-181) should know to ignore such content and the JCR binding in oak-jcr should automatically
[jira] [Comment Edited] (OAK-182) Support for invisible internal content
[ https://issues.apache.org/jira/browse/OAK-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414964#comment-13414964 ] Thomas Mueller edited comment on OAK-182 at 7/16/12 9:54 AM: - When enabling the IndexWrapper in the ContentRepositoryImpl default constructor, the org.apache.jackrabbit.oak.jcr.RepositoryTest.observation test fails with the exception stack trace below. I have a patch that fixed the issue, but I'm not familiar with the code and wonder if that is really the best way to fix it: {code} Index: src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java === --- src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java (revision 1361947) +++ src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java (working copy) @@ -97,6 +97,10 @@ SetString baseChildNodes = new HashSetString(); for (ChildNodeEntry beforeCNE : base.getChildNodeEntries()) { String name = beforeCNE.getName(); +if (name.startsWith(:)) { +// OAK-182: ignore invisible internal content +continue; +} NodeState beforeChild = beforeCNE.getNodeState(); NodeState afterChild = getChildNode(name); if (afterChild == null) { @@ -110,6 +114,10 @@ } for (ChildNodeEntry afterChild : getChildNodeEntries()) { String name = afterChild.getName(); +if (name.startsWith(:)) { +// OAK-182: ignore invisible internal content +continue; +} if (!baseChildNodes.contains(name)) { diff.childNodeAdded(name, afterChild.getNodeState()); } {code} Stack trace without the patch: {code} Exception in thread Observation java.lang.IllegalArgumentException: '/jcr:system/indexes/:data' is not a valid path. Prefix must not be empty at org.apache.jackrabbit.oak.namepath.NamePathMapperImpl$1.error(NamePathMapperImpl.java:108) at org.apache.jackrabbit.oak.namepath.JcrPathParser.parse(JcrPathParser.java:151) at org.apache.jackrabbit.oak.namepath.NamePathMapperImpl.getJcrPath(NamePathMapperImpl.java:122) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.jcrPath(ChangeProcessor.java:104) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.propertyChanged(ChangeProcessor.java:117) at org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:87) at org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155) at org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107) at org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155) at org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107) at org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155) at org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107) at org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214) at org.apache.jackrabbit.oak.core.RootImpl$1.getChanges(RootImpl.java:187) at org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor.run(ChangeProcessor.java:70) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) {code} was (Author: tmueller): When enabling the IndexWrapper in the ContentRepositoryImpl default constructor, the org.apache.jackrabbit.oak.jcr.RepositoryTest.observation test fails with the exception stack trace below. I have a patch that fixed the issue, but I'm not familiar with the code and wonder if that is really the best way to fix it: {{ Index: src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java === --- src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java (revision 1361947) +++ src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java
[jira] [Created] (OAK-189) Swallowed exceptions
Thomas Mueller created OAK-189: -- Summary: Swallowed exceptions Key: OAK-189 URL: https://issues.apache.org/jira/browse/OAK-189 Project: Jackrabbit Oak Issue Type: Bug Components: jcr Reporter: Thomas Mueller Exceptions should not be silently swallowed. This is currently done in SessionDelegate$SessionNameMapper, methods getOakPrefix(), getOakPrefixFromURI(), and getJcrPrefix(). Those methods catch RepositoryException, don't log by default (only when using debug level), and don't log the exception stack trace or throw an exception. Catching a very wide band of exceptions (RepositoryException) and then simply returning null is not an acceptable solution in my view. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-189) Swallowed exceptions
[ https://issues.apache.org/jira/browse/OAK-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13415201#comment-13415201 ] Thomas Mueller commented on OAK-189: The reason why I think it's not acceptable is that the exception could be anything, for example out of disk space, or some internal error. Just silently returning null, without logging, makes it hard to find the root cause of the problem, because everything else might just look fine. Some code might accept null as a correct answer, so that the program just behaves somewhat differently. Swallowed exceptions Key: OAK-189 URL: https://issues.apache.org/jira/browse/OAK-189 Project: Jackrabbit Oak Issue Type: Bug Components: jcr Reporter: Thomas Mueller Exceptions should not be silently swallowed. This is currently done in SessionDelegate$SessionNameMapper, methods getOakPrefix(), getOakPrefixFromURI(), and getJcrPrefix(). Those methods catch RepositoryException, don't log by default (only when using debug level), and don't log the exception stack trace or throw an exception. Catching a very wide band of exceptions (RepositoryException) and then simply returning null is not an acceptable solution in my view. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-182) Support for invisible internal content
[ https://issues.apache.org/jira/browse/OAK-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413580#comment-13413580 ] Thomas Mueller commented on OAK-182: Such internal content should probably also not be indexed for search. That's certainly possible. Later on we can still allow indexing such 'hidden' properties once we have a use case. Support for invisible internal content Key: OAK-182 URL: https://issues.apache.org/jira/browse/OAK-182 Project: Jackrabbit Oak Issue Type: New Feature Components: core, jcr Reporter: Jukka Zitting As discussed on the mailing list (http://markmail.org/message/kzt7csiz2bd5n3ww), it would be good to have a naming pattern line {{:name}} for internal content that we don't want to directly expose to JCR clients. JCR-related functionality like the namespace and node type validators and the observation dispatcher (see also OAK-181) should know to ignore such content and the JCR binding in oak-jcr should automatically filter out such internal content. Such internal content should probably also not be indexed for search. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (OAK-179) Tests should not fail if there is a jcr:system node
[ https://issues.apache.org/jira/browse/OAK-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-179. Resolution: Fixed Fixed in revision 1360591 Tests should not fail if there is a jcr:system node --- Key: OAK-179 URL: https://issues.apache.org/jira/browse/OAK-179 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Some of the tests fail if there is a node /jcr:system. The tests should be able to deal with such a node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3385) DbClusterTest fails when port is already in use
[ https://issues.apache.org/jira/browse/JCR-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412775#comment-13412775 ] Thomas Mueller commented on JCR-3385: - The problem is that the ports are also hardcoded in the repository-h2.xml file. The easiest solution is probably to use a non-clustered, embedded database stored in the parent directory if the cluster nodes. DbClusterTest fails when port is already in use --- Key: JCR-3385 URL: https://issues.apache.org/jira/browse/JCR-3385 Project: Jackrabbit Content Repository Issue Type: Bug Components: clustering, jackrabbit-core Affects Versions: 2.5 Reporter: Jukka Zitting Assignee: Thomas Mueller Priority: Minor Attachments: 0001-JCR-3385-DbClusterTest-fails-when-port-is-already-in.patch The DbClusterTest and DbClusterTestJCR3162 classes use hard-coded TCP port numbes 9001 and 9002 which make the tests fail whenever there already is some process listening on those ports. It would be better if the classes automatically looked for unused ports for the tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (JCR-3385) DbClusterTest fails when port is already in use
[ https://issues.apache.org/jira/browse/JCR-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated JCR-3385: Attachment: JCR-3385-embedded-shared-db.patch Alternative patch using an embedded database - a bit less real world but results in simpler test cases DbClusterTest fails when port is already in use --- Key: JCR-3385 URL: https://issues.apache.org/jira/browse/JCR-3385 Project: Jackrabbit Content Repository Issue Type: Bug Components: clustering, jackrabbit-core Affects Versions: 2.5 Reporter: Jukka Zitting Assignee: Thomas Mueller Priority: Minor Attachments: 0001-JCR-3385-DbClusterTest-fails-when-port-is-already-in.patch, JCR-3385-embedded-shared-db.patch The DbClusterTest and DbClusterTestJCR3162 classes use hard-coded TCP port numbes 9001 and 9002 which make the tests fail whenever there already is some process listening on those ports. It would be better if the classes automatically looked for unused ports for the tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-3385) DbClusterTest fails when port is already in use
[ https://issues.apache.org/jira/browse/JCR-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-3385. - Resolution: Fixed I have committed my patch now - not because it's better thank Jukkas patch, but because I hope the tests will be easier to understand and maintain in the future. Revision 1360692 DbClusterTest fails when port is already in use --- Key: JCR-3385 URL: https://issues.apache.org/jira/browse/JCR-3385 Project: Jackrabbit Content Repository Issue Type: Bug Components: clustering, jackrabbit-core Affects Versions: 2.5 Reporter: Jukka Zitting Assignee: Thomas Mueller Priority: Minor Attachments: 0001-JCR-3385-DbClusterTest-fails-when-port-is-already-in.patch, JCR-3385-embedded-shared-db.patch The DbClusterTest and DbClusterTestJCR3162 classes use hard-coded TCP port numbes 9001 and 9002 which make the tests fail whenever there already is some process listening on those ports. It would be better if the classes automatically looked for unused ports for the tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-178) Query: index definition documentation and tooling
[ https://issues.apache.org/jira/browse/OAK-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412591#comment-13412591 ] Thomas Mueller commented on OAK-178: and define an Oak namespace Sure! How would I do that? Query: index definition documentation and tooling - Key: OAK-178 URL: https://issues.apache.org/jira/browse/OAK-178 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller Assignee: Thomas Mueller Unlike Jackrabbit 2.x, indexes in the Oak query engine are user defined, that means data is only indexed if there is a matching index. Those indexes are then automatically used for the appropriate queries. The current plan is to define indexes as nodes within a repository. An index is created if an index metadata node is created, and the index is removed if the index metadata node is removed. The index content is automatically updated if the content changes (either synchronously or asynchronously). The location and structure of the index metadata needs to be defined and documented. Also, to simplify defining and managing indexes, it may make sense to write a utility (helper class) for managing indexes. Internally, this utility uses the regular JCR API and accesses the documented index metadata nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-178) Query: index definition documentation and tooling
[ https://issues.apache.org/jira/browse/OAK-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412594#comment-13412594 ] Thomas Mueller commented on OAK-178: Like this? #P oak-jcr Index: src/main/resources/org/apache/jackrabbit/oak/jcr/nodetype/builtin_nodetypes.cnd === --- src/main/resources/org/apache/jackrabbit/oak/jcr/nodetype/builtin_nodetypes.cnd (revision 1360166) +++ src/main/resources/org/apache/jackrabbit/oak/jcr/nodetype/builtin_nodetypes.cnd (working copy) @@ -19,6 +19,7 @@ jcr='http://www.jcp.org/jcr/1.0' nt='http://www.jcp.org/jcr/nt/1.0' mix='http://www.jcp.org/jcr/mix/1.0' +oak='http://jackrabbit.apache.org/oak/1.0' //-- // B A S E T Y P E Query: index definition documentation and tooling - Key: OAK-178 URL: https://issues.apache.org/jira/browse/OAK-178 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller Assignee: Thomas Mueller Unlike Jackrabbit 2.x, indexes in the Oak query engine are user defined, that means data is only indexed if there is a matching index. Those indexes are then automatically used for the appropriate queries. The current plan is to define indexes as nodes within a repository. An index is created if an index metadata node is created, and the index is removed if the index metadata node is removed. The index content is automatically updated if the content changes (either synchronously or asynchronously). The location and structure of the index metadata needs to be defined and documented. Also, to simplify defining and managing indexes, it may make sense to write a utility (helper class) for managing indexes. Internally, this utility uses the regular JCR API and accesses the documented index metadata nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-178) Query: index definition documentation and tooling
[ https://issues.apache.org/jira/browse/OAK-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412609#comment-13412609 ] Thomas Mueller commented on OAK-178: I saw the following change is also needed: #P oak-jcr Index: src/main/java/org/apache/jackrabbit/oak/jcr/nodetype/NodeTypeManagerDelegate.java === --- src/main/java/org/apache/jackrabbit/oak/jcr/nodetype/NodeTypeManagerDelegate.java (revision 1360577) +++ src/main/java/org/apache/jackrabbit/oak/jcr/nodetype/NodeTypeManagerDelegate.java (working copy) @@ -47,6 +47,7 @@ tmp.put(nt, http://www.jcp.org/jcr/nt/1.0;); tmp.put(mix, http://www.jcp.org/jcr/mix/1.0;); tmp.put(xml, http://www.w3.org/XML/1998/namespace;); +tmp.put(oak, http://jackrabbit.apache.org/oak/1.0;); nsdefaults = Collections.unmodifiableMap(tmp); } With those two changes, the tests seem to work Query: index definition documentation and tooling - Key: OAK-178 URL: https://issues.apache.org/jira/browse/OAK-178 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller Assignee: Thomas Mueller Unlike Jackrabbit 2.x, indexes in the Oak query engine are user defined, that means data is only indexed if there is a matching index. Those indexes are then automatically used for the appropriate queries. The current plan is to define indexes as nodes within a repository. An index is created if an index metadata node is created, and the index is removed if the index metadata node is removed. The index content is automatically updated if the content changes (either synchronously or asynchronously). The location and structure of the index metadata needs to be defined and documented. Also, to simplify defining and managing indexes, it may make sense to write a utility (helper class) for managing indexes. Internally, this utility uses the regular JCR API and accesses the documented index metadata nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-179) Tests should not fail if there is a jcr:system node
Thomas Mueller created OAK-179: -- Summary: Tests should not fail if there is a jcr:system node Key: OAK-179 URL: https://issues.apache.org/jira/browse/OAK-179 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Some of the tests fail if there is a node /jcr:system. The tests should be able to deal with such a node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-180) More real world benchmarks
Thomas Mueller created OAK-180: -- Summary: More real world benchmarks Key: OAK-180 URL: https://issues.apache.org/jira/browse/OAK-180 Project: Jackrabbit Oak Issue Type: New Feature Reporter: Thomas Mueller While the tests are oak-bench are good, they are not very close to real world scenarios. Specially, we need tests with more nodes (for example 15 million nodes), and a more complex node structure, and more complex operations (read operations, write operations, fulltext index, queries, and access rights). It doesn't need to be very complex, but at least closer to the reality. I'm thinking about something of what TPC-C is for databases, but with content management operations instead of order-entry. But that's a longer term goal. The goal of the test is to detect problem areas in our implementation (so this isn't just about scalability). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-181) Observation / indexing: don't create events for index updates
Thomas Mueller created OAK-181: -- Summary: Observation / indexing: don't create events for index updates Key: OAK-181 URL: https://issues.apache.org/jira/browse/OAK-181 Project: Jackrabbit Oak Issue Type: New Feature Reporter: Thomas Mueller If index data is stored in the repository (for example under jcr:system/oak:indexes), then each change in the content might result in one or multiple changed in the affected indexes. Observation events should only be created for content changes, not for index changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-178) Query: index definition documentation and tooling
Thomas Mueller created OAK-178: -- Summary: Query: index definition documentation and tooling Key: OAK-178 URL: https://issues.apache.org/jira/browse/OAK-178 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller Assignee: Thomas Mueller Unlike Jackrabbit 2.x, indexes in the Oak query engine are user defined, that means data is only indexed if there is a matching index. Those indexes are then automatically used for the appropriate queries. The current plan is to define indexes as nodes within a repository. An index is created if an index metadata node is created, and the index is removed if the index metadata node is removed. The index content is automatically updated if the content changes (either synchronously or asynchronously). The location and structure of the index metadata needs to be defined and documented. Also, to simplify defining and managing indexes, it may make sense to write a utility (helper class) for managing indexes. Internally, this utility uses the regular JCR API and accesses the documented index metadata nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-169) Support orderable nodes
[ https://issues.apache.org/jira/browse/OAK-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13410207#comment-13410207 ] Thomas Mueller commented on OAK-169: Again about linked list: Jukkas has a point that a regular linked list would be quite slow, because to display the child node names in order you would need to load all child nodes. An alternative would be to use a grouped linked list. The parent node would keep the child node names of the first 100 (or whatever number) child nodes in a multi-value property. The last of those 100 nodes (if there are that many) would contain another multi-value property of the next 100 child node names, and so on. This is a special case of a skip list, as a regular linked list is, as a normal multi-value property with all (ordered) child node names is. If a node in the middle is removed, just one group list would shrink, lets say from 100 to 99. If two groups combined have less than 100 elements, those two groups could be merged. Support orderable nodes --- Key: OAK-169 URL: https://issues.apache.org/jira/browse/OAK-169 Project: Jackrabbit Oak Issue Type: New Feature Components: jcr Reporter: Jukka Zitting There are JCR clients that depend on the ability to explicitly specify the order of child nodes. That functionality is not included in the MicroKernel tree model, so we need to implement it either in oak-core or oak-jcr using something like an extra (hidden) {{oak:childOrder}} property that records the specified ordering of child nodes. A multi-valued string property is probably good enough for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expected root node not in result
[ https://issues.apache.org/jira/browse/JCR-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409317#comment-13409317 ] Thomas Mueller commented on JCR-3376: - Thanks Jukka! I didn't consider the possibility of such a limitation and so didn't run the Jackrabbit test. I will try to find an alternative solution. TCK: SQLPathTest.testChildAxisRoot expected root node not in result --- Key: JCR-3376 URL: https://issues.apache.org/jira/browse/JCR-3376 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Fix For: 2.6 The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query: SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%' It expected the result to be /jcr:system, /testroot, /testdata It does not allow the implementation to return the root node ('/'). According to the specification, a JCR implementation may filter the root node, as noted by Randall Hauch - http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html - quote: Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard characters as generally used within LIKE predicates (and jcr:like in XPath): As in SQL, the character '%' represents any string of zero or more characters, and the character '_' (underscore) represents any single character. while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics jcr:path pseudo column and narrows the semantics of using LIKE with jcr:path in the second-to-last bullet point: Predicates in the WHERE clause that test jcr:path are only required to support the operators =, and LIKE. In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters) or within index brackets Because the '%' matches only a whole path segment, the /% literal only matches paths that have at least one path segment, which means that it matches all descendants of the root node. the specification says In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters)... but it doesn't specify it needs to do so. To allow an implementation to return the root node, I suggest to change the test as follows: ... AND NOT jcr:path LIKE '/%/%' AND jcr:path '/' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expects root node not in result
[ https://issues.apache.org/jira/browse/JCR-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated JCR-3376: Description: The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query: SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%' It expects the result to be /jcr:system, /testroot, /testdata It does not allow the implementation to return the root node ('/'). According to the specification, a JCR implementation may filter the root node, as noted by Randall Hauch - http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html - quote: Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard characters as generally used within LIKE predicates (and jcr:like in XPath): As in SQL, the character '%' represents any string of zero or more characters, and the character '_' (underscore) represents any single character. while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics jcr:path pseudo column and narrows the semantics of using LIKE with jcr:path in the second-to-last bullet point: Predicates in the WHERE clause that test jcr:path are only required to support the operators =, and LIKE. In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters) or within index brackets Because the '%' matches only a whole path segment, the /% literal only matches paths that have at least one path segment, which means that it matches all descendants of the root node. the specification says In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters)... but it doesn't specify it needs to do so. was: The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query: SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%' It expected the result to be /jcr:system, /testroot, /testdata It does not allow the implementation to return the root node ('/'). According to the specification, a JCR implementation may filter the root node, as noted by Randall Hauch - http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html - quote: Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard characters as generally used within LIKE predicates (and jcr:like in XPath): As in SQL, the character '%' represents any string of zero or more characters, and the character '_' (underscore) represents any single character. while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics jcr:path pseudo column and narrows the semantics of using LIKE with jcr:path in the second-to-last bullet point: Predicates in the WHERE clause that test jcr:path are only required to support the operators =, and LIKE. In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters) or within index brackets Because the '%' matches only a whole path segment, the /% literal only matches paths that have at least one path segment, which means that it matches all descendants of the root node. the specification says In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters)... but it doesn't specify it needs to do so. To allow an implementation to return the root node, I suggest to change the test as follows: ... AND NOT jcr:path LIKE '/%/%' AND jcr:path '/' Summary: TCK: SQLPathTest.testChildAxisRoot expects root node not in result (was: TCK: SQLPathTest.testChildAxisRoot expected root node not in result) TCK: SQLPathTest.testChildAxisRoot expects root node not in result -- Key: JCR-3376 URL: https://issues.apache.org/jira/browse/JCR-3376 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Fix For: 2.6 The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query: SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%' It expects the result to be /jcr:system, /testroot, /testdata It does not allow the implementation to return the root node ('/'). According to the specification, a JCR implementation may filter the root node, as noted by Randall Hauch -
[jira] [Resolved] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expects root node not in result
[ https://issues.apache.org/jira/browse/JCR-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-3376. - Resolution: Fixed Revision 1359213 allows for optional nodes to be in the result. For the test SQLPathTest.testChildAxisRoot, the root node is now optional. I tested with Jackrabbit Core (-PintegrationTesting) and with Oak (oak-jcr), with the test removed from the known.issues list in the pom.xml. TCK: SQLPathTest.testChildAxisRoot expects root node not in result -- Key: JCR-3376 URL: https://issues.apache.org/jira/browse/JCR-3376 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Fix For: 2.6 The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query: SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%' It expects the result to be /jcr:system, /testroot, /testdata It does not allow the implementation to return the root node ('/'). According to the specification, a JCR implementation may filter the root node, as noted by Randall Hauch - http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html - quote: Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard characters as generally used within LIKE predicates (and jcr:like in XPath): As in SQL, the character '%' represents any string of zero or more characters, and the character '_' (underscore) represents any single character. while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics jcr:path pseudo column and narrows the semantics of using LIKE with jcr:path in the second-to-last bullet point: Predicates in the WHERE clause that test jcr:path are only required to support the operators =, and LIKE. In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters) or within index brackets Because the '%' matches only a whole path segment, the /% literal only matches paths that have at least one path segment, which means that it matches all descendants of the root node. the specification says In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters)... but it doesn't specify it needs to do so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-169) Support orderable nodes
[ https://issues.apache.org/jira/browse/OAK-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409228#comment-13409228 ] Thomas Mueller commented on OAK-169: Iterating over all child nodes is always an O(n) operation (n = number of child nodes). Of course if you store all child nodes names in the parent node you only have to access the parent node, but that parent node is just n time larger then. We would be back to the behavior of Jackrabbit 2.x, where adding many child nodes is an O(n^2) operation, for orderable child node lists. This might or might not be acceptable - I don't think we decided this when we defined our goals. With the linked list approach, iterating over all child nodes will have to read all child nodes (also an O(n) operation), on the other hand it will make it possible to support many child nodes without limitations. It is true that this approach is probably slower if there are few child nodes (compared to storing the complete child node name list in the parent). I guess to decide which approach works best in practice we first need have to define which use cases we care about and which are the most common ones. Support orderable nodes --- Key: OAK-169 URL: https://issues.apache.org/jira/browse/OAK-169 Project: Jackrabbit Oak Issue Type: New Feature Components: jcr Reporter: Jukka Zitting There are JCR clients that depend on the ability to explicitly specify the order of child nodes. That functionality is not included in the MicroKernel tree model, so we need to implement it either in oak-core or oak-jcr using something like an extra (hidden) {{oak:childOrder}} property that records the specified ordering of child nodes. A multi-valued string property is probably good enough for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (OAK-169) Support orderable nodes
[ https://issues.apache.org/jira/browse/OAK-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409228#comment-13409228 ] Thomas Mueller edited comment on OAK-169 at 7/9/12 7:05 AM: Iterating over all child nodes is always an O( n ) operation (n = number of child nodes). Of course if you store all child nodes names in the parent node you only have to access the parent node, but that parent node is just n time larger then. We would be back to the behavior of Jackrabbit 2.x, where adding many child nodes is an O( n^2 ) operation, for orderable child node lists. This might or might not be acceptable - I don't think we decided this when we defined our goals. With the linked list approach, iterating over all child nodes will have to read all child nodes (also an O( n ) operation), on the other hand it will make it possible to support many child nodes without limitations. It is true that this approach is probably slower if there are few child nodes (compared to storing the complete child node name list in the parent). I guess to decide which approach works best in practice we first need have to define which use cases we care about and which are the most common ones. was (Author: tmueller): Iterating over all child nodes is always an O(n) operation (n = number of child nodes). Of course if you store all child nodes names in the parent node you only have to access the parent node, but that parent node is just n time larger then. We would be back to the behavior of Jackrabbit 2.x, where adding many child nodes is an O(n^2) operation, for orderable child node lists. This might or might not be acceptable - I don't think we decided this when we defined our goals. With the linked list approach, iterating over all child nodes will have to read all child nodes (also an O(n) operation), on the other hand it will make it possible to support many child nodes without limitations. It is true that this approach is probably slower if there are few child nodes (compared to storing the complete child node name list in the parent). I guess to decide which approach works best in practice we first need have to define which use cases we care about and which are the most common ones. Support orderable nodes --- Key: OAK-169 URL: https://issues.apache.org/jira/browse/OAK-169 Project: Jackrabbit Oak Issue Type: New Feature Components: jcr Reporter: Jukka Zitting There are JCR clients that depend on the ability to explicitly specify the order of child nodes. That functionality is not included in the MicroKernel tree model, so we need to implement it either in oak-core or oak-jcr using something like an extra (hidden) {{oak:childOrder}} property that records the specified ordering of child nodes. A multi-valued string property is probably good enough for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expected root node not in result
Thomas Mueller created JCR-3376: --- Summary: TCK: SQLPathTest.testChildAxisRoot expected root node not in result Key: JCR-3376 URL: https://issues.apache.org/jira/browse/JCR-3376 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query: SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%' It expected the result to be /jcr:system, /testroot, /testdata It does not allow the implementation to return the root node ('/'). According to the specification, a JCR implementation may filter the root node, as noted by Randall Hauch - http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html - quote: Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard characters as generally used within LIKE predicates (and jcr:like in XPath): As in SQL, the character '%' represents any string of zero or more characters, and the character '_' (underscore) represents any single character. while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics jcr:path pseudo column and narrows the semantics of using LIKE with jcr:path in the second-to-last bullet point: Predicates in the WHERE clause that test jcr:path are only required to support the operators =, and LIKE. In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters) or within index brackets Because the '%' matches only a whole path segment, the /% literal only matches paths that have at least one path segment, which means that it matches all descendants of the root node. the specification says In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters)... but it doesn't specify it needs to do so. To allow an implementation to return the root node, I suggest to change the test as follows: ... AND NOT jcr:path LIKE '/%/%' AND jcr:path '/' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expected root node not in result
[ https://issues.apache.org/jira/browse/JCR-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-3376. - Resolution: Fixed Fix Version/s: 2.6 TCK: SQLPathTest.testChildAxisRoot expected root node not in result --- Key: JCR-3376 URL: https://issues.apache.org/jira/browse/JCR-3376 Project: Jackrabbit Content Repository Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Fix For: 2.6 The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query: SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%' It expected the result to be /jcr:system, /testroot, /testdata It does not allow the implementation to return the root node ('/'). According to the specification, a JCR implementation may filter the root node, as noted by Randall Hauch - http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html - quote: Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard characters as generally used within LIKE predicates (and jcr:like in XPath): As in SQL, the character '%' represents any string of zero or more characters, and the character '_' (underscore) represents any single character. while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics jcr:path pseudo column and narrows the semantics of using LIKE with jcr:path in the second-to-last bullet point: Predicates in the WHERE clause that test jcr:path are only required to support the operators =, and LIKE. In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters) or within index brackets Because the '%' matches only a whole path segment, the /% literal only matches paths that have at least one path segment, which means that it matches all descendants of the root node. the specification says In the case of LIKE predicates, support is only required for tests using the % wildcard character as a match for a whole path segment (the part between two / characters)... but it doesn't specify it needs to do so. To allow an implementation to return the root node, I suggest to change the test as follows: ... AND NOT jcr:path LIKE '/%/%' AND jcr:path '/' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (OAK-155) Query: limited support for the deprecated JCR 1.0 query language Query.SQL
[ https://issues.apache.org/jira/browse/OAK-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller reassigned OAK-155: -- Assignee: Thomas Mueller Query: limited support for the deprecated JCR 1.0 query language Query.SQL -- Key: OAK-155 URL: https://issues.apache.org/jira/browse/OAK-155 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller Assignee: Thomas Mueller Existing applications (as well as the TCK) use the JCR 1.0 query language sql. As far as I know, there are only few differences between JCR 1.0 SQL and JCR 2.0 SQL-2. To make old applications work with Oak, I suggest we provide support JCR 1.0 SQL as well. An additional advantage is that more of the existing TCK tests can be run. I currently don't know if the full JCR 1.0 SQL syntax can be supported (similar to XPath); if we find supporting certain features is too complicated we will document those limitations instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (JCR-3363) DataStore garbage collection: test case GarbageCollectorTest.testGC() is to lenient
Thomas Mueller created JCR-3363: --- Summary: DataStore garbage collection: test case GarbageCollectorTest.testGC() is to lenient Key: JCR-3363 URL: https://issues.apache.org/jira/browse/JCR-3363 Project: Jackrabbit Content Repository Issue Type: Bug Reporter: Thomas Mueller The test case GarbageCollectorTest.testGC() is supposed to test binaries of nodes that are moved while garbage collection is running are not deleted. However the test doesn't fail if the event listener is disabled. It should. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-155) Query: limited support for the deprecated JCR 1.0 query language Query.SQL
Thomas Mueller created OAK-155: -- Summary: Query: limited support for the deprecated JCR 1.0 query language Query.SQL Key: OAK-155 URL: https://issues.apache.org/jira/browse/OAK-155 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller Existing applications (as well as the TCK) use the JCR 1.0 query language sql. As far as I know, there are only few differences between JCR 1.0 SQL and JCR 2.0 SQL-2. To make old applications work with Oak, I suggest we provide support JCR 1.0 SQL as well. An additional advantage is that more of the existing TCK tests can be run. I currently don't know if the full JCR 1.0 SQL syntax can be supported (similar to XPath); if we find supporting certain features is too complicated we will document those limitations instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-155) Query: limited support for the deprecated JCR 1.0 query language Query.SQL
[ https://issues.apache.org/jira/browse/OAK-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402996#comment-13402996 ] Thomas Mueller commented on OAK-155: My current plan is to use a strict parser for both SQL-1 and XPath, so that unsupported syntax is rejected. For existing applications that no longer work because of that, we can decide whether we want to support a certain (XPath / SQL-1) feature in Oak, patching the respective parser, or change the application. Whatever is less work. An implementation detail: I plan to use the SQL-2 parser for SQL-1 as well, but an (internal) switch so that the SQL-1 features are really only supported when using SQL-1, and rejected when using SQL-2. If this turns out to be a problem we can still split the parser. As a side effect, SQL-2 syntax is supported for SQL-1 queries. But SQL-1 is clearly deprecated, not just within the JCR spec, but also within Oak. Unlike XPath, which will very likely be supported within Oak for a longer time, SQL-1 queries should really be converted to SQL-2. Query: limited support for the deprecated JCR 1.0 query language Query.SQL -- Key: OAK-155 URL: https://issues.apache.org/jira/browse/OAK-155 Project: Jackrabbit Oak Issue Type: Bug Reporter: Thomas Mueller Existing applications (as well as the TCK) use the JCR 1.0 query language sql. As far as I know, there are only few differences between JCR 1.0 SQL and JCR 2.0 SQL-2. To make old applications work with Oak, I suggest we provide support JCR 1.0 SQL as well. An additional advantage is that more of the existing TCK tests can be run. I currently don't know if the full JCR 1.0 SQL syntax can be supported (similar to XPath); if we find supporting certain features is too complicated we will document those limitations instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-3321) TCK: Strange XPath query in OrderByMultiTypeTest.testMultipleOrder
[ https://issues.apache.org/jira/browse/JCR-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-3321. - Resolution: Fixed Fix Version/s: 2.6 TCK: Strange XPath query in OrderByMultiTypeTest.testMultipleOrder -- Key: JCR-3321 URL: https://issues.apache.org/jira/browse/JCR-3321 Project: Jackrabbit Content Repository Issue Type: Sub-task Components: jackrabbit-jcr-tests Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Fix For: 2.6 The test org.apache.jackrabbit.test.api.query.OrderByMultiTypeTest.testMultipleOrder currently runs a query of the form: //testroot/*[@jcr:primaryType='nt:unstructured'] I believe there is a typo in the test, and the query should be: /jcr:root/testroot/*[@jcr:primaryType='nt:unstructured'] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-2686) Data store garbage collection: interrupt mark
[ https://issues.apache.org/jira/browse/JCR-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-2686. - Resolution: Fixed I just saw we already have a GarbageCollector.close() method which does stop the mark, so I think I can resolve this issue as fixed. Data store garbage collection: interrupt mark - Key: JCR-2686 URL: https://issues.apache.org/jira/browse/JCR-2686 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-core Affects Versions: 2.1 Reporter: Stephan Huttenhuis Assignee: Thomas Mueller It would be nice if the DataStore GarbageCollector can be interrupted during a mark. This allows applications that use JackRabbit to shutdown without having to wait for the mark to complete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-2286) Implement Value.toString
[ https://issues.apache.org/jira/browse/JCR-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-2286. - Resolution: Won't Fix I don't plan to fix it in Jackrabbit 2.x Implement Value.toString Key: JCR-2286 URL: https://issues.apache.org/jira/browse/JCR-2286 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-spi-commons Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Currently QValueValue.toString() is not implemented. It would help if the method returns something human readable, both for debugging and for generating error messages. It's a bit tricky, because we need to make sure toString() never fails and has no side effects (doesn't read from files, doesn't change state), otherwise it breaks debugging. Changing the state when throwing an exception is not such a big problem, but for debugging. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (JCR-2998) Option to log the path for Session.save() calls
[ https://issues.apache.org/jira/browse/JCR-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved JCR-2998. - Resolution: Won't Fix We have SessionState debug logging options, I don't think we need anything else. Option to log the path for Session.save() calls --- Key: JCR-2998 URL: https://issues.apache.org/jira/browse/JCR-2998 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-core Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor It would be nice to be able to log the path for Session.save() calls, so that it's easier to analyze if the repository is slow because of many write operations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3340) GarbageCollector should ignore all NoSuchItemStateExceptions
[ https://issues.apache.org/jira/browse/JCR-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294941#comment-13294941 ] Thomas Mueller commented on JCR-3340: - As discussed offline with Mete we came to the conclusion not to apply the patch now. There is a risk we could hide a bigger problem. FYI the underlying exception is: javax.jcr.RepositoryException: failed to retrieve state of intermediary node javax.jcr.RepositoryException: javax.jcr.RepositoryException: failed to retrieve state of intermediary node at org.apache.jackrabbit.core.data.GarbageCollector.stopScan(GarbageCollector.java:240) at org.apache.jackrabbit.core.data.GarbageCollector.sweep(GarbageCollector.java:258) Caused by: javax.jcr.RepositoryException: failed to retrieve state of intermediary node at org.apache.jackrabbit.core.CachingHierarchyManager.resolvePath(CachingHierarchyManager.java:156) at org.apache.jackrabbit.core.HierarchyManagerImpl.resolvePath(HierarchyManagerImpl.java:365) at org.apache.jackrabbit.core.ItemManager.getItem(ItemManager.java:550) at org.apache.jackrabbit.core.session.SessionItemOperation$4.perform(SessionItemOperation.java:97) at org.apache.jackrabbit.core.session.SessionItemOperation$4.perform(SessionItemOperation.java:93) at org.apache.jackrabbit.core.session.SessionItemOperation.perform(SessionItemOperation.java:187) at org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:200) at org.apache.jackrabbit.core.SessionImpl.perform(SessionImpl.java:355) at org.apache.jackrabbit.core.SessionImpl.getItem(SessionImpl.java:743) at org.apache.jackrabbit.core.data.GarbageCollector$Listener.onEvent(GarbageCollector.java:421) at org.apache.jackrabbit.core.observation.EventConsumer.consumeEvents(EventConsumer.java:248) at org.apache.jackrabbit.core.observation.ObservationDispatcher.dispatchEvents(ObservationDispatcher.java:214) at org.apache.jackrabbit.core.observation.EventStateCollection.dispatch(EventStateCollection.java:475) at org.apache.jackrabbit.core.state.SharedItemStateManager$Update.end(SharedItemStateManager.java:798) at org.apache.jackrabbit.core.state.SharedItemStateManager.update(SharedItemStateManager.java:1498) at org.apache.jackrabbit.core.state.LocalItemStateManager.update(LocalItemStateManager.java:398) at org.apache.jackrabbit.core.state.XAItemStateManager.update(XAItemStateManager.java:354) at org.apache.jackrabbit.core.state.LocalItemStateManager.update(LocalItemStateManager.java:373) at org.apache.jackrabbit.core.state.SessionItemStateManager.update(SessionItemStateManager.java:274) at org.apache.jackrabbit.core.ItemSaveOperation.perform(ItemSaveOperation.java:258) at org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:200) at org.apache.jackrabbit.core.ItemImpl.perform(ItemImpl.java:91) at org.apache.jackrabbit.core.ItemImpl.save(ItemImpl.java:329) at org.apache.jackrabbit.core.session.SessionSaveOperation.perform(SessionSaveOperation.java:42) at org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:200) at org.apache.jackrabbit.core.SessionImpl.perform(SessionImpl.java:355) at org.apache.jackrabbit.core.SessionImpl.save(SessionImpl.java:758) Caused by: org.apache.jackrabbit.core.state.NoSuchItemStateException: 1b94274f-431c-4dcd-aac6-b238527fc276 at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:282) at org.apache.jackrabbit.core.state.LocalItemStateManager.getNodeState(LocalItemStateManager.java:109) at org.apache.jackrabbit.core.state.LocalItemStateManager.getItemState(LocalItemStateManager.java:174) at org.apache.jackrabbit.core.state.SessionItemStateManager.getItemState(SessionItemStateManager.java:161) at org.apache.jackrabbit.core.HierarchyManagerImpl.getItemState(HierarchyManagerImpl.java:152) at org.apache.jackrabbit.core.HierarchyManagerImpl.resolvePath(HierarchyManagerImpl.java:115) at org.apache.jackrabbit.core.CachingHierarchyManager.resolvePath(CachingHierarchyManager.java:152) GarbageCollector should ignore all NoSuchItemStateExceptions Key: JCR-3340 URL: https://issues.apache.org/jira/browse/JCR-3340 Project: Jackrabbit Content Repository Issue Type: Bug Components: jackrabbit-core Affects Versions: 2.5 Reporter: Mete Atamel Attachments: JCR-3340.patch Original Estimate: 24h Remaining Estimate: 24h When GarbageCollector goes through nodes, it can encounter NoSuchItemStateException or PathNotFoundException if a node has been deleted or moved in the meantime. GarbageCollector can safely ignore these exceptions. It tries to do so in some cases but not all. For example, Listener#onEvent method in GarbageCollector catches PathNotFoundException and it also catches the
[jira] [Created] (JCR-3341) GarbageCollector should fail fast if there is a problem
Thomas Mueller created JCR-3341: --- Summary: GarbageCollector should fail fast if there is a problem Key: JCR-3341 URL: https://issues.apache.org/jira/browse/JCR-3341 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-core Affects Versions: 2.5 Reporter: Thomas Mueller Priority: Minor The GarbageCollector installs an ObservationListener to ensure moved nodes are scanned as well. If there is an exception in the ObservationListener, this exception is captured (lastException), but only evaluated at the very end of the GC cycle, in Listener.stop() / GarbageCollector.stopScan() which is called as part of sweep(), before deleting unused items. This is quite late. For a large repository, scanning might take a few hours. If such an exception occurs, scanning should stop within a reasonable time (fail fast), and the exception should be thrown there. This is related to JCR-3340 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (JCR-3341) GarbageCollector should fail fast if there is a problem
[ https://issues.apache.org/jira/browse/JCR-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated JCR-3341: Attachment: JCR-3341.patch GarbageCollector should fail fast if there is a problem --- Key: JCR-3341 URL: https://issues.apache.org/jira/browse/JCR-3341 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-core Affects Versions: 2.5 Reporter: Thomas Mueller Priority: Minor Attachments: JCR-3341.patch The GarbageCollector installs an ObservationListener to ensure moved nodes are scanned as well. If there is an exception in the ObservationListener, this exception is captured (lastException), but only evaluated at the very end of the GC cycle, in Listener.stop() / GarbageCollector.stopScan() which is called as part of sweep(), before deleting unused items. This is quite late. For a large repository, scanning might take a few hours. If such an exception occurs, scanning should stop within a reasonable time (fail fast), and the exception should be thrown there. This is related to JCR-3340 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (JCR-3341) GarbageCollector should fail fast if there is a problem
[ https://issues.apache.org/jira/browse/JCR-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated JCR-3341: Assignee: Thomas Mueller Status: Patch Available (was: Open) A patch for the 2.2 branch is attached. GarbageCollector should fail fast if there is a problem --- Key: JCR-3341 URL: https://issues.apache.org/jira/browse/JCR-3341 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-core Affects Versions: 2.5 Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Attachments: JCR-3341.patch The GarbageCollector installs an ObservationListener to ensure moved nodes are scanned as well. If there is an exception in the ObservationListener, this exception is captured (lastException), but only evaluated at the very end of the GC cycle, in Listener.stop() / GarbageCollector.stopScan() which is called as part of sweep(), before deleting unused items. This is quite late. For a large repository, scanning might take a few hours. If such an exception occurs, scanning should stop within a reasonable time (fail fast), and the exception should be thrown there. This is related to JCR-3340 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-138) Move client/server package in oak-mk to separate project
[ https://issues.apache.org/jira/browse/OAK-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294318#comment-13294318 ] Thomas Mueller commented on OAK-138: The data store that is currently in oak-mk (org.apache.jackrabbit.mk.blobs) can be re-used in other mk implementations. So I guess we should also create (at least) oak-mk-datastore or oak-mk-blob. Similarly, the jsop part shouldn't be part of oak-mk. We could move it to oak-commons or create a new project oak-commons-jsop. I guess the same goes for the cache implementation (oak-commons or oak-commons-cache). Move client/server package in oak-mk to separate project Key: OAK-138 URL: https://issues.apache.org/jira/browse/OAK-138 Project: Jackrabbit Oak Issue Type: Improvement Components: core, it, mk, run Affects Versions: 0.3 Reporter: Dominique Pfister Assignee: Dominique Pfister As a further cleanup step in OAK-13, I'd like to move the packages o.a.j.mk.client and o.a.j.mk.server and referenced classes in oak-mk to a separate project, e.g. oak-mk-remote. This new project will then be added as a dependency to: oak-core oak-run oak-it-mk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-138) Move client/server package in oak-mk to separate project
[ https://issues.apache.org/jira/browse/OAK-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294428#comment-13294428 ] Thomas Mueller commented on OAK-138: OK, my personal opinion is still to use as few projects as possible, but if everybody else things we need to do it then I can live with that. We can later decide if we want to create additional projects for the data store implementation and the log wrapper. Move client/server package in oak-mk to separate project Key: OAK-138 URL: https://issues.apache.org/jira/browse/OAK-138 Project: Jackrabbit Oak Issue Type: Improvement Components: core, it, mk, run Affects Versions: 0.3 Reporter: Dominique Pfister Assignee: Dominique Pfister As a further cleanup step in OAK-13, I'd like to move the packages o.a.j.mk.client and o.a.j.mk.server and referenced classes in oak-mk to a separate project, e.g. oak-mk-remote. This new project will then be added as a dependency to: oak-core oak-run oak-it-mk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OAK-140) PropertyState: data type of empty array property
Thomas Mueller created OAK-140: -- Summary: PropertyState: data type of empty array property Key: OAK-140 URL: https://issues.apache.org/jira/browse/OAK-140 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Thomas Mueller Priority: Minor Currently, there seems to be no way to retrieve the data type of a org.apache.jackrabbit.oak.api.PropertyState for empty arrays. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3333) The binary file entities are stored twice in the DB
[ https://issues.apache.org/jira/browse/JCR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293408#comment-13293408 ] Thomas Mueller commented on JCR-: - Hi, Well, I still don't know which version of Jackrabbit you are using. Anyway, I just saw the data store should be supported in Jackrabbit 1.6 (by the way, this is the Jackrabbit version, not the version of the JCR API, sorry I also got that wrong). As documented, you should be able to add the data store config to the repository.xml as follows: DataStore class=org.apache.jackrabbit.core.data.FileDataStore/ could u let me know any big changes or release note of JCR about API update? See the documentation. The binary file entities are stored twice in the DB --- Key: JCR- URL: https://issues.apache.org/jira/browse/JCR- Project: Jackrabbit Content Repository Issue Type: Bug Components: JCR 2.0 Environment: Windows 7, Linux Reporter: P.C.Sun Attachments: repository.xml We are using JCR in Liferay to store documents, which means all documents store in DB in binary. As these days, we found the size of DB is increasing very fast. So we run the SQL to get size of documents. The SQLs are like: 1. select sum(size_) from dlfileentry(liferay table to store file meta data, such as name, size); - All documents size recorded in dlentry table: The result is: 43330765874, which means around 40.36 GB 2. The DB size report is: around 95.97 GB. 3. Within these tables, there are two very big tables: j_pm_liferay_binval - 52.07GB j_v_pm_binval - 43.65 GB So the question is: if the document itself is only around 40.36 GB, what are those two tables storing? From the table itself, they are the all binval tables...Does it mean every document is stored twice or something. What's inside those tables? In this case, the DB increase around 30 GB within 3 months, really fast, any suggestion to improve this? As replied from Liferay: the table j_v_pm_binaval is to store the file version. However, for the new document, it's also stored, which we think it should be created only when a new version is generated. They also mentioned to solve this we need to change repository.xml, however, we don't have the answer how to deal with the old files, whether they will get lost if we change the config file. Please let me know whether it is possible to clean them in DB? Thank you very much and looking forwards to your reply. Best Regards. P.C.(JACK) SUN -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3333) The binary file entities are stored twice in the DB
[ https://issues.apache.org/jira/browse/JCR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293452#comment-13293452 ] Thomas Mueller commented on JCR-: - Hi, I suggest you first read all the documentation about this (data store, persistence manager, migration). You should migrate your data, that is, create a new repository, migrate the data, and delete the old repository. Regards, Thomas The binary file entities are stored twice in the DB --- Key: JCR- URL: https://issues.apache.org/jira/browse/JCR- Project: Jackrabbit Content Repository Issue Type: Bug Components: JCR 2.0 Environment: Windows 7, Linux Reporter: P.C.Sun Attachments: repository.xml We are using JCR in Liferay to store documents, which means all documents store in DB in binary. As these days, we found the size of DB is increasing very fast. So we run the SQL to get size of documents. The SQLs are like: 1. select sum(size_) from dlfileentry(liferay table to store file meta data, such as name, size); - All documents size recorded in dlentry table: The result is: 43330765874, which means around 40.36 GB 2. The DB size report is: around 95.97 GB. 3. Within these tables, there are two very big tables: j_pm_liferay_binval - 52.07GB j_v_pm_binval - 43.65 GB So the question is: if the document itself is only around 40.36 GB, what are those two tables storing? From the table itself, they are the all binval tables...Does it mean every document is stored twice or something. What's inside those tables? In this case, the DB increase around 30 GB within 3 months, really fast, any suggestion to improve this? As replied from Liferay: the table j_v_pm_binaval is to store the file version. However, for the new document, it's also stored, which we think it should be created only when a new version is generated. They also mentioned to solve this we need to change repository.xml, however, we don't have the answer how to deal with the old files, whether they will get lost if we change the config file. Please let me know whether it is possible to clean them in DB? Thank you very much and looking forwards to your reply. Best Regards. P.C.(JACK) SUN -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-138) Move client/server package in oak-mk to separate project
[ https://issues.apache.org/jira/browse/OAK-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293499#comment-13293499 ] Thomas Mueller commented on OAK-138: The log wrapper is a somewhat similar implementation. What about oak-mk-common, and use it for both the log wrapper and the remoting? (possibly, if this will ever be needed, other implementations could be added, for example a virtual repository wrapper which currently isn't needed). Move client/server package in oak-mk to separate project Key: OAK-138 URL: https://issues.apache.org/jira/browse/OAK-138 Project: Jackrabbit Oak Issue Type: Improvement Components: core, it, mk, run Affects Versions: 0.3 Reporter: Dominique Pfister Assignee: Dominique Pfister As a further cleanup step in OAK-13, I'd like to move the packages o.a.j.mk.client and o.a.j.mk.server and referenced classes in oak-mk to a separate project, e.g. oak-mk-remote. This new project will then be added as a dependency to: oak-core oak-run oak-it-mk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira