[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-04-25 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641579#comment-13641579
 ] 

Thomas Mueller commented on JCR-3534:
-

  Well, this message is an access token.
 The message data must not be a general access token.

I didn't say general access token, but it's still an access token: it grants 
access to read a certain binary. Sure, we can argue now what an access token 
exactly is.




 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-04-24 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640355#comment-13640355
 ] 

Thomas Mueller commented on JCR-3534:
-

I was chatting with Tommaso Teofili about the basic data structures, features, 
and security protocols. There are still a few open questions regarding the API, 
but here what we have so far:

DataIdentifier: The (unencryptged and unsigned) identifier of the binary, as 
already used by the Jackrabbit DataStore. Please note it could be a reference 
to a file, or, for small binaries, contain the data itself.

DataStoreSecret: a secret value that needs to be configured to be the same 
value in all repositories that share the same physical data store. It is used 
as the basis to encrypt and decrypt the DataIdentifier, and to sign and verify 
the signature. This could be a configuration parameter of the DataStore element 
in the repository.xml, but then we would probably need to change each DataStore 
implementation were we want support for the new feature? To avoid that, should 
be add a new element to the repository.xml? Not sure what is the easiest.

BinaryReferenceMessage: The encrypt DataIdentifier, the random salt, the expiry 
time. Plus the signature of all of that. The encryption key for the 
DataIdentifier is the (SHA-1) hash of the random salt combined with the 
DataStoreSecret (this is to avoid re-using the same encryption key for all 
BinaryReferenceMessages). The random salt is per message. The expiry time is 
the maximum system time up to when to accept the BinaryReferenceMessage (same 
as for time limited S3 URLs), for example the system time the message was 
generated plus 2 hours or so. The signature is the HMAC of the rest of the 
message, with the DataStoreSecret as the key. To simplify development/support, 
the message should readable, for example JSON or an URL. Example (shortened): 
{encryptedDataId:0123456789abcd, salt:1234, expiry:3456, signature:4567}. 
This will also allow to change the algorithms in the future. For now, we could 
use the following algorithms / formats: 128 bit DataStoreSecret and salt 
(generated with a SecureRandom); AES-256 encryption / AES-CTR mode; expiry: 
milliseconds since 1970 UTC; signature: HMAC-SHA-1.


 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-04-24 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640369#comment-13640369
 ] 

Thomas Mueller commented on JCR-3534:
-

Well, we wanted to make it secure, right?

Expiry: this is to avoid reply attacks. It is the same mechanism as used for 
S3, see 
http://stackoverflow.com/questions/5419264/best-practice-amazons3-url-sharing

The identifier may be the data itself, so if it's not encrypted then the data 
would be included in the message. Without it, the message would no longer have 
the meaning of you have access to this binary but it would sometimes mean 
this is the data.

 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-04-24 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640461#comment-13640461
 ] 

Thomas Mueller commented on JCR-3534:
-

Jukka, I understand now what your attack scenario is. However, this is not the 
only scenario. See the comment above from Chetan Mehrotra So may be we can 
have some service provided by DataStore which can provide such safe ids which 
can be passed around and still be secure.

Having expiry and encrypting the identifier would prevent further damage in 
case the BinaryReferenceMessage leaks. It basically allows to use it within an 
email, embed it in a web site,... as described in 
http://stackoverflow.com/questions/5419264/best-practice-amazons3-url-sharing

 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-04-24 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640533#comment-13640533
 ] 

Thomas Mueller commented on JCR-3534:
-

To avoid further delay, we could already start implement what we seem to agree 
on (well, lets see :-). That is, we need a secret, that is used as the key to 
sign the message and verify the signature. One solution is a shared secret, 
configured in the repository.xml, in a new tag. Or it could be configured 
within the data store, but then each data store where we want to support this 
feature would need to be changed. Are the any other (simpler) solutions? The 
simplest to implement would probably be a system property, but that feels wrong 
:-)

 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3547) Datastore GC doesn't reset updateModifiedDateOnAccess on datastore

2013-04-09 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626313#comment-13626313
 ] 

Thomas Mueller commented on JCR-3547:
-

The patch looks good, thanks!

Because it's quite a big change, I think somebody else should have a look as 
well, specially the changes in the RepositoryContext and RepositoryImpl.

I also think it's a good idea to run the GC tests serially, but I wonder why 
they didn't fail before when running concurrently, if they actually accessed 
the same repository concurrently? Or don't they access the same repository? But 
then they should still be able to run concurrently, right?


 Datastore GC doesn't reset updateModifiedDateOnAccess on datastore
 --

 Key: JCR-3547
 URL: https://issues.apache.org/jira/browse/JCR-3547
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: jackrabbit-core
Affects Versions: 2.4, 2.5
Reporter: Shashank Gupta
 Attachments: GarbageCollector.java.patch, 
 GC_prevent_concurrent_run_app2.patch, GC_prevent_concurrnet_run_app1.patch


 In mark phase, GC updates store.updateModifiedDateOnAccess with current time, 
 so that datastore updates record’s lastModified timestamp upon subsequent 
 read/scan.
  But  GC doesn't reset it to 0. So even after GC completes, datastore will 
 continue updating lastModified timestamp on read invocations and it will have 
 performance impact. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3547) Datastore GC doesn't reset updateModifiedDateOnAccess on datastore

2013-04-09 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626431#comment-13626431
 ] 

Thomas Mueller commented on JCR-3547:
-

Approach 2 is much better I think. If serializing the tests is not needed, then 
I think we shouldn't do it. That way there is less risk to introduce bugs / 
change behavior.

Do you want to submit a new patch where the tests are not moved? I can do it 
myself if you want (it would just take a bit more time I guess).

 Should I go for RTC?

No, I think voting is not needed, just somebody else reviewing the (final, 
smaller) patch.

 Datastore GC doesn't reset updateModifiedDateOnAccess on datastore
 --

 Key: JCR-3547
 URL: https://issues.apache.org/jira/browse/JCR-3547
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: jackrabbit-core
Affects Versions: 2.4, 2.5
Reporter: Shashank Gupta
 Attachments: GarbageCollector.java.patch, 
 GC_prevent_concurrent_run_app2.patch, GC_prevent_concurrnet_run_app1.patch


 In mark phase, GC updates store.updateModifiedDateOnAccess with current time, 
 so that datastore updates record’s lastModified timestamp upon subsequent 
 read/scan.
  But  GC doesn't reset it to 0. So even after GC completes, datastore will 
 continue updating lastModified timestamp on read invocations and it will have 
 performance impact. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3547) Datastore GC doesn't reset updateModifiedDateOnAccess on datastore

2013-04-08 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625221#comment-13625221
 ] 

Thomas Mueller commented on JCR-3547:
-

Sorry I didn't see this issue before.

Yes, the value should be reset to 0. 

There is one exception, and I'm not sure if that's a possible / common use 
case: it shouldn't be reset if another garbage collection is running. I wonder 
if we could detect this reliably. Maybe only reset if the current value matches 
the value in the GarbageCollector class?

 Datastore GC doesn't reset updateModifiedDateOnAccess on datastore
 --

 Key: JCR-3547
 URL: https://issues.apache.org/jira/browse/JCR-3547
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: jackrabbit-core
Affects Versions: 2.4, 2.5
Reporter: Shashank Gupta
 Attachments: GarbageCollector.java.patch


 In mark phase, GC updates store.updateModifiedDateOnAccess with current time, 
 so that datastore updates record’s lastModified timestamp upon subsequent 
 read/scan.
  But  GC doesn't reset it to 0. So even after GC completes, datastore will 
 continue updating lastModified timestamp on read invocations and it will have 
 performance impact. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3547) Datastore GC doesn't reset updateModifiedDateOnAccess on datastore

2013-04-08 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625281#comment-13625281
 ] 

Thomas Mueller commented on JCR-3547:
-

 imo repository should forbid user to run two simultaneous gc.

That's true. I guess it would be the best solution. I can't currently think of 
a reason to run two GCs concurrently.

Maybe only reset if the current value matches the value in the 
GarbageCollector class? can you explain it more? ds interface doesn't expose a 
method to retrieve this value. 

We could add a getter method. But thinking about it again, it wouldn't be good 
enough. Just comparing the current value wouldn't be enough, as the second run 
(if we allow it) could be started at the exact same time, or started right 
after calling the getter and before resetting the value.

So I guess we should add code to disallow running GC concurrently. Do you want 
to do that, or should I?

 Datastore GC doesn't reset updateModifiedDateOnAccess on datastore
 --

 Key: JCR-3547
 URL: https://issues.apache.org/jira/browse/JCR-3547
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: jackrabbit-core
Affects Versions: 2.4, 2.5
Reporter: Shashank Gupta
 Attachments: GarbageCollector.java.patch


 In mark phase, GC updates store.updateModifiedDateOnAccess with current time, 
 so that datastore updates record’s lastModified timestamp upon subsequent 
 read/scan.
  But  GC doesn't reset it to 0. So even after GC completes, datastore will 
 continue updating lastModified timestamp on read invocations and it will have 
 performance impact. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-03-21 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608971#comment-13608971
 ] 

Thomas Mueller commented on JCR-3534:
-

Sounds good. In addition, I think we should consider having a mechanism to 
expire the identifier, similar to Amazon S3. I found some information here: 
http://stackoverflow.com/questions/14414455/amazon-s3-generating-an-expiring-link-using-ruby-1-9-3

What do you think about the limited lifetime of an identifier? I think it might 
be overkill, but I'm not sure.

 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-03-20 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607707#comment-13607707
 ] 

Thomas Mueller commented on JCR-3534:
-

(d) Content ids could expire after some time (for example one minute). One way 
to do that is to add the number of minutes since 1970 to the hash, then encrypt 
this using a datastore wide secret key, and use this as the content identifier. 
The receiver repository would decrypt the identifier and check time.

 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-03-18 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604961#comment-13604961
 ] 

Thomas Mueller commented on JCR-3534:
-

The patch looks good to me.

 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-03-18 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605186#comment-13605186
 ] 

Thomas Mueller commented on JCR-3534:
-

 I would prefer the simpler return of null for the not-found case instead of 
 the ItemNotFoundException

I agree.

 leaky abstraction that may come back to haunt us for example if someone who 
 doesn't realize the security implications

I think what Felix described is a valid use case. If there are better ways to 
solve the problem, that would be great, but I also currently don't see other 
solutions that would work well.

 adjust the deployment configuration if you want to make those repositories 
 share data more intimately

Could you provide more details? How could you reference a binary stored in one 
repository in the other repository, if the repositories are not running in the 
same process?

 the implementation may well be something like hash(revision + path) that 
 can't be reversed for use in something like getValueByContentId().

This is an idea that is new to me, could you tell us more about it? I believe 
we can and should support getValueByContentId() in Oak in the same way as in 
Jackrabbit 2.x. I don't see a reason to use hash(revision + path).


 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method

2013-03-18 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605201#comment-13605201
 ] 

Thomas Mueller commented on JCR-3534:
-

I think the general problem here is: how do you avoid sending the binary if the 
binary is already there? The repositories don't necessarily need to share the 
data store. 

Of course there are security questions, but then all operations have security 
questions (uploading a huge binary can fill the disk so we would need quotas in 
theory; a rogue remote client might already access binaries he is not allowed).

 Add JackrabbitSession.getValueByContentId method
 

 Key: JCR-3534
 URL: https://issues.apache.org/jira/browse/JCR-3534
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-api, jackrabbit-core
Affects Versions: 2.6
Reporter: Felix Meschberger
 Attachments: JCR-3534.patch


 we have a couple of use cases, where we would like to leverage the global 
 data store to prevent sending around and copying around large binary data 
 unnecessarily: We have two separate Jackrabbit instances configured to use 
 the same DataStore (for the sake of this discussion assume we have the 
 problems of concurrent access and garbage collection under control). When 
 sending content from one instance to the other instance we don't want to send 
 potentially large binary data (e.g. video files) if not needed.
 The idea is for the sender to just send the content identity from 
 JackrabbitValue.getContentIdentity(). The receiver would then check whether 
 the such content already exists and would reuse if so:
 String ci = contentIdentity_from_sender;
 try {
 Value v = session.getValueByContentIdentity(ci);
 Property p = targetNode.setProperty(propName, v);
 } catch (ItemNotFoundException ie) {
 // unknown or invalid content Identity
 } catch (RepositoryException re) {
 // some other exception
 }
 Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
 would allow for round tripping the JackrabbitValue.getContentIdentity() 
 preventing superfluous binary data copying and moving. 
 See also the dev@ thread 
 http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (JCR-3529) Property2Index: node type is ignored

2013-03-05 Thread Thomas Mueller (JIRA)
Thomas Mueller created JCR-3529:
---

 Summary: Property2Index: node type is ignored
 Key: JCR-3529
 URL: https://issues.apache.org/jira/browse/JCR-3529
 Project: Jackrabbit Content Repository
  Issue Type: Bug
Reporter: Thomas Mueller


The Property2Index filters by node type, so each index only contains data for a 
list of node types.

But the getCost and and query methods ignore the node type.

Because of that, if there are multiple indexes for the same property, each 
index filtering on certain node types, then running a query might pick the 
wrong index.

The result is that the query doesn't return any data where it should, because 
the index doesn't return the right nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3509) Workspace maxIdleTime parameter not working

2013-01-31 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13567687#comment-13567687
 ] 

Thomas Mueller commented on JCR-3509:
-

I wonder, do you want to change the maxIdleTime in the workspace.xml after the 
repository and workspace was created? It seems the workspace.xml file is stored 
in the database in your case, so I guess you would have to change it in the 
database as well. But I'm not completely sure.

 Workspace maxIdleTime parameter not working
 ---

 Key: JCR-3509
 URL: https://issues.apache.org/jira/browse/JCR-3509
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: config
Affects Versions: 2.4.3
 Environment: JSF, SPRING
Reporter: Sarfaraaz ASLAM
 Attachments: derby.jackrabbit.repository.xml, JcrConfigurer.java


 would like to set the maximum number of seconds that a workspace can remain 
 unused before the workspace is automatically closed through maxIdleTime 
 parameter but this seems not to work. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3493) OUTER JOIN tests expect incorrect results

2013-01-24 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561528#comment-13561528
 ] 

Thomas Mueller commented on JCR-3493:
-

The patch looks good! There is one typo in the issue number: !--JCR-3493, 
JCR-2498-- should be !--JCR-3493, JCR-3498--


 OUTER JOIN tests expect incorrect results
 -

 Key: JCR-3493
 URL: https://issues.apache.org/jira/browse/JCR-3493
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: jackrabbit-jcr-tests
Affects Versions: 2.5.2
Reporter: Randall Hauch
 Fix For: 2.5.3, 2.6, 2.7

 Attachments: jcr-3493-tests.patch


 Two of the OUTER JOIN tests appears to expect incorrect results:
 - 
 org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testRightOuterJoin1
 - 
 org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testLeftOuterJoin2
 Both tests are set up the same way: two nodes are created:
   /testroot/workarea/node1 {jcr:primaryType=nt:unstructured, prop1=yikqysrwur}
   /testroot/workarea/node1/node2 {jcr:primaryType=nt:unstructured, 
 prop1=yikqysrwur, prop2=yikqysrwur, jcr:mixinTypes=[mix:referenceable], 
 jcr:uuid=c9118bb2-922e-4612-acd7-7152105f5684}
 A single string is randomly generated and used for the values for prop1 and 
 prop2, and only the second node is made to be mix:referenceable. 
 The testRightOuterJoin1 test runs this query:
   SELECT * FROM [nt:unstructured] AS left 
   RIGHT OUTER JOIN [nt:unstructured] AS right 
   ON left.prop1 = right.prop2 
   WHERE ISDESCENDANTNODE(right,'/testroot/workarea')
 The left side of the join has at least two tuples (one for node1, one for 
 node2, and other nodes which do not have a 'prop1' value), and column of 
 interest is the prop1 column. Thus the left side tuples (or the parts we 
 care about for the join) look like:
   [ node1, yikqysrwur ]
   [ node2, yikqysrwur ]
   [ …, null ]
 The right side of the join has only two tuples (node1 and node2) because 
 of the ISDESCENDANTNODE criteria, and the only column of interest is the 
 prop2 column. Thus, the right side tuples (or the parts we care about for 
 the join) look like:
   [ node1, null ]
   [ node2, yikqysrwur ]
 When we perform a RIGHT OUTER JOIN, we have to **include all the tuples on 
 the right** even if they don't match a value on the left tuples. Thus, 
 node1 must be included in the results, and because it has a null value for 
 the prop2 column will not match any of the tuples on the left (since a null 
 value is not equal to another null value in the case of join criteria). So 
 the result set should contain these combinations of nodes:
   [ null, node1 ]
   [ node1, node2 ]
   [ node2, node2 ]
 However, the test expects the following result:
   [ node1, node2 ]
   [ node2, node2 ]
 This is incorrect to me, because it is missing the [node1, null] tuple that 
 was on the right side of the join.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3493) OUTER JOIN tests expect incorrect results

2013-01-24 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561684#comment-13561684
 ] 

Thomas Mueller commented on JCR-3493:
-

Look good to me, thanks!

 OUTER JOIN tests expect incorrect results
 -

 Key: JCR-3493
 URL: https://issues.apache.org/jira/browse/JCR-3493
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: jackrabbit-jcr-tests
Affects Versions: 2.5.2
Reporter: Randall Hauch
 Fix For: 2.5.3, 2.6, 2.7

 Attachments: jcr-3493-tests-2.patch, jcr-3493-tests.patch


 Two of the OUTER JOIN tests appears to expect incorrect results:
 - 
 org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testRightOuterJoin1
 - 
 org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testLeftOuterJoin2
 Both tests are set up the same way: two nodes are created:
   /testroot/workarea/node1 {jcr:primaryType=nt:unstructured, prop1=yikqysrwur}
   /testroot/workarea/node1/node2 {jcr:primaryType=nt:unstructured, 
 prop1=yikqysrwur, prop2=yikqysrwur, jcr:mixinTypes=[mix:referenceable], 
 jcr:uuid=c9118bb2-922e-4612-acd7-7152105f5684}
 A single string is randomly generated and used for the values for prop1 and 
 prop2, and only the second node is made to be mix:referenceable. 
 The testRightOuterJoin1 test runs this query:
   SELECT * FROM [nt:unstructured] AS left 
   RIGHT OUTER JOIN [nt:unstructured] AS right 
   ON left.prop1 = right.prop2 
   WHERE ISDESCENDANTNODE(right,'/testroot/workarea')
 The left side of the join has at least two tuples (one for node1, one for 
 node2, and other nodes which do not have a 'prop1' value), and column of 
 interest is the prop1 column. Thus the left side tuples (or the parts we 
 care about for the join) look like:
   [ node1, yikqysrwur ]
   [ node2, yikqysrwur ]
   [ …, null ]
 The right side of the join has only two tuples (node1 and node2) because 
 of the ISDESCENDANTNODE criteria, and the only column of interest is the 
 prop2 column. Thus, the right side tuples (or the parts we care about for 
 the join) look like:
   [ node1, null ]
   [ node2, yikqysrwur ]
 When we perform a RIGHT OUTER JOIN, we have to **include all the tuples on 
 the right** even if they don't match a value on the left tuples. Thus, 
 node1 must be included in the results, and because it has a null value for 
 the prop2 column will not match any of the tuples on the left (since a null 
 value is not equal to another null value in the case of join criteria). So 
 the result set should contain these combinations of nodes:
   [ null, node1 ]
   [ node1, node2 ]
   [ node2, node2 ]
 However, the test expects the following result:
   [ node1, node2 ]
   [ node2, node2 ]
 This is incorrect to me, because it is missing the [node1, null] tuple that 
 was on the right side of the join.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (JCR-3460) PropertyIndex uses TraversingCursor but should not

2012-11-22 Thread Thomas Mueller (JIRA)
Thomas Mueller created JCR-3460:
---

 Summary: PropertyIndex uses TraversingCursor but should not
 Key: JCR-3460
 URL: https://issues.apache.org/jira/browse/JCR-3460
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: query
Reporter: Thomas Mueller
Assignee: Thomas Mueller


The org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex uses the 
traversing cursor (that traverses over the whole repository) when there is no 
index. This is not how the index mechanism is supposed to work: if there is no 
property index, then the cost function of the property index should return 
infinity or max value, so that the property index isn't used.

According to my test the PropertyIndex never really falls back to traversing, 
so this might just be defensive programming. However, in this case it would 
be better if the code would throw an exception, otherwise we risk not seeing 
the bug in the PropertyIndex cost method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3460) PropertyIndex uses TraversingCursor but should not

2012-11-22 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502734#comment-13502734
 ] 

Thomas Mueller commented on JCR-3460:
-

Revision 1412513

 PropertyIndex uses TraversingCursor but should not
 --

 Key: JCR-3460
 URL: https://issues.apache.org/jira/browse/JCR-3460
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: query
Reporter: Thomas Mueller
Assignee: Thomas Mueller

 The org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex uses the 
 traversing cursor (that traverses over the whole repository) when there is no 
 index. This is not how the index mechanism is supposed to work: if there is 
 no property index, then the cost function of the property index should return 
 infinity or max value, so that the property index isn't used.
 According to my test the PropertyIndex never really falls back to traversing, 
 so this might just be defensive programming. However, in this case it would 
 be better if the code would throw an exception, otherwise we risk not seeing 
 the bug in the PropertyIndex cost method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3460) PropertyIndex uses TraversingCursor but should not

2012-11-22 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502736#comment-13502736
 ] 

Thomas Mueller commented on JCR-3460:
-

The NodeTypeIndex also currently uses the TraversingCursor

 PropertyIndex uses TraversingCursor but should not
 --

 Key: JCR-3460
 URL: https://issues.apache.org/jira/browse/JCR-3460
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: query
Reporter: Thomas Mueller
Assignee: Thomas Mueller

 The org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex uses the 
 traversing cursor (that traverses over the whole repository) when there is no 
 index. This is not how the index mechanism is supposed to work: if there is 
 no property index, then the cost function of the property index should return 
 infinity or max value, so that the property index isn't used.
 According to my test the PropertyIndex never really falls back to traversing, 
 so this might just be defensive programming. However, in this case it would 
 be better if the code would throw an exception, otherwise we risk not seeing 
 the bug in the PropertyIndex cost method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (JCR-3460) PropertyIndex uses TraversingCursor but should not

2012-11-22 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-3460.
-

Resolution: Fixed

Revision 1412514

 PropertyIndex uses TraversingCursor but should not
 --

 Key: JCR-3460
 URL: https://issues.apache.org/jira/browse/JCR-3460
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: query
Reporter: Thomas Mueller
Assignee: Thomas Mueller

 The org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex uses the 
 traversing cursor (that traverses over the whole repository) when there is no 
 index. This is not how the index mechanism is supposed to work: if there is 
 no property index, then the cost function of the property index should return 
 infinity or max value, so that the property index isn't used.
 According to my test the PropertyIndex never really falls back to traversing, 
 so this might just be defensive programming. However, in this case it would 
 be better if the code would throw an exception, otherwise we risk not seeing 
 the bug in the PropertyIndex cost method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3461) SQL2 query returns no results because @ in path is ignored

2012-11-22 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502753#comment-13502753
 ] 

Thomas Mueller commented on JCR-3461:
-

Could you provide an example query please?

 SQL2 query returns no results because @ in path is ignored
 --

 Key: JCR-3461
 URL: https://issues.apache.org/jira/browse/JCR-3461
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: jackrabbit-jcr-commons
Affects Versions: 2.5
Reporter: Joel Richard
Priority: Minor
  Labels: queryparser, sql2

 If you search for nodes under a given path with ISDESCENDANTNODE and the path 
 contains an @, no results are returned because the @ is removed from the path 
 and then the path cannot be found. 
 The @ gets lost in the 
 org.apache.jackrabbit.commons.query.sql2.Parser#readName method. The reason 
 is that the initialize method assigns the wrong type for @.
 Probably the problem exists for other special characters as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (JCR-3462) Documentation for the PropertyIndex

2012-11-22 Thread Thomas Mueller (JIRA)
Thomas Mueller created JCR-3462:
---

 Summary: Documentation for the PropertyIndex
 Key: JCR-3462
 URL: https://issues.apache.org/jira/browse/JCR-3462
 Project: Jackrabbit Content Repository
  Issue Type: Bug
Reporter: Thomas Mueller
Priority: Minor


This ticket is to improve the documentation of the PropertyIndex implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (JCR-3462) Documentation for the PropertyIndex

2012-11-22 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502848#comment-13502848
 ] 

Thomas Mueller commented on JCR-3462:
-

Initial documentation in revision 1412613

 Documentation for the PropertyIndex
 ---

 Key: JCR-3462
 URL: https://issues.apache.org/jira/browse/JCR-3462
 Project: Jackrabbit Content Repository
  Issue Type: Bug
Reporter: Thomas Mueller
Priority: Minor

 This ticket is to improve the documentation of the PropertyIndex 
 implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OAK-28) Query implementation

2012-08-31 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445785#comment-13445785
 ] 

Thomas Mueller commented on OAK-28:
---

Thanks Chetan! Fixed in revision 1379373.

 Query implementation
 

 Key: OAK-28
 URL: https://issues.apache.org/jira/browse/OAK-28
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: core, jcr
Reporter: Thomas Mueller
Assignee: Thomas Mueller
  Labels: query
 Attachments: OakToJcrQueryTreeConverter.java


 A query engine needs to be implemented. 
 A query parser in oak-core should be able to handle xpath, sql2 and 
 optionally other query languages. The jcr component must generate a valid 
 query in one of those languages from JQOM queries and pass that statement 
 along with value bindings, limit, offset, and name space mappings to the 
 oak-core. 
 We need to:
 * Define the oak-core API for handling queries. How are do we handle name 
 space mappings, limit and offset
 * Implement a query builder in the jcr component which takes care of 
 translating JQOM queries to statements in string form 
 * Implement a query parser in oak-core and decide on a versatile AST 
 representation which works with all query languages and which is extensible 
 to future query languages.
 * Implement the actual query execution engine which interprets the query AST

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OAK-288) QueryTests should use the NodeStore apis

2012-08-31 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445804#comment-13445804
 ] 

Thomas Mueller commented on OAK-288:


 util class

That looks good to me.

 bypassing the CommitHook and going directly to the mk level 
 while ignoring the oak-core layer is poor separation of concerns.

You already wrote that, and I already wrote I used the MicroKernel API because 
it is a stable API, while the oak-core API was not stable when I wrote the 
tests. Actually the oak-core API didn't exist yet. Now that the oak-core API is 
ready, it does make sense to use it.

 the current property index implementation doesn't play nice 
 with existing notification mechanisms (like the CommitHook).

Sorry I don't understand, what do you mean with 'doesn't play nice'?

 the query tests pass if I update them to use the NodeStore, except the 
 'explain' ones.

Hm, they should work if the same indexes are available... could you post the 
result you get?



 QueryTests should use the NodeStore apis
 

 Key: OAK-288
 URL: https://issues.apache.org/jira/browse/OAK-288
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core
Reporter: Alex Parvulescu
 Attachments: OAK-288-jsop-util.patch


 Currently the existing oak query tests come in form of a script file [0] 
 that contains 
  - commit commands which will be executed directly against the mk.
  - select commands
  - expected results
 while this was good for fast prototyping we should refactor the tests to use 
 proper unit tests.
 Arguments for refactoring:
  - overall java style unit tests, reduce the complexity of running this setup
  - proper reporting unit test failures
  - executing the commit commands directly against the mk breaks the 
 {{CommitHook}} mechanism because the commits will pass unnoticed
  - proper separation of concerns - oak core should not directly reference the 
 mk, it should pass through exisiting apis like the {{NodeStore}}
 [0] 
 http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/test/resources/org/apache/jackrabbit/oak/query/sql2.txt?view=markup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OAK-288) QueryTests should use the NodeStore apis

2012-08-30 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444797#comment-13444797
 ] 

Thomas Mueller commented on OAK-288:


I'd like to keep the internal DSL style for those tests, as it has proven very 
useful to me (very easy to add tests, very easy to verify and change the 
expected results). 

I used the MicroKernel API because it is a stable API, while the oak-core API 
was not stable when I wrote the tests. Those tests are testing the parser and 
the internals of the query engine. The tests are useful even if the changes 
don't go though the CommitHook, as the tests are not really meant to test the 
CommitHook and index implementations. If we want the changes to go though the 
oak-core API instead of using the MicroKernel API directly, the MicroKernel API 
DSL could be replaced with a oak-core DSL. Or even simpler:

We could add a static node structure plus custom indexes (using the oak-core 
API) before executing the script. That way we could test index implementations 
that require the CommitHook to index data.


 QueryTests should use the NodeStore apis
 

 Key: OAK-288
 URL: https://issues.apache.org/jira/browse/OAK-288
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core
Reporter: Alex Parvulescu

 Currently the existing oak query tests come in form of a script file [0] 
 that contains 
  - commit commands which will be executed directly against the mk.
  - select commands
  - expected results
 while this was good for fast prototyping we should refactor the tests to use 
 proper unit tests.
 Arguments for refactoring:
  - overall java style unit tests, reduce the complexity of running this setup
  - proper reporting unit test failures
  - executing the commit commands directly against the mk breaks the 
 {{CommitHook}} mechanism because the commits will pass unnoticed
  - proper separation of concerns - oak core should not directly reference the 
 mk, it should pass through exisiting apis like the {{NodeStore}}
 [0] 
 http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/test/resources/org/apache/jackrabbit/oak/query/sql2.txt?view=markup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OAK-288) QueryTests should use the NodeStore apis

2012-08-30 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444801#comment-13444801
 ] 

Thomas Mueller commented on OAK-288:


 proper reporting unit test failures

If there is a test failure, what I usually do is compare the expected result 
(in the src/test directory) with the result in the target directory.

 the test in sql2.txt are somewhat hard to debug individually

What I usually do to debug a test individually is move the test case to the top 
of the script, so it is run first.

 QueryTests should use the NodeStore apis
 

 Key: OAK-288
 URL: https://issues.apache.org/jira/browse/OAK-288
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core
Reporter: Alex Parvulescu

 Currently the existing oak query tests come in form of a script file [0] 
 that contains 
  - commit commands which will be executed directly against the mk.
  - select commands
  - expected results
 while this was good for fast prototyping we should refactor the tests to use 
 proper unit tests.
 Arguments for refactoring:
  - overall java style unit tests, reduce the complexity of running this setup
  - proper reporting unit test failures
  - executing the commit commands directly against the mk breaks the 
 {{CommitHook}} mechanism because the commits will pass unnoticed
  - proper separation of concerns - oak core should not directly reference the 
 mk, it should pass through exisiting apis like the {{NodeStore}}
 [0] 
 http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/test/resources/org/apache/jackrabbit/oak/query/sql2.txt?view=markup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (JCR-3406) Journal doUnlock sometimes not called on repository shutdown

2012-08-28 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-3406.
-

   Resolution: Fixed
Fix Version/s: 2.6

 Journal doUnlock sometimes not called on repository shutdown
 

 Key: JCR-3406
 URL: https://issues.apache.org/jira/browse/JCR-3406
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller
 Fix For: 2.6


 When the repository is shut down, the method AbstractJournal.doUnlock(boolean 
 successful) is sometimes not called. The method Journal.close is called, but 
 when the journal implementation uses a reentrant lock it can't unlock because 
 close is called from a different thread.
 The reason for not calling doUnlock is that ClusterNode.stop() sets the 
 status to stopped, which causes all WorkspaceUpdateChannel methods to not 
 work, including updateCommitted and updateCancelled. Therefore, it is 
 possible that an operation is started but never completed nor cancelled.
 To solve the issue, I found that it is enough to let updateCommitted and 
 updateCancelled to complete, so that operations that are in progress can 
 finish.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (OAK-264) MicroKernel.diff for depth limited, unspecified changes

2012-08-21 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-264:
--

 Summary: MicroKernel.diff for depth limited, unspecified changes
 Key: OAK-264
 URL: https://issues.apache.org/jira/browse/OAK-264
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: mk
Reporter: Thomas Mueller


Currently the MicroKernel API specifies for the method diff, if the depth 
parameter is used, that unspecified changes below a certain path can be 
returned as:

  ^ /some/path

I would prefer the slightly more verbose:

  ^ /some/path: {}

Reason: It is similar to how getNode() returns node names if the depth limited: 
some:{path:{}}, and it makes parsing unambiguous: there is always a ':' 
after the path, whether a property was changed or a node was changed. Without 
the colon, the parser needs to look ahead to decide whether a node was changed 
or a property was changed (the token after the path could be the start of the 
next operation). And we could never ever support ':' as an operation because 
that would make parsing ambiguous.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-264) MicroKernel.diff for depth limited, unspecified changes

2012-08-21 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438565#comment-13438565
 ] 

Thomas Mueller commented on OAK-264:


Here the proposed patch. The project oak-it/mk doesn't need to be changed 
(there is a test, but it doesn't check for the end of the diff).

{code}
Index: src/main/java/org/apache/jackrabbit/mk/model/tree/DiffBuilder.java
===
--- src/main/java/org/apache/jackrabbit/mk/model/tree/DiffBuilder.java  
(revision 1375443)
+++ src/main/java/org/apache/jackrabbit/mk/model/tree/DiffBuilder.java  
(working copy)
@@ -144,7 +144,8 @@
 super.childNodeChanged(name, before, after);
 } else {
 buff.tag('^');
-buff.value(p);
+buff.key(p);
+buff.object().endObject();
 buff.newline();
 }
 ++levels;
Index: src/main/java/org/apache/jackrabbit/mk/api/MicroKernel.java
===
--- src/main/java/org/apache/jackrabbit/mk/api/MicroKernel.java (revision 
1375021)
+++ src/main/java/org/apache/jackrabbit/mk/api/MicroKernel.java (working copy)
@@ -193,7 +193,7 @@
  * The {@code depth} limit applies to the subtree rooted at {@code path}.
  * It allows to limit the depth of the diff, i.e. only changes up to the
  * specified depth will be included in full detail. changes at paths 
exceeding
- * the specified depth limit will be reported as {@code ^/some/path},
+ * the specified depth limit will be reported as {@code ^ /some/path: 
{}},
  * indicating that there are unspecified changes below that path.
  * table border=1
  *   tr
{code}

 MicroKernel.diff for depth limited, unspecified changes
 ---

 Key: OAK-264
 URL: https://issues.apache.org/jira/browse/OAK-264
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mk
Reporter: Thomas Mueller
Priority: Minor

 Currently the MicroKernel API specifies for the method diff, if the depth 
 parameter is used, that unspecified changes below a certain path can be 
 returned as:
   ^ /some/path
 I would prefer the slightly more verbose:
   ^ /some/path: {}
 Reason: It is similar to how getNode() returns node names if the depth 
 limited: some:{path:{}}, and it makes parsing unambiguous: there is 
 always a ':' after the path, whether a property was changed or a node was 
 changed. Without the colon, the parser needs to look ahead to decide whether 
 a node was changed or a property was changed (the token after the path could 
 be the start of the next operation). And we could never ever support ':' as 
 an operation because that would make parsing ambiguous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-260) Avoid the Turkish Locale Problem

2012-08-20 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-260:
--

 Summary: Avoid the Turkish Locale Problem
 Key: OAK-260
 URL: https://issues.apache.org/jira/browse/OAK-260
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller


We currently use String.toUpperCase() and String.toLowerCase() and in some 
cases where it is not appropriate. When running using the Turkish profile, this 
will not work as expected. See also 

http://mattryall.net/blog/2009/02/the-infamous-turkish-locale-bug

Problematic are String.toUpperCase(), String.toLowerCase(). 
String.equalsIgnoreCase(..) isn't a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-260) Avoid the Turkish Locale Problem

2012-08-20 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437877#comment-13437877
 ] 

Thomas Mueller commented on OAK-260:


The main problems should be fixed in r1375026, r1375028, and r1375030. However, 
we should have a way to ensure no new such bugs are introduced. One option is 
to run the test case using the Turkish locale, another might be to configure 
Checkstyle to detect such problems.

 Avoid the Turkish Locale Problem
 --

 Key: OAK-260
 URL: https://issues.apache.org/jira/browse/OAK-260
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller

 We currently use String.toUpperCase() and String.toLowerCase() and in some 
 cases where it is not appropriate. When running using the Turkish profile, 
 this will not work as expected. See also 
 http://mattryall.net/blog/2009/02/the-infamous-turkish-locale-bug
 Problematic are String.toUpperCase(), String.toLowerCase(). 
 String.equalsIgnoreCase(..) isn't a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-262) Query: support pseudo properties like jcr:score() and rep:excerpt()

2012-08-20 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-262:
--

 Summary: Query: support pseudo properties like jcr:score() and 
rep:excerpt()
 Key: OAK-262
 URL: https://issues.apache.org/jira/browse/OAK-262
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller


The query engine currently only supports properties that are stored within a 
node. It doesn't currently support pseudo-properties that are provided by an 
index, for example jcr:score() and rep:excerpt(). 

To support such properties, I suggest to change the Cursor interface to return 
an IndexRow (a new class that can return such pseudo-properties as well as 
the path) instead of just the path.

This may also speed up queries that don't require to load the node itself (if 
access rights can be checked efficiently or don't need to be checked for a 
given query).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (JCR-3406) Journal doUnlock sometimes not called on repository shutdown

2012-08-15 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434890#comment-13434890
 ] 

Thomas Mueller commented on JCR-3406:
-

Please note the patch only applies to updateCancelled and updateCommitted. As 
far as I see, those methods are called after updateCreated / updatePrepared, 
and those two methods do test for status == STARTED (I didn't change that). So 
I don't see how this could affect startup, but I might be wrong of course.

 I'm just afraid we might create some race condition on startup. The code 
 seems to be designed to first trigger a sync() call before any other 
 operations are done which seems correct to me. 

Sorry I don't understand, the patch I made doesn't affect sync() as far as I 
see. It is only supposed to ensure that the journal is unlocked if it was 
locked.

It's quite hard to say if the patch would break something, just on a 
theoretical basis (without test cases). I do have an upstream test case that 
shows the current behavior is problematic, and the patch fixes that, so I 
suggest I will commit my patch next week, unless somebody can come up with a 
better patch, or a test case that shows my patch is problematic.


 Journal doUnlock sometimes not called on repository shutdown
 

 Key: JCR-3406
 URL: https://issues.apache.org/jira/browse/JCR-3406
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller

 When the repository is shut down, the method AbstractJournal.doUnlock(boolean 
 successful) is sometimes not called. The method Journal.close is called, but 
 when the journal implementation uses a reentrant lock it can't unlock because 
 close is called from a different thread.
 The reason for not calling doUnlock is that ClusterNode.stop() sets the 
 status to stopped, which causes all WorkspaceUpdateChannel methods to not 
 work, including updateCommitted and updateCancelled. Therefore, it is 
 possible that an operation is started but never completed nor cancelled.
 To solve the issue, I found that it is enough to let updateCommitted and 
 updateCancelled to complete, so that operations that are in progress can 
 finish.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-245) Add import for org.h2 in oak-mk bundle

2012-08-15 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435156#comment-13435156
 ] 

Thomas Mueller commented on OAK-245:


 Class.forName(org.h2.Driver)
 This would constitute a bug in itself.

I don't consider this a bug. Let's say it doesn't work well with OSGi.

But I believe both Class.forName and org.h2.Driver.load() are not required 
here, as anyway the H2 connection pool is used. I would simply remove the line.

 Add import for org.h2 in oak-mk bundle
 --

 Key: OAK-245
 URL: https://issues.apache.org/jira/browse/OAK-245
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: mk
Reporter: Chetan Mehrotra
  Labels: osgi
 Attachments: import-h2.patch, OAK-245-load-driver.patch


 The oak-mk bundle depends on H2 database. It internally uses 
 Class.forName('org.h2.Driver) to load the H2 driver. Due to usage of 
 Class.forName Bnd is not able to add org.h2 package to Import-Package list. 
 So it should have an explicit entry in the maven-bundle-plugin config as 
 shown below
 {code:xml}
 Import-Package
   org.h2;resolution:=optional,
   *
 /Import-Package
 {code}
 Without this MicroKernalService loading would fail with a CNFE

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-239) MicroKernel.getRevisionHistory: maxEntries behavior should be documented

2012-08-13 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-239:
--

 Summary: MicroKernel.getRevisionHistory: maxEntries behavior 
should be documented
 Key: OAK-239
 URL: https://issues.apache.org/jira/browse/OAK-239
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mk
Reporter: Thomas Mueller


The method MicroKernel.getRevisionHistory uses a parameter maxEntries to limit 
the number of returned entries. If the implementation has to limit the entries, 
it is not clear from the documentation which entries to return (the oldest 
entries, the newest entries, or any x entries).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (OAK-239) MicroKernel.getRevisionHistory: maxEntries behavior should be documented

2012-08-13 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-239:
---

Priority: Minor  (was: Major)

 MicroKernel.getRevisionHistory: maxEntries behavior should be documented
 

 Key: OAK-239
 URL: https://issues.apache.org/jira/browse/OAK-239
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mk
Reporter: Thomas Mueller
Priority: Minor

 The method MicroKernel.getRevisionHistory uses a parameter maxEntries to 
 limit the number of returned entries. If the implementation has to limit the 
 entries, it is not clear from the documentation which entries to return (the 
 oldest entries, the newest entries, or any x entries).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-241) QueryEngine.executeQuery needs a session parameter

2012-08-13 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-241:
--

 Summary: QueryEngine.executeQuery needs a session parameter
 Key: OAK-241
 URL: https://issues.apache.org/jira/browse/OAK-241
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor


The method QueryEngine.executeQuery currently needs a ContentSession parameter, 
even thought the instance was retrieved from the ContentSession using 
ContentSession.getQueryEngine(). This is a bit confusing.

To solve this, we could rename the QueryEngine interface to SessionQueryEngine, 
change QueryEngineImpl so it no longer implements any interface, add a class 
SessionQueryEngineImpl that calls the QueryEngineImpl methods (1:1, except for 
executeQuery where it adds the session parameter).

An alternative would be to change the existing QueryEngineImpl so a new 
instance is created for each session. But I prefer not todo this as 
conceptually there is only one query engine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (JCR-3406) Journal doUnlock sometimes not called on repository shutdown

2012-08-08 Thread Thomas Mueller (JIRA)
Thomas Mueller created JCR-3406:
---

 Summary: Journal doUnlock sometimes not called on repository 
shutdown
 Key: JCR-3406
 URL: https://issues.apache.org/jira/browse/JCR-3406
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller


When the repository is shut down, the method AbstractJournal.doUnlock(boolean 
successful) is sometimes not called. The method Journal.close is called, but 
when the journal implementation uses a reentrant lock it can't unlock because 
close is called from a different thread.

The reason for not calling doUnlock is that ClusterNode.stop() sets the status 
to stopped, which causes all WorkspaceUpdateChannel methods to not work, 
including updateCommitted and updateCancelled. Therefore, it is possible that 
an operation is started but never completed nor cancelled.

To solve the issue, I found that it is enough to let updateCommitted and 
updateCancelled to complete, so that operations that are in progress can finish.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (JCR-3406) Journal doUnlock sometimes not called on repository shutdown

2012-08-08 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431101#comment-13431101
 ] 

Thomas Mueller commented on JCR-3406:
-

Proposed patch:

{code}
Index: src/main/java/org/apache/jackrabbit/core/cluster/ClusterNode.java
===
--- src/main/java/org/apache/jackrabbit/core/cluster/ClusterNode.java   
(revision 1370130)
+++ src/main/java/org/apache/jackrabbit/core/cluster/ClusterNode.java   
(working copy)
@@ -665,10 +665,6 @@
  * {@inheritDoc}
  */
 public void updateCommitted(Update update, String path) {
-if (status != STARTED) {
-log.info(not started: update commit ignored.);
-return;
-}
 Record record = (Record) update.getAttribute(ATTRIBUTE_RECORD);
 if (record == null) {
 String msg = No record prepared.;
@@ -705,10 +701,6 @@
  * {@inheritDoc}
  */
 public void updateCancelled(Update update) {
-if (status != STARTED) {
-log.info(not started: update cancel ignored.);
-return;
-}
 Record record = (Record) update.getAttribute(ATTRIBUTE_RECORD);
 if (record != null) {
 record.cancelUpdate();
{code}

 Journal doUnlock sometimes not called on repository shutdown
 

 Key: JCR-3406
 URL: https://issues.apache.org/jira/browse/JCR-3406
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller

 When the repository is shut down, the method AbstractJournal.doUnlock(boolean 
 successful) is sometimes not called. The method Journal.close is called, but 
 when the journal implementation uses a reentrant lock it can't unlock because 
 close is called from a different thread.
 The reason for not calling doUnlock is that ClusterNode.stop() sets the 
 status to stopped, which causes all WorkspaceUpdateChannel methods to not 
 work, including updateCommitted and updateCancelled. Therefore, it is 
 possible that an operation is started but never completed nor cancelled.
 To solve the issue, I found that it is enough to let updateCommitted and 
 updateCancelled to complete, so that operations that are in progress can 
 finish.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-225) Sling I18N queries not supported by Oak

2012-08-08 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13430992#comment-13430992
 ] 

Thomas Mueller commented on OAK-225:


In revision 1370294 the query is converted to a SQL-2 query, so executing the 
query no longer fails. However the conversion is not correct (as far as I know):

//element(*,mix:language)[fn:lower-case(@jcr:language)='en']
//element(*,sling:Message)[@sling:message]
/(@sling:key|@sling:message)

is currently converted to:

select [jcr:path], [jcr:score], [sling:key], [sling:message] 
from [sling:Message] 
where (lower([jcr:language]) = 'en') 
and ([sling:message] is not null)

I'm not sure if it's worth the effort to support such XPath queries.


 Sling I18N queries not supported by Oak
 ---

 Key: OAK-225
 URL: https://issues.apache.org/jira/browse/OAK-225
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core
Affects Versions: 0.3
Reporter: Jukka Zitting
Priority: Minor
  Labels: sling, xpath

 The Sling I18N component issues XPath queries like the following:
 {code:none}
 //element(*,mix:language)[fn:lower-case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message)
 {code}
 Such queries currently fail with the following exception:
 {code:none}
 javax.jcr.query.InvalidQueryException: java.text.ParseException: Query: 
 //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message);
  expected: (
 at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:115)
 at 
 org.apache.jackrabbit.oak.jcr.query.QueryImpl.execute(QueryImpl.java:85)
 at 
 org.apache.sling.jcr.resource.JcrResourceUtil.query(JcrResourceUtil.java:52)
 at 
 org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.queryResources(JcrResourceProvider.java:262)
 ... 54 more
 Caused by: java.text.ParseException: Query: 
 //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message);
  expected: (
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.getSyntaxError(XPathToSQL2Converter.java:704)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.read(XPathToSQL2Converter.java:410)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseExpression(XPathToSQL2Converter.java:336)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseCondition(XPathToSQL2Converter.java:279)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseAnd(XPathToSQL2Converter.java:252)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseConstraint(XPathToSQL2Converter.java:244)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.convert(XPathToSQL2Converter.java:153)
 at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.parseQuery(QueryEngineImpl.java:86)
 at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:99)
 at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:39)
 at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:110)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-225) Sling I18N queries not supported by Oak

2012-08-08 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431099#comment-13431099
 ] 

Thomas Mueller commented on OAK-225:


I tried to add the comment on OAK-225 yesterday, but Jira was not working 
properly. It seems even today it sometimes doesn't work, let's see.

The fix I did so far was to support fn:lower-case. This wasn't supported 
before (even for very very simple queries).

The query is not converted correctly, I know, and this needs to be either fixed 
or it has to fail (throw an exception). 

For XPath queries of this form, a join would be needed I believe. I didn't look 
into that so far, I'm not sure how easy it is to support it (all I know is that 
it's not trivial). I guess it's not just a question on whether we 'want' to 
support it (I want :-) but also if it's worth the effort, and this I'm not 
convinced yet. It sounds like, as a short term solution, it would be relatively 
easy to change the query in Sling, but in the long term I guess it would be 
better to support such queries.


 Sling I18N queries not supported by Oak
 ---

 Key: OAK-225
 URL: https://issues.apache.org/jira/browse/OAK-225
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core
Affects Versions: 0.3
Reporter: Jukka Zitting
Priority: Minor
  Labels: sling, xpath

 The Sling I18N component issues XPath queries like the following:
 {code:none}
 //element(*,mix:language)[fn:lower-case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message)
 {code}
 Such queries currently fail with the following exception:
 {code:none}
 javax.jcr.query.InvalidQueryException: java.text.ParseException: Query: 
 //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message);
  expected: (
 at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:115)
 at 
 org.apache.jackrabbit.oak.jcr.query.QueryImpl.execute(QueryImpl.java:85)
 at 
 org.apache.sling.jcr.resource.JcrResourceUtil.query(JcrResourceUtil.java:52)
 at 
 org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.queryResources(JcrResourceProvider.java:262)
 ... 54 more
 Caused by: java.text.ParseException: Query: 
 //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message);
  expected: (
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.getSyntaxError(XPathToSQL2Converter.java:704)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.read(XPathToSQL2Converter.java:410)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseExpression(XPathToSQL2Converter.java:336)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseCondition(XPathToSQL2Converter.java:279)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseAnd(XPathToSQL2Converter.java:252)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseConstraint(XPathToSQL2Converter.java:244)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.convert(XPathToSQL2Converter.java:153)
 at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.parseQuery(QueryEngineImpl.java:86)
 at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:99)
 at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:39)
 at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:110)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-225) Sling I18N queries not supported by Oak

2012-08-08 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431133#comment-13431133
 ] 

Thomas Mueller commented on OAK-225:


Let me rephrase it: I'm not convinced it's worth to support such queries *right 
now* (in the near term). If there are many such queries, then yes, if it's the 
only one, I would rather postpone support until the end of the year, and make 
such queries throw an exception currently.

 Sling I18N queries not supported by Oak
 ---

 Key: OAK-225
 URL: https://issues.apache.org/jira/browse/OAK-225
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core
Affects Versions: 0.3
Reporter: Jukka Zitting
Priority: Minor
  Labels: sling, xpath

 The Sling I18N component issues XPath queries like the following:
 {code:none}
 //element(*,mix:language)[fn:lower-case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message)
 {code}
 Such queries currently fail with the following exception:
 {code:none}
 javax.jcr.query.InvalidQueryException: java.text.ParseException: Query: 
 //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message);
  expected: (
 at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:115)
 at 
 org.apache.jackrabbit.oak.jcr.query.QueryImpl.execute(QueryImpl.java:85)
 at 
 org.apache.sling.jcr.resource.JcrResourceUtil.query(JcrResourceUtil.java:52)
 at 
 org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.queryResources(JcrResourceProvider.java:262)
 ... 54 more
 Caused by: java.text.ParseException: Query: 
 //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message);
  expected: (
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.getSyntaxError(XPathToSQL2Converter.java:704)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.read(XPathToSQL2Converter.java:410)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseExpression(XPathToSQL2Converter.java:336)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseCondition(XPathToSQL2Converter.java:279)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseAnd(XPathToSQL2Converter.java:252)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseConstraint(XPathToSQL2Converter.java:244)
 at 
 org.apache.jackrabbit.oak.query.XPathToSQL2Converter.convert(XPathToSQL2Converter.java:153)
 at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.parseQuery(QueryEngineImpl.java:86)
 at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:99)
 at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:39)
 at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:110)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (OAK-209) BlobStore: use SHA-256 instead of SHA-1, and use two directory levels for FileBlobStore

2012-08-02 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved OAK-209.


Resolution: Fixed

Revision 1368520 and revision 1368542.

Some additional changes are included as some of the tests had to be changed in 
order to use SHA-256. Also I documented and changed the internal BlobStore 
interface a bit.

 BlobStore: use SHA-256 instead of SHA-1, and use two directory levels for 
 FileBlobStore
 ---

 Key: OAK-209
 URL: https://issues.apache.org/jira/browse/OAK-209
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: mk
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor

 Currently we use SHA-1 as the hash algorithm for the blob store (same as with 
 Jackrabbit 2.x). I think it makes sense if we use SHA-256 instead:
 Advantages:
 - SHA-1 is considered broken by some experts:
   http://www.schneier.com/blog/archives/2005/02/sha1_broken.html
 - SHA-256 belongs to the SHA-2 family, which is recommended by NIST
   for new applications:
   http://csrc.nist.gov/groups/ST/toolkit/secure_hashing.html
 Disadvantages:
 - Longer file name
 - Longer content hash
 - Not compatible with Jackrabbit 2.x
 For the FileBlobStore, the current implementation uses only one directory 
 level while Jackrabbit 2.x uses 3 levels. I think we should use two levels 
 for Oak, to avoid too many files in the same directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (JCR-3369) Garbage collector improvements

2012-07-31 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated JCR-3369:


Fix Version/s: 2.4.3

 Garbage collector improvements
 --

 Key: JCR-3369
 URL: https://issues.apache.org/jira/browse/JCR-3369
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
  Components: jackrabbit-core
Reporter: Mete Atamel
Assignee: Thomas Mueller
 Fix For: 2.2.13, 2.4.3, 2.5.1

 Attachments: JCR-3369-2.2.patch, JCR-3369-2.4.patch, 
 JCR-3369-trunk.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 We identified a number of improvements to garbage collector related code to 
 make it more robust, specifically:
 1- As discussed in JCR-3340, when GC goes through nodes, it can encounter a 
 lot of ItemStateExceptions. Currently, stack trace of these exceptions are 
 not logged and this makes debugging difficult. Instead, ItemStateExceptions 
 should at least be logged with full stack trace every 1 minute or so.
 2- As discussed in JCR-3341, GC does not fail fast if there is a problem and 
 it should.
 3- Session usage in the GC is problematic. The session in GC is used for 
 traversing the content and marking the binaries, but the listener in that 
 class uses the same session as well, when a node is added. GC should rather 
 use a separate session in onEvent() to avoid concurrent use.
 4- GC listens for NODE_ADDED event for moved nodes but instead it should 
 listen for NODE_MOVED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (JCR-3369) Garbage collector improvements

2012-07-31 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425573#comment-13425573
 ] 

Thomas Mueller commented on JCR-3369:
-

Merged into the 2.4 branch in revision 1367435

 Garbage collector improvements
 --

 Key: JCR-3369
 URL: https://issues.apache.org/jira/browse/JCR-3369
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
  Components: jackrabbit-core
Reporter: Mete Atamel
Assignee: Thomas Mueller
 Fix For: 2.2.13, 2.4.3, 2.5.1

 Attachments: JCR-3369-2.2.patch, JCR-3369-2.4.patch, 
 JCR-3369-trunk.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 We identified a number of improvements to garbage collector related code to 
 make it more robust, specifically:
 1- As discussed in JCR-3340, when GC goes through nodes, it can encounter a 
 lot of ItemStateExceptions. Currently, stack trace of these exceptions are 
 not logged and this makes debugging difficult. Instead, ItemStateExceptions 
 should at least be logged with full stack trace every 1 minute or so.
 2- As discussed in JCR-3341, GC does not fail fast if there is a problem and 
 it should.
 3- Session usage in the GC is problematic. The session in GC is used for 
 traversing the content and marking the binaries, but the listener in that 
 class uses the same session as well, when a node is added. GC should rather 
 use a separate session in onEvent() to avoid concurrent use.
 4- GC listens for NODE_ADDED event for moved nodes but instead it should 
 listen for NODE_MOVED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (JCR-3369) Garbage collector improvements

2012-07-31 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-3369.
-

Resolution: Fixed

 Garbage collector improvements
 --

 Key: JCR-3369
 URL: https://issues.apache.org/jira/browse/JCR-3369
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
  Components: jackrabbit-core
Reporter: Mete Atamel
Assignee: Thomas Mueller
 Fix For: 2.2.13, 2.4.3, 2.5.1

 Attachments: JCR-3369-2.2.patch, JCR-3369-2.4.patch, 
 JCR-3369-trunk.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 We identified a number of improvements to garbage collector related code to 
 make it more robust, specifically:
 1- As discussed in JCR-3340, when GC goes through nodes, it can encounter a 
 lot of ItemStateExceptions. Currently, stack trace of these exceptions are 
 not logged and this makes debugging difficult. Instead, ItemStateExceptions 
 should at least be logged with full stack trace every 1 minute or so.
 2- As discussed in JCR-3341, GC does not fail fast if there is a problem and 
 it should.
 3- Session usage in the GC is problematic. The session in GC is used for 
 traversing the content and marking the binaries, but the listener in that 
 class uses the same session as well, when a node is added. GC should rather 
 use a separate session in onEvent() to avoid concurrent use.
 4- GC listens for NODE_ADDED event for moved nodes but instead it should 
 listen for NODE_MOVED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-189) Swallowed exceptions

2012-07-25 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422310#comment-13422310
 ] 

Thomas Mueller commented on OAK-189:


Well, this is not about checked versus uncheck exceptions, but about catching 
an exception and then simply return null, without logging the exception, 
without re-throwing a different exception. The exception is silently ignored, 
and the code behaves in a different way.

 Swallowed exceptions
 

 Key: OAK-189
 URL: https://issues.apache.org/jira/browse/OAK-189
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: jcr
Reporter: Thomas Mueller

 Exceptions should not be silently swallowed. This is currently done in 
 SessionDelegate$SessionNameMapper, methods getOakPrefix(), 
 getOakPrefixFromURI(), and getJcrPrefix(). Those methods catch 
 RepositoryException, don't log by default (only when using debug level), and 
 don't log the exception stack trace or throw an exception.
 Catching a very wide band of exceptions (RepositoryException) and then simply 
 returning null is not an acceptable solution in my view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (JCR-3396) Simplify the code when possible

2012-07-24 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-3396.
-

Resolution: Fixed

 Simplify the code when possible
 ---

 Key: JCR-3396
 URL: https://issues.apache.org/jira/browse/JCR-3396
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Priority: Minor

 Sometimes it's possible to simplify the code, for example:
 - making methods static when possible, so a reader knows the method doesn't 
 change the state of an object
 - the else is unnecessary if the if block always returns

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-182) Support for invisible internal content

2012-07-24 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13421322#comment-13421322
 ] 

Thomas Mueller commented on OAK-182:


 I think the ValidatingEditor shouldn't ignore hidden content... the 
 MergingNodeStateDiff class probably shouldn't ignore hidden content.

For index content, I don't agree. I see no reason to validate or merge changes 
in the index. The only hidden content I aware of is index content.

But the tests seem to work now, so I will not change add the filtering to 
ValidatingEditor and MergingNodeStateDiff. I guess we will come back to this 
once it's a performance problem.

Also, I will change NodeStateUtils.isHidden() to support simple names only.



 Support for invisible internal content
 

 Key: OAK-182
 URL: https://issues.apache.org/jira/browse/OAK-182
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: core, jcr
Reporter: Jukka Zitting
 Attachments: OAK-182-b.patch


 As discussed on the mailing list 
 (http://markmail.org/message/kzt7csiz2bd5n3ww), it would be good to have a 
 naming pattern line {{:name}} for internal content that we don't want to 
 directly expose to JCR clients.
 JCR-related functionality like the namespace and node type validators and the 
 observation dispatcher (see also OAK-181) should know to ignore such content 
 and the JCR binding in oak-jcr should automatically filter out such internal 
 content. Such internal content should probably also not be indexed for search.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (JCR-3396) Simplify the code when possible

2012-07-23 Thread Thomas Mueller (JIRA)
Thomas Mueller created JCR-3396:
---

 Summary: Simplify the code when possible
 Key: JCR-3396
 URL: https://issues.apache.org/jira/browse/JCR-3396
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Priority: Minor


Sometimes it's possible to simplify the code, for example:

- making methods static when possible, so a reader knows the method doesn't 
change the state of an object

- the else is unnecessary if the if block always returns

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-201) NamespaceRegistry is very slow

2012-07-23 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-201:
--

 Summary: NamespaceRegistry is very slow
 Key: OAK-201
 URL: https://issues.apache.org/jira/browse/OAK-201
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller


The NamespaceRegistryImpl.getURI and getPrefix are called a lot, for example by 
NamePathMapperImpl.getOakName. 

The method doesn't do any caching, which is a problem because it has to read it 
each time from the repository. Even if it would do caching, it wouldn't help 
because it the method WorkspaceImpl.getNamespaceRegistry creates a new 
NamespaceRegistryImpl each time it is called. To allow caching of known 
mappings, the instance needs to be cached as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-202) Simplify the code when possible

2012-07-23 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-202:
--

 Summary: Simplify the code when possible
 Key: OAK-202
 URL: https://issues.apache.org/jira/browse/OAK-202
 Project: Jackrabbit Oak
  Issue Type: Improvement
Reporter: Thomas Mueller
Priority: Minor


Sometimes it's possible to simplify the code, for example: 

- making methods static when possible, so a reader knows the method doesn't 
change the state of an object 

- the else is unnecessary if the if block always returns

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-201) NamespaceRegistry is very slow

2012-07-23 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420578#comment-13420578
 ] 

Thomas Mueller commented on OAK-201:


Yes, that makes sense. 

It seems sometimes a one-way conversion is needed currently, for example in 
NodeImpl.setPrimaryType:

String jcrPrimaryType = 
sessionDelegate.getOakPathOrThrow(Property.JCR_PRIMARY_TYPE);

Property.JCR_PRIMARY_TYPE is the expanded form ({http://...}primaryType;).

I guess we could create constants with the short form (nt:primaryType), and 
then check if there are remappings. Or use a hardcoded list for known 
remappings (either just the prefixes or the complete names).


 NamespaceRegistry is very slow
 --

 Key: OAK-201
 URL: https://issues.apache.org/jira/browse/OAK-201
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller

 The NamespaceRegistryImpl.getURI and getPrefix are called a lot, for example 
 by NamePathMapperImpl.getOakName. 
 The method doesn't do any caching, which is a problem because it has to read 
 it each time from the repository. Even if it would do caching, it wouldn't 
 help because it the method WorkspaceImpl.getNamespaceRegistry creates a new 
 NamespaceRegistryImpl each time it is called. To allow caching of known 
 mappings, the instance needs to be cached as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-201) NamespaceRegistry is very slow

2012-07-23 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420597#comment-13420597
 ] 

Thomas Mueller commented on OAK-201:


A stack trace:

SessionImpl.refresh(boolean) line: 156  
WorkspaceImpl$1.refresh() line: 153 
WorkspaceImpl$1(NamespaceRegistryImpl).getURI(String) line: 162 
SessionImpl(AbstractSession).getNamespaceURI(String) line: 132  
SessionDelegate$SessionNameMapper.getOakPrefix(String) line: 453
SessionDelegate$SessionNameMapper(AbstractNameMapper).getOakName(String) line: 
61   
NamePathMapperImpl.getOakName(String) line: 46  
NodeTypeManagerImpl.getNodeType(String) line: 83
NodeImpl.addNode(String, String) line: 217  

So for each addNode(String, String), currently there is a Session.refresh(true).

 NamespaceRegistry is very slow
 --

 Key: OAK-201
 URL: https://issues.apache.org/jira/browse/OAK-201
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller

 The NamespaceRegistryImpl.getURI and getPrefix are called a lot, for example 
 by NamePathMapperImpl.getOakName. 
 The method doesn't do any caching, which is a problem because it has to read 
 it each time from the repository. Even if it would do caching, it wouldn't 
 help because it the method WorkspaceImpl.getNamespaceRegistry creates a new 
 NamespaceRegistryImpl each time it is called. To allow caching of known 
 mappings, the instance needs to be cached as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-181) Observation / indexing: don't create events for index updates

2012-07-16 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414944#comment-13414944
 ] 

Thomas Mueller commented on OAK-181:


Revision 1361938: the index content, as well as the internal index data, is now 
stored in a child node. The name of that child node is currently :data, but 
that can be changed later if required. There is one such a node per index, and 
one for the internal index data and temporary storage (used for move 
operations). The internal index data is currently just the revision id of the 
latest indexed revision.

 the index should be part of the repository (e.g. as binary nt:files), 
 so you can easily back them up and copy over using the JCR API 
 (and package systems on top of it)
 IIUC, that is one of the major reasons to put indexes into the repository.

How visible the index data should be is a good question. I don't think we 
should leave it somewhat open currently, and decide once we have more 
experience. I think the main reasons to put the index data in the repository 
are:

- to simplify backup / storage / maintenance
- scalability (so the index can scale in the same way the repository can scale)
- reduce complexity associated with separate storage for indexes

But making the index accessible over the JCR API wasn't a goal so far (as far 
as I'm aware). What you describe is uses cases I didn't think about so far. 
Within relational databases, I never heard about a use case to copy index data 
from one database to another. You generally just copy the data, and then let 
the database reindex it. If you want to copy the index data, then you do a full 
database backup.


 Observation / indexing: don't create events for index updates
 -

 Key: OAK-181
 URL: https://issues.apache.org/jira/browse/OAK-181
 Project: Jackrabbit Oak
  Issue Type: New Feature
Reporter: Thomas Mueller

 If index data is stored in the repository (for example under 
 jcr:system/oak:indexes), then each change in the content might result in one 
 or multiple changed in the affected indexes.
 Observation events should only be created for content changes, not for index 
 changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-182) Support for invisible internal content

2012-07-16 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414964#comment-13414964
 ] 

Thomas Mueller commented on OAK-182:


When enabling the IndexWrapper in the ContentRepositoryImpl default 
constructor, the org.apache.jackrabbit.oak.jcr.RepositoryTest.observation test 
fails with the exception stack trace below. I have a patch that fixed the 
issue, but I'm not familiar with the code and wonder if that is really the best 
way to fix it:

Index: src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java
===
--- src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java
(revision 1361947)
+++ src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java
(working copy)
@@ -97,6 +97,10 @@
 SetString baseChildNodes = new HashSetString();
 for (ChildNodeEntry beforeCNE : base.getChildNodeEntries()) {
 String name = beforeCNE.getName();
+if (name.startsWith(:)) {
+// OAK-182: ignore invisible internal content
+continue;
+}
 NodeState beforeChild = beforeCNE.getNodeState();
 NodeState afterChild = getChildNode(name);
 if (afterChild == null) {
@@ -110,6 +114,10 @@
 }
 for (ChildNodeEntry afterChild : getChildNodeEntries()) {
 String name = afterChild.getName();
+if (name.startsWith(:)) {
+// OAK-182: ignore invisible internal content
+continue;
+}
 if (!baseChildNodes.contains(name)) {
 diff.childNodeAdded(name, afterChild.getNodeState());
 }


Stack trace without the patch:

Exception in thread Observation java.lang.IllegalArgumentException: 
'/jcr:system/indexes/:data' is not a valid path. Prefix must not be empty
at 
org.apache.jackrabbit.oak.namepath.NamePathMapperImpl$1.error(NamePathMapperImpl.java:108)
at 
org.apache.jackrabbit.oak.namepath.JcrPathParser.parse(JcrPathParser.java:151)
at 
org.apache.jackrabbit.oak.namepath.NamePathMapperImpl.getJcrPath(NamePathMapperImpl.java:122)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.jcrPath(ChangeProcessor.java:104)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.propertyChanged(ChangeProcessor.java:117)
at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:87)
at 
org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155)
at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107)
at 
org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155)
at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107)
at 
org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155)
at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107)
at 
org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214)
at 
org.apache.jackrabbit.oak.core.RootImpl$1.getChanges(RootImpl.java:187)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor.run(ChangeProcessor.java:70)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)


 Support for invisible internal content
 

 Key: OAK-182
 URL: https://issues.apache.org/jira/browse/OAK-182
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: core, jcr
Reporter: Jukka Zitting

 As discussed on the mailing list 
 (http://markmail.org/message/kzt7csiz2bd5n3ww), it would be good to have a 
 naming pattern line {{:name}} for internal content that we don't want to 
 directly expose to JCR clients.
 JCR-related functionality like the namespace and node type validators and the 
 observation dispatcher (see also OAK-181) should know to ignore such content 
 and the JCR binding in oak-jcr should automatically 

[jira] [Comment Edited] (OAK-182) Support for invisible internal content

2012-07-16 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414964#comment-13414964
 ] 

Thomas Mueller edited comment on OAK-182 at 7/16/12 9:54 AM:
-

When enabling the IndexWrapper in the ContentRepositoryImpl default 
constructor, the org.apache.jackrabbit.oak.jcr.RepositoryTest.observation test 
fails with the exception stack trace below. I have a patch that fixed the 
issue, but I'm not familiar with the code and wonder if that is really the best 
way to fix it:

{code}
Index: src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java
===
--- src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java
(revision 1361947)
+++ src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java
(working copy)
@@ -97,6 +97,10 @@
 SetString baseChildNodes = new HashSetString();
 for (ChildNodeEntry beforeCNE : base.getChildNodeEntries()) {
 String name = beforeCNE.getName();
+if (name.startsWith(:)) {
+// OAK-182: ignore invisible internal content
+continue;
+}
 NodeState beforeChild = beforeCNE.getNodeState();
 NodeState afterChild = getChildNode(name);
 if (afterChild == null) {
@@ -110,6 +114,10 @@
 }
 for (ChildNodeEntry afterChild : getChildNodeEntries()) {
 String name = afterChild.getName();
+if (name.startsWith(:)) {
+// OAK-182: ignore invisible internal content
+continue;
+}
 if (!baseChildNodes.contains(name)) {
 diff.childNodeAdded(name, afterChild.getNodeState());
 }
{code}

Stack trace without the patch:

{code}
Exception in thread Observation java.lang.IllegalArgumentException: 
'/jcr:system/indexes/:data' is not a valid path. Prefix must not be empty
at 
org.apache.jackrabbit.oak.namepath.NamePathMapperImpl$1.error(NamePathMapperImpl.java:108)
at 
org.apache.jackrabbit.oak.namepath.JcrPathParser.parse(JcrPathParser.java:151)
at 
org.apache.jackrabbit.oak.namepath.NamePathMapperImpl.getJcrPath(NamePathMapperImpl.java:122)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.jcrPath(ChangeProcessor.java:104)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.propertyChanged(ChangeProcessor.java:117)
at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:87)
at 
org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155)
at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107)
at 
org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155)
at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107)
at 
org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor$EventGeneratingNodeStateDiff.childNodeChanged(ChangeProcessor.java:155)
at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:107)
at 
org.apache.jackrabbit.oak.kernel.KernelNodeState.compareAgainstBaseState(KernelNodeState.java:214)
at 
org.apache.jackrabbit.oak.core.RootImpl$1.getChanges(RootImpl.java:187)
at 
org.apache.jackrabbit.oak.jcr.observation.ChangeProcessor.run(ChangeProcessor.java:70)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
{code}

  was (Author: tmueller):
When enabling the IndexWrapper in the ContentRepositoryImpl default 
constructor, the org.apache.jackrabbit.oak.jcr.RepositoryTest.observation test 
fails with the exception stack trace below. I have a patch that fixed the 
issue, but I'm not familiar with the code and wonder if that is really the best 
way to fix it:

{{
Index: src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java
===
--- src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java
(revision 1361947)
+++ src/main/java/org/apache/jackrabbit/oak/spi/state/AbstractNodeState.java 

[jira] [Created] (OAK-189) Swallowed exceptions

2012-07-16 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-189:
--

 Summary: Swallowed exceptions
 Key: OAK-189
 URL: https://issues.apache.org/jira/browse/OAK-189
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: jcr
Reporter: Thomas Mueller


Exceptions should not be silently swallowed. This is currently done in 
SessionDelegate$SessionNameMapper, methods getOakPrefix(), 
getOakPrefixFromURI(), and getJcrPrefix(). Those methods catch 
RepositoryException, don't log by default (only when using debug level), and 
don't log the exception stack trace or throw an exception.

Catching a very wide band of exceptions (RepositoryException) and then simply 
returning null is not an acceptable solution in my view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-189) Swallowed exceptions

2012-07-16 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13415201#comment-13415201
 ] 

Thomas Mueller commented on OAK-189:


The reason why I think it's not acceptable is that the exception could be 
anything, for example out of disk space, or some internal error. Just silently 
returning null, without logging, makes it hard to find the root cause of the 
problem, because everything else might just look fine. Some code might accept 
null as a correct answer, so that the program just behaves somewhat differently.

 Swallowed exceptions
 

 Key: OAK-189
 URL: https://issues.apache.org/jira/browse/OAK-189
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: jcr
Reporter: Thomas Mueller

 Exceptions should not be silently swallowed. This is currently done in 
 SessionDelegate$SessionNameMapper, methods getOakPrefix(), 
 getOakPrefixFromURI(), and getJcrPrefix(). Those methods catch 
 RepositoryException, don't log by default (only when using debug level), and 
 don't log the exception stack trace or throw an exception.
 Catching a very wide band of exceptions (RepositoryException) and then simply 
 returning null is not an acceptable solution in my view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-182) Support for invisible internal content

2012-07-13 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413580#comment-13413580
 ] 

Thomas Mueller commented on OAK-182:


 Such internal content should probably also not be indexed for search.

That's certainly possible. Later on we can still allow indexing such 'hidden' 
properties once we have a use case.

 Support for invisible internal content
 

 Key: OAK-182
 URL: https://issues.apache.org/jira/browse/OAK-182
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: core, jcr
Reporter: Jukka Zitting

 As discussed on the mailing list 
 (http://markmail.org/message/kzt7csiz2bd5n3ww), it would be good to have a 
 naming pattern line {{:name}} for internal content that we don't want to 
 directly expose to JCR clients.
 JCR-related functionality like the namespace and node type validators and the 
 observation dispatcher (see also OAK-181) should know to ignore such content 
 and the JCR binding in oak-jcr should automatically filter out such internal 
 content. Such internal content should probably also not be indexed for search.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (OAK-179) Tests should not fail if there is a jcr:system node

2012-07-12 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved OAK-179.


Resolution: Fixed

Fixed in revision 1360591

 Tests should not fail if there is a jcr:system node
 ---

 Key: OAK-179
 URL: https://issues.apache.org/jira/browse/OAK-179
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor

 Some of the tests fail if there is a node /jcr:system. The tests should be 
 able to deal with such a node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (JCR-3385) DbClusterTest fails when port is already in use

2012-07-12 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412775#comment-13412775
 ] 

Thomas Mueller commented on JCR-3385:
-

The problem is that the ports are also hardcoded in the repository-h2.xml file.

The easiest solution is probably to use a non-clustered, embedded database 
stored in the parent directory if the cluster nodes.

 DbClusterTest fails when port is already in use
 ---

 Key: JCR-3385
 URL: https://issues.apache.org/jira/browse/JCR-3385
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: clustering, jackrabbit-core
Affects Versions: 2.5
Reporter: Jukka Zitting
Assignee: Thomas Mueller
Priority: Minor
 Attachments: 
 0001-JCR-3385-DbClusterTest-fails-when-port-is-already-in.patch


 The DbClusterTest and DbClusterTestJCR3162 classes use hard-coded TCP port 
 numbes 9001 and 9002 which make the tests fail whenever there already is some 
 process listening on those ports.
 It would be better if the classes automatically looked for unused ports for 
 the tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (JCR-3385) DbClusterTest fails when port is already in use

2012-07-12 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated JCR-3385:


Attachment: JCR-3385-embedded-shared-db.patch

Alternative patch using an embedded database - a bit less real world but 
results in simpler test cases

 DbClusterTest fails when port is already in use
 ---

 Key: JCR-3385
 URL: https://issues.apache.org/jira/browse/JCR-3385
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: clustering, jackrabbit-core
Affects Versions: 2.5
Reporter: Jukka Zitting
Assignee: Thomas Mueller
Priority: Minor
 Attachments: 
 0001-JCR-3385-DbClusterTest-fails-when-port-is-already-in.patch, 
 JCR-3385-embedded-shared-db.patch


 The DbClusterTest and DbClusterTestJCR3162 classes use hard-coded TCP port 
 numbes 9001 and 9002 which make the tests fail whenever there already is some 
 process listening on those ports.
 It would be better if the classes automatically looked for unused ports for 
 the tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (JCR-3385) DbClusterTest fails when port is already in use

2012-07-12 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-3385.
-

Resolution: Fixed

I have committed my patch now - not because it's better thank Jukkas patch, but 
because I hope the tests will be easier to understand and maintain in the 
future.

Revision 1360692

 DbClusterTest fails when port is already in use
 ---

 Key: JCR-3385
 URL: https://issues.apache.org/jira/browse/JCR-3385
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: clustering, jackrabbit-core
Affects Versions: 2.5
Reporter: Jukka Zitting
Assignee: Thomas Mueller
Priority: Minor
 Attachments: 
 0001-JCR-3385-DbClusterTest-fails-when-port-is-already-in.patch, 
 JCR-3385-embedded-shared-db.patch


 The DbClusterTest and DbClusterTestJCR3162 classes use hard-coded TCP port 
 numbes 9001 and 9002 which make the tests fail whenever there already is some 
 process listening on those ports.
 It would be better if the classes automatically looked for unused ports for 
 the tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-178) Query: index definition documentation and tooling

2012-07-12 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412591#comment-13412591
 ] 

Thomas Mueller commented on OAK-178:


 and define an Oak namespace

Sure! How would I do that?

 Query: index definition documentation and tooling
 -

 Key: OAK-178
 URL: https://issues.apache.org/jira/browse/OAK-178
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller
Assignee: Thomas Mueller

 Unlike Jackrabbit 2.x, indexes in the Oak query engine are user defined, that 
 means data is only indexed if there is a matching index. Those indexes are 
 then automatically used for the appropriate queries. The current plan is to 
 define indexes as nodes within a repository. An index is created if an index 
 metadata node is created, and the index is removed if the index metadata node 
 is removed. The index content is automatically updated if the content changes 
 (either synchronously or asynchronously).
 The location and structure of the index metadata needs to be defined and 
 documented.
 Also, to simplify defining and managing indexes, it may make sense to write a 
 utility (helper class) for managing indexes. Internally, this utility uses 
 the regular JCR API and accesses the documented index metadata nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-178) Query: index definition documentation and tooling

2012-07-12 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412594#comment-13412594
 ] 

Thomas Mueller commented on OAK-178:


Like this?

#P oak-jcr
Index: 
src/main/resources/org/apache/jackrabbit/oak/jcr/nodetype/builtin_nodetypes.cnd
===
--- 
src/main/resources/org/apache/jackrabbit/oak/jcr/nodetype/builtin_nodetypes.cnd 
(revision 1360166)
+++ 
src/main/resources/org/apache/jackrabbit/oak/jcr/nodetype/builtin_nodetypes.cnd 
(working copy)
@@ -19,6 +19,7 @@
 jcr='http://www.jcp.org/jcr/1.0'
 nt='http://www.jcp.org/jcr/nt/1.0'
 mix='http://www.jcp.org/jcr/mix/1.0'
+oak='http://jackrabbit.apache.org/oak/1.0'
 
 
//--
 // B A S E  T Y P E


 Query: index definition documentation and tooling
 -

 Key: OAK-178
 URL: https://issues.apache.org/jira/browse/OAK-178
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller
Assignee: Thomas Mueller

 Unlike Jackrabbit 2.x, indexes in the Oak query engine are user defined, that 
 means data is only indexed if there is a matching index. Those indexes are 
 then automatically used for the appropriate queries. The current plan is to 
 define indexes as nodes within a repository. An index is created if an index 
 metadata node is created, and the index is removed if the index metadata node 
 is removed. The index content is automatically updated if the content changes 
 (either synchronously or asynchronously).
 The location and structure of the index metadata needs to be defined and 
 documented.
 Also, to simplify defining and managing indexes, it may make sense to write a 
 utility (helper class) for managing indexes. Internally, this utility uses 
 the regular JCR API and accesses the documented index metadata nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-178) Query: index definition documentation and tooling

2012-07-12 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412609#comment-13412609
 ] 

Thomas Mueller commented on OAK-178:


I saw the following change is also needed:

#P oak-jcr
Index: 
src/main/java/org/apache/jackrabbit/oak/jcr/nodetype/NodeTypeManagerDelegate.java
===
--- 
src/main/java/org/apache/jackrabbit/oak/jcr/nodetype/NodeTypeManagerDelegate.java
   (revision 1360577)
+++ 
src/main/java/org/apache/jackrabbit/oak/jcr/nodetype/NodeTypeManagerDelegate.java
   (working copy)
@@ -47,6 +47,7 @@
 tmp.put(nt,  http://www.jcp.org/jcr/nt/1.0;);
 tmp.put(mix, http://www.jcp.org/jcr/mix/1.0;);
 tmp.put(xml, http://www.w3.org/XML/1998/namespace;);
+tmp.put(oak,  http://jackrabbit.apache.org/oak/1.0;);
 nsdefaults = Collections.unmodifiableMap(tmp);
 }
 
With those two changes, the tests seem to work

 Query: index definition documentation and tooling
 -

 Key: OAK-178
 URL: https://issues.apache.org/jira/browse/OAK-178
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller
Assignee: Thomas Mueller

 Unlike Jackrabbit 2.x, indexes in the Oak query engine are user defined, that 
 means data is only indexed if there is a matching index. Those indexes are 
 then automatically used for the appropriate queries. The current plan is to 
 define indexes as nodes within a repository. An index is created if an index 
 metadata node is created, and the index is removed if the index metadata node 
 is removed. The index content is automatically updated if the content changes 
 (either synchronously or asynchronously).
 The location and structure of the index metadata needs to be defined and 
 documented.
 Also, to simplify defining and managing indexes, it may make sense to write a 
 utility (helper class) for managing indexes. Internally, this utility uses 
 the regular JCR API and accesses the documented index metadata nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-179) Tests should not fail if there is a jcr:system node

2012-07-12 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-179:
--

 Summary: Tests should not fail if there is a jcr:system node
 Key: OAK-179
 URL: https://issues.apache.org/jira/browse/OAK-179
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor


Some of the tests fail if there is a node /jcr:system. The tests should be able 
to deal with such a node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-180) More real world benchmarks

2012-07-12 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-180:
--

 Summary: More real world benchmarks
 Key: OAK-180
 URL: https://issues.apache.org/jira/browse/OAK-180
 Project: Jackrabbit Oak
  Issue Type: New Feature
Reporter: Thomas Mueller


While the tests are oak-bench are good, they are not very close to real world 
scenarios. Specially, we need tests with more nodes (for example 15 million 
nodes), and a more complex node structure, and more complex operations (read 
operations, write operations, fulltext index, queries, and access rights). It 
doesn't need to be very complex, but at least closer to the reality. 

I'm thinking about something of what TPC-C is for databases, but with content 
management operations instead of order-entry. But that's a longer term goal.

The goal of the test is to detect problem areas in our implementation (so this 
isn't just about scalability).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-181) Observation / indexing: don't create events for index updates

2012-07-12 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-181:
--

 Summary: Observation / indexing: don't create events for index 
updates
 Key: OAK-181
 URL: https://issues.apache.org/jira/browse/OAK-181
 Project: Jackrabbit Oak
  Issue Type: New Feature
Reporter: Thomas Mueller


If index data is stored in the repository (for example under 
jcr:system/oak:indexes), then each change in the content might result in one or 
multiple changed in the affected indexes.

Observation events should only be created for content changes, not for index 
changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-178) Query: index definition documentation and tooling

2012-07-11 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-178:
--

 Summary: Query: index definition documentation and tooling
 Key: OAK-178
 URL: https://issues.apache.org/jira/browse/OAK-178
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller
Assignee: Thomas Mueller


Unlike Jackrabbit 2.x, indexes in the Oak query engine are user defined, that 
means data is only indexed if there is a matching index. Those indexes are then 
automatically used for the appropriate queries. The current plan is to define 
indexes as nodes within a repository. An index is created if an index metadata 
node is created, and the index is removed if the index metadata node is 
removed. The index content is automatically updated if the content changes 
(either synchronously or asynchronously).

The location and structure of the index metadata needs to be defined and 
documented.

Also, to simplify defining and managing indexes, it may make sense to write a 
utility (helper class) for managing indexes. Internally, this utility uses the 
regular JCR API and accesses the documented index metadata nodes.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-169) Support orderable nodes

2012-07-10 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13410207#comment-13410207
 ] 

Thomas Mueller commented on OAK-169:


Again about linked list: Jukkas has a point that a regular linked list would be 
quite slow, because to display the child node names in order you would need to 
load all child nodes.

An alternative would be to use a grouped linked list. The parent node would 
keep the child node names of the first 100 (or whatever number) child nodes in 
a multi-value property. The last of those 100 nodes (if there are that many) 
would contain another multi-value property of the next 100 child node names, 
and so on. This is a special case of a skip list, as a regular linked list is, 
as a normal multi-value property with all (ordered) child node names is. If a 
node in the middle is removed, just one group list would shrink, lets say from 
100 to 99. If two groups combined have less than 100 elements, those two groups 
could be merged.

 Support orderable nodes
 ---

 Key: OAK-169
 URL: https://issues.apache.org/jira/browse/OAK-169
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: jcr
Reporter: Jukka Zitting

 There are JCR clients that depend on the ability to explicitly specify the 
 order of child nodes. That functionality is not included in the MicroKernel 
 tree model, so we need to implement it either in oak-core or oak-jcr using 
 something like an extra (hidden) {{oak:childOrder}} property that records the 
 specified ordering of child nodes. A multi-valued string property is probably 
 good enough for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expected root node not in result

2012-07-09 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409317#comment-13409317
 ] 

Thomas Mueller commented on JCR-3376:
-

Thanks Jukka! I didn't consider the possibility of such a limitation and so 
didn't run the Jackrabbit test. I will try to find an alternative solution.

 TCK: SQLPathTest.testChildAxisRoot expected root node not in result
 ---

 Key: JCR-3376
 URL: https://issues.apache.org/jira/browse/JCR-3376
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor
 Fix For: 2.6


 The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query:
 SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE 
 '/%/%'
 It expected the result to be
 /jcr:system, /testroot, /testdata
 It does not allow the implementation to return the root node ('/'). According 
 to the specification, a JCR implementation may filter the root node, as noted 
 by Randall Hauch - 
 http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html
  - quote:
 
 Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard 
 characters as generally used within LIKE predicates (and jcr:like in XPath):
   As in SQL, the character '%' represents any string of zero or more 
   characters, and the character '_' (underscore) represents any 
   single character.
 while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics 
 jcr:path pseudo column and narrows the semantics of using LIKE with 
 jcr:path in the second-to-last bullet point:
   Predicates in the WHERE clause that test jcr:path are only required to 
   support the operators =,  and LIKE. In the case of LIKE predicates, 
   support is only required for tests using the % wildcard character as a 
   match for a whole path segment (the part between two / characters) 
   or within index brackets
 Because the '%' matches only a whole path segment, the /% literal only 
 matches paths that have at least one path segment, which means that it 
 matches all descendants of the root node.
 
 the specification says In the case of LIKE predicates, support is only 
 required for tests using the % wildcard character as a match for a whole path 
 segment (the part between two / characters)... but it doesn't specify it 
 needs to do so.
 To allow an implementation to return the root node, I suggest to change the 
 test as follows:
 ... AND NOT jcr:path LIKE '/%/%' AND jcr:path  '/'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expects root node not in result

2012-07-09 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated JCR-3376:


Description: 
The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query:

SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%'

It expects the result to be

/jcr:system, /testroot, /testdata

It does not allow the implementation to return the root node ('/'). According 
to the specification, a JCR implementation may filter the root node, as noted 
by Randall Hauch - 
http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html
 - quote:


Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard 
characters as generally used within LIKE predicates (and jcr:like in XPath):

As in SQL, the character '%' represents any string of zero or more 
characters, and the character '_' (underscore) represents any 
single character.

while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics 
jcr:path pseudo column and narrows the semantics of using LIKE with 
jcr:path in the second-to-last bullet point:

Predicates in the WHERE clause that test jcr:path are only required to 
support the operators =,  and LIKE. In the case of LIKE predicates, 
support is only required for tests using the % wildcard character as a 
match for a whole path segment (the part between two / characters) 
or within index brackets

Because the '%' matches only a whole path segment, the /% literal only 
matches paths that have at least one path segment, which means that it matches 
all descendants of the root node.


the specification says In the case of LIKE predicates, support is only 
required for tests using the % wildcard character as a match for a whole path 
segment (the part between two / characters)... but it doesn't specify it needs 
to do so.




  was:
The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query:

SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%'

It expected the result to be

/jcr:system, /testroot, /testdata

It does not allow the implementation to return the root node ('/'). According 
to the specification, a JCR implementation may filter the root node, as noted 
by Randall Hauch - 
http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html
 - quote:


Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard 
characters as generally used within LIKE predicates (and jcr:like in XPath):

As in SQL, the character '%' represents any string of zero or more 
characters, and the character '_' (underscore) represents any 
single character.

while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics 
jcr:path pseudo column and narrows the semantics of using LIKE with 
jcr:path in the second-to-last bullet point:

Predicates in the WHERE clause that test jcr:path are only required to 
support the operators =,  and LIKE. In the case of LIKE predicates, 
support is only required for tests using the % wildcard character as a 
match for a whole path segment (the part between two / characters) 
or within index brackets

Because the '%' matches only a whole path segment, the /% literal only 
matches paths that have at least one path segment, which means that it matches 
all descendants of the root node.


the specification says In the case of LIKE predicates, support is only 
required for tests using the % wildcard character as a match for a whole path 
segment (the part between two / characters)... but it doesn't specify it needs 
to do so.

To allow an implementation to return the root node, I suggest to change the 
test as follows:

... AND NOT jcr:path LIKE '/%/%' AND jcr:path  '/'



Summary: TCK: SQLPathTest.testChildAxisRoot expects root node not in 
result  (was: TCK: SQLPathTest.testChildAxisRoot expected root node not in 
result)

 TCK: SQLPathTest.testChildAxisRoot expects root node not in result
 --

 Key: JCR-3376
 URL: https://issues.apache.org/jira/browse/JCR-3376
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor
 Fix For: 2.6


 The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query:
 SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE 
 '/%/%'
 It expects the result to be
 /jcr:system, /testroot, /testdata
 It does not allow the implementation to return the root node ('/'). According 
 to the specification, a JCR implementation may filter the root node, as noted 
 by Randall Hauch - 

[jira] [Resolved] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expects root node not in result

2012-07-09 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-3376.
-

Resolution: Fixed

Revision 1359213 allows for optional nodes to be in the result. For the test 
SQLPathTest.testChildAxisRoot, the root node is now optional.

I tested with Jackrabbit Core (-PintegrationTesting) and with Oak (oak-jcr), 
with the test removed from the known.issues list in the pom.xml.

 TCK: SQLPathTest.testChildAxisRoot expects root node not in result
 --

 Key: JCR-3376
 URL: https://issues.apache.org/jira/browse/JCR-3376
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor
 Fix For: 2.6


 The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query:
 SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE 
 '/%/%'
 It expects the result to be
 /jcr:system, /testroot, /testdata
 It does not allow the implementation to return the root node ('/'). According 
 to the specification, a JCR implementation may filter the root node, as noted 
 by Randall Hauch - 
 http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html
  - quote:
 
 Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard 
 characters as generally used within LIKE predicates (and jcr:like in XPath):
   As in SQL, the character '%' represents any string of zero or more 
   characters, and the character '_' (underscore) represents any 
   single character.
 while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics 
 jcr:path pseudo column and narrows the semantics of using LIKE with 
 jcr:path in the second-to-last bullet point:
   Predicates in the WHERE clause that test jcr:path are only required to 
   support the operators =,  and LIKE. In the case of LIKE predicates, 
   support is only required for tests using the % wildcard character as a 
   match for a whole path segment (the part between two / characters) 
   or within index brackets
 Because the '%' matches only a whole path segment, the /% literal only 
 matches paths that have at least one path segment, which means that it 
 matches all descendants of the root node.
 
 the specification says In the case of LIKE predicates, support is only 
 required for tests using the % wildcard character as a match for a whole path 
 segment (the part between two / characters)... but it doesn't specify it 
 needs to do so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-169) Support orderable nodes

2012-07-09 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409228#comment-13409228
 ] 

Thomas Mueller commented on OAK-169:


Iterating over all child nodes is always an O(n) operation (n = number of child 
nodes). Of course if you store all child nodes names in the parent node you 
only have to access the parent node, but that parent node is just n time larger 
then. We would be back to the behavior of Jackrabbit 2.x, where adding many 
child nodes is an O(n^2) operation, for orderable child node lists. This might 
or might not be acceptable - I don't think we decided this when we defined our 
goals.

With the linked list approach, iterating over all child nodes will have to read 
all child nodes (also an O(n) operation), on the other hand it will make it 
possible to support many child nodes without limitations. It is true that this 
approach is probably slower if there are few child nodes (compared to storing 
the complete child node name list in the parent).

I guess to decide which approach works best in practice we first need have to 
define which use cases we care about and which are the most common ones.

 Support orderable nodes
 ---

 Key: OAK-169
 URL: https://issues.apache.org/jira/browse/OAK-169
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: jcr
Reporter: Jukka Zitting

 There are JCR clients that depend on the ability to explicitly specify the 
 order of child nodes. That functionality is not included in the MicroKernel 
 tree model, so we need to implement it either in oak-core or oak-jcr using 
 something like an extra (hidden) {{oak:childOrder}} property that records the 
 specified ordering of child nodes. A multi-valued string property is probably 
 good enough for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (OAK-169) Support orderable nodes

2012-07-09 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409228#comment-13409228
 ] 

Thomas Mueller edited comment on OAK-169 at 7/9/12 7:05 AM:


Iterating over all child nodes is always an O( n ) operation (n = number of 
child nodes). Of course if you store all child nodes names in the parent node 
you only have to access the parent node, but that parent node is just n time 
larger then. We would be back to the behavior of Jackrabbit 2.x, where adding 
many child nodes is an O( n^2 ) operation, for orderable child node lists. This 
might or might not be acceptable - I don't think we decided this when we 
defined our goals.

With the linked list approach, iterating over all child nodes will have to read 
all child nodes (also an O( n ) operation), on the other hand it will make it 
possible to support many child nodes without limitations. It is true that this 
approach is probably slower if there are few child nodes (compared to storing 
the complete child node name list in the parent).

I guess to decide which approach works best in practice we first need have to 
define which use cases we care about and which are the most common ones.

  was (Author: tmueller):
Iterating over all child nodes is always an O(n) operation (n = number of 
child nodes). Of course if you store all child nodes names in the parent node 
you only have to access the parent node, but that parent node is just n time 
larger then. We would be back to the behavior of Jackrabbit 2.x, where adding 
many child nodes is an O(n^2) operation, for orderable child node lists. This 
might or might not be acceptable - I don't think we decided this when we 
defined our goals.

With the linked list approach, iterating over all child nodes will have to read 
all child nodes (also an O(n) operation), on the other hand it will make it 
possible to support many child nodes without limitations. It is true that this 
approach is probably slower if there are few child nodes (compared to storing 
the complete child node name list in the parent).

I guess to decide which approach works best in practice we first need have to 
define which use cases we care about and which are the most common ones.
  
 Support orderable nodes
 ---

 Key: OAK-169
 URL: https://issues.apache.org/jira/browse/OAK-169
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: jcr
Reporter: Jukka Zitting

 There are JCR clients that depend on the ability to explicitly specify the 
 order of child nodes. That functionality is not included in the MicroKernel 
 tree model, so we need to implement it either in oak-core or oak-jcr using 
 something like an extra (hidden) {{oak:childOrder}} property that records the 
 specified ordering of child nodes. A multi-valued string property is probably 
 good enough for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expected root node not in result

2012-07-05 Thread Thomas Mueller (JIRA)
Thomas Mueller created JCR-3376:
---

 Summary: TCK: SQLPathTest.testChildAxisRoot expected root node not 
in result
 Key: JCR-3376
 URL: https://issues.apache.org/jira/browse/JCR-3376
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor


The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query:

SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE '/%/%'

It expected the result to be

/jcr:system, /testroot, /testdata

It does not allow the implementation to return the root node ('/'). According 
to the specification, a JCR implementation may filter the root node, as noted 
by Randall Hauch - 
http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html
 - quote:


Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard 
characters as generally used within LIKE predicates (and jcr:like in XPath):

As in SQL, the character '%' represents any string of zero or more 
characters, and the character '_' (underscore) represents any 
single character.

while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics 
jcr:path pseudo column and narrows the semantics of using LIKE with 
jcr:path in the second-to-last bullet point:

Predicates in the WHERE clause that test jcr:path are only required to 
support the operators =,  and LIKE. In the case of LIKE predicates, 
support is only required for tests using the % wildcard character as a 
match for a whole path segment (the part between two / characters) 
or within index brackets

Because the '%' matches only a whole path segment, the /% literal only 
matches paths that have at least one path segment, which means that it matches 
all descendants of the root node.


the specification says In the case of LIKE predicates, support is only 
required for tests using the % wildcard character as a match for a whole path 
segment (the part between two / characters)... but it doesn't specify it needs 
to do so.

To allow an implementation to return the root node, I suggest to change the 
test as follows:

... AND NOT jcr:path LIKE '/%/%' AND jcr:path  '/'



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (JCR-3376) TCK: SQLPathTest.testChildAxisRoot expected root node not in result

2012-07-05 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-3376.
-

   Resolution: Fixed
Fix Version/s: 2.6

 TCK: SQLPathTest.testChildAxisRoot expected root node not in result
 ---

 Key: JCR-3376
 URL: https://issues.apache.org/jira/browse/JCR-3376
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor
 Fix For: 2.6


 The TCK test SQLPathTest.testChildAxisRoot runs the following SQL-1 query:
 SELECT * FROM nt:base WHERE jcr:path LIKE '/%' AND NOT jcr:path LIKE 
 '/%/%'
 It expected the result to be
 /jcr:system, /testroot, /testdata
 It does not allow the implementation to return the root node ('/'). According 
 to the specification, a JCR implementation may filter the root node, as noted 
 by Randall Hauch - 
 http://jackrabbit.510166.n4.nabble.com/TCK-SQLPathTest-testChildAxisRoot-td4655670.html
  - quote:
 
 Section 6.6.5.1 (jcr:like function) defines the semantics of the wildcard 
 characters as generally used within LIKE predicates (and jcr:like in XPath):
   As in SQL, the character '%' represents any string of zero or more 
   characters, and the character '_' (underscore) represents any 
   single character.
 while Section 8.5.2.2 (Pseudo-property jcr:path) specifies the semantics 
 jcr:path pseudo column and narrows the semantics of using LIKE with 
 jcr:path in the second-to-last bullet point:
   Predicates in the WHERE clause that test jcr:path are only required to 
   support the operators =,  and LIKE. In the case of LIKE predicates, 
   support is only required for tests using the % wildcard character as a 
   match for a whole path segment (the part between two / characters) 
   or within index brackets
 Because the '%' matches only a whole path segment, the /% literal only 
 matches paths that have at least one path segment, which means that it 
 matches all descendants of the root node.
 
 the specification says In the case of LIKE predicates, support is only 
 required for tests using the % wildcard character as a match for a whole path 
 segment (the part between two / characters)... but it doesn't specify it 
 needs to do so.
 To allow an implementation to return the root node, I suggest to change the 
 test as follows:
 ... AND NOT jcr:path LIKE '/%/%' AND jcr:path  '/'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (OAK-155) Query: limited support for the deprecated JCR 1.0 query language Query.SQL

2012-07-05 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller reassigned OAK-155:
--

Assignee: Thomas Mueller

 Query: limited support for the deprecated JCR 1.0 query language Query.SQL
 --

 Key: OAK-155
 URL: https://issues.apache.org/jira/browse/OAK-155
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller
Assignee: Thomas Mueller

 Existing applications (as well as the TCK) use the JCR 1.0 query language
 sql. As far as I know, there are only few differences between JCR 1.0 SQL 
 and JCR 2.0 SQL-2. To make old applications work with Oak, I suggest we 
 provide support JCR 1.0 SQL as well. An additional advantage is that more of 
 the existing TCK tests can be run.
 I currently don't know if the full JCR 1.0 SQL syntax can be supported 
 (similar to XPath); if we find supporting certain features is too complicated 
 we will document those limitations instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (JCR-3363) DataStore garbage collection: test case GarbageCollectorTest.testGC() is to lenient

2012-06-28 Thread Thomas Mueller (JIRA)
Thomas Mueller created JCR-3363:
---

 Summary: DataStore garbage collection: test case 
GarbageCollectorTest.testGC() is to lenient
 Key: JCR-3363
 URL: https://issues.apache.org/jira/browse/JCR-3363
 Project: Jackrabbit Content Repository
  Issue Type: Bug
Reporter: Thomas Mueller


The test case GarbageCollectorTest.testGC() is supposed to test binaries of 
nodes that are moved while garbage collection is running are not deleted.

However the test doesn't fail if the event listener is disabled. It should.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-155) Query: limited support for the deprecated JCR 1.0 query language Query.SQL

2012-06-28 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-155:
--

 Summary: Query: limited support for the deprecated JCR 1.0 query 
language Query.SQL
 Key: OAK-155
 URL: https://issues.apache.org/jira/browse/OAK-155
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller


Existing applications (as well as the TCK) use the JCR 1.0 query language
sql. As far as I know, there are only few differences between JCR 1.0 SQL and 
JCR 2.0 SQL-2. To make old applications work with Oak, I suggest we provide 
support JCR 1.0 SQL as well. An additional advantage is that more of the 
existing TCK tests can be run.

I currently don't know if the full JCR 1.0 SQL syntax can be supported (similar 
to XPath); if we find supporting certain features is too complicated we will 
document those limitations instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-155) Query: limited support for the deprecated JCR 1.0 query language Query.SQL

2012-06-28 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402996#comment-13402996
 ] 

Thomas Mueller commented on OAK-155:


My current plan is to use a strict parser for both SQL-1 and XPath, so that 
unsupported syntax is rejected. For existing applications that no longer work 
because of that, we can decide whether we want to support a certain (XPath / 
SQL-1) feature in Oak, patching the respective parser, or change the 
application. Whatever is less work.

An implementation detail: I plan to use the SQL-2 parser for SQL-1 as well, but 
an (internal) switch so that the SQL-1 features are really only supported when 
using SQL-1, and rejected when using SQL-2. If this turns out to be a problem 
we can still split the parser. As a side effect, SQL-2 syntax is supported for 
SQL-1 queries.

But SQL-1 is clearly deprecated, not just within the JCR spec, but also within 
Oak. Unlike XPath, which will very likely be supported within Oak for a longer 
time, SQL-1 queries should really be converted to SQL-2.

 Query: limited support for the deprecated JCR 1.0 query language Query.SQL
 --

 Key: OAK-155
 URL: https://issues.apache.org/jira/browse/OAK-155
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Thomas Mueller

 Existing applications (as well as the TCK) use the JCR 1.0 query language
 sql. As far as I know, there are only few differences between JCR 1.0 SQL 
 and JCR 2.0 SQL-2. To make old applications work with Oak, I suggest we 
 provide support JCR 1.0 SQL as well. An additional advantage is that more of 
 the existing TCK tests can be run.
 I currently don't know if the full JCR 1.0 SQL syntax can be supported 
 (similar to XPath); if we find supporting certain features is too complicated 
 we will document those limitations instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (JCR-3321) TCK: Strange XPath query in OrderByMultiTypeTest.testMultipleOrder

2012-06-19 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-3321.
-

   Resolution: Fixed
Fix Version/s: 2.6

 TCK: Strange XPath query in OrderByMultiTypeTest.testMultipleOrder
 --

 Key: JCR-3321
 URL: https://issues.apache.org/jira/browse/JCR-3321
 Project: Jackrabbit Content Repository
  Issue Type: Sub-task
  Components: jackrabbit-jcr-tests
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor
 Fix For: 2.6


 The test 
 org.apache.jackrabbit.test.api.query.OrderByMultiTypeTest.testMultipleOrder 
 currently runs a query of the form:
 //testroot/*[@jcr:primaryType='nt:unstructured']
 I believe there is a typo in the test, and the query should be:
 /jcr:root/testroot/*[@jcr:primaryType='nt:unstructured']

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (JCR-2686) Data store garbage collection: interrupt mark

2012-06-19 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-2686.
-

Resolution: Fixed

I just saw we already have a GarbageCollector.close() method which does stop 
the mark, so I think I can resolve this issue as fixed.

 Data store garbage collection: interrupt mark
 -

 Key: JCR-2686
 URL: https://issues.apache.org/jira/browse/JCR-2686
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
  Components: jackrabbit-core
Affects Versions: 2.1
Reporter: Stephan Huttenhuis
Assignee: Thomas Mueller

 It would be nice if the DataStore GarbageCollector can be interrupted during 
 a mark. This allows applications that use JackRabbit to shutdown without 
 having to wait for the mark to complete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (JCR-2286) Implement Value.toString

2012-06-19 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-2286.
-

Resolution: Won't Fix

I don't plan to fix it in Jackrabbit 2.x

 Implement Value.toString
 

 Key: JCR-2286
 URL: https://issues.apache.org/jira/browse/JCR-2286
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
  Components: jackrabbit-spi-commons
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor

 Currently QValueValue.toString() is not implemented. It would help if the 
 method returns something human readable, both for debugging and for 
 generating error messages.
 It's a bit tricky, because we need to make sure toString() never fails and 
 has no side effects (doesn't read from files, doesn't change state), 
 otherwise it breaks debugging. Changing the state when throwing an exception 
 is not such a big problem, but for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (JCR-2998) Option to log the path for Session.save() calls

2012-06-19 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller resolved JCR-2998.
-

Resolution: Won't Fix

We have SessionState debug logging options, I don't think we need anything else.

 Option to log the path for Session.save() calls
 ---

 Key: JCR-2998
 URL: https://issues.apache.org/jira/browse/JCR-2998
 Project: Jackrabbit Content Repository
  Issue Type: New Feature
  Components: jackrabbit-core
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor

 It would be nice to be able to log the path for Session.save() calls, so that 
 it's easier to analyze if the repository is slow because of many write 
 operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (JCR-3340) GarbageCollector should ignore all NoSuchItemStateExceptions

2012-06-14 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294941#comment-13294941
 ] 

Thomas Mueller commented on JCR-3340:
-

As discussed offline with Mete we came to the conclusion not to apply the patch 
now.
There is a risk we could hide a bigger problem. FYI the underlying exception is:

javax.jcr.RepositoryException: failed to retrieve state of intermediary node
javax.jcr.RepositoryException: javax.jcr.RepositoryException: failed to 
retrieve state of intermediary node
at 
org.apache.jackrabbit.core.data.GarbageCollector.stopScan(GarbageCollector.java:240)
at 
org.apache.jackrabbit.core.data.GarbageCollector.sweep(GarbageCollector.java:258)

Caused by: javax.jcr.RepositoryException: failed to retrieve state of 
intermediary node
at 
org.apache.jackrabbit.core.CachingHierarchyManager.resolvePath(CachingHierarchyManager.java:156)
at 
org.apache.jackrabbit.core.HierarchyManagerImpl.resolvePath(HierarchyManagerImpl.java:365)
at org.apache.jackrabbit.core.ItemManager.getItem(ItemManager.java:550)
at 
org.apache.jackrabbit.core.session.SessionItemOperation$4.perform(SessionItemOperation.java:97)
at 
org.apache.jackrabbit.core.session.SessionItemOperation$4.perform(SessionItemOperation.java:93)
at 
org.apache.jackrabbit.core.session.SessionItemOperation.perform(SessionItemOperation.java:187)
at 
org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:200)
at org.apache.jackrabbit.core.SessionImpl.perform(SessionImpl.java:355)
at org.apache.jackrabbit.core.SessionImpl.getItem(SessionImpl.java:743)
at 
org.apache.jackrabbit.core.data.GarbageCollector$Listener.onEvent(GarbageCollector.java:421)
at 
org.apache.jackrabbit.core.observation.EventConsumer.consumeEvents(EventConsumer.java:248)
at 
org.apache.jackrabbit.core.observation.ObservationDispatcher.dispatchEvents(ObservationDispatcher.java:214)
at 
org.apache.jackrabbit.core.observation.EventStateCollection.dispatch(EventStateCollection.java:475)
at 
org.apache.jackrabbit.core.state.SharedItemStateManager$Update.end(SharedItemStateManager.java:798)
at 
org.apache.jackrabbit.core.state.SharedItemStateManager.update(SharedItemStateManager.java:1498)
at 
org.apache.jackrabbit.core.state.LocalItemStateManager.update(LocalItemStateManager.java:398)
at 
org.apache.jackrabbit.core.state.XAItemStateManager.update(XAItemStateManager.java:354)
at 
org.apache.jackrabbit.core.state.LocalItemStateManager.update(LocalItemStateManager.java:373)
at 
org.apache.jackrabbit.core.state.SessionItemStateManager.update(SessionItemStateManager.java:274)
at 
org.apache.jackrabbit.core.ItemSaveOperation.perform(ItemSaveOperation.java:258)
at 
org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:200)
at org.apache.jackrabbit.core.ItemImpl.perform(ItemImpl.java:91)
at org.apache.jackrabbit.core.ItemImpl.save(ItemImpl.java:329)
at 
org.apache.jackrabbit.core.session.SessionSaveOperation.perform(SessionSaveOperation.java:42)
at 
org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:200)
at org.apache.jackrabbit.core.SessionImpl.perform(SessionImpl.java:355)
at org.apache.jackrabbit.core.SessionImpl.save(SessionImpl.java:758)

Caused by: org.apache.jackrabbit.core.state.NoSuchItemStateException: 
1b94274f-431c-4dcd-aac6-b238527fc276
at 
org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:282)
at 
org.apache.jackrabbit.core.state.LocalItemStateManager.getNodeState(LocalItemStateManager.java:109)
at 
org.apache.jackrabbit.core.state.LocalItemStateManager.getItemState(LocalItemStateManager.java:174)
at 
org.apache.jackrabbit.core.state.SessionItemStateManager.getItemState(SessionItemStateManager.java:161)
at 
org.apache.jackrabbit.core.HierarchyManagerImpl.getItemState(HierarchyManagerImpl.java:152)
at 
org.apache.jackrabbit.core.HierarchyManagerImpl.resolvePath(HierarchyManagerImpl.java:115)
at 
org.apache.jackrabbit.core.CachingHierarchyManager.resolvePath(CachingHierarchyManager.java:152)



 GarbageCollector should ignore all NoSuchItemStateExceptions
 

 Key: JCR-3340
 URL: https://issues.apache.org/jira/browse/JCR-3340
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: jackrabbit-core
Affects Versions: 2.5
Reporter: Mete Atamel
 Attachments: JCR-3340.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 When GarbageCollector goes through nodes, it can encounter 
 NoSuchItemStateException or PathNotFoundException if a node has been deleted 
 or moved in the meantime. GarbageCollector can safely ignore these 
 exceptions. It tries to do so in some cases but not all.
 For example, Listener#onEvent method in GarbageCollector catches 
 PathNotFoundException and it also catches the 

[jira] [Created] (JCR-3341) GarbageCollector should fail fast if there is a problem

2012-06-14 Thread Thomas Mueller (JIRA)
Thomas Mueller created JCR-3341:
---

 Summary: GarbageCollector should fail fast if there is a problem
 Key: JCR-3341
 URL: https://issues.apache.org/jira/browse/JCR-3341
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
  Components: jackrabbit-core
Affects Versions: 2.5
Reporter: Thomas Mueller
Priority: Minor


The GarbageCollector installs an ObservationListener to ensure moved nodes are 
scanned as well.

If there is an exception in the ObservationListener, this exception is captured 
(lastException), but only evaluated at the very end of the GC cycle, in 
Listener.stop() / GarbageCollector.stopScan() which is called as part of 
sweep(), before deleting unused items.

This is quite late. For a large repository, scanning might take a few hours.

If such an exception occurs, scanning should stop within a reasonable time 
(fail fast), and the exception should be thrown there.

This is related to JCR-3340

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (JCR-3341) GarbageCollector should fail fast if there is a problem

2012-06-14 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated JCR-3341:


Attachment: JCR-3341.patch

 GarbageCollector should fail fast if there is a problem
 ---

 Key: JCR-3341
 URL: https://issues.apache.org/jira/browse/JCR-3341
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
  Components: jackrabbit-core
Affects Versions: 2.5
Reporter: Thomas Mueller
Priority: Minor
 Attachments: JCR-3341.patch


 The GarbageCollector installs an ObservationListener to ensure moved nodes 
 are scanned as well.
 If there is an exception in the ObservationListener, this exception is 
 captured (lastException), but only evaluated at the very end of the GC cycle, 
 in Listener.stop() / GarbageCollector.stopScan() which is called as part of 
 sweep(), before deleting unused items.
 This is quite late. For a large repository, scanning might take a few hours.
 If such an exception occurs, scanning should stop within a reasonable time 
 (fail fast), and the exception should be thrown there.
 This is related to JCR-3340

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (JCR-3341) GarbageCollector should fail fast if there is a problem

2012-06-14 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated JCR-3341:


Assignee: Thomas Mueller
  Status: Patch Available  (was: Open)

A patch for the 2.2 branch is attached.

 GarbageCollector should fail fast if there is a problem
 ---

 Key: JCR-3341
 URL: https://issues.apache.org/jira/browse/JCR-3341
 Project: Jackrabbit Content Repository
  Issue Type: Improvement
  Components: jackrabbit-core
Affects Versions: 2.5
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor
 Attachments: JCR-3341.patch


 The GarbageCollector installs an ObservationListener to ensure moved nodes 
 are scanned as well.
 If there is an exception in the ObservationListener, this exception is 
 captured (lastException), but only evaluated at the very end of the GC cycle, 
 in Listener.stop() / GarbageCollector.stopScan() which is called as part of 
 sweep(), before deleting unused items.
 This is quite late. For a large repository, scanning might take a few hours.
 If such an exception occurs, scanning should stop within a reasonable time 
 (fail fast), and the exception should be thrown there.
 This is related to JCR-3340

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-138) Move client/server package in oak-mk to separate project

2012-06-13 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294318#comment-13294318
 ] 

Thomas Mueller commented on OAK-138:


The data store that is currently in oak-mk (org.apache.jackrabbit.mk.blobs) 
can be re-used in other mk implementations. 
So I guess we should also create (at least) oak-mk-datastore or
oak-mk-blob. Similarly, the jsop part shouldn't be part of oak-mk.
We could move it to oak-commons or create a new project oak-commons-jsop.
I guess the same goes for the cache implementation (oak-commons or 
oak-commons-cache).

 Move client/server package in oak-mk to separate project
 

 Key: OAK-138
 URL: https://issues.apache.org/jira/browse/OAK-138
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, it, mk, run
Affects Versions: 0.3
Reporter: Dominique Pfister
Assignee: Dominique Pfister

 As a further cleanup step in OAK-13, I'd like to move the packages 
 o.a.j.mk.client and o.a.j.mk.server and referenced classes in oak-mk to a 
 separate project, e.g. oak-mk-remote.
 This new project will then be added as a dependency to:
 oak-core
 oak-run
 oak-it-mk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-138) Move client/server package in oak-mk to separate project

2012-06-13 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294428#comment-13294428
 ] 

Thomas Mueller commented on OAK-138:


OK, my personal opinion is still to use as few projects as possible, but if 
everybody else things we need to do it then I can live with that.

We can later decide if we want to create additional projects for the data store 
implementation and the log wrapper.

 Move client/server package in oak-mk to separate project
 

 Key: OAK-138
 URL: https://issues.apache.org/jira/browse/OAK-138
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, it, mk, run
Affects Versions: 0.3
Reporter: Dominique Pfister
Assignee: Dominique Pfister

 As a further cleanup step in OAK-13, I'd like to move the packages 
 o.a.j.mk.client and o.a.j.mk.server and referenced classes in oak-mk to a 
 separate project, e.g. oak-mk-remote.
 This new project will then be added as a dependency to:
 oak-core
 oak-run
 oak-it-mk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (OAK-140) PropertyState: data type of empty array property

2012-06-13 Thread Thomas Mueller (JIRA)
Thomas Mueller created OAK-140:
--

 Summary: PropertyState: data type of empty array property
 Key: OAK-140
 URL: https://issues.apache.org/jira/browse/OAK-140
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core
Reporter: Thomas Mueller
Priority: Minor


Currently, there seems to be no way to retrieve the data type of a 
org.apache.jackrabbit.oak.api.PropertyState for empty arrays.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (JCR-3333) The binary file entities are stored twice in the DB

2012-06-12 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293408#comment-13293408
 ] 

Thomas Mueller commented on JCR-:
-

Hi,

Well, I still don't know which version of Jackrabbit you are using. Anyway, I 
just saw the data store should be supported in Jackrabbit 1.6 (by the way, this 
is the Jackrabbit version, not the version of the JCR API, sorry I also got 
that wrong).

As documented, you should be able to add the data store config to the 
repository.xml as follows:

DataStore class=org.apache.jackrabbit.core.data.FileDataStore/

 could u let me know any big changes or release note of JCR about API update? 

See the documentation.

 The binary file entities are stored twice in the DB
 ---

 Key: JCR-
 URL: https://issues.apache.org/jira/browse/JCR-
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: JCR 2.0
 Environment: Windows 7, Linux
Reporter: P.C.Sun
 Attachments: repository.xml


 We are using JCR in Liferay to store documents, which means all documents 
 store in DB in binary. As these days, we found the size of DB is increasing 
 very fast. So we run the SQL to get size of documents. The SQLs are like: 
 1. select sum(size_) from dlfileentry(liferay table to store file meta data, 
 such as name, size); - All documents size recorded in dlentry table:
 The result is: 43330765874, which means around 40.36 GB
 2. The DB size report is: around 95.97 GB. 
 3. Within these tables, there are two very big tables: 
 j_pm_liferay_binval - 52.07GB
 j_v_pm_binval - 43.65 GB
 So the question is: if the document itself is only around 40.36 GB, what are 
 those two tables storing? From the table itself, they are the all binval 
 tables...Does it mean every document is stored twice or something. What's 
 inside those tables? 
 In this case, the DB increase around 30 GB within 3 months, really fast, any 
 suggestion to improve this? 
 As replied from Liferay: the table j_v_pm_binaval is to store the file 
 version. However, for the new document, it's also stored, which we think it 
 should be created only when a new version is generated. They also mentioned 
 to solve this we need to change repository.xml, however, we don't have the 
 answer how to deal with the old files, whether they will get lost if we 
 change the config file.
 Please let me know whether it is possible to clean them in DB? 
 Thank you very much and looking forwards to your reply. 
 Best Regards.
 P.C.(JACK) SUN

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (JCR-3333) The binary file entities are stored twice in the DB

2012-06-12 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293452#comment-13293452
 ] 

Thomas Mueller commented on JCR-:
-

Hi,

I suggest you first read all the documentation about this (data store, 
persistence manager, migration). You should migrate your data, that is, create 
a new repository, migrate the data, and delete the old repository.

Regards,
Thomas


 The binary file entities are stored twice in the DB
 ---

 Key: JCR-
 URL: https://issues.apache.org/jira/browse/JCR-
 Project: Jackrabbit Content Repository
  Issue Type: Bug
  Components: JCR 2.0
 Environment: Windows 7, Linux
Reporter: P.C.Sun
 Attachments: repository.xml


 We are using JCR in Liferay to store documents, which means all documents 
 store in DB in binary. As these days, we found the size of DB is increasing 
 very fast. So we run the SQL to get size of documents. The SQLs are like: 
 1. select sum(size_) from dlfileentry(liferay table to store file meta data, 
 such as name, size); - All documents size recorded in dlentry table:
 The result is: 43330765874, which means around 40.36 GB
 2. The DB size report is: around 95.97 GB. 
 3. Within these tables, there are two very big tables: 
 j_pm_liferay_binval - 52.07GB
 j_v_pm_binval - 43.65 GB
 So the question is: if the document itself is only around 40.36 GB, what are 
 those two tables storing? From the table itself, they are the all binval 
 tables...Does it mean every document is stored twice or something. What's 
 inside those tables? 
 In this case, the DB increase around 30 GB within 3 months, really fast, any 
 suggestion to improve this? 
 As replied from Liferay: the table j_v_pm_binaval is to store the file 
 version. However, for the new document, it's also stored, which we think it 
 should be created only when a new version is generated. They also mentioned 
 to solve this we need to change repository.xml, however, we don't have the 
 answer how to deal with the old files, whether they will get lost if we 
 change the config file.
 Please let me know whether it is possible to clean them in DB? 
 Thank you very much and looking forwards to your reply. 
 Best Regards.
 P.C.(JACK) SUN

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (OAK-138) Move client/server package in oak-mk to separate project

2012-06-12 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293499#comment-13293499
 ] 

Thomas Mueller commented on OAK-138:


The log wrapper is a somewhat similar implementation.

What about oak-mk-common, and use it for both the log wrapper and the remoting?
(possibly, if this will ever be needed, other implementations could be added, 
for example
a virtual repository wrapper which currently isn't needed).

 Move client/server package in oak-mk to separate project
 

 Key: OAK-138
 URL: https://issues.apache.org/jira/browse/OAK-138
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, it, mk, run
Affects Versions: 0.3
Reporter: Dominique Pfister
Assignee: Dominique Pfister

 As a further cleanup step in OAK-13, I'd like to move the packages 
 o.a.j.mk.client and o.a.j.mk.server and referenced classes in oak-mk to a 
 separate project, e.g. oak-mk-remote.
 This new project will then be added as a dependency to:
 oak-core
 oak-run
 oak-it-mk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




<    1   2   3   4   5   6   7   8   9   10   >