stefan-egli commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-2236322219

   Turns out that it might look slightly different altogether:
   
   In the latest test from @ionutzpi where it passes a broken surrogate through 
NodeBuilder (except it shouldn't also go via DocumentPropertyState there), it 
can be seen that when it applies a property to the commit, as part of 
[persist](https://github.com/apache/jackrabbit-oak/blob/a5300d8088eacf7d5662ef5b8a7a64abad1ce7bf/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java#L313)
 / 
[compareAgainstBaseState](https://github.com/apache/jackrabbit-oak/blob/a5300d8088eacf7d5662ef5b8a7a64abad1ce7bf/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java#L289),
 it does go through 
[JsonSerializer](https://github.com/apache/jackrabbit-oak/blob/a5300d8088eacf7d5662ef5b8a7a64abad1ce7bf/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/CommitDiff.java#L182-L183),
 which handles broken surrogates properly.
   
   So in short : DocumentNodeStore does actually handle broken surrogates in 
property values fine.
   
   What I think happened in that original 
[testInterestingStrings](https://github.com/apache/jackrabbit-oak/blob/0e327a27eb988a0f7656535275f3735ef8474e5d/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/BasicDocumentStoreTest.java#L750),
 is that it the DocumentStore API directly - instead of the DocumentNodeStore 
API...
   
   Wrt this PR I think there are a few things still open though :
   * beautify the newly introduced tests - they have a few issue, namely : 
     * `getBrokenSurrogateAndInitializeDifferentStores` calls 
`createPropAndCheckValue` 3 times dependent on the DocumentStore used, except 
there's no difference between the 3 calls, can do that in 1 call. 
     * the way it uses the fixture is different from how it is classically done 
through a parameterized test. either I'd suggest to consider a parameterized 
test (where it would dispose the fixture at the end) or make sure to dispose 
the store/documentNodeStore created. The way it is done now leaves the 
DocumentNodeStore undisposed, which means follow-up tests hit an active lease 
and wait for lease expiration, which extends the test unnecessarily
     * `createPropAndCheckValue` as mentioned should go via 
`DocumentPropertyState` when setting the property value. that is not how it 
happens in existing code. `DocumentPropertyState` is only used for reading. 
Writing happens via builder which uses things like ModifiedDocumentNodeState, 
goes via jcr ValueImpl, StringPropertyState etc. 
   * additionally though I believe there is a bigger issue : also based on the 
new test, when debugging through compression/decompression it can be seen that 
immediately after a `DocumentPropertyState` is created and with compression the 
value is compressed, it goes ahead and wants to put it into the cache and for 
that wants to know its size, so it goes an calls `parsed()` - several times - 
as part of which it calls decompression - several times. I think this 
highlights a bigger problem, that we should see if/how compression is 
performant. It seems the way it is now would add a lot of decompression 
overhead. This needs further improvement in my view.
   
   wdyt?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to