Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-12 Thread via GitHub


ionutzpi commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-2225449508

   @reschke I vote for your proposal(fixing segment store)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-12 Thread via GitHub


reschke commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-2225379200

   bq. For RDB it could be different: it could mean that if you want to use 
compression for RDB you'd have to accept that broken surrogates are not 
supported (unless we find a fast way to support it in compression).
   
   The elephant in the room is segment persistence, which IMHO allows *any* 
Java string. I don't believe discussing the differences of document store 
backends is helpful here; we should come to an agreement what the expections 
for Oak in *general* are. Having document store and segment store behave 
differently IMHO is a problem we should solve (by fixing segment store, would 
be my preference).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-12 Thread via GitHub


ionutzpi commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-2225354781

   @reschke @reschke I created methods to check the behavior with broken 
surrogate before and after introducing compression. After compression the test 
failed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-11 Thread via GitHub


stefan-egli commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-991078

   With the default set to disable compression, in theory this PR shouldn't do 
no harm and could be merged, as changing the default is a conscious decision 
that could be done based on the requirement of the system and which 
DocumentStore is used.
   
   Having said that, I think the current state of `testInterestingStrings` 
isn't ideal, as it lists a broken surrogate in a test string, but then doesn't 
test it. So I'd say either it uses that broken surrogate in a test, or then 
shouldn't list it at all. Except, I think now that this problem with broken 
surrogate was identified, we need to address it, so we probably should have it 
in the test indeed.
   
   What we might do is have the test use different DocumentStore fixtures and 
test on memory, mongo and rdb *with* the broken surrogate property and verify 
that the behavior with or without compression is the same.
   
   That could mean that for mongo the fact that compression doesn't support 
broken surrogates turns out as a non-issue as mongo already doesn't support it. 
It would just be good to have a test that verifies that.
   
   For RDB it could be different: it could mean that if you want to use 
compression for RDB you'd have to accept that broken surrogates are not 
supported (unless we find a fast way to support it in compression).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-11 Thread via GitHub


reschke commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-299402

   > Why is this a MongoDB specific aspect? For MongoDB we know it is not 
supported (see [this skip 
here](https://github.com/apache/jackrabbit-oak/blob/0e327a27eb988a0f7656535275f3735ef8474e5d/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/BasicDocumentStoreTest.java#L760))
 - but for RDB it is supported. So shouldn't this PR keep the support for RDB?
   
   I believe we should come up with consistent behaviour for all types of 
persistences. My preference would be to disallow Strings with broken UTF-8 
serializations, but that would be a change for segment persistence, and that's 
why I faced resistence when I proposed that back then.
   
   For *this* PR, it will have to handle what the current system allows.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


stefan-egli commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644787553


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -60,6 +73,11 @@ public void before() throws Exception {
 ns = builderProvider.newBuilder().setBlobStore(bs).getNodeStore();
 }
 
+@After
+public void tearDown() {
+ns.dispose();

Review Comment:
   ```suggestion
   try {
   ns.dispose();
   } finally {
   
DocumentPropertyState.setCompressionThreshold(DISABLED_COMPRESSION);
   }
   ```
   
   to avoid side-effects, the test should ensure it doesn't leak a non default 
compression threshold to other tests. DISABLED_COMPRESSION is for illustration 
to refer to -1 - ideally the -1 would only be defined in the source class...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


stefan-egli commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-2176272288

   
[looks](https://github.com/apache/jackrabbit-oak/blob/14c357120e49d83281dc473d810ca485deb94501/oak-store-spi/src/main/java/org/apache/jackrabbit/oak/plugins/memory/AbstractPropertyState.java#L103-L106)
 like the original intention was that properties shouldn't or wouldn't be used 
as keys in hash maps... if that's the case then it should be fine..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644558711


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -127,8 +188,25 @@ public boolean equals(Object object) {
 return true;
 } else if (object instanceof DocumentPropertyState) {
 DocumentPropertyState other = (DocumentPropertyState) object;
-return this.name.equals(other.name)
-&& this.value.equals(other.value);
+// Compare names and raw un-parsed values
+if (!this.name.equals(other.name) || 
!this.getValue().equals(other.getValue())) {

Review Comment:
   Yes, but only then, not always.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


reschke commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644549857


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -127,8 +188,25 @@ public boolean equals(Object object) {
 return true;
 } else if (object instanceof DocumentPropertyState) {
 DocumentPropertyState other = (DocumentPropertyState) object;
-return this.name.equals(other.name)
-&& this.value.equals(other.value);
+// Compare names and raw un-parsed values
+if (!this.name.equals(other.name) || 
!this.getValue().equals(other.getValue())) {

Review Comment:
   "getValue()" will uncompress when the value is compressed, no?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644545109


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -127,8 +188,25 @@ public boolean equals(Object object) {
 return true;
 } else if (object instanceof DocumentPropertyState) {
 DocumentPropertyState other = (DocumentPropertyState) object;
-return this.name.equals(other.name)
-&& this.value.equals(other.value);
+// Compare names and raw un-parsed values
+if (!this.name.equals(other.name) || 
!this.getValue().equals(other.getValue())) {

Review Comment:
   Not necessarily.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644545109


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -127,8 +188,25 @@ public boolean equals(Object object) {
 return true;
 } else if (object instanceof DocumentPropertyState) {
 DocumentPropertyState other = (DocumentPropertyState) object;
-return this.name.equals(other.name)
-&& this.value.equals(other.value);
+// Compare names and raw un-parsed values
+if (!this.name.equals(other.name) || 
!this.getValue().equals(other.getValue())) {

Review Comment:
   Not necessary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


reschke commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644499232


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -127,8 +188,25 @@ public boolean equals(Object object) {
 return true;
 } else if (object instanceof DocumentPropertyState) {
 DocumentPropertyState other = (DocumentPropertyState) object;
-return this.name.equals(other.name)
-&& this.value.equals(other.value);
+// Compare names and raw un-parsed values
+if (!this.name.equals(other.name) || 
!this.getValue().equals(other.getValue())) {

Review Comment:
   This will always decompress, no?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644492664


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,225 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenReturn(mockOutputStream);
+
when(mockCompression.getInputStream(any(InputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", STRING_HUGEVALUE, 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), "{}");
+
+verify(mockCompression, 
times(1)).getInputStream(any(InputStream.class));
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644490150


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,225 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644489555


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -116,7 +163,20 @@ public int count() {
  */
 @NotNull
 String getValue() {
-return value;
+return value != null ? value : decompress(this.compressedValue);
+}
+
+private String decompress(byte[] value) {
+try {
+return new String(compression.getInputStream(new 
ByteArrayInputStream(value)).readAllBytes(), StandardCharsets.UTF_8);
+} catch (IOException e) {
+LOG.error("Failed to decompress property {} value: ", getName(), 
e);
+return "\"{}\"";
+}
+}
+
+private byte[] getCompressedValue() {

Review Comment:
   Done



##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,225 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644489190


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -18,26 +18,45 @@
 
 import java.io.IOException;
 import java.io.InputStream;
+import java.io.OutputStream;
+import java.lang.reflect.Field;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.lang.reflect.Modifier;
 import java.util.List;
+import java.util.Objects;
 import java.util.Set;
 
+import org.apache.commons.lang3.RandomStringUtils;
 import org.apache.jackrabbit.oak.api.Blob;
 import org.apache.jackrabbit.oak.api.PropertyState;
 import org.apache.jackrabbit.oak.api.Type;
+import org.apache.jackrabbit.oak.commons.Compression;
 import org.apache.jackrabbit.oak.spi.blob.BlobStore;
 import org.apache.jackrabbit.oak.spi.blob.MemoryBlobStore;
 import org.apache.jackrabbit.oak.spi.state.NodeBuilder;
-import org.junit.Before;
-import org.junit.Rule;
-import org.junit.Test;
+import org.junit.*;
+import org.junit.runners.MethodSorters;
 
 import static org.apache.jackrabbit.guava.common.collect.Lists.newArrayList;
 import static org.apache.jackrabbit.guava.common.collect.Sets.newHashSet;
 import static org.junit.Assert.assertEquals;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.times;
+import static org.mockito.Mockito.verify;
+import static org.mockito.Mockito.when;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.assertTrue;
 
+@FixMethodOrder(MethodSorters.NAME_ASCENDING)
 public class DocumentPropertyStateTest {
 
 private static final int BLOB_SIZE = 16 * 1024;
+public static final int COMPRESSED_SIZE = 543;

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644488604


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
SystemPropertySupplier
+.create("oak.documentMK.stringCompressionThreshold ", 
-1).loggingTo(LOG).get();
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
 this.store = store;
 this.name = name;
-this.value = value;
+if (compression != null && DEFAULT_COMPRESSION_THRESHOLD == -1) {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644487143


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenReturn(mockOutputStream);
+
when(mockCompression.getInputStream(any(InputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", STRING_HUGEVALUE, 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), "{}");
+
+verify(mockCompression, 
times(1)).getInputStream(any(InputStream.class));
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644486368


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenReturn(mockOutputStream);
+
when(mockCompression.getInputStream(any(InputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", STRING_HUGEVALUE, 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), "{}");
+
+verify(mockCompression, 
times(1)).getInputStream(any(InputStream.class));
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644485936


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
SystemPropertySupplier
+.create("oak.documentMK.stringCompressionThreshold ", 
-1).loggingTo(LOG).get();

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


stefan-egli commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644395621


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -128,7 +188,7 @@ public boolean equals(Object object) {
 } else if (object instanceof DocumentPropertyState) {
 DocumentPropertyState other = (DocumentPropertyState) object;
 return this.name.equals(other.name)
-&& this.value.equals(other.value);
+&& this.getValue().equals(other.getValue());

Review Comment:
   Right, I think there should be distinction between non-compressed, 
compressed, mixed cases. Only the compressed one is slightly more involved. For 
that one we could use a combination of length comparison, hashcode (calculated 
when needed) and/or plain Arrays.equals(byte[],byte[])..



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


reschke commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644353388


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -128,7 +188,7 @@ public boolean equals(Object object) {
 } else if (object instanceof DocumentPropertyState) {
 DocumentPropertyState other = (DocumentPropertyState) object;
 return this.name.equals(other.name)
-&& this.value.equals(other.value);
+&& this.getValue().equals(other.getValue());

Review Comment:
   We could special-case the compressed case, and compare the byte arrays. 
Still expensive.
   
   Maybe we could keep the String's hashCode in order to avoid comparing the 
byte arrays in most cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644348052


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -116,7 +163,20 @@ public int count() {
  */
 @NotNull
 String getValue() {
-return value;
+return value != null ? value : decompress(this.compressedValue);
+}
+
+private String decompress(byte[] value) {
+try {
+return new String(compression.getInputStream(new 
ByteArrayInputStream(value)).readAllBytes(), StandardCharsets.UTF_8);
+} catch (IOException e) {
+LOG.error("Failed to decompress property {} value: ", getName(), 
e);
+return "\"{}\"";
+}
+}
+
+private byte[] getCompressedValue() {

Review Comment:
   Yes. I did the same idea as Stefan. But Julian insisted to make it 
private(it was package-protected). Also, this approach complicated many 
tests(trying to solve with reflection)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


stefan-egli commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644336103


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -116,7 +163,20 @@ public int count() {
  */
 @NotNull
 String getValue() {
-return value;
+return value != null ? value : decompress(this.compressedValue);
+}
+
+private String decompress(byte[] value) {
+try {
+return new String(compression.getInputStream(new 
ByteArrayInputStream(value)).readAllBytes(), StandardCharsets.UTF_8);
+} catch (IOException e) {
+LOG.error("Failed to decompress property {} value: ", getName(), 
e);
+return "\"{}\"";
+}
+}
+
+private byte[] getCompressedValue() {

Review Comment:
   ah, now I remember. Your comment wasn't visible in this particular review 
anymore, so I didn't connect the dots.
   
   I'm actually a fan of facilitating testing, and that can mean to introduce 
methods in productive classes that are there solely for testing purpose. That 
method can still be non-public but package-protected for example. That in my 
view makes things more explicit than using reflection in tests which is 
brittle. Moving this method to the test class I assume implies opening up the 
actual variable, which seems less favourable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


reschke commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644328794


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -116,7 +163,20 @@ public int count() {
  */
 @NotNull
 String getValue() {
-return value;
+return value != null ? value : decompress(this.compressedValue);
+}
+
+private String decompress(byte[] value) {
+try {
+return new String(compression.getInputStream(new 
ByteArrayInputStream(value)).readAllBytes(), StandardCharsets.UTF_8);
+} catch (IOException e) {
+LOG.error("Failed to decompress property {} value: ", getName(), 
e);
+return "\"{}\"";
+}
+}
+
+private byte[] getCompressedValue() {

Review Comment:
   It actually was my proposal not to expose it. Maybe we can just *copy* or 
*move* it to the test class?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-18 Thread via GitHub


stefan-egli commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1644291222


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
SystemPropertySupplier
+.create("oak.documentMK.stringCompressionThreshold ", 
-1).loggingTo(LOG).get();
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
 this.store = store;
 this.name = name;
-this.value = value;
+if (compression != null && DEFAULT_COMPRESSION_THRESHOLD == -1) {

Review Comment:
   if `compression` is null, then it would go into `else` and run into a NPE



##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642878592


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenReturn(mockOutputStream);
+
when(mockCompression.getInputStream(any(InputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", STRING_HUGEVALUE, 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), "{}");
+
+verify(mockCompression, 
times(1)).getInputStream(any(InputStream.class));
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


reschke commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642813805


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenReturn(mockOutputStream);
+
when(mockCompression.getInputStream(any(InputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", STRING_HUGEVALUE, 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), "{}");
+
+verify(mockCompression, 
times(1)).getInputStream(any(InputStream.class));
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642766461


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenReturn(mockOutputStream);
+
when(mockCompression.getInputStream(any(InputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", STRING_HUGEVALUE, 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), "{}");
+
+verify(mockCompression, 
times(1)).getInputStream(any(InputStream.class));
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642746647


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenReturn(mockOutputStream);
+
when(mockCompression.getInputStream(any(InputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", STRING_HUGEVALUE, 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), "{}");
+
+verify(mockCompression, 
times(1)).getInputStream(any(InputStream.class));
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642740716


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD =
+SystemPropertySupplier.create("oak.mongo.compressionThreshold", 
-1).get();
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
 this.store = store;
 this.name = name;
-this.value = value;
+if (DEFAULT_COMPRESSION_THRESHOLD == -1) {
+this.value = value;
+this.compression = null;
+this.compressedValue = null;
+} else {
+this.compression = compression;
+int size = value.length();
+String localValue = value;
+byte[] localCompressedValue = null;
+if (compression != null && size > DEFAULT_COMPRESSION_THRESHOLD) {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642740716


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD =
+SystemPropertySupplier.create("oak.mongo.compressionThreshold", 
-1).get();
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
 this.store = store;
 this.name = name;
-this.value = value;
+if (DEFAULT_COMPRESSION_THRESHOLD == -1) {
+this.value = value;
+this.compression = null;
+this.compressedValue = null;
+} else {
+this.compression = compression;
+int size = value.length();
+String localValue = value;
+byte[] localCompressedValue = null;
+if (compression != null && size > DEFAULT_COMPRESSION_THRESHOLD) {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642736298


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenReturn(mockOutputStream);
+
when(mockCompression.getInputStream(any(InputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", STRING_HUGEVALUE, 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), "{}");
+
+verify(mockCompression, 
times(1)).getInputStream(any(InputStream.class));
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642735284


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD =
+SystemPropertySupplier.create("oak.mongo.compressionThreshold", 
-1).get();
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
 this.store = store;
 this.name = name;
-this.value = value;
+if (DEFAULT_COMPRESSION_THRESHOLD == -1) {
+this.value = value;
+this.compression = null;
+this.compressedValue = null;
+} else {
+this.compression = compression;
+int size = value.length();
+String localValue = value;
+byte[] localCompressedValue = null;
+if (compression != null && size > DEFAULT_COMPRESSION_THRESHOLD) {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642734832


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD =
+SystemPropertySupplier.create("oak.mongo.compressionThreshold", 
-1).get();

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642734255


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -128,7 +188,7 @@ public boolean equals(Object object) {
 } else if (object instanceof DocumentPropertyState) {
 DocumentPropertyState other = (DocumentPropertyState) object;
 return this.name.equals(other.name)
-&& this.value.equals(other.value);
+&& this.getValue().equals(other.getValue());

Review Comment:
   Yes.



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD =

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


stefan-egli commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642721700


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenReturn(mockOutputStream);
+
when(mockCompression.getInputStream(any(InputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", STRING_HUGEVALUE, 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), "{}");
+
+verify(mockCompression, 
times(1)).getInputStream(any(InputStream.class));
+
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


stefan-egli commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642713859


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD =
+SystemPropertySupplier.create("oak.mongo.compressionThreshold", 
-1).get();
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
 this.store = store;
 this.name = name;
-this.value = value;
+if (DEFAULT_COMPRESSION_THRESHOLD == -1) {
+this.value = value;
+this.compression = null;
+this.compressedValue = null;
+} else {
+this.compression = compression;
+int size = value.length();
+String localValue = value;
+byte[] localCompressedValue = null;
+if (compression != null && size > DEFAULT_COMPRESSION_THRESHOLD) {

Review Comment:
   I think the `compression != null` part could be move to the `if` above?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


stefan-egli commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642708995


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD =
+SystemPropertySupplier.create("oak.mongo.compressionThreshold", 
-1).get();

Review Comment:
   the property isn't mongo specific. this isn't handled entirely consistent in 
the existing code base neither - perhaps it should be `oak.documentMK.` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-17 Thread via GitHub


reschke commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1642564726


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +45,64 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD =

Review Comment:
   I'd set the logger here as well ("logTo()")



##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -81,4 +100,179 @@ public void multiValuedBinarySize() throws Exception {
 assertEquals(0, reads.size());
 }
 
-}
+@Test
+public void multiValuedAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+List blobs = newArrayList();
+for (int i = 0; i < 13; i++) {
+blobs.add(builder.createBlob(new RandomStream(BLOB_SIZE, i)));
+}
+builder.child(TEST_NODE).setProperty("p", blobs, Type.BINARIES);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.BINARIES, Objects.requireNonNull(p).getType());
+assertEquals(13, p.count());
+
+reads.clear();
+assertEquals(BLOB_SIZE, p.size(0));
+// must not read the blob via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringBelowThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", "dummy", Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(5, p.size(0));
+// must not read the string via stream
+assertEquals(0, reads.size());
+}
+
+@Test
+public void stringAboveThresholdSize() throws Exception {
+NodeBuilder builder = ns.getRoot().builder();
+builder.child(TEST_NODE).setProperty("p", STRING_HUGEVALUE, 
Type.STRING);
+TestUtils.merge(ns, builder);
+
+PropertyState p = 
ns.getRoot().getChildNode(TEST_NODE).getProperty("p");
+assertEquals(Type.STRING, Objects.requireNonNull(p).getType());
+assertEquals(1, p.count());
+
+reads.clear();
+assertEquals(10050, p.size(0));
+// must not read the string via streams
+assertEquals(0, reads.size());
+}
+
+@Test
+public void compressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+
when(mockCompression.getOutputStream(any(OutputStream.class))).thenThrow(new 
IOException("Compression failed"));
+
+Field compressionThreshold = 
DocumentPropertyState.class.getDeclaredField("DEFAULT_COMPRESSION_THRESHOLD");
+compressionThreshold.setAccessible(true);
+Field modifiersField = Field.class.getDeclaredField("modifiers");
+modifiersField.setAccessible(true);
+modifiersField.setInt(compressionThreshold, 
compressionThreshold.getModifiers() & ~Modifier.FINAL);
+
+compressionThreshold.set(null, DEFAULT_COMPRESSION_THRESHOLD);
+
+DocumentPropertyState documentPropertyState = new 
DocumentPropertyState(mockDocumentStore, "p", "\"" + STRING_HUGEVALUE + "\"", 
mockCompression);
+
+assertEquals(documentPropertyState.getValue(Type.STRING), 
STRING_HUGEVALUE);
+
+verify(mockCompression, 
times(1)).getOutputStream(any(OutputStream.class));
+
+compressionThreshold.set(null, -1);
+
+}
+
+@Test
+public void uncompressValueThrowsException() throws IOException, 
NoSuchFieldException, IllegalAccessException {
+
+DocumentNodeStore mockDocumentStore = mock(DocumentNodeStore.class);
+Compression mockCompression = mock(Compression.class);
+OutputStream mockOutputStream= mock(OutputStream.class);
+

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-14 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1639713253


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +44,66 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
Integer.getInteger("oak.mongo.compressionThreshold", -1);
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
-this.store = store;
-this.name = name;
-this.value = value;
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
+if (DEFAULT_COMPRESSION_THRESHOLD == -1) {
+this.store = store;

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-14 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1639707803


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -18,26 +18,43 @@
 
 import java.io.IOException;
 import java.io.InputStream;
+import java.io.OutputStream;
+import java.lang.reflect.Field;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.lang.reflect.Modifier;
 import java.util.List;
+import java.util.Objects;
 import java.util.Set;
 
+import org.apache.commons.lang3.RandomStringUtils;
 import org.apache.jackrabbit.oak.api.Blob;
 import org.apache.jackrabbit.oak.api.PropertyState;
 import org.apache.jackrabbit.oak.api.Type;
+import org.apache.jackrabbit.oak.commons.Compression;
 import org.apache.jackrabbit.oak.spi.blob.BlobStore;
 import org.apache.jackrabbit.oak.spi.blob.MemoryBlobStore;
 import org.apache.jackrabbit.oak.spi.state.NodeBuilder;
-import org.junit.Before;
-import org.junit.Rule;
-import org.junit.Test;
+import org.junit.*;
+import org.junit.runners.MethodSorters;
 
 import static org.apache.jackrabbit.guava.common.collect.Lists.newArrayList;
 import static org.apache.jackrabbit.guava.common.collect.Sets.newHashSet;
-import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.*;

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-14 Thread via GitHub


ionutzpi commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1639707557


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +44,66 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
Integer.getInteger("oak.mongo.compressionThreshold", -1);

Review Comment:
   Done
   



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +44,66 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
Integer.getInteger("oak.mongo.compressionThreshold", -1);
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
-this.store = store;
-this.name = name;
-this.value = value;
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
+if (DEFAULT_COMPRESSION_THRESHOLD == -1) {
+this.store = store;
+this.name = name;
+this.value = value;
+this.compression = null;
+this.compressedValue = null;
+} else {
+this.store = store;
+this.name = name;
+this.compression = compression;
+
+int size = value.getBytes().length;

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-14 Thread via GitHub


reschke commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1639705992


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +44,66 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
Integer.getInteger("oak.mongo.compressionThreshold", -1);
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
-this.store = store;
-this.name = name;
-this.value = value;
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
+if (DEFAULT_COMPRESSION_THRESHOLD == -1) {
+this.store = store;

Review Comment:
   things that are the same should be moved out of the condition



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-14 Thread via GitHub


reschke commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-2167814526

   I believe the main concern is that we don't know yet exactly why we are 
doing this. IMHO it would be better to first collect data on the size of the 
properties.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-06-14 Thread via GitHub


reschke commented on code in PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#discussion_r1639674435


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +44,66 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
Integer.getInteger("oak.mongo.compressionThreshold", -1);
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
-this.store = store;
-this.name = name;
-this.value = value;
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
+if (DEFAULT_COMPRESSION_THRESHOLD == -1) {
+this.store = store;
+this.name = name;
+this.value = value;
+this.compression = null;
+this.compressedValue = null;
+} else {
+this.store = store;
+this.name = name;
+this.compression = compression;
+
+int size = value.getBytes().length;

Review Comment:
   That serializes the String into a byte array, obtains the lenght, and then 
throws the byte array away.



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +44,66 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
Integer.getInteger("oak.mongo.compressionThreshold", -1);

Review Comment:
   please use SystemPropertySupplier



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyState.java:
##
@@ -38,24 +44,66 @@
 import org.apache.jackrabbit.oak.plugins.memory.StringPropertyState;
 import org.apache.jackrabbit.oak.plugins.value.Conversions;
 import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * PropertyState implementation with lazy parsing of the JSOP encoded value.
  */
 final class DocumentPropertyState implements PropertyState {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(DocumentPropertyState.class);
+
 private final DocumentNodeStore store;
 
 private final String name;
 
 private final String value;
 
 private PropertyState parsed;
+private final byte[] compressedValue;
+private final Compression compression;
+
+private static final int DEFAULT_COMPRESSION_THRESHOLD = 
Integer.getInteger("oak.mongo.compressionThreshold", -1);
 
 DocumentPropertyState(DocumentNodeStore store, String name, String value) {
-this.store = store;
-this.name = name;
-this.value = value;
+this(store, name, value, Compression.GZIP);
+}
+
+DocumentPropertyState(DocumentNodeStore store, String name, String value, 
Compression compression) {
+if (DEFAULT_COMPRESSION_THRESHOLD == -1) {
+this.store = store;
+this.name = name;
+this.value = value;
+this.compression = null;
+this.compressedValue = null;
+} else {
+this.store = store;
+this.name = name;
+this.compression = compression;
+
+int size = value.getBytes().length;

Review Comment:
   (also it needs to specify the charset)



##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentPropertyStateTest.java:
##
@@ -18,26 +18,43 @@
 
 import