[ 
https://issues.apache.org/jira/browse/OAK-12114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18061647#comment-18061647
 ] 

Rishabh Daim edited comment on OAK-12114 at 2/27/26 3:21 PM:
-------------------------------------------------------------

Hi

I have tested on local and below is the response from claude:

Now the root cause is clear. RandomStringUtils.secure().next(1024 * 1024) 
generates random Unicode characters, which are multi-byte in UTF-8/BSON:

  {quote}The problem:
  - next() picks from the full Unicode range — chars can be 1–3 bytes in UTF-8
  - 1M chars × ~2.3 bytes avg = ~2.3MB per property in BSON
  - 16 properties × 2.3MB = ~37MB total (matches the error: 37651771 bytes)
  - At 37MB, the MongoDB driver rejects it before sending — different error 
message than before

  Old ('0' chars): 1MB per property (ASCII = 1 byte) → exactly 16MB → rejected 
by MongoDB server with "Resulting document after update is larger than 16777216"

  New (random Unicode): ~2.3MB per property → ~37MB → rejected by MongoDB 
driver with "Payload document size is larger than maximum of 16793600"

  Fix: Use nextAlphanumeric() instead of next() — alphanumeric chars are all 
ASCII (1 byte in UTF-8):{quote}

So, essentially test was correct, since it was using a 16MB payload. But with 
the proposed change, the size of the payload increases and hence the exception.


was (Author: JIRAUSER299730):
Hi

I have tested on local and below is the response from claude:

Now the root cause is clear. RandomStringUtils.secure().next(1024 * 1024) 
generates random Unicode characters, which are multi-byte in UTF-8/BSON:

  {quote}The problem:
  - next() picks from the full Unicode range — chars can be 1–3 bytes in UTF-8
  - 1M chars × ~2.3 bytes avg = ~2.3MB per property in BSON
  - 16 properties × 2.3MB = ~37MB total (matches the error: 37651771 bytes)
  - At 37MB, the MongoDB driver rejects it before sending — different error 
message than before

  Old ('0' chars): 1MB per property (ASCII = 1 byte) → exactly 16MB → rejected 
by MongoDB server with "Resulting document after update is larger than 16777216"

  New (random Unicode): ~2.3MB per property → ~37MB → rejected by MongoDB 
driver with "Payload document size is larger than maximum of 16793600"

  Fix: Use nextAlphanumeric() instead of next() — alphanumeric chars are all 
ASCII (1 byte in UTF-8):{quote}

> MongoDBExceptionTest uses payloads that compress well
> -----------------------------------------------------
>
>                 Key: OAK-12114
>                 URL: https://issues.apache.org/jira/browse/OAK-12114
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: mongomk
>            Reporter: Julian Reschke
>            Priority: Major
>             Fix For: 2.0.0
>
>
> ...containing only zeros, thus defeating the purpose of testing large 
> payloads.
> (actually, not testing it at all)
> With this change:
> {code}
> diff --git 
> a/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/mongo/MongoDBExceptionTest.java
>  
> b/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/mongo/MongoDBExceptionTest.java
> index d6e7cb264e..934af36627 100644
> --- 
> a/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/mongo/MongoDBExceptionTest.java
> +++ 
> b/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/mongo/MongoDBExceptionTest.java
> @@ -36,7 +36,6 @@ import org.junit.Test;
>  import org.slf4j.event.Level;
>  import java.util.ArrayList;
> -import java.util.Arrays;
>  import java.util.List;
>  import static java.util.Collections.singletonList;
> @@ -333,9 +332,5 @@ public class MongoDBExceptionTest {
>      // RED ALERT: OAK-12114
>      private String create1MBContent() {
> -        char[] chars = new char[1024 * 1024];
> -        Arrays.fill(chars, '0');
> -        String content = new String(chars);
> -        return content;
> -    }
> +        return  RandomStringUtils.secure().next(1024 * 1024);    }
>  }
> {code}
> we get
> {code}
> ERROR] Failures:
> [ERROR]   MongoDBExceptionTest.createOrUpdate16MBDoc:156
> Expected: a string containing "Document to upsert is larger than 16777216"
>      but: was "Document size of 37633707 is larger than maximum of 16793600. 
> [/foo]"
> [ERROR]   MongoDBExceptionTest.findAndUpdate16MBDoc:304
> Expected: a string containing "Resulting document after update is larger than 
> 16777216"
>      but: was "Payload document size is larger than maximum of 16793600. 
> [/foo]"
> [ERROR]   MongoDBExceptionTest.multiCreateOrUpdate16MBDoc:263
> Expected: a string containing "Resulting document after update is larger than 
> 16777216"
>      but: was "Payload document size is larger than maximum of 16793600. 
> [/test]"
> [ERROR]   MongoDBExceptionTest.update16MBDoc:231
> Expected: a string containing "Resulting document after update is larger than 
> 16777216"
>      but: was "Payload document size is larger than maximum of 16793600. 
> [/foo]"
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to