gemini-code-assist[bot] commented on code in PR #36073:
URL: https://github.com/apache/beam/pull/36073#discussion_r2337632022
##########
sdks/java/core/src/main/java/org/apache/beam/sdk/util/RowJsonUtils.java:
##########
@@ -63,14 +67,41 @@ public static void increaseDefaultStreamReadConstraints(int
newLimit) {
}
static {
- increaseDefaultStreamReadConstraints(100 * 1024 * 1024);
+ increaseDefaultStreamReadConstraints(MAX_STRING_LENGTH);
+ }
+
+ /**
+ * Creates a thread-safe JsonFactory with custom stream read constraints.
+ *
+ * <p>This method encapsulates the logic to increase the default
jackson-databind stream read
+ * constraint to 100MB. This functionality was introduced in Jackson 2.15
causing string > 20MB
+ * (5MB in <2.15.0) parsing failure. This has caused regressions in its
dependencies including
+ * Beam. Here we create a streamReadConstraints minimum size limit set to
100MB and exposing the
+ * factory to higher limits. If needed, call this method during pipeline run
time, e.g. in
+ * DoFn.setup. This avoids a data race caused by modifying the global
default settings.
+ */
+ public static JsonFactory createJsonFactory(int sizeLimit) {
+ sizeLimit = Math.max(sizeLimit, MAX_STRING_LENGTH);
+ JsonFactory jsonFactory = new JsonFactory();
+ try {
+ // Check if StreamReadConstraints is available (Jackson 2.15+)
+ Class.forName("com.fasterxml.jackson.core.StreamReadConstraints");
+ com.fasterxml.jackson.core.StreamReadConstraints streamReadConstraints =
+ com.fasterxml.jackson.core.StreamReadConstraints.builder()
+ .maxStringLength(sizeLimit)
+ .build();
+ jsonFactory.setStreamReadConstraints(streamReadConstraints);
+ } catch (ClassNotFoundException e) {
+ // If the class is not found (i.e., Jackson version < 2.15), do nothing.
+ }
+ return jsonFactory;
}
Review Comment:

For performance and robustness, the check for `StreamReadConstraints`
availability should be done only once, and the code that depends on it should
be isolated to prevent `NoClassDefFoundError`.
The current implementation calls `Class.forName` on every invocation of
`createJsonFactory`, which is inefficient. A better approach is to perform this
check once in a static context and cache the result.
Additionally, using a static nested helper class to encapsulate the logic
that uses `StreamReadConstraints` is a robust pattern for handling optional
dependencies. This ensures the helper class is only loaded when the dependency
is confirmed to be present.
I've provided a suggestion that implements these improvements.
```java
private static final boolean STREAM_READ_CONSTRAINTS_AVAILABLE =
streamReadConstraintsAvailable();
private static boolean streamReadConstraintsAvailable() {
try {
Class.forName("com.fasterxml.jackson.core.StreamReadConstraints");
return true;
} catch (ClassNotFoundException e) {
return false;
}
}
private static class StreamReadConstraintsHelper {
static void setStreamReadConstraints(JsonFactory jsonFactory, int
sizeLimit) {
com.fasterxml.jackson.core.StreamReadConstraints streamReadConstraints
=
com.fasterxml.jackson.core.StreamReadConstraints.builder()
.maxStringLength(sizeLimit)
.build();
jsonFactory.setStreamReadConstraints(streamReadConstraints);
}
}
/**
* Creates a thread-safe JsonFactory with custom stream read constraints.
*
* <p>This method encapsulates the logic to increase the default
jackson-databind stream read
* constraint to 100MB. This functionality was introduced in Jackson 2.15
causing string > 20MB
* (5MB in <2.15.0) parsing failure. This has caused regressions in its
dependencies including
* Beam. Here we create a streamReadConstraints minimum size limit set to
100MB and exposing the
* factory to higher limits. If needed, call this method during pipeline
run time, e.g. in
* DoFn.setup. This avoids a data race caused by modifying the global
default settings.
*/
public static JsonFactory createJsonFactory(int sizeLimit) {
sizeLimit = Math.max(sizeLimit, MAX_STRING_LENGTH);
JsonFactory jsonFactory = new JsonFactory();
if (STREAM_READ_CONSTRAINTS_AVAILABLE) {
StreamReadConstraintsHelper.setStreamReadConstraints(jsonFactory,
sizeLimit);
}
return jsonFactory;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]