KalleOlaviNiemitalo commented on code in PR #3333:
URL: https://github.com/apache/avro/pull/3333#discussion_r1986091908
##########
lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java:
##########
@@ -37,43 +37,72 @@ public class SpecificDatumReader<T> extends
GenericDatumReader<T> {
public static final String[] SERIALIZABLE_PACKAGES;
static {
- SERIALIZABLE_PACKAGES =
System.getProperty("org.apache.avro.SERIALIZABLE_PACKAGES",
-
"java.lang,java.math,java.io,java.net,org.apache.avro.reflect").split(",");
+ String defaultPackages =
"java.lang,java.math,java.io,java.net,org.apache.avro.reflect";
+
+ String userDefinedPackages =
System.getProperty("org.apache.avro.SERIALIZABLE_PACKAGES", "");
+
+ // Combine the user-defined packages (if any) with the default packages.
+ String combined = userProp.isEmpty() ? defaultPackages : userProp + "," +
defaultPackages;
+
+ SERIALIZABLE_PACKAGES = Arrays.stream(combined.split(","))
+ .distinct()
+ .toArray(String[]::new);
}
private final List<String> trustedPackages = new ArrayList<>();
public SpecificDatumReader() {
this(null, null, SpecificData.get());
+ initializeTrustedPackages();
}
/** Construct for reading instances of a class. */
public SpecificDatumReader(Class<T> c) {
this(SpecificData.getForClass(c));
setSchema(getSpecificData().getSchema(c));
+ initializeTrustedPackages();
}
/** Construct where the writer's and reader's schemas are the same. */
public SpecificDatumReader(Schema schema) {
this(schema, schema, SpecificData.getForSchema(schema));
+ initializeTrustedPackages();
}
/** Construct given writer's and reader's schema. */
public SpecificDatumReader(Schema writer, Schema reader) {
this(writer, reader, SpecificData.getForSchema(reader));
+ initializeTrustedPackages();
}
/**
* Construct given writer's schema, reader's schema, and a {@link
SpecificData}.
*/
public SpecificDatumReader(Schema writer, Schema reader, SpecificData data) {
super(writer, reader, data);
- trustedPackages.addAll(Arrays.asList(SERIALIZABLE_PACKAGES));
+ initializeTrustedPackages();
}
/** Construct given a {@link SpecificData}. */
public SpecificDatumReader(SpecificData data) {
super(data);
+ initializeTrustedPackages();
+ }
+
+ /**
+ * Initializes the {@code trustedPackages} list with the package names
considered safe for deserialization.
+ *
+ * <p>This method populates the {@code trustedPackages} list using the
static array {@code SERIALIZABLE_PACKAGES},
+ * which is initialized from the system property {@code
org.apache.avro.SERIALIZABLE_PACKAGES} combined with
+ * default trusted packages. By doing so, it ensures that both user-defined
and default packages are included,
+ * and any duplicate entries are avoided.</p>
+ *
+ * <p>Before adding the packages, the list is cleared to prevent duplicate
entries if this method is invoked
+ * multiple times, ensuring that the list remains consistent and up-to-date
across all instances.</p>
Review Comment:
What does "up-to-date across all instances" mean here?
- AFAICS, SERIALIZABLE\_PACKAGES is not modified after it has been
initialised, so there is no question of the trustedPackages list being
"up-to-date" with SERIALIZABLE\_PACKAGES.
- As the method only modified the trustedPackages list in one instance, the
"across all instances" wording seems misleading.
Out of curiosity, were parts of the PR description and the javadoc comment
created using generative AI? The Apache Software Foundation publishes some
guidelines for that at [ASF GENERATIVE TOOLING
GUIDANCE](https://www.apache.org/legal/generative-tooling.html).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]