[ https://issues.apache.org/jira/browse/BEAM-10265?focusedWorklogId=763857&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763857 ]
ASF GitHub Bot logged work on BEAM-10265: ----------------------------------------- Author: ASF GitHub Bot Created on: 28/Apr/22 20:04 Start Date: 28/Apr/22 20:04 Worklog Time Spent: 10m Work Description: zhoufek commented on code in PR #17477: URL: https://github.com/apache/beam/pull/17477#discussion_r861244041 ########## sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/StaticSchemaInference.java: ########## @@ -79,30 +80,59 @@ enum MethodType { .put(BigDecimal.class, FieldType.DECIMAL) .build(); + /** Helper method that instantiates HashMap to verify that schemas don't reference themselves. */ Review Comment: The comment on the public method should remain the same. Users want to know what this offers them, not how it is implemented, which is what this comment is about. Even if most of the implementation is in the private method, it is the helper to this one. ########## sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/StaticSchemaInference.java: ########## @@ -79,30 +80,59 @@ enum MethodType { .put(BigDecimal.class, FieldType.DECIMAL) .build(); + /** Helper method that instantiates HashMap to verify that schemas don't reference themselves. */ + public static Schema schemaFromClass( + Class<?> clazz, FieldValueTypeSupplier fieldValueTypeSupplier) { + return schemaFromClass(clazz, fieldValueTypeSupplier, new HashMap<Class, Schema>()); + } + /** * Infer a schema from a Java class. * * <p>Takes in a function to extract a list of field types from a class. Different callers may * have different strategies for extracting this list: e.g. introspecting public member variables, * public getter methods, or special annotations on the class. */ - public static Schema schemaFromClass( - Class<?> clazz, FieldValueTypeSupplier fieldValueTypeSupplier) { + private static Schema schemaFromClass( + Class<?> clazz, + FieldValueTypeSupplier fieldValueTypeSupplier, + HashMap<Class, Schema> alreadyVisitedSchemas) { + if (alreadyVisitedSchemas.containsKey(clazz)) { + Schema existingSchema = alreadyVisitedSchemas.get(clazz); + if (existingSchema == null) { + throw new IllegalArgumentException( + "Cannot infer schema with a circular reference. Class: " + clazz.getTypeName()); + } + return existingSchema; + } + alreadyVisitedSchemas.put(clazz, null); Schema.Builder builder = Schema.builder(); for (FieldValueTypeInformation type : fieldValueTypeSupplier.get(clazz)) { - Schema.FieldType fieldType = fieldFromType(type.getType(), fieldValueTypeSupplier); + Schema.FieldType fieldType = + fieldFromType(type.getType(), fieldValueTypeSupplier, alreadyVisitedSchemas); if (type.isNullable()) { builder.addNullableField(type.getName(), fieldType); } else { builder.addField(type.getName(), fieldType); } } - return builder.build(); + alreadyVisitedSchemas.remove(clazz); + Schema generatedSchema = builder.build(); + alreadyVisitedSchemas.put(clazz, generatedSchema); + return generatedSchema; } - /** Map a Java field type to a Beam Schema FieldType. */ + /** Helper method that instantiates HashMap to verify that schemas don't reference themselves. */ Review Comment: Similar to the above. It's the private method that helps this one. This should keep the same doc comment as before. ########## sdks/java/extensions/protobuf/src/main/java/org/apache/beam/sdk/extensions/protobuf/ProtoSchemaTranslator.java: ########## @@ -156,6 +165,16 @@ static Schema getSchema(Class<? extends Message> clazz) { } static Schema getSchema(Descriptors.Descriptor descriptor) { Review Comment: Sorry, missed this: The method is now no longer thread-safe due to the use of a static hash map. I think that's acceptable for a package-private method, but this should be added to the Javadoc in case anyone in the future tries to use this in a multithreaded context. Issue Time Tracking ------------------- Worklog Id: (was: 763857) Time Spent: 2h 10m (was: 2h) > GetterBasedSchemaProvider#schemaFor stack overflows when given a recursive > schema > --------------------------------------------------------------------------------- > > Key: BEAM-10265 > URL: https://issues.apache.org/jira/browse/BEAM-10265 > Project: Beam > Issue Type: Bug > Components: sdk-java-core > Affects Versions: 2.21.0, 2.22.0 > Reporter: Reza ardeshir rokni > Assignee: Andrei Gurau > Priority: P3 > Labels: Clarified, starter > Time Spent: 2h 10m > Remaining Estimate: 0h > > Proto: > message TSFoo { > string a = 1; > string b = 2; > TSFoo theOlderMe = 3; > } > new ProtoMessageSchema().schemaFor(TypeDescriptor.of(Foo.TSFoo.class)); > Causes a stackoverflow. > > -- This message was sent by Atlassian Jira (v8.20.7#820007)