dossett commented on code in PR #963: URL: https://github.com/apache/parquet-mr/pull/963#discussion_r893846424
########## parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoWriteSupport.java: ########## @@ -115,27 +120,32 @@ public void prepareForWrite(RecordConsumer recordConsumer) { public WriteContext init(Configuration configuration) { // if no protobuf descriptor was given in constructor, load descriptor from configuration (set with setProtobufClass) - if (protoMessage == null) { - Class<? extends Message> pbClass = configuration.getClass(PB_CLASS_WRITE, null, Message.class); - if (pbClass != null) { - protoMessage = pbClass; - } else { - String msg = "Protocol buffer class not specified."; - String hint = " Please use method ProtoParquetOutputFormat.setProtobufClass(...) or other similar method."; - throw new BadConfigurationException(msg + hint); + if (descriptor == null) { + if (protoMessage == null) { + Class<? extends Message> pbClass = configuration.getClass(PB_CLASS_WRITE, null, Message.class); + if (pbClass != null) { + protoMessage = pbClass; + } else { + String msg = "Protocol buffer class or descriptor not specified."; + String hint = " Please use method ProtoParquetOutputFormat.setProtobufClass(...) or other similar method."; + throw new BadConfigurationException(msg + hint); + } } + descriptor = Protobufs.getMessageDescriptor(protoMessage); + } else { + //Assume no specific Message extending class, so use DynamicMessage + protoMessage = DynamicMessage.class; Review Comment: Should this just be left null? Having it set implies it usable, but it's only used to determine the descriptor and get a protoschemaconverter and then only in cases when we don't have a descriptor. I just worry that setting it could lead to assumptions later on that it's as usable as if it were a generated message and not a dynamic message. ########## parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoWriteSupport.java: ########## @@ -115,27 +120,32 @@ public void prepareForWrite(RecordConsumer recordConsumer) { public WriteContext init(Configuration configuration) { // if no protobuf descriptor was given in constructor, load descriptor from configuration (set with setProtobufClass) - if (protoMessage == null) { - Class<? extends Message> pbClass = configuration.getClass(PB_CLASS_WRITE, null, Message.class); - if (pbClass != null) { - protoMessage = pbClass; - } else { - String msg = "Protocol buffer class not specified."; - String hint = " Please use method ProtoParquetOutputFormat.setProtobufClass(...) or other similar method."; - throw new BadConfigurationException(msg + hint); + if (descriptor == null) { + if (protoMessage == null) { + Class<? extends Message> pbClass = configuration.getClass(PB_CLASS_WRITE, null, Message.class); + if (pbClass != null) { + protoMessage = pbClass; + } else { + String msg = "Protocol buffer class or descriptor not specified."; + String hint = " Please use method ProtoParquetOutputFormat.setProtobufClass(...) or other similar method."; + throw new BadConfigurationException(msg + hint); + } } + descriptor = Protobufs.getMessageDescriptor(protoMessage); + } else { + //Assume no specific Message extending class, so use DynamicMessage + protoMessage = DynamicMessage.class; Review Comment: Should this just be left null? Having it set implies it usable, but it's only used to determine the descriptor and get a protoschemaconverter and then only in cases when we don't have a descriptor. I just worry that setting it could lead to assumptions later on that it's as usable as if it were a generated message and not a dynamic message. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org