justaparth commented on code in PR #41498: URL: https://github.com/apache/spark/pull/41498#discussion_r1223202896
########## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala: ########## @@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer( updater.setLong(ordinal, micros + TimeUnit.NANOSECONDS.toMicros(nanoSeconds)) case (MESSAGE, StringType) - if protoType.getMessageType.getFullName == "google.protobuf.Any" => + if protoType.getMessageType.getFullName == "google.protobuf.Any" => (updater, ordinal, value) => // Convert 'Any' protobuf message to JSON string. val jsonStr = jsonPrinter.print(value.asInstanceOf[DynamicMessage]) updater.set(ordinal, UTF8String.fromString(jsonStr)) + // Handle well known wrapper types. We unpack the value field instead of keeping Review Comment: Also one point that may not have been clear, but these wrapper types for primitives are not inherently useful as "messages"; rather they exist for the purpose of distinguishing between empty and 0 in proto3, before the existence of optional. As an example ``` syntax = "proto3"; message A { int64 val = 1; } ``` and then ``` A() A(0) ``` are not distinguishable from one another because default values are [not serialized](https://protobuf.dev/programming-guides/field_presence/#presence-in-tag-value-stream-wire-format-serialization) to the wire for primitives in proto3. But if you had ``` syntax = "proto3"; message A { google.protobuf.Int64Value val = 1; } ``` then ``` A() A(Int64Value(0)) ``` Now become distinguishable from one another. Basically it's not like it's useful to just nest primitives inside a message for the sake of it, the purpose is simply to disambiguate in that case above (and you're right that `optional` in newer versions of protos basically solves this problem). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org