[ 
https://issues.apache.org/jira/browse/PARQUET-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614569#comment-17614569
 ] 

ASF GitHub Bot commented on PARQUET-1711:
-----------------------------------------

emkornfield commented on code in PR #995:
URL: https://github.com/apache/parquet-mr/pull/995#discussion_r990725805


##########
parquet-protobuf/src/test/java/org/apache/parquet/proto/ProtoSchemaConverterTest.java:
##########
@@ -82,264 +93,447 @@ public void testConvertAllDatatypes() throws Exception {
    * Tests that all protocol buffer datatypes are converted to correct parquet 
datatypes.
    */
   @Test
-  public void testProto3ConvertAllDatatypes() throws Exception {
-    String expectedSchema =
-      "message TestProto3.SchemaConverterAllDatatypes {\n" +
-        "  optional double optionalDouble = 1;\n" +
-        "  optional float optionalFloat = 2;\n" +
-        "  optional int32 optionalInt32 = 3;\n" +
-        "  optional int64 optionalInt64 = 4;\n" +
-        "  optional int32 optionalUInt32 = 5;\n" +
-        "  optional int64 optionalUInt64 = 6;\n" +
-        "  optional int32 optionalSInt32 = 7;\n" +
-        "  optional int64 optionalSInt64 = 8;\n" +
-        "  optional int32 optionalFixed32 = 9;\n" +
-        "  optional int64 optionalFixed64 = 10;\n" +
-        "  optional int32 optionalSFixed32 = 11;\n" +
-        "  optional int64 optionalSFixed64 = 12;\n" +
-        "  optional boolean optionalBool = 13;\n" +
-        "  optional binary optionalString (UTF8) = 14;\n" +
-        "  optional binary optionalBytes = 15;\n" +
-        "  optional group optionalMessage = 16 {\n" +
-        "    optional int32 someId = 3;\n" +
-        "  }\n" +
-        "  optional binary optionalEnum (ENUM) = 18;" +
-        "  optional int32 someInt32 = 19;" +
-        "  optional binary someString (UTF8) = 20;" +
-        "  optional group optionalMap (MAP) = 21 {\n" +
-        "    repeated group key_value {\n" +
-        "      required int64 key;\n" +
-        "      optional group value {\n" +
-        "        optional int32 someId = 3;\n" +
-        "      }\n" +
-        "    }\n" +
-        "  }\n" +
-        "}";
+  public void testProto3ConvertAllDatatypes() {
+    String expectedSchema = JOINER.join(

Review Comment:
   is it possible to separate this tpe of code style cleanup from functional 
changes?





> [parquet-protobuf] stack overflow when work with well known json type
> ---------------------------------------------------------------------
>
>                 Key: PARQUET-1711
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1711
>             Project: Parquet
>          Issue Type: Bug
>    Affects Versions: 1.10.1
>            Reporter: Lawrence He
>            Priority: Major
>
> Writing following protobuf message as parquet file is not possible: 
> {code:java}
> syntax = "proto3";
> import "google/protobuf/struct.proto";
> package test;
> option java_outer_classname = "CustomMessage";
> message TestMessage {
>     map<string, google.protobuf.ListValue> data = 1;
> } {code}
> Protobuf introduced "well known json type" such like 
> [ListValue|https://developers.google.com/protocol-buffers/docs/reference/google.protobuf#listvalue]
>  to work around json schema conversion. 
> However writing above messages traps parquet writer into an infinite loop due 
> to the "general type" support in protobuf. Current implementation will keep 
> referencing 6 possible types defined in protobuf (null, bool, number, string, 
> struct, list) and entering infinite loop when referencing "struct".
> {code:java}
> java.lang.StackOverflowErrorjava.lang.StackOverflowError at 
> java.base/java.util.Arrays$ArrayItr.<init>(Arrays.java:4418) at 
> java.base/java.util.Arrays$ArrayList.iterator(Arrays.java:4410) at 
> java.base/java.util.Collections$UnmodifiableCollection$1.<init>(Collections.java:1044)
>  at 
> java.base/java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1043)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:64)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
>  at 
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to