[ https://issues.apache.org/jira/browse/SPARK-33598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
jacklzg updated SPARK-33598: ---------------------------- Description: If the target Java data class has a circular reference, Spark will fail fast from creating the Dataset or running Encoders. For example, with protobuf class, there is a reference with Descriptor, there is no way to build a dataset from the protobuf class. >From this line {color:#7a869a}Encoders.bean(ProtoBuffOuterClass.ProtoBuff.class);{color} It will throw out immediately {quote}Exception in thread "main" java.lang.UnsupportedOperationException: Cannot have circular references in bean class, but got the circular reference of class class com.google.protobuf.Descriptors$Descriptor {quote} Can we add a parameter, for example, {code:java} Encoders.bean(Class<T> clas, List<Fields> fieldsToIgnore);{code} ```` or {code:java} Encoders.bean(Class<T> clas, boolean skipCircularRefField);{code} which subsequently, instead of throwing an [exception|[https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L556]], it instead skip the field. {code:java} if (seenTypeSet.contains(t)) { if(skipCircularRefField) println("field skipped") //just skip this field else throw new UnsupportedOperationException( s"cannot have circular references in class, but got the circular reference of class $t") } {code} was: If the target Java data class has a circular reference, Spark will fail fast from creating the Dataset or running Encoders. For example, with protobuf class, there is a reference with Descriptor, there is no way to build a dataset from the protobuf class. >From this line {color:#7a869a}Encoders.bean(ProtoBuffOuterClass.ProtoBuff.class);{color} It will throw out immediately {quote}Exception in thread "main" java.lang.UnsupportedOperationException: Cannot have circular references in bean class, but got the circular reference of class class com.google.protobuf.Descriptors$Descriptor {quote} Can we add a parameter, for example, {code:java} Encoders.bean(Class<T> clas, List<Fields> fieldsToIgnore);{code} ```` or {code:java} Encoders.bean(Class<T> clas, boolean skipCircularRefField);{code} which subsequently, instead of throwing an [[exception||#L556] [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L556] []|#L556], it instead skip the field. {code:java} if (seenTypeSet.contains(t)) { if(skipCircularRefField) println("field skipped") //just skip this field else throw new UnsupportedOperationException( s"cannot have circular references in class, but got the circular reference of class $t") } {code} > Support Java Class with circular references > ------------------------------------------- > > Key: SPARK-33598 > URL: https://issues.apache.org/jira/browse/SPARK-33598 > Project: Spark > Issue Type: Improvement > Components: Java API > Affects Versions: 2.4.7 > Reporter: jacklzg > Priority: Minor > > If the target Java data class has a circular reference, Spark will fail fast > from creating the Dataset or running Encoders. > > For example, with protobuf class, there is a reference with Descriptor, there > is no way to build a dataset from the protobuf class. > From this line > {color:#7a869a}Encoders.bean(ProtoBuffOuterClass.ProtoBuff.class);{color} > > It will throw out immediately > > {quote}Exception in thread "main" java.lang.UnsupportedOperationException: > Cannot have circular references in bean class, but got the circular reference > of class class com.google.protobuf.Descriptors$Descriptor > {quote} > > Can we add a parameter, for example, > > {code:java} > Encoders.bean(Class<T> clas, List<Fields> fieldsToIgnore);{code} > ```` > or > > {code:java} > Encoders.bean(Class<T> clas, boolean skipCircularRefField);{code} > > which subsequently, instead of throwing an > [exception|[https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L556]], > it instead skip the field. > > {code:java} > if (seenTypeSet.contains(t)) { > if(skipCircularRefField) > println("field skipped") //just skip this field > else throw new UnsupportedOperationException( s"cannot have circular > references in class, but got the circular reference of class $t") > } > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org