[ 
https://issues.apache.org/jira/browse/SPARK-33598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jacklzg updated SPARK-33598:
----------------------------
    Description: 
If the target Java data class has a circular reference, Spark will fail fast 
from creating the Dataset or running Encoders.

 

For example, with protobuf class, there is a reference with Descriptor, there 
is no way to build a dataset from the protobuf class.

>From this line

{color:#7a869a}Encoders.bean(ProtoBuffOuterClass.ProtoBuff.class);{color}

 

It will throw out immediately

 
{quote}Exception in thread "main" java.lang.UnsupportedOperationException: 
Cannot have circular references in bean class, but got the circular reference 
of class class com.google.protobuf.Descriptors$Descriptor
{quote}
 

Can we add  a parameter, for example, 

 
{code:java}
Encoders.bean(Class<T> clas, List<Fields> fieldsToIgnore);{code}
````

or

 
{code:java}
Encoders.bean(Class<T> clas, boolean skipCircularRefField);{code}
 

 which subsequently, instead of throwing an 
[exception|[https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L556]],
 it instead skip the field.

 
{code:java}
if (seenTypeSet.contains(t)) {
if(skipCircularRefField)
  println("field skipped") //just skip this field
else throw new UnsupportedOperationException( s"cannot have circular references 
in class, but got the circular reference of class $t")
}
{code}
 

 

  was:
If the target Java data class has a circular reference, Spark will fail fast 
from creating the Dataset or running Encoders.

 

For example, with protobuf class, there is a reference with Descriptor, there 
is no way to build a dataset from the protobuf class.

>From this line

{color:#7a869a}Encoders.bean(ProtoBuffOuterClass.ProtoBuff.class);{color}

 

It will throw out immediately

 
{quote}Exception in thread "main" java.lang.UnsupportedOperationException: 
Cannot have circular references in bean class, but got the circular reference 
of class class com.google.protobuf.Descriptors$Descriptor
{quote}
 

Can we add  a parameter, for example, 

 
{code:java}
Encoders.bean(Class<T> clas, List<Fields> fieldsToIgnore);{code}
````

or

 
{code:java}
Encoders.bean(Class<T> clas, boolean skipCircularRefField);{code}
 

 which subsequently, instead of throwing an [[exception||#L556] 
[https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L556]
 []|#L556], it instead skip the field.

 
{code:java}
if (seenTypeSet.contains(t)) {
if(skipCircularRefField)
  println("field skipped") //just skip this field
else throw new UnsupportedOperationException( s"cannot have circular references 
in class, but got the circular reference of class $t")
}
{code}
 

 


> Support Java Class with circular references
> -------------------------------------------
>
>                 Key: SPARK-33598
>                 URL: https://issues.apache.org/jira/browse/SPARK-33598
>             Project: Spark
>          Issue Type: Improvement
>          Components: Java API
>    Affects Versions: 2.4.7
>            Reporter: jacklzg
>            Priority: Minor
>
> If the target Java data class has a circular reference, Spark will fail fast 
> from creating the Dataset or running Encoders.
>  
> For example, with protobuf class, there is a reference with Descriptor, there 
> is no way to build a dataset from the protobuf class.
> From this line
> {color:#7a869a}Encoders.bean(ProtoBuffOuterClass.ProtoBuff.class);{color}
>  
> It will throw out immediately
>  
> {quote}Exception in thread "main" java.lang.UnsupportedOperationException: 
> Cannot have circular references in bean class, but got the circular reference 
> of class class com.google.protobuf.Descriptors$Descriptor
> {quote}
>  
> Can we add  a parameter, for example, 
>  
> {code:java}
> Encoders.bean(Class<T> clas, List<Fields> fieldsToIgnore);{code}
> ````
> or
>  
> {code:java}
> Encoders.bean(Class<T> clas, boolean skipCircularRefField);{code}
>  
>  which subsequently, instead of throwing an 
> [exception|[https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L556]],
>  it instead skip the field.
>  
> {code:java}
> if (seenTypeSet.contains(t)) {
> if(skipCircularRefField)
>   println("field skipped") //just skip this field
> else throw new UnsupportedOperationException( s"cannot have circular 
> references in class, but got the circular reference of class $t")
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to