Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20163#discussion_r161362902
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
 ---
    @@ -144,6 +145,7 @@ object EvaluatePython {
         }
     
         case StringType => (obj: Any) => nullSafeConvert(obj) {
    +      case _: Calendar => null
           case _ => UTF8String.fromString(obj.toString)
    --- End diff --
    
    For the perfectness, I think we should check all the types, 
https://github.com/irmen/Pyrolite,
    
    ```
    PYTHON    ---->     JAVA
    ------              ----
    None                null
    bool                boolean
    int                 int
    long                long or BigInteger  (depending on size)
    string              String
    unicode             String
    complex             net.razorvine.pickle.objects.ComplexNumber
    datetime.date       java.util.Calendar
    datetime.datetime   java.util.Calendar
    datetime.time       net.razorvine.pickle.objects.Time
    datetime.timedelta  net.razorvine.pickle.objects.TimeDelta
    float               double   (float isn't used) 
    array.array         array of appropriate primitive type (char, int, short, 
long, float, double)
    list                java.util.List<Object>
    tuple               Object[]
    set                 java.util.Set
    dict                java.util.Map
    bytes               byte[]
    bytearray           byte[]
    decimal             BigDecimal    
    custom class        Map<String, Object>  (dict with class attributes 
including its name in "__class__")
    Pyro4.core.URI      net.razorvine.pyro.PyroURI
    Pyro4.core.Proxy    net.razorvine.pyro.PyroProxy
    Pyro4.errors.*      net.razorvine.pyro.PyroException
    Pyro4.utils.flame.FlameBuiltin     net.razorvine.pyro.FlameBuiltin 
    Pyro4.utils.flame.FlameModule      net.razorvine.pyro.FlameModule 
    Pyro4.utils.flame.RemoteInteractiveConsole    
net.razorvine.pyro.FlameRemoteConsole 
    ```
    
    and then check if the string conversion looks reasonably consistent by 
`obj.toString`. If not, we add it in the blacklist.
    
    Another possibility is to whitelist `String`, but then I guess this is 
rather a radical change.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to