panbingkun commented on PR #44665:
URL: https://github.com/apache/spark/pull/44665#issuecomment-2010991612

   - Why is the result displayed through `to_csv` inconsistency in Scala and 
Python for this case?
     Because this case is on the `python side`, it ultimately uses 
`GenericArrayData`, which happens to implement the method `toString`, so 
`to_csv` displays readable text.
   
https://github.com/apache/spark/blob/11247d804cd370aaeb88736a706c587e7f5c83b3/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala#L85
   
   However, on the `scala side`, it ultimately uses `UnsafeArrayData`.  
`Unfortunately`, it does not implement the method `toString` (using the default 
`Object.toString` method), so the final `to_csv` displays `the address of the 
object`.
   
   - In the implementation process of this PR, it can display `non-standard but 
pretty strings`,  as follows:
     
https://github.com/apache/spark/pull/44665/commits/9695e975f3299556e7c268918ecd51be7a03c157
     <img width="605" alt="image" 
src="https://github.com/apache/spark/assets/15246973/fd07dc0a-4d61-4663-8631-daff518da278";>
     The `disadvantage` of this is that it `cannot` be `read back` through 
`from_csv` `at present`.
     If the final result of the discussion is acceptable, it should be easy to 
bring back this feature.
   
   - Another possible compromise solution is to add a configuration (defaultly, 
it does `not` support displaying data of type [Array, Map, Struct ...] as 
`non-standard but pretty strings` through `to_csv`). If the user sets this 
configuration to be enabled, restore the original behavior?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to