[ https://issues.apache.org/jira/browse/SPARK-42237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-42237. ---------------------------------- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39802 [https://github.com/apache/spark/pull/39802] > change binary to unsupported dataType in csv format > --------------------------------------------------- > > Key: SPARK-42237 > URL: https://issues.apache.org/jira/browse/SPARK-42237 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.8, 3.3.1 > Reporter: Wei Guo > Assignee: Wei Guo > Priority: Minor > Fix For: 3.4.0 > > Attachments: image-2023-01-30-17-21-09-212.png > > > When a binary colunm is written into csv files, actual content of this colunm > is {*}object.toString(){*}, which is meaningless. > {code:java} > val df = Seq(Array[Byte](1,2)).toDF > df.write.csv("/Users/guowei/Desktop/binary_csv") > {code} > The csv file's content is as follows: > !image-2023-01-30-17-21-09-212.png|width=141,height=29! > Meanwhile, if a binary colunm saved as table with csv fileformat, the table > can't be read back successfully. > {code:java} > val df = Seq((1, Array[Byte](1,2))).toDF > df.write.format("csv").saveAsTable("binaryDataTable")spark.sql("select * from > binaryDataTable").show() > {code} > !https://rte.weiyun.baidu.com/wiki/attach/image/api/imageDownloadAddress?attachId=82da0afc444c41bdaac34418a1c89963&docGuid=Eiscz4oMI45Sfp&sign=eyJhbGciOiJkaXIiLCJlbmMiOiJBMjU2R0NNIiwiYXBwSWQiOjEsInVpZCI6IjgtVWkzU0lMY2wiLCJkb2NJZCI6IkVpc2N6NG9NSTQ1U2ZwIn0..z1O-00hE1tTua9co.RmL0GxEQyNVQbIMYOvyAmQY18NMCxHdGdEPtulFiV3BuqsVlJODgA9-xFY9H9yer_Ckpbt4aG2ZrqgohIq43_ywzj-8u8SKKZnnzm7Dt-EhQBwrA7EhwUveE4-MRcAmsgqRKneN0gUJIu78ogR-M5-GAYqiyd-C-PH0LTaHDhNBWFBkF01kVOLJ18c2VTT6_lbc9j9Drmxj56ouymFgfhdUtpA.cTYqsEvvnKDcIPiah99f_A! > So I think it' better to change binary to unsupported dataType in csv format, > both for datasource v1(CSVFileFormat) and v2(CSVTable). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org