Its the same input except that headers are also being read with csv reader.
Mvg/Regards Saurabh Gulati ________________________________ From: Sean Owen <[email protected]> Sent: 04 January 2023 15:12 To: Saurabh Gulati <[email protected]> Cc: User <[email protected]> Subject: Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data That does not appear to be the same input you used in your example. What is the contents of test.csv? On Wed, Jan 4, 2023 at 7:45 AM Saurabh Gulati <[email protected]<mailto:[email protected]>> wrote: Hi @Sean Owen<mailto:[email protected]> Probably the data is incorrect, and the source needs to fix it. But using python's csv parser returns the correct results. import csv with open("/tmp/test.csv") as c_file: csv_reader = csv.reader(c_file, delimiter=",") for row in csv_reader: print(row) ['a', 'b', 'c'] ['1', '', ',see what "I did",\ni am still writing'] ['2', '', 'abc'] And also, I don't understand why there is a distinction in outputs from df.show() and df.select("c").show() Mvg/Regards Saurabh Gulati Data Platform ________________________________ From: Sean Owen <[email protected]<mailto:[email protected]>> Sent: 04 January 2023 14:25 To: Saurabh Gulati <[email protected]<mailto:[email protected]>> Cc: Mich Talebzadeh <[email protected]<mailto:[email protected]>>; User <[email protected]<mailto:[email protected]>> Subject: Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data That input is just invalid as CSV for any parser. You end a quoted col without following with a col separator. What would the intended parsing be and how would it work? On Wed, Jan 4, 2023 at 4:30 AM Saurabh Gulati <[email protected]<mailto:[email protected]>> wrote: @Sean Owen<mailto:[email protected]> Also see the example below with quotes feedback: "a","b","c" "1","",",see what ""I did""," "2","","abc"
