Hi @Sean Owen<mailto:[email protected]>
Probably the data is incorrect, and the source needs to fix it.
But using python's csv parser returns the correct results.
import csv
with open("/tmp/test.csv") as c_file:
csv_reader = csv.reader(c_file, delimiter=",")
for row in csv_reader:
print(row)
['a', 'b', 'c']
['1', '', ',see what "I did",\ni am still writing']
['2', '', 'abc']
And also, I don't understand why there is a distinction in outputs from
df.show() and df.select("c").show()
Mvg/Regards
Saurabh Gulati
Data Platform
________________________________
From: Sean Owen <[email protected]>
Sent: 04 January 2023 14:25
To: Saurabh Gulati <[email protected]>
Cc: Mich Talebzadeh <[email protected]>; User <[email protected]>
Subject: Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within
the data
That input is just invalid as CSV for any parser. You end a quoted col without
following with a col separator. What would the intended parsing be and how
would it work?
On Wed, Jan 4, 2023 at 4:30 AM Saurabh Gulati
<[email protected]<mailto:[email protected]>> wrote:
@Sean Owen<mailto:[email protected]> Also see the example below with quotes
feedback:
"a","b","c"
"1","",",see what ""I did"","
"2","","abc"