That does not appear to be the same input you used in your example. What is the contents of test.csv?
On Wed, Jan 4, 2023 at 7:45 AM Saurabh Gulati <saurabh.gul...@fedex.com> wrote: > Hi @Sean Owen <sro...@gmail.com> > Probably the data is incorrect, and the source needs to fix it. > But using python's csv parser returns the correct results. > > import csv > > with open("/tmp/test.csv") as c_file: > > csv_reader = csv.reader(c_file, delimiter=",") > for row in csv_reader: > print(row) > > ['a', 'b', 'c'] > ['1', '', ',see what "I did",\ni am still writing'] > ['2', '', 'abc'] > > And also, I don't understand why there is a distinction in outputs from > df.show() and df.select("c").show() > > Mvg/Regards > Saurabh Gulati > Data Platform > ------------------------------ > *From:* Sean Owen <sro...@gmail.com> > *Sent:* 04 January 2023 14:25 > *To:* Saurabh Gulati <saurabh.gul...@fedex.com> > *Cc:* Mich Talebzadeh <mich.talebza...@gmail.com>; User < > user@spark.apache.org> > *Subject:* Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used > within the data > > That input is just invalid as CSV for any parser. You end a quoted col > without following with a col separator. What would the intended parsing be > and how would it work? > > On Wed, Jan 4, 2023 at 4:30 AM Saurabh Gulati <saurabh.gul...@fedex.com> > wrote: > > > @Sean Owen <sro...@gmail.com> Also see the example below with quotes > feedback: > > "a","b","c" > "1","",",see what ""I did""," > "2","","abc" > >