Hello.

I am using pyarrow csv module.

from pyarrow import csv
fn = '/home/srruj/cars.csv'
read_options=csv.ReadOptions(column_names=('year', 'make', 'model', 'comment', 
'blank'))

convert_options = csv.ConvertOptions(include_columns=column_names=('year', 
'make', 'model', 'comment', 'blank'),

                                     include_missing_columns=True,

                                     strings_can_be_null=True)

table = csv.read_csv(fn, read_options=read_options, 
convert_options=convert_options)
table

I am getting the following error :
Csv parse error: Expected 5 columns, got 3

This is how file looks:

year,make,model,comment,blank
"2012","Tesla","S","No comment",
1997,Ford,E350,"Go get one now they are going fast",
2015,Chevy,Volt

I am able to read this file from spark using spark.read.csv(..) but not using 
pyarrow.

Can you please help?

Thanks
Sricheta.


Reply via email to