You can use the structtype and structfield approach or use the inferSchema approach.
Sent from my T-Mobile 4G LTE Device -------- Original message -------- From: "write2sivakumar@gmail" <write2sivaku...@gmail.com> Date: 12/30/16 10:08 PM (GMT-05:00) To: Raymond Xie <xie3208...@gmail.com>, user@spark.apache.org Subject: Re: How to load a big csv to dataframe in Spark 1.6 Hi Raymond, Your problem is to pass those 100 fields to .toDF() method?? Sent from my Samsung device -------- Original message -------- From: Raymond Xie <xie3208...@gmail.com> Date: 31/12/2016 10:46 (GMT+08:00) To: user@spark.apache.org Subject: How to load a big csv to dataframe in Spark 1.6 Hello, I see there is usually this way to load a csv to dataframe: sqlContext = SQLContext(sc) Employee_rdd = sc.textFile("\..\Employee.csv") .map(lambda line: line.split(",")) Employee_df = Employee_rdd.toDF(['Employee_ID','Employee_name']) Employee_df.show()However in my case my csv has 100+ fields, which means toDF() will be very lengthy. Can anyone tell me a practical method to load the data? Thank you very much. Raymond