Hi I have a orderedDict in python, and I would like to convert it to a DF, with columns in the same order.
from collections import OrderedDict str = [OrderedDict([(u'MID', 15784879), (u'START_DATE', u'1983-06-16 00:00:00'), (u'END_DATE', u'1984-01-31 00:00:00'), (u'AUDIT_ID', u'16994174'), (u'AUDIT_TIMESTAMP', u'2011-05-19 14:01:16.761979000 +10:00')]), OrderedDict([(u'MID', 15784879), (u'START_DATE', u'1984-02-01 00:00:00'), (u'END_DATE', u'1995-10-09 00:00:00'), (u'AUDIT_ID', u'16994174'), (u'AUDIT_TIMESTAMP', u'2011-05-19 14:01:16.760966000 +10:00')])] print str df = spark.sparkContext.parallelize(str).toDF() df.printSchema() [OrderedDict([(u'MID', 15784879), (u'START_DATE', u'1983-06-16 00:00:00'), (u'END_DATE', u'1984-01-31 00:00:00'), (u'AUDIT_ID', u'16994174'), (u'AUDIT_TIMESTAMP', u'2011-05-19 14:01:16.761979000 +10:00')]), OrderedDict([(u'MID', 15784879), (u'START_DATE', u'1984-02-01 00:00:00'), (u'END_DATE', u'1995-10-09 00:00:00'), (u'AUDIT_ID', u'16994174'), (u'AUDIT_TIMESTAMP', u'2011-05-19 14:01:16.760966000 +10:00')])] root |-- AUDIT_ID: string (nullable = true) |-- AUDIT_TIMESTAMP: string (nullable = true) |-- END_DATE: string (nullable = true) |-- MID: long (nullable = true) |-- START_DATE: string (nullable = true) Is there any way to do it? I have control over to use OrderedDict vs normal dict, but the column order is the requirement. Any help would be great!! -- Best Regards, Ayan Guha