Hi all, I'm trying to create a dataframe enforcing a schema so that I can write it to a parquet file. The schema has timestamps and I get an error with pyspark. The following is a snippet of code that exhibits the problem,
df = sqlctx.range(1000) schema = StructType([StructField('a', TimestampType(), True)]) df1 = sqlctx.createDataFrame(df.rdd.map(row_gen_func), schema) row_gen_func is a function that retruns timestamp strings of the form "2018-03-21 11:09:44" When I compile this with Spark 2.2 I get the following error, raise TypeError("%s can not accept object %r in type %s" % (dataType, obj, type(obj))) TypeError: TimestampType can not accept object '2018-03-21 08:06:17' in type <type 'str'> Regards, Keith. http://keith-chapman.com