[ https://issues.apache.org/jira/browse/SPARK-23612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Herman van Hovell updated SPARK-23612: -------------------------------------- Labels: DataType date spree sql (was: DataType date sql) > Specify formats for individual DateType and TimestampType columns in schemas > ---------------------------------------------------------------------------- > > Key: SPARK-23612 > URL: https://issues.apache.org/jira/browse/SPARK-23612 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL > Affects Versions: 2.3.0 > Reporter: Patrick Young > Priority: Minor > Labels: DataType, date, spree, sql > > [https://github.com/apache/spark/blob/407f67249639709c40c46917700ed6dd736daa7d/python/pyspark/sql/types.py#L162-L200] > It would be very helpful if it were possible to specify the format for > individual columns in a schema when reading csv files, rather than one format: > {code:java|title=Bar.python|borderStyle=solid} > # Currently can only do something like: > spark.read.option("dateFormat", "yyyyMMdd").csv(...) > # Would like to be able to do something like: > schema = StructType([ > StructField("date1", DateType(format="MM/dd/yyyy"), True), > StructField("date2", DateType(format="yyyyMMdd"), True) > ] > read.schema(schema).csv(...) > {code} > Thanks for any help, input! -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org