Maciej Szymkiewicz created SPARK-33054: ------------------------------------------
Summary: Support interval type in PySpark Key: SPARK-33054 URL: https://issues.apache.org/jira/browse/SPARK-33054 Project: Spark Issue Type: Improvement Components: PySpark, SQL Affects Versions: 3.1.0 Reporter: Maciej Szymkiewicz At the moment PySpark doesn't support interval types at all. For example calling the following {code:python} spark.sql("SELECT current_date() - current_date()") {code} or {code:python} from pyspark.sql.functions import current_timestamp spark.range(1).select(current_timestamp() - current_timestamp()) {code} results in {code} Traceback (most recent call last): ... ValueError: Could not parse datatype: interval {code} At minimum, we should support {{CalendarIntervalType}} in the schema, so queries using it don't fail on conversion. Optionally we should have to provide conversions between internal and external types. That however, might be tricky, as {{CalendarInterval}} seems to have different semantics than {{datetime.timedetla}}. Also see https://issues.apache.org/jira/browse/SPARK-21187?focusedCommentId=16474664&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16474664 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org