Maciej Szymkiewicz created SPARK-33054:
------------------------------------------

             Summary: Support interval type in PySpark
                 Key: SPARK-33054
                 URL: https://issues.apache.org/jira/browse/SPARK-33054
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, SQL
    Affects Versions: 3.1.0
            Reporter: Maciej Szymkiewicz


At the moment PySpark doesn't support interval types at all. For example 
calling the following

{code:python}
spark.sql("SELECT current_date() - current_date()")                             
                                                                                
                     
{code}

or 

{code:python}
from pyspark.sql.functions import current_timestamp                             
                                                                                
                     
spark.range(1).select(current_timestamp() - current_timestamp())  
{code}

results in

{code}
Traceback (most recent call last):
...
ValueError: Could not parse datatype: interval
{code}

At minimum, we should support {{CalendarIntervalType}} in the schema, so 
queries using it don't fail on conversion.

Optionally we should have to provide conversions between internal and external 
types. That however, might be tricky, as {{CalendarInterval}} seems to have 
different semantics than {{datetime.timedetla}}.

Also see 
https://issues.apache.org/jira/browse/SPARK-21187?focusedCommentId=16474664&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16474664



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to