[jira] [Commented] (SPARK-27353) PySpark Row repr bug

Bryan Cutler (JIRA) Thu, 04 Apr 2019 09:01:14 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-27353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810009#comment-16810009
 ]


Bryan Cutler commented on SPARK-27353:
--------------------------------------

Works for me out of master, can you provide a script to reproduce?

In [1]: from pyspark.sql.types import Row                                       
                                             

In [2]: import datetime                                                         
                                             

In [3]: Row(d=datetime.date.today())                                            
                                             
Out[3]: Row(d=datetime.date(2019, 4, 4))

In [4]: repr(Row(d=datetime.date.today()))                                      
                                             
Out[4]: 'Row(d=datetime.date(2019, 4, 4))'

> PySpark  Row  __repr__ bug
> --------------------------
>
>                 Key: SPARK-27353
>                 URL: https://issues.apache.org/jira/browse/SPARK-27353
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.4.0
>            Reporter: Ihor Bobak
>            Priority: Major
>
> Row class has this implementation of __repr__:
>     def __repr__(self):
>         """Printable representation of Row used in Python REPL."""
>         if hasattr(self, "__fields__"):
>             return "Row(%s)" % ", ".join("%s=%r" % (k, v)
>                                          for k, v in zip(self.__fields__, 
> tuple(self)))
>         else:
>             return "<Row(%s)>" % ", ".join(self)
>  
> the last line fails when you have a datetime.date instance in a row:
> TypeError                                 Traceback (most recent call last)
> <ipython-input-41-02c2f5a33c6e> in <module>
>       2     print(*row.values)
>       3     df_row = Row(*row.values)
> ----> 4     print(repr(df_row))
>       5     break
>       6 
> E:\spark\spark-2.3.2-bin-without-hadoop\python\pyspark\sql\types.py in 
> __repr__(self)
>    1579                                          for k, v in 
> zip(self.__fields__, tuple(self)))
>    1580         else:
> -> 1581             return "<Row(%s)>" % ", ".join(self)
>    1582 
>    1583 
> TypeError: sequence item 0: expected str instance, datetime.date found
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27353) PySpark Row __repr__ bug

Reply via email to

[jira] [Commented] (SPARK-27353) PySpark Row repr bug