[ https://issues.apache.org/jira/browse/SPARK-22010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166097#comment-16166097 ]
Hyukjin Kwon commented on SPARK-22010: -------------------------------------- I don't think this is worth fixing for now. The improvement looks quite trivial but it sounds we should reinvent the wheel. Do you know a simple and well-known workaround or any measurement between the custom fix and the current status? Otherwise, I'd close this as {{Won't Fix}}. > Slow fromInternal conversion for TimestampType > ---------------------------------------------- > > Key: SPARK-22010 > URL: https://issues.apache.org/jira/browse/SPARK-22010 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.2.0 > Reporter: Maciej Bryński > > To convert timestamp type to python we are using > `datetime.datetime.fromtimestamp(ts // 1000000).replace(microsecond=ts % > 1000000)` > code. > {code} > In [34]: %%timeit > ...: > datetime.datetime.fromtimestamp(1505383647).replace(microsecond=12344) > ...: > 4.2 µs ± 558 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) > {code} > It's slow, because: > # we're trying to get TZ on every conversion > # we're using replace method > Proposed solution: custom datetime conversion and move calculation of TZ to > module -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org