[ https://issues.apache.org/jira/browse/SPARK-22010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maciej Bryński updated SPARK-22010: ----------------------------------- Issue Type: Sub-task (was: Improvement) Parent: SPARK-22024 > Slow fromInternal conversion for TimestampType > ---------------------------------------------- > > Key: SPARK-22010 > URL: https://issues.apache.org/jira/browse/SPARK-22010 > Project: Spark > Issue Type: Sub-task > Components: PySpark > Affects Versions: 2.2.0 > Reporter: Maciej Bryński > Priority: Minor > Attachments: profile_fact_dok.png > > > To convert timestamp type to python we are using > {code}datetime.datetime.fromtimestamp(ts // 1000000).replace(microsecond=ts % > 1000000){code} > code. > {code} > In [34]: %%timeit > ...: > datetime.datetime.fromtimestamp(1505383647).replace(microsecond=12344) > ...: > 4.58 µs ± 558 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) > {code} > It's slow, because: > # we're trying to get TZ on every conversion > # we're using replace method > Proposed solution: custom datetime conversion and move calculation of TZ to > module -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org