[ https://issues.apache.org/jira/browse/ARROW-17893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joris Van den Bossche resolved ARROW-17893. ------------------------------------------- Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14531 [https://github.com/apache/arrow/pull/14531] > [Python] Bug: Wrong reading of timedelta > ---------------------------------------- > > Key: ARROW-17893 > URL: https://issues.apache.org/jira/browse/ARROW-17893 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 8.0.0 > Environment: macOS 12.6 on an Apple M1 Ultra > Reporter: Yaser Alraddadi > Assignee: Alenka Frim > Priority: Critical > Labels: pull-request-available > Fix For: 11.0.0 > > Attachments: check_timedelta.py > > Time Spent: 50m > Remaining Estimate: 0h > > When there is a timedelta and a list of dictionary that also has timedelta as > well, reading the upper timedelta in feather format sometimes gives wrong > reading. > below is an example if you check the printed results sometime it reads the > upper timedelta as {color:#00875a}0 days 03:40:23 correct{color}, and > sometimes as {color:#de350b}153 days 01:03:20 wrong{color} > Here is the code, also it is attached as check_timedelta.py > > {code:java} > from datetime import datetime, timedelta > import pandas as pd > import pyarrow.feather as feather > time_1 = datetime.fromisoformat("2022-04-21T10:18:12+03:00") > time_2 = datetime.fromisoformat("2022-04-21T13:58:35+03:00") > data = [ > { > "waiting_time": timedelta(seconds=12, microseconds=1), > }, > { > "waiting_time": timedelta(seconds=1020), > }, > { > "waiting_time": timedelta(seconds=960), > }, > { > "waiting_time": timedelta(seconds=960), > }, > { > "waiting_time": timedelta(seconds=960), > }, > { > "waiting_time": timedelta(seconds=815, microseconds=1), > }, > ] > df = pd.DataFrame( > [ > { > "time_1": time_1, > "time_2": time_2, > "data": data, > "timedelta_1": time_2 - time_1, > "timedelta_2": timedelta(hours=3, minutes=40, seconds=23), > }, > ] > ) > print("Correct timedelta_1: ", df["timedelta_1"].item()) > print("Correct timedelta_2: ", df["timedelta_2"].item()) > with open(f"records.feather.lz4", "wb") as f: > feather.write_feather(df, f, compression="lz4") > for _ in range(10): > with open(f"records.feather.lz4", "rb") as f: > print("Reading timedelta_1: ", > feather.read_feather(f)["timedelta_1"].item()) > print("Reading timedelta_2: ", > feather.read_feather(f)["timedelta_2"].item()) > {code} > > > Printed Results > > {code:java} > Correct timedelta_1: 0 days 03:40:23 > Correct timedelta_2: 0 days 03:40:23 > Reading timedelta_1: 0 days 03:40:23 > Reading timedelta_2: 0 days 03:40:23 > Reading timedelta_1: 0 days 03:40:23 > Reading timedelta_2: 0 days 03:40:23 > Reading timedelta_1: 153 days 01:03:20 > Reading timedelta_2: 153 days 01:03:20 > Reading timedelta_1: 0 days 03:40:23 > Reading timedelta_2: 0 days 03:40:23 > Reading timedelta_1: 0 days 03:40:23 > Reading timedelta_2: 0 days 03:40:23 > Reading timedelta_1: 0 days 03:40:23 > Reading timedelta_2: 153 days 01:03:20 > Reading timedelta_1: 153 days 01:03:20 > Reading timedelta_2: 0 days 03:40:23 > Reading timedelta_1: 0 days 03:40:23 > Reading timedelta_2: 153 days 01:03:20 > Reading timedelta_1: 153 days 01:03:20 > Reading timedelta_2: 153 days 01:03:20 > Reading timedelta_1: 153 days 01:03:20 > Reading timedelta_2: 153 days 01:03:20{code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)