[ https://issues.apache.org/jira/browse/ARROW-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joris Van den Bossche updated ARROW-7914: ----------------------------------------- Fix Version/s: 8.0.0 > [Python] Allow pandas datetime as index for feather > --------------------------------------------------- > > Key: ARROW-7914 > URL: https://issues.apache.org/jira/browse/ARROW-7914 > Project: Apache Arrow > Issue Type: New Feature > Components: Python > Affects Versions: 0.15.1 > Environment: Windows, python 3.6.7, > Reporter: Samuel Jones > Priority: Minor > Labels: arrow, datetime, feather, pull-request-available, python > Fix For: 8.0.0 > > Attachments: PEC fine course 1 grid 199001.csv, PEC fine course 1 > grid 199001.feather > > Time Spent: 20m > Remaining Estimate: 0h > > Sorry in advance if I mess anything up. This is my first issue. > I have hourly data for 3 years using a Pandas datetime as the index. Pandas > allows me load/save .csv with the following code (only one month with 2 > variables shown): > ` > h1. Write data to .csv > jan90.to_csv('PEC fine course 1 grid 199001.csv', index=True) > h1. Load data from .csv > jan90 = pd.read_csv('PEC fine course 1 grid 199001.csv', index_col=0, > parse_dates=True) > ` > Using .csv works, but is slow when I get to the full dataset of 26k+ rows and > 21.6k+ columns (and more columns may be coming if I have to add lags to my > data). So, a more efficient load/save routine is very desirable. I was > excited when I found feather, but the lost index is a no-go for my use. > Thanks for your consideration. -- This message was sent by Atlassian Jira (v8.20.1#820001)