[ 
https://issues.apache.org/jira/browse/ARROW-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-7914:
----------------------------------
    Labels: arrow datetime feather pull-request-available python  (was: arrow 
datetime feather python)

> [Python] Allow pandas datetime as index for feather
> ---------------------------------------------------
>
>                 Key: ARROW-7914
>                 URL: https://issues.apache.org/jira/browse/ARROW-7914
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>    Affects Versions: 0.15.1
>         Environment: Windows, python 3.6.7,
>            Reporter: Samuel Jones
>            Priority: Minor
>              Labels: arrow, datetime, feather, pull-request-available, python
>         Attachments: PEC fine course 1 grid 199001.csv, PEC fine course 1 
> grid 199001.feather
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Sorry in advance if I mess anything up. This is my first issue.
> I have hourly data for 3 years using a  Pandas datetime as the index. Pandas 
> allows me load/save .csv with the following code (only one month with 2 
> variables shown):
> `
> h1. Write data to .csv
> jan90.to_csv('PEC fine course 1 grid 199001.csv', index=True)
> h1. Load data from .csv
> jan90 = pd.read_csv('PEC fine course 1 grid 199001.csv', index_col=0, 
> parse_dates=True)
> `
> Using .csv works, but is slow when I get to the full dataset of 26k+ rows and 
> 21.6k+ columns (and more columns may be coming if I have to add lags to my 
> data). So, a more efficient load/save routine is very desirable. I was 
> excited when I found feather, but the lost index is a no-go for my use.
> Thanks for your consideration.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to