Keith Curtis created ARROW-1311:
-----------------------------------

             Summary: python hangs after write a few parquet tables
                 Key: ARROW-1311
                 URL: https://issues.apache.org/jira/browse/ARROW-1311
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.5.0
         Environment: Python 3.5.2, pyarrow 0.5.0
            Reporter: Keith Curtis


I had a program to read some csv files (a few million rows each, 9 columns), 
and converted with:

```python
import os
import pandas as pd

import pyarrow.parquet as pq
import pyarrow

def to_parquet(output_file, csv_file):
    df = pd.read_csv(csv_file)
    table = pyarrow.Table.from_pandas(df)
    pq.write_table(table, output_file)

```

The first csv file would always complete, but python would hang on the second 
or third file, and sometimes on a much later file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to