Keith Curtis created ARROW-1311: ----------------------------------- Summary: python hangs after write a few parquet tables Key: ARROW-1311 URL: https://issues.apache.org/jira/browse/ARROW-1311 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.5.0 Environment: Python 3.5.2, pyarrow 0.5.0 Reporter: Keith Curtis
I had a program to read some csv files (a few million rows each, 9 columns), and converted with: ```python import os import pandas as pd import pyarrow.parquet as pq import pyarrow def to_parquet(output_file, csv_file): df = pd.read_csv(csv_file) table = pyarrow.Table.from_pandas(df) pq.write_table(table, output_file) ``` The first csv file would always complete, but python would hang on the second or third file, and sometimes on a much later file. -- This message was sent by Atlassian JIRA (v6.4.14#64029)