Re: Error when converting csv to parquet in chunks, with the first chunk being all nulls

2017-07-10 Thread Alexey Strokach
OK, awesome! Thanks for the reply. On Mon, Jul 10, 2017 at 1:42 PM, Uwe L. Korn wrote: > Hello Alexey, > > you discovered a known bug in 0.4.1. If a column is only made up of None > objects, then writing to Parquet fails. This is fixed upstream and will > be included in the

Re: Error when converting csv to parquet in chunks, with the first chunk being all nulls

2017-07-10 Thread Uwe L. Korn
Hello Alexey, you discovered a known bug in 0.4.1. If a column is only made up of None objects, then writing to Parquet fails. This is fixed upstream and will be included in the upcoming 0.5.0 release. Uwe On Sat, Jul 8, 2017, at 04:32 AM, Alexey Strokach wrote: > I am running into a problem

Error when converting csv to parquet in chunks, with the first chunk being all nulls

2017-07-07 Thread Alexey Strokach
I am running into a problem converting a csv file into a parquet file in chunks, where one of the string columns is null for the first several million rows. Self-contained dummy example: csv_file = '/tmp/df.csv' parquet_file = '/tmp/df.parquet' df = pd.DataFrame([np.nan] * 3 + ['hello'],