Re: Write to Parquet File in Python

2015-04-06 Thread Akriti23
Thank you so much for your reply. We would like to provide a tool to the user to convert a binary file to a file in Avro/Parquet format on his own computer. The tool will parse binary file in python, and convert the data to Parquet. (BTW can we append to parquet file). The issue is that we do not

Re: Write to Parquet File in Python

2015-03-23 Thread chuwiey
Hey Akriti23, pyspark gives you a saveAsParquetFile() api, to save your rdd as parquet. You will however, need to infer the schema or describe it manually before you can do so. Here are some docs about that (v1.2.1, you can search for the others, they're relatively similar 1.1 and up):