Thank you so much for your reply.
We would like to provide a tool to the user to convert a binary file to a
file in Avro/Parquet format on his own computer. The tool will parse binary
file in python, and convert the data to Parquet. (BTW can we append to
parquet file). The issue is that we do not
Hey Akriti23,
pyspark gives you a saveAsParquetFile() api, to save your rdd as parquet.
You will however, need to infer the schema or describe it manually before
you can do so. Here are some docs about that (v1.2.1, you can search for the
others, they're relatively similar 1.1 and up):