Viju, JSon data are not *typed* - Parquet requires types - and Avro is a perfect packaging as it provides the typing in the parquet format
So you will need to convert your messages into Avro ( and that's a best practice in *Kafka* by the way ). The reason you need to do that is that for example in the following JSon you need to define whether it an INT or a LONG etc .. "age" : 34 How to convert JSon to *Avro* ? - You could get away with using a library like Avro4s (check in github) that does a best effort conversion - but that would not be a very robust solution - The other way would be to actually type the code : Read each JSon -> Construct an Avro record (via hard-coded types) Then all you need is store those Avro into *Parquet*. The bottom line is that Parquet , cannot store Json msg - as it requires types - Antonios On Tue, Sep 27, 2016 at 8:25 PM, VIJJU CH <vijju5...@gmail.com> wrote: > Hello, > > We have a scenario where we currently use Apache Kafka. We are having the > Kafka message which are in JSON format in a Kafka topic. From Kafka topic > we are sending JSON messages to Amazon S3. > > Can we do like reading the messages from Apache Kafka topic which are in > JSON and then convert them to Parquet format and store them in S3 ? > > Reply me at your earliest convenience. > > Thanks, > Vijju >