Re: Camel + Drill + Parquet

Omar Al-Safi Wed, 12 Feb 2020 01:34:02 -0800

Hi Ron,

By reading some introduction in Apache Drill, I'd say the file component
would be more suitable to write parquet files.
In regards to parquet and Camel, we don't have an example for it but the
way I see it, you are heading into the right direction by creating a
processor to convert the data to parquet format.
However, we do have an open feature request
<https://issues.apache.org/jira/browse/CAMEL-13573> to add parquet data
format, we would love to see some contributions to add this to Camel :) .


Regards,
Omar


On Tue, Feb 11, 2020 at 11:37 PM Ron Cecchini <roncecch...@comcast.net>
wrote:

> Hi, all.  I'm just looking for quick guidance or confirmation that I'm
> going in the right direction here:
>
> - There's a small Kotlin service that uses Camel to read from Kafka and
> write to Mongo.
> - I need to replace Mongo with Apache Drill and write Parquet files to the
> file system.
>   (I know nothing about Parquet but I know a little bit about Drill.)
>
> - This service isn't used to do any queries, it's just for persisting data.
>   So, given that, and the fact that Drill is just a query engine, I really
> can't use the "Drill" component for anything.
>
> - But there is that "HDFS" component that I think I can use?
>   Or maybe the "File" component is better here?
>
> So my thinking is that I just need to:
>
> 1. write a Processor to transform the JSON data into Parquet
>    (and keep in mind that I know nothing about Parquet...)
>
> 2. use the HDFS (or File) component to write it to a file
>    (I think there's some Parquet set up to do (?) outside the scope of
> this service, but that's another matter...)
>
> Seems pretty straight-forward.  Does that sound reasonable?
>
> Are there any Camel examples I can look at?  The Google machine seems to
> not find anything related to Camel and Parquet...
>
> Thank you so much!
>
> Ron
>

Re: Camel + Drill + Parquet

Reply via email to