+1,

It is always good to have new ways to ingest data as an Iceberg table.

- Ajantha

On Wed, May 22, 2024 at 7:32 PM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi Omar,
>
> That's the plan (see the last section in my previous email). Just
> wanted to bring some attention in the Iceberg community :)
>
> Regards
> JB
>
> On Wed, May 22, 2024 at 10:01 AM Omar Al-Safi <o...@oalsafi.com> wrote:
> >
> > IMO the Camel iceberg component should live in the camel repo. it can be
> part of the camel components registry in camel
> >
> > On Wed, May 22, 2024 at 9:58 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
> >>
> >> Hi Manish
> >>
> >> No, Camel is not an alternative to Spark or Flink: Camel is not a
> >> query engine. It's more a "complement" to Kafka Connect.
> >>
> >> Regards
> >> JB
> >>
> >> On Wed, May 22, 2024 at 7:09 AM Manish Malhotra
> >> <manish.malhotra.w...@gmail.com> wrote:
> >> >
> >> > Is Camel can be used as an alternate to Flink?
> >> >
> >> >
> >> > On Tue, May 21, 2024 at 10:17 AM Ryan Blue <b...@tabular.io> wrote:
> >> >>
> >> >> This is an interesting idea. What is the use case and where should
> this live? I'm unfamiliar with Camel and I'm not sure what the normal thing
> is. At least in the Iceberg community, we generally avoid adding connectors
> unless there is a clear use case and demand for them. We don't want to add
> code that needs to be maintained but isn't used.
> >> >>
> >> >> On Tue, May 21, 2024 at 10:15 AM Yufei Gu <flyrain...@gmail.com>
> wrote:
> >> >>>
> >> >>> Hi JB,
> >> >>>
> >> >>> Thanks for sharing. Got a few questions:
> >> >>>
> >> >>> Does Apache Camel rely on other engines, e.g., Spark or Flink for
> any processing, or is it fully self-contained?
> >> >>> What are the potential challenges or limitations you foresee? For
> example, does it generate too many commits and/or small files considering
> its use cases(IoT, Event streaming)? Can Camel cache ingestion data, and
> write it to the Iceberg table as a batch?
> >> >>> How do you recommend handling schema evolution in Iceberg tables
> when integrating with Camel routes?
> >> >>>
> >> >>> Yufei
> >> >>>
> >> >>>
> >> >>> On Tue, May 21, 2024 at 6:06 AM Jean-Baptiste Onofré <
> j...@nanthrax.net> wrote:
> >> >>>>
> >> >>>> Hi folks,
> >> >>>>
> >> >>>> I'm working on a Iceberg component for Apache Camel:
> >> >>>>
> https://github.com/jbonofre/iceberg/tree/CAMEL/camel/camel-iceberg/src/main
> >> >>>>
> >> >>>> Apache Camel is an integration framework, supporting a lot of
> >> >>>> components and EIPs (Enterprise Integration Patterns, like Content
> >> >>>> Based Router, Splitter, Aggregator, Content Enricher, ...).
> >> >>>> Camel is very popular in a lot of use cases, like IoT, system
> >> >>>> integration, event streamings, ...
> >> >>>>
> >> >>>> This component provides a Camel component with:
> >> >>>> - a Camel consumer endpoint (from) to read data from Iceberg
> >> >>>> tables/views (scan) and create a Camel exchange
> >> >>>> - a Camel producer endpoint (to) to write data (from Camel
> exchange)
> >> >>>> to Iceberg tables/views
> >> >>>>
> >> >>>> For instance, you can write a Camel route like this (using the
> >> >>>> spring/blueprint DSL for instance):
> >> >>>>
> >> >>>> <from uri="jms:queue:foo"/>
> >> >>>> <process ref="#convertToIcebergRecords"/> <!-- optional depending
> on
> >> >>>> the exchange message body -->
> >> >>>> <to uri="iceberg:my_table?catalog=#ref"/>
> >> >>>>
> >> >>>> This route is event driven, consuming messages from the foo JMS
> queue
> >> >>>> (from Apache ActiveMQ for instance), and writing a message body to
> >> >>>> my_table iceberg table (it's possible to use a router or multicast
> >> >>>> EIPs to send the exchange to different tables).
> >> >>>> NB: for the from (consumer endpoint), you can use any Camel
> component
> >> >>>> (https://camel.apache.org/components/4.4.x/).
> >> >>>>
> >> >>>> You can also consume (scan) data from an Iceberg table, and send
> the
> >> >>>> generated Exchange to any endpoint/route:
> >> >>>>
> >> >>>> <from uri="iceberg:my_table?catalog=#ref"/>
> >> >>>> <process ref="#convertFromIcebergRecords"/> <!-- optional
> depending on
> >> >>>> the next steps in the route -->
> >> >>>> <wireTap uri="direct:tap"/>
> >> >>>> <to
> uri="mongodb:myDB?database=mydb&collection=foo&operation=insert"/>
> >> >>>>
> >> >>>> This route generates exchanges from my_table Iceberg table, uses
> the
> >> >>>> wiretap EIP and stores the data into a mongoDB database/collection.
> >> >>>>
> >> >>>> If I started the component in the Iceberg repo, I think it would
> make
> >> >>>> more sense to have it at camel (as Apache Beam contains the Iceberg
> >> >>>> IO).
> >> >>>> Thoughts ?
> >> >>>>
> >> >>>> Comments are welcome !
> >> >>>>
> >> >>>> NB: on a related topic, I created
> https://github.com/apache/iceberg/pull/10365
> >> >>>>
> >> >>>> Regards
> >> >>>> JB
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Ryan Blue
> >> >> Tabular
>

Reply via email to