> To achieve this, I iterated through the route X times, each time executing
a query with a different offset. I utilized Camel headers to store the
offset and other flags, as mentioned in my initial email.

This is a perfectly reasonable approach IMO.

> Does Camel have any built-in functionality that
accomplishes the same task? Additionally, since I was "improvising," I'm
curious if my code adheres to best practices. I sensed that it might not,
given that I implemented business logic at the route level.

The EIPs are the building blocks that allow you to accomplish this type of
use case. Apart from EIPs, Camel doesn't have specific functionality to
query and process paged resources. The Loop EIP (
https://camel.apache.org/components/4.0.x/eips/loop-eip.html) might be a
little more idiomatic than a route calling itself recursively.


On Fri, Jan 26, 2024 at 3:07 AM Ghassen Kahri <ghassen.ka...@codeonce.fr>
wrote:

> Hey Raymond, I appreciate your response.
>
> We are both on board with the idea of dividing the query response into
> chunks. Let's discuss the "how" in Camel.
>
> To achieve this, I iterated through the route X times, each time executing
> a query with a different offset. I utilized Camel headers to store the
> offset and other flags, as mentioned in my initial email.
>
> My primary question is: Does Camel have any built-in functionality that
> accomplishes the same task? Additionally, since I was "improvising," I'm
> curious if my code adheres to best practices. I sensed that it might not,
> given that I implemented business logic at the route level.
>
> Le jeu. 25 janv. 2024 à 15:46, ski n <raymondmees...@gmail.com> a écrit :
>
> > Yes, dividing it into chunks is a good practice. This adheres to
> > message-based systems in general, not specific to Camel.
> > Let's discuss both ways of processing messages:
> >
> > 1. One big message
> >
> > Say the message is 100 GB+ and this is processed by some integration
> > software on a server, you need to scale the server
> > for that amount. This means both memory and CPU must be capable of doing
> > processing so amount of data. When you want to perform
> > EIP's (like filters or transformation) this will be difficult, because
> the
> > needed resources to match that.
> >
> > Say this big message comes one's a week, then you have a very big server
> > basically run for nothing.
> >
> > 2. Many small messages
> >
> > Because of 1 it's generally the best practice to have fixed sized smaller
> > messages. When possible, directly on the source.
> > If this is somehow not possible, you can split them and move it back to a
> > Kafka topic, then you use streaming the messages
> > and do the actual EIP's on the small message. Some advantages are:
> >
> > 1. Predictable: Every message is of the same size, so you load test this
> > and match resources.
> > 2. Resources: A small message needs less resources (CPU/Memory) to
> process
> > 3. Load: The load is spread over time (you can use a smaller server).
> > 4. Realtime: You don't need to wait until all data is gathered and then
> > send it in batch, but
> >                          you can process it when it happens.
> > 5. Scaling: When the load is high, you may add multiple threads or even
> > multiple pods/containers to scale, when you
> >                     don't need it anymore, you can scale back.
> >
> > Raymond
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Jan 25, 2024 at 2:32 PM Ghassen Kahri <ghassen.ka...@codeonce.fr
> >
> > wrote:
> >
> > > Hello community,
> > >
> > > I am currently working on a feature within the Camel project that
> > involves
> > > processing Kafka messages (String) and performing a query based on that
> > > message. Initially, I implemented a classic route that called a service
> > > method responsible for executing the query. However, I encountered an
> > issue
> > > with the size of the query result, as the memory couldn't handle such a
> > > massive amount of data.
> > >
> > > In response to this challenge, I devised an alternative solution that
> > might
> > > be considered unconventional. The approach involves querying the
> database
> > > multiple times and retrieving the results in manageable chunks.
> > > Consequently, the route needs to be executed multiple times. The
> current
> > > structure of my route is as follows:
> > >
> > >
> > > from(getInput())
> > >                 .routeId(getRouteId())
> > >
> > >                 .bean(Service.class, "extractDataInChunks")
> > >
> > >                 .choice()
> > >
> > > .when(header(PAGINATION_END_FLAG).isEqualTo(true)).to(getOutput())
> > >
> > >
> > >
> >
> .when(header(PAGINATION_END_FLAG).isEqualTo(false)).to(getOutput(),directUri(getRouteId()));
> > > //re-execute the route with offset = offset+limit
> > >
> > >
> > > The extractDataInChunks method queries the database with a
> parameterized
> > > limit (chunk size) and an offset that ranges from 0 to X * limit. The
> > > PAGINATION_END_FLAG is a Camel header, initially set to false, and is
> > > switched to true by the extractDataInChunks method if the size of the
> > query
> > > result is 0.
> > >
> > > I would appreciate feedback on whether this solution adheres to good
> > Camel
> > > practices, specifically the consideration of implementing business
> logic
> > at
> > > the route level. Additionally, I am curious if there are any built-in
> > > Enterprise Integration Patterns (EIPs) in Camel that might be more
> > suitable
> > > for my business requirements.
> > >
> > > Thank you for your insights.
> > >
> >
>

Reply via email to