Re: Camel use case

Anthony Wu Wed, 31 Jan 2024 10:30:59 -0800

Hi folks - I had thought that the loop EIP was meant only for testing
purposes? In the 3.14.x LTS docs the doc page reads, my emphasis:


The Loop EIP allows for processing a message a number of times, possibly in
a different way for each iteration. _Useful mostly during testing._

See
https://stackoverflow.com/questions/51257248/camel-stackoverflow-error-when-route-is-called-recursively
as well.

In the past I've used a SEDA queue like the following in Java DSL:

from("seda:foo").process(processorThatTerminatesWhenBodyIsExhausted).to("seda:foo")

Any insight on whether the loop EIP is safe to use (no longer suffers from
memory overrun) here is greatly appreciated.

On Wed, Jan 31, 2024 at 8:45 AM Jeremy Ross <[email protected]> wrote:

> If you keep copy=false (default), loop sends the same exchange for each
> iteration. This allows you to manipulate headers inside the loop and
> subsequent iterations would see those header changes.
>
> On Wed, Jan 31, 2024 at 2:18 AM Ghassen Kahri <[email protected]>
> wrote:
>
> > Hi Jeremy,
> >
> > The idea of using the loop EIP crossed my mind as well, but I'm uncertain
> > about the feasibility of manipulating headers for each iteration.
> >
> > I appreciate your concern.
> >
> > Thank you.
> >
> > Le lun. 29 janv. 2024 à 18:35, Jeremy Ross <[email protected]> a
> > écrit :
> >
> > > > To achieve this, I iterated through the route X times, each time
> > > executing
> > > a query with a different offset. I utilized Camel headers to store the
> > > offset and other flags, as mentioned in my initial email.
> > >
> > > This is a perfectly reasonable approach IMO.
> > >
> > > > Does Camel have any built-in functionality that
> > > accomplishes the same task? Additionally, since I was "improvising,"
> I'm
> > > curious if my code adheres to best practices. I sensed that it might
> not,
> > > given that I implemented business logic at the route level.
> > >
> > > The EIPs are the building blocks that allow you to accomplish this type
> > of
> > > use case. Apart from EIPs, Camel doesn't have specific functionality to
> > > query and process paged resources. The Loop EIP (
> > > https://camel.apache.org/components/4.0.x/eips/loop-eip.html) might
> be a
> > > little more idiomatic than a route calling itself recursively.
> > >
> > >
> > > On Fri, Jan 26, 2024 at 3:07 AM Ghassen Kahri <
> [email protected]
> > >
> > > wrote:
> > >
> > > > Hey Raymond, I appreciate your response.
> > > >
> > > > We are both on board with the idea of dividing the query response
> into
> > > > chunks. Let's discuss the "how" in Camel.
> > > >
> > > > To achieve this, I iterated through the route X times, each time
> > > executing
> > > > a query with a different offset. I utilized Camel headers to store
> the
> > > > offset and other flags, as mentioned in my initial email.
> > > >
> > > > My primary question is: Does Camel have any built-in functionality
> that
> > > > accomplishes the same task? Additionally, since I was "improvising,"
> > I'm
> > > > curious if my code adheres to best practices. I sensed that it might
> > not,
> > > > given that I implemented business logic at the route level.
> > > >
> > > > Le jeu. 25 janv. 2024 à 15:46, ski n <[email protected]> a
> > écrit
> > > :
> > > >
> > > > > Yes, dividing it into chunks is a good practice. This adheres to
> > > > > message-based systems in general, not specific to Camel.
> > > > > Let's discuss both ways of processing messages:
> > > > >
> > > > > 1. One big message
> > > > >
> > > > > Say the message is 100 GB+ and this is processed by some
> integration
> > > > > software on a server, you need to scale the server
> > > > > for that amount. This means both memory and CPU must be capable of
> > > doing
> > > > > processing so amount of data. When you want to perform
> > > > > EIP's (like filters or transformation) this will be difficult,
> > because
> > > > the
> > > > > needed resources to match that.
> > > > >
> > > > > Say this big message comes one's a week, then you have a very big
> > > server
> > > > > basically run for nothing.
> > > > >
> > > > > 2. Many small messages
> > > > >
> > > > > Because of 1 it's generally the best practice to have fixed sized
> > > smaller
> > > > > messages. When possible, directly on the source.
> > > > > If this is somehow not possible, you can split them and move it
> back
> > > to a
> > > > > Kafka topic, then you use streaming the messages
> > > > > and do the actual EIP's on the small message. Some advantages are:
> > > > >
> > > > > 1. Predictable: Every message is of the same size, so you load test
> > > this
> > > > > and match resources.
> > > > > 2. Resources: A small message needs less resources (CPU/Memory) to
> > > > process
> > > > > 3. Load: The load is spread over time (you can use a smaller
> server).
> > > > > 4. Realtime: You don't need to wait until all data is gathered and
> > then
> > > > > send it in batch, but
> > > > >                          you can process it when it happens.
> > > > > 5. Scaling: When the load is high, you may add multiple threads or
> > even
> > > > > multiple pods/containers to scale, when you
> > > > >                     don't need it anymore, you can scale back.
> > > > >
> > > > > Raymond
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jan 25, 2024 at 2:32 PM Ghassen Kahri <
> > > [email protected]
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hello community,
> > > > > >
> > > > > > I am currently working on a feature within the Camel project that
> > > > > involves
> > > > > > processing Kafka messages (String) and performing a query based
> on
> > > that
> > > > > > message. Initially, I implemented a classic route that called a
> > > service
> > > > > > method responsible for executing the query. However, I
> encountered
> > an
> > > > > issue
> > > > > > with the size of the query result, as the memory couldn't handle
> > > such a
> > > > > > massive amount of data.
> > > > > >
> > > > > > In response to this challenge, I devised an alternative solution
> > that
> > > > > might
> > > > > > be considered unconventional. The approach involves querying the
> > > > database
> > > > > > multiple times and retrieving the results in manageable chunks.
> > > > > > Consequently, the route needs to be executed multiple times. The
> > > > current
> > > > > > structure of my route is as follows:
> > > > > >
> > > > > >
> > > > > > from(getInput())
> > > > > >                 .routeId(getRouteId())
> > > > > >
> > > > > >                 .bean(Service.class, "extractDataInChunks")
> > > > > >
> > > > > >                 .choice()
> > > > > >
> > > > > >
> .when(header(PAGINATION_END_FLAG).isEqualTo(true)).to(getOutput())
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .when(header(PAGINATION_END_FLAG).isEqualTo(false)).to(getOutput(),directUri(getRouteId()));
> > > > > > //re-execute the route with offset = offset+limit
> > > > > >
> > > > > >
> > > > > > The extractDataInChunks method queries the database with a
> > > > parameterized
> > > > > > limit (chunk size) and an offset that ranges from 0 to X * limit.
> > The
> > > > > > PAGINATION_END_FLAG is a Camel header, initially set to false,
> and
> > is
> > > > > > switched to true by the extractDataInChunks method if the size of
> > the
> > > > > query
> > > > > > result is 0.
> > > > > >
> > > > > > I would appreciate feedback on whether this solution adheres to
> > good
> > > > > Camel
> > > > > > practices, specifically the consideration of implementing
> business
> > > > logic
> > > > > at
> > > > > > the route level. Additionally, I am curious if there are any
> > built-in
> > > > > > Enterprise Integration Patterns (EIPs) in Camel that might be
> more
> > > > > suitable
> > > > > > for my business requirements.
> > > > > >
> > > > > > Thank you for your insights.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Camel use case

Reply via email to