How to clear the cache of MemoryIdempotentRepository

2023-04-19 Thread Reto Peter
Hi all

I have a route, which checks for uniqueness of a certain value (find 
duplicates, and in case of duplicates, the route will throw an exception).
That works fine so far.

At the end of the route, I want to clear the cache, so next time the route gets 
executed, it shall start with an empty cache.

My question: How can I clear the cache at the end of the route?

My route(simplified)


from("direct:processArticles")

  .split(xpath("//Article")) // for each Article

 .setHeader("ArtEAN", xpath("Article/ArtEAN/text()"))

 .idempotentConsumer(header("ArtEAN"), 
MemoryIdempotentRepository.memoryIdempotentRepository(2000)).skipDuplicate(false)

  
.choice().when(header(Exchange.DUPLICATE_MESSAGE).isEqualTo(true)).throwException(ImportException.class,
 "Line/Unit: ${exchangeProperty.CamelSplitIndex}: Found Duplicates").end()

 // do some processing

  .end()

Camel 3.20.3, Spring Boot 2.7.10


Re: How to clear the cache of MemoryIdempotentRepository

2023-04-19 Thread Claus Ibsen
Hi

If you use JMX then there is JMX operations to clear.

Otherwise you can create an instance of the memory repo, and use in the
route, and from java code you can invoke its clear method to clear the
cache.
You can make that code also be executed by the end of the route if you add
it as onCompletion in the route or via a Processor on the exchange UoW.



On Wed, Apr 19, 2023 at 9:52 AM Reto Peter  wrote:

> Hi all
>
> I have a route, which checks for uniqueness of a certain value (find
> duplicates, and in case of duplicates, the route will throw an exception).
> That works fine so far.
>
> At the end of the route, I want to clear the cache, so next time the route
> gets executed, it shall start with an empty cache.
>
> My question: How can I clear the cache at the end of the route?
>
> My route(simplified)
>
>
> from("direct:processArticles")
>
>   .split(xpath("//Article")) // for each Article
>
>  .setHeader("ArtEAN", xpath("Article/ArtEAN/text()"))
>
>  .idempotentConsumer(header("ArtEAN"),
> MemoryIdempotentRepository.memoryIdempotentRepository(2000)).skipDuplicate(false)
>
>
> .choice().when(header(Exchange.DUPLICATE_MESSAGE).isEqualTo(true)).throwException(ImportException.class,
> "Line/Unit: ${exchangeProperty.CamelSplitIndex}: Found Duplicates").end()
>
>  // do some processing
>
>   .end()
>
> Camel 3.20.3, Spring Boot 2.7.10
>


-- 
Claus Ibsen
-
@davsclaus
Camel in Action 2: https://www.manning.com/ibsen2


Re: camel-pulsar performance

2023-04-19 Thread Claus Ibsen
Hi

I think camel-pulsar does an ACK per message at the end of routing.
Then the performance may become slower.

kafka uses async commit every 5 sec in the background (by default).

for camel-pulsar you can also use manual ack, and then find a way to batch
acks.
But maybe there is a way in pulsar to also ack like kafka via a background
task.
And maybe we can improve camel-pulsar to make this easier to do out of the
box.

Of course if you do async acks, then you can have duplicates in case the
app crashes and is restarted,
then pulsar will resume from last known "ack" position.



On Mon, Apr 17, 2023 at 5:04 PM Steve973  wrote:

> Hello.  I have been experimenting with the Camel Pulsar component as a more
> performant alternative to traditional JMS brokers.  I have seen performance
> comparisons that set Pulsar even above Kafka in most cases.  It is reported
> that Pulsar can handle (~3.5) millions of messages per second.  In my use
> case, I am sending very simple messages, where I have a couple of headers,
> and a payload that is a simple POJO with a string field and a map with
> between one and five entries, depending on the message.  I am using
> protobuf to de/serialize the message body.  I am seeing approximately one
> thousand messages per second.  I can only assume that it is "user error" on
> my part, but I was wondering if any of you have an example that
> demonstrates performance that is more on-par with the advertised message
> rate.  If not, how can I determine what is slowing down Pulsar's
> performance in my use case?
>
> Thanks,
> Steve
>


-- 
Claus Ibsen
-
@davsclaus
Camel in Action 2: https://www.manning.com/ibsen2


RE: Camel in Action: Microsoft Azure Files over public Internet

2023-04-19 Thread Petr Kuzel
Thank you Andrea for the hint,

I have sketched https://issues.apache.org/jira/browse/CAMEL-19279

  Hope it helps
  Cc.

-Original Message-
From: Andrea Cosentino 
Sent: Tuesday, April 18, 2023 09:21
To: users@camel.apache.org
Subject: Re: Camel in Action: Microsoft Azure Files over public Internet


Hello,

To me it makes sense to have a component supporting the SDK.

Open an issue for that, if you have time you could contribute it, otherwise
someone could take a look.

Il giorno mar 18 apr 2023 alle ore 09:18 Petr Kuzel
 ha scritto:

> Hi Chirag,
>
> well, it is not the first time I hear, apparently
> it is a common confusion in the community.
>
> Azure Blob Storage
>   Massively scalable and secure object storage for cloud-native workloads,
>   archives, data lakes, high-performance computing, and machine learning.
>   
>
> Azure Files
>   Simple, secure, and serverless enterprise-grade cloud file shares.
>   
>
> My question is about Azure Files.
>
>   Best regards
>   Cc.
>
> -Original Message-
> From: Chirag 
> Sent: Monday, April 17, 2023 17:30
> To: users@camel.apache.org
> Subject: Re: Camel in Action: Microsoft Azure Files over public Internet
>
>
> Shouldn't you be lookin at
>
> https://camel.apache.org/components/3.20.x/azure-storage-blob-component.html
> ?
>
> Or a variation of it? The APIs are similar.
>
>
> ચિરાગ/चिराग/Chirag
> --
> Sent from My Gmail Account
>
> On Mon, Apr 17, 2023 at 11:18 AM Petr Kuzel
>  wrote:
> >
> > I have a new RFE which includes integrating
> > Microsoft Azure Files over public Internet.
> >
> > Initial findings and constraints:
> >
> >   - Azure Files do not implement the FTP standard.
> >   - Azure Files could expose SMB protocol but SMB over
> > public Internet is blacklisted by the security policy.
> >   - Azure Files could expose NFS but its pricing is prohibitive.
> >   - Azure Files have REST API <
> https://github.com/Azure/azure-rest-api-specs>
> > and Java SDK .
> >   - My team is used to Camel 3.x components.
> >
> > Given that I see the two options:
> >
> >   A: use Camel REST component.
> >   B: use Azure Files remote file component.
> >
> > Neither seems easy. For the Camel REST component,
> > I'd need to implement a polling consumer via REST and
> > match the FTPS component-like capabilities. For Azure Files,
> > I have not found a developed Camel remote file component
> > so its development would be required, i.e. likely a continuation
> > at the Camel dev list...
> >
> > First, have I overlooked any recommendable option that
> > could address the problem, please?
> >
> > Second, if left only with above two options, which approach
> > would look more promising from a Camel veteran perspective
> > and why, please?
> >
> >   Best regards
> >   Cc.
> >
> > --
> >   Mr. Petr Kužel, Software Engineer
> >   Eurofins International Support Services s.à r.l.
> >   Val Fleuri 23
> >   L-1526 LUXEMBOURG
> >
>


Re: camel-pulsar performance

2023-04-19 Thread Steve973
Thanks, Claus.  I have also tried Kafka in my app for comparison purposes,
and it seems to be a little bit more performant.

I saw something called Starlight for JMS that does JMS over Pulsar, and
gets about a million messages per second.  I'm considering writing a camel
component for that.  Im surprised, though, that I'm not getting anywhere
near a million messages per second (or even a million messages in ten
seconds) with either Pulsar or Kafka.  With a small payload, what
throughput should I be seeing?

On Wed, Apr 19, 2023, 4:40 AM Claus Ibsen  wrote:

> Hi
>
> I think camel-pulsar does an ACK per message at the end of routing.
> Then the performance may become slower.
>
> kafka uses async commit every 5 sec in the background (by default).
>
> for camel-pulsar you can also use manual ack, and then find a way to batch
> acks.
> But maybe there is a way in pulsar to also ack like kafka via a background
> task.
> And maybe we can improve camel-pulsar to make this easier to do out of the
> box.
>
> Of course if you do async acks, then you can have duplicates in case the
> app crashes and is restarted,
> then pulsar will resume from last known "ack" position.
>
>
>
> On Mon, Apr 17, 2023 at 5:04 PM Steve973  wrote:
>
> > Hello.  I have been experimenting with the Camel Pulsar component as a
> more
> > performant alternative to traditional JMS brokers.  I have seen
> performance
> > comparisons that set Pulsar even above Kafka in most cases.  It is
> reported
> > that Pulsar can handle (~3.5) millions of messages per second.  In my use
> > case, I am sending very simple messages, where I have a couple of
> headers,
> > and a payload that is a simple POJO with a string field and a map with
> > between one and five entries, depending on the message.  I am using
> > protobuf to de/serialize the message body.  I am seeing approximately one
> > thousand messages per second.  I can only assume that it is "user error"
> on
> > my part, but I was wondering if any of you have an example that
> > demonstrates performance that is more on-par with the advertised message
> > rate.  If not, how can I determine what is slowing down Pulsar's
> > performance in my use case?
> >
> > Thanks,
> > Steve
> >
>
>
> --
> Claus Ibsen
> -
> @davsclaus
> Camel in Action 2: https://www.manning.com/ibsen2
>


Re: camel-pulsar performance

2023-04-19 Thread Neal Feierabend
Hello Steve,

I'm a relative novice with both Pulsar and Camel so I could be completely
wrong, but I wonder if this may have more to do with Pulsar than Camel.
What kind of hardware are you testing on? Most of the benchmarks I've seen
that talk about 1 million or more messages/second are usually clusters with
really fast hardware. This one from Streamnative (
https://streamnative.io/blog/apache-pulsar-vs-apache-kafka-2022-benchmark)
for example, is using 3 servers with 24 cores and 192G of memory for the
Pulsar brokers plus separate machines for the clients. Have you tried just
using the native Java client to see what kind of performance you get with
the same Pulsar cluster? You might also look at using Pulsar Perf (
https://pulsar.apache.org/docs/2.11.x/performance-pulsar-perf/) to do some
performance testing to understand what you can expect from Pulsar on the
hardware you're using. Hope that helps some!

Neal Feierabend  (he/him/his)
*IT Developer Team Lead*
*Virginia Tech Transportation Institute*


On Wed, Apr 19, 2023 at 2:51 PM Steve973  wrote:

> Thanks, Claus.  I have also tried Kafka in my app for comparison purposes,
> and it seems to be a little bit more performant.
>
> I saw something called Starlight for JMS that does JMS over Pulsar, and
> gets about a million messages per second.  I'm considering writing a camel
> component for that.  Im surprised, though, that I'm not getting anywhere
> near a million messages per second (or even a million messages in ten
> seconds) with either Pulsar or Kafka.  With a small payload, what
> throughput should I be seeing?
>
> On Wed, Apr 19, 2023, 4:40 AM Claus Ibsen  wrote:
>
> > Hi
> >
> > I think camel-pulsar does an ACK per message at the end of routing.
> > Then the performance may become slower.
> >
> > kafka uses async commit every 5 sec in the background (by default).
> >
> > for camel-pulsar you can also use manual ack, and then find a way to
> batch
> > acks.
> > But maybe there is a way in pulsar to also ack like kafka via a
> background
> > task.
> > And maybe we can improve camel-pulsar to make this easier to do out of
> the
> > box.
> >
> > Of course if you do async acks, then you can have duplicates in case the
> > app crashes and is restarted,
> > then pulsar will resume from last known "ack" position.
> >
> >
> >
> > On Mon, Apr 17, 2023 at 5:04 PM Steve973  wrote:
> >
> > > Hello.  I have been experimenting with the Camel Pulsar component as a
> > more
> > > performant alternative to traditional JMS brokers.  I have seen
> > performance
> > > comparisons that set Pulsar even above Kafka in most cases.  It is
> > reported
> > > that Pulsar can handle (~3.5) millions of messages per second.  In my
> use
> > > case, I am sending very simple messages, where I have a couple of
> > headers,
> > > and a payload that is a simple POJO with a string field and a map with
> > > between one and five entries, depending on the message.  I am using
> > > protobuf to de/serialize the message body.  I am seeing approximately
> one
> > > thousand messages per second.  I can only assume that it is "user
> error"
> > on
> > > my part, but I was wondering if any of you have an example that
> > > demonstrates performance that is more on-par with the advertised
> message
> > > rate.  If not, how can I determine what is slowing down Pulsar's
> > > performance in my use case?
> > >
> > > Thanks,
> > > Steve
> > >
> >
> >
> > --
> > Claus Ibsen
> > -
> > @davsclaus
> > Camel in Action 2: https://www.manning.com/ibsen2
> >
>


Re: camel-pulsar performance

2023-04-19 Thread Steve973
Thanks, Neal.  This is my first foray into Pulsar, so you are probably
right.

On Wed, Apr 19, 2023 at 3:34 PM Neal Feierabend  wrote:

> Hello Steve,
>
> I'm a relative novice with both Pulsar and Camel so I could be completely
> wrong, but I wonder if this may have more to do with Pulsar than Camel.
> What kind of hardware are you testing on? Most of the benchmarks I've seen
> that talk about 1 million or more messages/second are usually clusters with
> really fast hardware. This one from Streamnative (
> https://streamnative.io/blog/apache-pulsar-vs-apache-kafka-2022-benchmark)
> for example, is using 3 servers with 24 cores and 192G of memory for the
> Pulsar brokers plus separate machines for the clients. Have you tried just
> using the native Java client to see what kind of performance you get with
> the same Pulsar cluster? You might also look at using Pulsar Perf (
> https://pulsar.apache.org/docs/2.11.x/performance-pulsar-perf/) to do some
> performance testing to understand what you can expect from Pulsar on the
> hardware you're using. Hope that helps some!
>
> Neal Feierabend  (he/him/his)
> *IT Developer Team Lead*
> *Virginia Tech Transportation Institute*
>
>
> On Wed, Apr 19, 2023 at 2:51 PM Steve973  wrote:
>
> > Thanks, Claus.  I have also tried Kafka in my app for comparison
> purposes,
> > and it seems to be a little bit more performant.
> >
> > I saw something called Starlight for JMS that does JMS over Pulsar, and
> > gets about a million messages per second.  I'm considering writing a
> camel
> > component for that.  Im surprised, though, that I'm not getting anywhere
> > near a million messages per second (or even a million messages in ten
> > seconds) with either Pulsar or Kafka.  With a small payload, what
> > throughput should I be seeing?
> >
> > On Wed, Apr 19, 2023, 4:40 AM Claus Ibsen  wrote:
> >
> > > Hi
> > >
> > > I think camel-pulsar does an ACK per message at the end of routing.
> > > Then the performance may become slower.
> > >
> > > kafka uses async commit every 5 sec in the background (by default).
> > >
> > > for camel-pulsar you can also use manual ack, and then find a way to
> > batch
> > > acks.
> > > But maybe there is a way in pulsar to also ack like kafka via a
> > background
> > > task.
> > > And maybe we can improve camel-pulsar to make this easier to do out of
> > the
> > > box.
> > >
> > > Of course if you do async acks, then you can have duplicates in case
> the
> > > app crashes and is restarted,
> > > then pulsar will resume from last known "ack" position.
> > >
> > >
> > >
> > > On Mon, Apr 17, 2023 at 5:04 PM Steve973  wrote:
> > >
> > > > Hello.  I have been experimenting with the Camel Pulsar component as
> a
> > > more
> > > > performant alternative to traditional JMS brokers.  I have seen
> > > performance
> > > > comparisons that set Pulsar even above Kafka in most cases.  It is
> > > reported
> > > > that Pulsar can handle (~3.5) millions of messages per second.  In my
> > use
> > > > case, I am sending very simple messages, where I have a couple of
> > > headers,
> > > > and a payload that is a simple POJO with a string field and a map
> with
> > > > between one and five entries, depending on the message.  I am using
> > > > protobuf to de/serialize the message body.  I am seeing approximately
> > one
> > > > thousand messages per second.  I can only assume that it is "user
> > error"
> > > on
> > > > my part, but I was wondering if any of you have an example that
> > > > demonstrates performance that is more on-par with the advertised
> > message
> > > > rate.  If not, how can I determine what is slowing down Pulsar's
> > > > performance in my use case?
> > > >
> > > > Thanks,
> > > > Steve
> > > >
> > >
> > >
> > > --
> > > Claus Ibsen
> > > -
> > > @davsclaus
> > > Camel in Action 2: https://www.manning.com/ibsen2
> > >
> >
>