[ 
https://issues.apache.org/jira/browse/CAMEL-23267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18072450#comment-18072450
 ] 

Thomas Gantenbein commented on CAMEL-23267:
-------------------------------------------

Thanks, [~nfilotto], I've tested with the main branch after your PR got merged 
and confirm that it works with my reproducer outlined in the Zulip thread. The 
time it takes to compress the lastChanges queue is negligible even in my test 
scenario where the files are not even read, just moved around. Some more 
comments in 
[https://github.com/apache/camel/pull/22311#issuecomment-4217703840|https://github.com/apache/camel/pull/22311#issuecomment-4217703840].
 

Thanks again for taking a look at this!

> Lightweight inProgressRepository for file component
> ---------------------------------------------------
>
>                 Key: CAMEL-23267
>                 URL: https://issues.apache.org/jira/browse/CAMEL-23267
>             Project: Camel
>          Issue Type: Improvement
>          Components: camel-core, camel-file
>            Reporter: Thomas Gantenbein
>            Assignee: Nicolas Filotto
>            Priority: Minor
>             Fix For: 4.14.6, 4.18.2, 4.19.0
>
>
> See also [#camel > file consumer does not release items in 
> SimpleLRUCache|https://camel.zulipchat.com/#narrow/channel/257298-camel/topic/file.20consumer.20does.20not.20release.20items.20in.20SimpleLRUCache/with/581658577]
> *Observation*
> By default, the file component is using an MemoryIdempotentRepository as its 
> inProgressRepository. This, in turn, is using by default an instance of 
> SimpleLRU cache. The SimpleLRU cache maintains a Map with the values 
> themselves ("delegate") as well as a list of recent changes ("lastChanges").
> The GenericFileOnCompletion _does_ remove the absolute path of the processed 
> file from the "delegate" in the SimpleLRUCache, but it does not remove it 
> from the lastChanges queue. So when the queue is full (2 times the cache 
> capacity = 100'000), whenever a file is added to the inProgressRepository, 
> that Map with lastChanges gets iterated over and copied. And it never gets 
> below a size of 100'000 again.
> This leads to high CPU load as soon as many files are processed by an 
> endpoint after 100'000 files have been processed.
> *Proposed solution*
> Replace the SimpleLRUCache with a LinkedHashMap as the backing store of the 
> MemoryIdempotentRepository used as the default inProgressRepository of the 
> file component.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to