Hi

With the risk of being seen as a n00b (again)…

We are processing large XML files (0.5GB/~500.000 records).
To process them we use stream caching, spit, parallel processing, xpath and a 
bean.

We get a lot of OutOfMemoryExceptions and after analysing we see that the call 
to the bean method is the villain.

The process is to split() using tokenizeXML() on a tag that makes up one record 
in the XML.

For each of these records we call a bean where the method utilises @Xpath() on 
the method parameters.

We see in the heap dump that these calls are never GC'd, we have 90% leftovers
[image.png]

The question is: is this related to a not thread safe bean/method or what could 
be the reason?
The documentation states the default behaviour is a Signleton and when used in 
concurrent processing it must be thread safe…
https://camel.apache.org/components/3.11.x/bean-component.html#_options

Running as a war under Tomcat 9 on Windows using Camel 3.11.3 and Spring Boot 
2.5.6.
Server has 32GB of RAM…

Route:
from(file("Full"))
.streamCaching()
.unmarshal()
.zipFile()
.split()
.tokenizeXML("RefData")
.streaming()
.parallelProcessing(false)
.bean(XmlToSqlBean.class)
.to(jdbc("default"))
.end();

Bean:
public class XmlToSqlBean {
public String toSql(@XPath("//FinInstrmGnlAttrbts/Id") final String isin,
@XPath("//NtnlCcy") final String currency,
@XPath("//FullNm") final String fullName,
@XPath("//TradgVnRltdAttrbts/Id") final String venue,
@XPath("//ClssfctnTp") final String classification,
@XPath("//TradgVnRltdAttrbts/TermntnDt") final String terminationDate,
@XPath("//Issr") final String issuer,
@XPath("//MtrtyDt") String maturityDate,
@XPath("//TermntdRcrd") final String termnRecord,
@XPath("//NewRcrd") final String newRecord) {
…
}
}

Thanks

/M

Reply via email to