On Fri, Nov 12, 2021 at 8:03 AM Mikael Andersson Wigander <mikael.andersson.wigan...@pm.me.invalid> wrote:
> Ok thanks. > > But the outcome is very obvious when I analyse the heapdump. > The pool.clear() is never executed and the memory is filled with the > orphaned pool. > If the pool is to be used as a placeholder for the expressions then why is > not only 10 items in the pool? My pool after importing records are huge, 10 > times the amount of records in the split I would say. > > Yeah the pool should only be as high as number of concurrent threads at peak. How many items do you have? And can you put together a reproducer example and put on github or attach a .zip to the JIRA? > > /M > > > > > On fre, nov. 12, 2021 at 07:52, Claus Ibsen <claus.ib...@gmail.com> skrev: > > Hi > > Mind about evaluating xpath in Java is not thread safe so you end up > having to create an instance of XPathExpression per xpath you want to > execute. > And your bean have 10 xpaths, so that is 10 per message, so you end up > with 10 XPathExpression instances in the memory. > And all legacy XML from the JDK/JVM is memory hungry (DOM, JAXB etc). > > And if you turn on parallel processing then you multiply this with another > 10 or more depending on number of concurrent threads etc. > In this case you can make thar argument that the pool should be able to > shrink in case there was a spike of concurrent processing which later is no > longer needed, > then the pool have too many free elements. > > Also as Alex mentions then a stax based parser may be better, which you > use with tokenizeXML. > > You talk about a leak, but is that really a leak? The xpath instance is > pooled so it can be re-used for the next message. So what you see in the > memory is those 10 XPathExpression instances. > If they are cleared after processing a message, then you end up having to > re-create the XPathExpression for the next message, and then you have more > CPU usage and also more pressure on the GC > to de-allocate those 10 XPathExpression per message. > > > > > > On Mon, Nov 8, 2021 at 8:12 PM Mikael Andersson Wigander > <mikael.andersson.wigan...@pm.me.invalid> wrote: > >> Hi >> >> With the risk of being seen as a n00b (again)… >> >> We are processing large XML files (0.5GB/~500.000 records). >> To process them we use stream caching, spit, parallel processing, xpath >> and a bean. >> >> We get a lot of OutOfMemoryExceptions and after analysing we see that the >> call to the bean method is the villain. >> >> The process is to split() using tokenizeXML() on a tag that makes up one >> record in the XML. >> >> For each of these records we call a bean where the method utilises >> @Xpath() on the method parameters. >> >> We see in the heap dump that these calls are never GC'd, we have 90% >> leftovers >> [image: image.png] >> >> The question is: is this related to a not thread safe bean/method or what >> could be the reason? >> The documentation states the default behaviour is a Signleton and when >> used in concurrent processing it must be thread safe… >> https://camel.apache.org/components/3.11.x/bean-component.html#_options >> >> Running as a war under Tomcat 9 on Windows using Camel 3.11.3 and Spring >> Boot 2.5.6. >> Server has 32GB of RAM… >> >> Route: >> from(file("Full")) >> .streamCaching() >> .unmarshal() >> .zipFile() >> .split() >> .tokenizeXML("RefData") >> .streaming() >> .parallelProcessing(false) >> .bean(XmlToSqlBean.class) >> .to(jdbc("default")) >> .end(); >> >> Bean: >> public class XmlToSqlBean { >> public String toSql(@XPath("//FinInstrmGnlAttrbts/Id") final >> String isin, >> @XPath("//NtnlCcy") final String currency, >> @XPath("//FullNm") final String fullName, >> @XPath("//TradgVnRltdAttrbts/Id") final >> String venue, >> @XPath("//ClssfctnTp") final String >> classification, >> @XPath("//TradgVnRltdAttrbts/TermntnDt") >> final String terminationDate, >> @XPath("//Issr") final String issuer, >> @XPath("//MtrtyDt") String maturityDate, >> @XPath("//TermntdRcrd") final String >> termnRecord, >> @XPath("//NewRcrd") final String >> newRecord) { >> … >> } >> } >> >> >> Thanks >> >> /M >> >> > > -- > Claus Ibsen > ----------------- > http://davsclaus.com @davsclaus > Camel in Action 2: https://www.manning.com/ibsen2 > > -- Claus Ibsen ----------------- http://davsclaus.com @davsclaus Camel in Action 2: https://www.manning.com/ibsen2