Re: [OT] web app big memory usage?

Christopher Schultz Tue, 01 Jun 2021 09:35:16 -0700

Cris,

On 6/1/21 09:17, Berneburg, Cris J. - US wrote:

Hi Chris


[lots of snippage]

cb> One of our web apps is using a "lot" of memory, specifically a big
cb> user query.  We'd like to find out why.
cb> 1. Is there a way to analyze uncollected garbage?
cb> * AWS EC2 instance.
cb> * There are other TC instances on the same server.
cb> * Each TC instance has multiple apps.

cs> What's the goal? Do you just Want To Know, or are you trying
cs> to solve an actual problem.

a. Barely enough memory to distribute among the multiple TC instances
and the apps they support.  A big enough user query (no throttling)
causes OOME's.  Attempting to determine if the code is being wasteful
in some way and therefore could be made more efficient.

What's the database? And the driver?

MySQL Connector/J used to (still does?) read 100% of the results intothe heap before Statement.executeQuery() returns unless you specificallytell it not to. So if your query returns 1M rows, you might bust your heap.


It's entirely possible that other drivers do similar things.

b. It's a dev app server (EC2) which hosts diff stages in the dev
process - dev, test, and prototype streams.  Multiple TC instances
(3) because multiple copies of the apps don't play nice with each
other.  That is, we can't just rename the WAR files and expect the
deployed apps to stay inside that context name (I think).

You might want to look into that, eventually. If they aren't playingtogether nicely, they are not "good" servlet citizens. Solving thoseissues may improve other things. *shrug*

c. I don't want to debug the code. I'm relatively new to theproject, unfamiliar with some of the code, and anticipate getting

lost in the weeds.  See point #1 below.  ;-) >
cs> If you have a bunch of garbage that's not being cleaned up,
cs> usually it's because there is simply no need to do so. The GC
cs> is behaving according to the 3 laws of rob..., er, 3 virtues of
cs> computing[1]:
cs>
cs>    1. Laziness: nothing needs that memory so... meh
cs>    2. Impatience: gotta clean that Eden space quick
cs>    3. Hubris: if I ever need more memory, I know where to find it

cs> [1] http://threevirtues.com/

Ha ha ha!  :-)

cs> How long does the query take to run?

Dunno about the time on the DB query itself.  From the user's point of view, a 
full minute plus.

cs> What kind of query is it? Are we talking about something like SQL

Yup.  Classic RDMS back-end.

cs> or some in-memory database or something which really does
cs> take a lot of memory for the application to fulfill the request?

Nah, nothing that fancy.  The only fancy part is using node.js for the 
front-end.

I followed Amit's and John's suggestion of using Eclipse Memory
Analyzer Tool's "Keep unreachable options" when running a query from
the app client.  Digging deeper into the Leak Suspects Report, I saw
a StringBuilder - 264MB for the supporting byte array and 264MB for
the returned String, about 790MB total for that piece of the pie.
Contents were simply the JSON query results returned to the client.
No mystery there.

Yep: runaway string concatenation. This is a devolution of the"Connector/J reads the whole result set into memory before returning"thing I mentioned above. Most JSON endpoints return arbitrarily largeJSON responses and most client applications just go "duh, read the JSON,then process it".

If your JSON is big, well, then you need a lot of memory to store it allif that' who you do things.

If you want to deal with JSON at scale, you need to process it in astreaming fashion. The only library I know that can do streaming JSON isNoggit, which was developed for use with Solr (I think, maybe it camefrom elsewhere before that). Anyway, it's ... not for the faint ofheart. But if you can figure out out, you can handle petabytes of JSONwith a tiny heap.

I suspect that repeating the process with multiple queries will
reveal multiple StringBuilder's each containing big honking JSON
results.

Very likely. You might want to throttle/serialize queries you expect tohave big responses so that only e.g. 2 of them can be running at a time.Maybe all is well when they come one-at-a-time, but if you try to handle5 concurrent "big responses" you bust your heap.


The short-term solution is to simply not allow that much concurrency.

So the issue may not be a problem with efficiency so much as one of
simple memory hogging.

This can happen with almost any application. We serve ... many clientssimultaneously and we run with a maximum of 10 connections in our dbconnection pool on each app server. It sounds "too small" but it handlesthe user-traffic without any user complaints.

(At least StringBuilder is being
used instead of plus-sign String concatenation.)

In Java "..." + "..." uses a StringBuilder (unless the compiler canperform the concatenation at compile-time, in which case there is noconcatenation at all). In some code, "..." + "..." is just fine and Ihate it when someone replaces it with:


  String foo = new StringBuilder("bar").append("baz").toString();

because the compiler does the _exact same thing_ and you've just madethe code more difficult to read.

But if that were in a *loop*, then replacing it with a StringBuilder ispretty important for performance, otherwise the compiler will dosomething stupid like this:


String foo = "";
for(String s : [ "a", "b", "c", "d" ... ] ) {
  StringBuilder temp = new StringBuilder(foo);
  temp.append(s);
  foo = temp.toString();
}

Anyhow, the use of StringBuilder is essentially a requirement so don'tread too much into it being in your heap. You might actually have tostart reading some code (shiver!).


-chris

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: [OT] web app big memory usage?

Reply via email to