Re: Performance with many small requests
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 All, On 5/12/2009 9:38 AM, Caldarale, Charles R wrote: > Might be interesting to modify it to run with more cores, if you have > a system available. Here are the results I got on two different systems. Note that I compiled the test code using the 1.5 JVM though it shouldn't matter at all. I ran all these tests with little to no load on the server (I stopped all TC instances to keep people from hitting them and wasting server time :) SYSTEM 1: 32-bit GNU/Linux kernel 2.6.14 - model name : AMD Athlon(tm) XP 1700+ cpu MHz : 1470.260 bogomips: 2945.26 *** Java 1.5/client $ java -version java version "1.5.0_13" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05) Java HotSpot(TM) Client VM (build 1.5.0_13-b05, mixed mode) $ java TestSynch 1 secondary atomic time: 2010; ticks: 48157094 primary atomic time: 1981; ticks: 51842907 primary synchronized time: 40940; ticks: 49988735 secondary synchronized time: 40850; ticks: 50011266 $ java TestSynch 1 secondary atomic time: 2032; ticks: 49652307 primary atomic time: 1997; ticks: 50347694 primary synchronized time: 41086; ticks: 55617866 secondary synchronized time: 40998; ticks: 44382135 *** Java 1.5/server $ java -version -server java version "1.5.0_13" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05) Java HotSpot(TM) Server VM (build 1.5.0_13-b05, mixed mode) $ java -server TestSynch 1 secondary atomic time: 897; ticks: 47771660 primary atomic time: 860; ticks: 52228341 primary synchronized time: 37749; ticks: 49503874 secondary synchronized time: 37644; ticks: 50496127 $ java -server TestSynch 1 primary atomic time: 882; ticks: 55689446 secondary atomic time: 955; ticks: 44310555 primary synchronized time: 39245; ticks: 45526991 secondary synchronized time: 39350; ticks: 54473010 *** Java 1.6/client $ java -version java version "1.6.0_13" Java(TM) SE Runtime Environment (build 1.6.0_13-b03) Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode, sharing) $ java TestSynch 1 secondary atomic time: 959; ticks: 47824199 primary atomic time: 980; ticks: 52175802 primary synchronized time: 26029; ticks: 56339037 secondary synchronized time: 24232; ticks: 43660964 $ java TestSynch 1 secondary atomic time: 1050; ticks: 47887651 primary atomic time: 1020; ticks: 52112350 secondary synchronized time: 25947; ticks: 42345253 primary synchronized time: 26042; ticks: 57654748 *** Java 1.6/server $ java -server -version java version "1.6.0_13" Java(TM) SE Runtime Environment (build 1.6.0_13-b03) Java HotSpot(TM) Server VM (build 11.3-b02, mixed mode, sharing) $ java -server TestSynch 1 secondary atomic time: 973; ticks: 46198801 primary atomic time: 942; ticks: 53801200 secondary synchronized time: 449; ticks: 3906780 primary synchronized time: 2256; ticks: 96093221 $ java -server TestSynch 1 secondary atomic time: 928; ticks: 55025620 primary atomic time: 924; ticks: 44974381 primary synchronized time: 2672; ticks: 44065122 secondary synchronized time: 2568; ticks: 55934879 SYSTEM 2: 32-bit Windows Vista SP1 - Processor: Core 2 Duo "Merom" T7500 (2.2GHz) C:\Users\chris\Desktop>java -client -version java version "1.6.0_13" Java(TM) SE Runtime Environment (build 1.6.0_13-b03) Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode, sharing) C:\Users\chris\Desktop>java -client TestSynch 1 primary atomic time: 4034; ticks: 52861711 secondary atomic time: 4034; ticks: 47138290 secondary synchronized time: 21446; ticks: 50758159 primary synchronized time: 21446; ticks: 49241842 C:\Users\chris\Desktop>java -client TestSynch 1 primary atomic time: 4351; ticks: 45396375 secondary atomic time: 4351; ticks: 54603626 secondary synchronized time: 18824; ticks: 50273205 primary synchronized time: 18824; ticks: 49726796 Oddly enough, I don't have the "server" VM installed, so I can't check that performance right now. Seems that on my two systems, atomics are faster than language-level synchronization, regardless of client versus server, or 1.5 versus 1.6 (though both using -server and 1.6 both give a significant performance boost to the language-level synchronization). I didn't find this code to exhibit high lock contention: there are only two threads at work, though they are doing nothing but acquiring locks (and incrementing integers, which should be trivial). - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkoMXasACgkQ9CaO5/Lv0PCysACeKVMPuHn1HV32zgETXgD8bzFb t5oAniwV24MvuAarjpXUQwbhxweTMJ1P =T9f3 -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Performance with many small requests
On 13-May-2009, at 09:16, David kerber wrote: Christopher Schultz wrote: ... Since a String object is immutable, one should always use a StringBuffer (preferably a StringBuilder, these days) when you are constructing strings in a piecemeal fashion, then convert to String when complete. This advice is good when constructing a long string in a loop or across a long series of statements. If you are just concatenating a bunch of string together to make a big one in a single statement, go ahead and use the + operator: the compiler is smart enough to convert the entire thing into a StringBuilder (a non-synchronized replacement for StringBuffer) expression that gets good performance without making your code look like crap. I was wondering about that. It certainly seemed like a good place for a smart compiler to fix things up, but didn't know if it actually did or not. I don't do a lot of that, but enough of it that it becomes a style issue. If in doubt write a small test case and repeat it for a period of time and see which one had the most completions. The one with the most completions is likely to be the most optimal. André-John - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Performance with many small requests
Christopher Schultz wrote: ... Since a String object is immutable, one should always use a StringBuffer (preferably a StringBuilder, these days) when you are constructing strings in a piecemeal fashion, then convert to String when complete. This advice is good when constructing a long string in a loop or across a long series of statements. If you are just concatenating a bunch of string together to make a big one in a single statement, go ahead and use the + operator: the compiler is smart enough to convert the entire thing into a StringBuilder (a non-synchronized replacement for StringBuffer) expression that gets good performance without making your code look like crap. I was wondering about that. It certainly seemed like a good place for a smart compiler to fix things up, but didn't know if it actually did or not. I don't do a lot of that, but enough of it that it becomes a style issue. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Performance with many small requests
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chuck, On 5/12/2009 5:03 PM, Caldarale, Charles R wrote: >> From: David kerber [mailto:dcker...@verizon.net] Subject: Re: >> Performance with many small requests >> >> When (what java version) did those string operation optimizations >> happen? Sun's web page that talks about this (and explicitly says >> that string buffers are usually faster than direct string >> operations) doesn't mention a specific java version. > > Don't confuse a StringBuffer (the recommended type) with a byte array > (what Chris was talking about). Right. People used to write code like this: String s = ...; char[] c = s.toCharArray(); for(int i=0; i Since a String object is immutable, one should always use a > StringBuffer (preferably a StringBuilder, these days) when you are > constructing strings in a piecemeal fashion, then convert to String > when complete. This advice is good when constructing a long string in a loop or across a long series of statements. If you are just concatenating a bunch of string together to make a big one in a single statement, go ahead and use the + operator: the compiler is smart enough to convert the entire thing into a StringBuilder (a non-synchronized replacement for StringBuffer) expression that gets good performance without making your code look like crap. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkoKxUcACgkQ9CaO5/Lv0PCo+gCgrlE7HabUgDG+zcba+GFPwZlP TTEAn2l+hTciWmHGvHH5GSiybnxZfTbi =nuGH -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Caldarale, Charles R wrote: From: David kerber [mailto:dcker...@verizon.net] Subject: Re: Performance with many small requests When (what java version) did those string operation optimizations happen? Sun's web page that talks about this (and explicitly says that string buffers are usually faster than direct string operations) doesn't mention a specific java version. Don't confuse a StringBuffer (the recommended type) with a byte array (what Chris was talking about). Since a String object is immutable, one should always use a StringBuffer (preferably a StringBuilder, these days) when you are constructing strings in a piecemeal fashion, then convert to String when complete. Thanks for clarifying; I thought he was referring to a StringBuilder. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > When (what java version) did those string operation optimizations > happen? Sun's web page that talks about this (and explicitly says > that string buffers are usually faster than direct string operations) > doesn't mention a specific java version. Don't confuse a StringBuffer (the recommended type) with a byte array (what Chris was talking about). Since a String object is immutable, one should always use a StringBuffer (preferably a StringBuilder, these days) when you are constructing strings in a piecemeal fashion, then convert to String when complete. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Christopher Schultz wrote: On May 12, 2009, at 13:09, "Caldarale, Charles R" wrote: From: David kerber [mailto:dcker...@verizon.net] Subject: Re: Performance with many small requests From these tests, it looks like, under windows XP and java 1.5 any way, that atomics are always faster Try it under 1.6; Sun made major improvements to synchronization handling between 1.5 and 1.6. When I reran my tests on 1.5 (which I don't use these days), I got numbers similar to yours. 1.6 is much, much faster. This reminds me of perfomance "optimizations" that people used to make in their Java code such as converting String objects to byte arrays to do operations on them because "everyone knew" that it was faster. Then, Sun came along and optimized the String API implementation, causing all those "optimizations" to then be slower than the straightforward implementatios of string ops. That "optimized" code also has the added advantage of being confusing to read. When (what java version) did those string operation optimizations happen? Sun's web page that talks about this (and explicitly says that string buffers are usually faster than direct string operations) doesn't mention a specific java version. I agree with Chuck's assertion that understandability ought to be a more important goal than maximum possible performance. That's going to depend on the application's intended use. Dave - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
On May 12, 2009, at 13:09, "Caldarale, Charles R" > wrote: From: David kerber [mailto:dcker...@verizon.net] Subject: Re: Performance with many small requests From these tests, it looks like, under windows XP and java 1.5 any way, that atomics are always faster Try it under 1.6; Sun made major improvements to synchronization handling between 1.5 and 1.6. When I reran my tests on 1.5 (which I don't use these days), I got numbers similar to yours. 1.6 is much, much faster. This reminds me of perfomance "optimizations" that people used to make in their Java code such as converting String objects to byte arrays to do operations on them because "everyone knew" that it was faster. Then, Sun came along and optimized the String API implementation, causing all those "optimizations" to then be slower than the straightforward implementatios of string ops. That "optimized" code also has the added advantage of being confusing to read. I agree with Chuck's assertion that understandability ought to be a more important goal than maximum possible performance. -chris - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > How difficult are keepalives to implement? That would depend on your client. Looks like the Apache http client supports it, but I haven't used it. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > That's good to know; that would be an incentive for me to migrate this > app to 1.6 and Tomcat 6. You don't need to move to Tomcat 6 to use a 1.6 JVM; you can use the Tomcat you already have. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Caldarale, Charles R wrote: From: David kerber [mailto:dcker...@verizon.net] Subject: Re: Performance with many small requests From these tests, it looks like, under windows XP and java 1.5 any way, that atomics are always faster Try it under 1.6; Sun made major improvements to synchronization handling between 1.5 and 1.6. When I reran my tests on 1.5 (which I don't use these days), I got numbers similar to yours. 1.6 is much, much faster. That's good to know; that would be an incentive for me to migrate this app to 1.6 and Tomcat 6. Also, what is your CPU type? Intel and AMD may have significant differences, as may 32- vs 64-bit. AMD 64 x2, running 32-bit windows XP - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > From these tests, it looks like, under windows XP and java 1.5 > any way, that atomics are always faster Try it under 1.6; Sun made major improvements to synchronization handling between 1.5 and 1.6. When I reran my tests on 1.5 (which I don't use these days), I got numbers similar to yours. 1.6 is much, much faster. Also, what is your CPU type? Intel and AMD may have significant differences, as may 32- vs 64-bit. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chuck, On 5/12/2009 12:27 AM, Caldarale, Charles R wrote: >> From: David Kerber [mailto:dcker...@verizon.net] >> Subject: Re: Performance with many small requests >> >> Incrementing a counter can't be much of a synchronization bottleneck, >> and if I switch to an AtomicInteger, it should be even less of one. > > Actually, it won't. There's a slight performance difference between > the two mechanisms, but it's usually in favor of the synchronized > increment, not the AtomicInteger, at least on my dual-core AMD 64 system > running JDK 6u12 in 64-bit server mode on Vista. The difference is only > a few percent, so you should just code it whichever way you find more > maintainable. (Test program available on request; it would be > interesting to see if the same relationship exists on a modern Intel chip.) High monitor contention or low? I can run your test code on a Core 2 Duo if you want to publish it. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkoJnmYACgkQ9CaO5/Lv0PDvxgCgsJr3YwJRFNh4ibZEQacaIWcN 1QcAnA5rOrqpu3WMqiBhzUZ6si3bI0lX =9sJl -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Caldarale, Charles R wrote: From: Leon Rosenberg [mailto:rosenberg.l...@googlemail.com] Subject: Re: Performance with many small requests If you would share your test code, I would love to test it on some *nixes and darwins I have here; Here's the code I used to do the synch vs atomic testing. The command line parameter is the number of loops to perform; you'll want to set it to at least 10, and even then run repeated tests - the timings can vary considerably, at least under Vista. (Also being sent directly to the two requesters, in case the list strips the attachment.) Might be interesting to modify it to run with more cores, if you have a system available. - Chuck My dev machine: WinXP SP3, dual-core 2.8GHz processor, java 1.5.0_12. First, I ran it in Eclipse as supplied, with looplimit = 1000, and got: secondary atomic time: 6890; ticks: 51773402 primary atomic time: 6890; ticks: 48226599 secondary synchronized time: 21281; ticks: 50282172 primary synchronized time: 21281; ticks: 49717829 Then I reversed the order of the tests (just to be sure it didn't matter) and got similar results: secondary synchronized time: 21219; ticks: 49601191 primary synchronized time: 21234; ticks: 50398810 secondary atomic time: 6734; ticks: 52111089 primary atomic time: 6734; ticks: 47888912 Running at a command line (java -cp . TestSynch) gave me rather different results (qualitatively similar, quantitatively rather different): primary synchronized time: 42998; ticks: 59125831 secondary synchronized time: 42998; ticks: 40874170 secondary atomic time: 4953; ticks: 49025722 primary atomic time: 4953; ticks: 50974279 After several tests, the ratio between the synchronized and atomic times varied between about 5 and 9, but atomic was always the lower time. Running two instances simultaneously didn't change the numbers much (as expected from a dual-core machine), but the command window with the focus always ran significantly faster than the one without it, no matter which one was started first. One very surprising result (to me, anyway) was that 4 instances only extended the time numbers slightly (<10%) for the synchronized run, and even less for the atomic run. Going to 8 instances made a dramatic increase in the synchronized time, but again only a slight increase in the atomic version. 16 instances was too much for my system; it took a long time to start the last 8 or so, and both the atomic and the synchronized versions took a lot longer. From these tests, it looks like, under windows XP and java 1.5 any way, that atomics are always faster, and also handle increasing concurrency much better than synchronize() blocks do. Now to test on my server!! Dave - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: Leon Rosenberg [mailto:rosenberg.l...@googlemail.com] > Subject: Re: Performance with many small requests > > If you would share your test code, I would love to test it on > some *nixes and darwins I have here; Here's the code I used to do the synch vs atomic testing. The command line parameter is the number of loops to perform; you'll want to set it to at least 10, and even then run repeated tests - the timings can vary considerably, at least under Vista. (Also being sent directly to the two requesters, in case the list strips the attachment.) Might be interesting to modify it to run with more cores, if you have a system available. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
[OT] RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > A typical client will have 2 to 5 items to send per > transaction (they're > actually lines from a data logger's data file), and each line > is done in > a separate POST request. The frequency of transactions varies widely, > but typically won't exceed one every 10 or 15 seconds from any given > site. As I mentioned earlier, each data line is small, 20 to > 50 bytes. OK, so your top end is about 1 line every 2 seconds. You'll need at least 2 round-trip times (RTT) per line (SYN out, SYN-ACK back, ACK-DATA out, ACK-DATA back, plus the FIN-ACK out), but that's not a high rate. > We had looked at batching up the transmissions before, and > it's still an > option. However that adds a bit of complexity to the software on both > ends, though the gain would be far fewer individual requests to > process. For now, we prefer the simplicity of line-by-line > transmission, but if we start running into network limitations we'll > probably start batching them up. I'm interested - and this is now a long way from Tomcat, hence the [OT] mark above. If a set of lines represents one transaction, why would you ever not send it and try to process it atomically? Or is it acceptable to have part-transactions within your system? - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Caldarale, Charles R wrote: From: Peter Crowther [mailto:peter.crowt...@melandra.com] Subject: RE: Performance with many small requests That said, if a client has multiple data items to send in rapid succession, does it accumulate those and batch them, or does it send each one as a different request? Or does the situation never arise? Continuing with that thought, are the requests from a single client frequent enough to warrant using keepalives? Building and tearing down the TCP session on each request might be adding noticeable delay, although your analysis of the heap dumps hasn't shown that yet. See the message I just sent. How difficult are keepalives to implement? Our app design is such that we are never supposed to go longer than 5 minutes without at least a status update transmission. Dave - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Peter Crowther wrote: From: David kerber [mailto:dcker...@verizon.net] Just over 1000 total, 810 to the port that this application is using. "Should" be fine on Windows. That was my gut feeling too, but I'm glad to have it confirmed. The vast majority are showing a status of TIME_WAIT, a dozen or so in ESTABLISHED and one (I think) in FIN_WAIT_1. Sounds fair enough. The ESTABLISHED ones are active both ways and able to transfer data; the one in FIN_WAIT_1 has been closed at one end but the other end's still open; and the ones in TIME_WAIT are closed but tombstoned so the TCP stack knows to throw away any data that arrives for them. None of those are a surprise. That's our corporate connection, so it's shared across all users. I can easily run it up to 100% it by doing a large d/l from somewhere (I need to plan my patch Tuesday updates to avoid trouble), so my router and firewall have no trouble handling the full bandwidth. Ah, OK. However, those are low numbers of high-throughput connections. This app produces large numbers of connections, each with small amounts of data, so it may scale differently. It may, but I'd be a little surprised - IP is IP, and you have enough concurrency that latency shouldn't be a problem. I was wondering about that. I knew total data throughput wasn't a major issue here, but wasn't sure how latency would affect it. That said, if a client has multiple data items to send in rapid succession, does it accumulate those and batch them, or does it send each one as a different request? Or does the situation never arise? A typical client will have 2 to 5 items to send per transaction (they're actually lines from a data logger's data file), and each line is done in a separate POST request. The frequency of transactions varies widely, but typically won't exceed one every 10 or 15 seconds from any given site. As I mentioned earlier, each data line is small, 20 to 50 bytes. We had looked at batching up the transmissions before, and it's still an option. However that adds a bit of complexity to the software on both ends, though the gain would be far fewer individual requests to process. For now, we prefer the simplicity of line-by-line transmission, but if we start running into network limitations we'll probably start batching them up. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: Peter Crowther [mailto:peter.crowt...@melandra.com] > Subject: RE: Performance with many small requests > > That said, if a client has multiple data items to send in rapid > succession, does it accumulate those and batch them, or does it send > each one as a different request? Or does the situation never arise? Continuing with that thought, are the requests from a single client frequent enough to warrant using keepalives? Building and tearing down the TCP session on each request might be adding noticeable delay, although your analysis of the heap dumps hasn't shown that yet. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Just over 1000 total, 810 to the port that this application is using. "Should" be fine on Windows. > The vast majority are showing a status of TIME_WAIT, a dozen or so in > ESTABLISHED and one (I think) in FIN_WAIT_1. Sounds fair enough. The ESTABLISHED ones are active both ways and able to transfer data; the one in FIN_WAIT_1 has been closed at one end but the other end's still open; and the ones in TIME_WAIT are closed but tombstoned so the TCP stack knows to throw away any data that arrives for them. None of those are a surprise. > That's our corporate connection, so it's shared across all > users. I can > easily run it up to 100% it by doing a large d/l from > somewhere (I need > to plan my patch Tuesday updates to avoid trouble), so my router and > firewall have no trouble handling the full bandwidth. Ah, OK. > However, those > are low numbers of high-throughput connections. This app > produces large > numbers of connections, each with small amounts of data, so > it may scale differently. It may, but I'd be a little surprised - IP is IP, and you have enough concurrency that latency shouldn't be a problem. That said, if a client has multiple data items to send in rapid succession, does it accumulate those and batch them, or does it send each one as a different request? Or does the situation never arise? - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Peter Crowther wrote: From: David kerber [mailto:dcker...@verizon.net] In my original post, I posted a bunch of numbers about network and other possible bottlenecks, and what it boiled down to was that neither my firewall load, nor total internet connection bandwidth were close to their limits. Thanks. Apologies for not referring back! No problem; that was many posts ago... I do have questions about the number of connections that the OS networking stack can handle, but have not figured out how to check on that. As a first step: netstat -an > somefile.txt How many TCP sockets are there in the result? Just over 1000 total, 810 to the port that this application is using. The vast majority are showing a status of TIME_WAIT, a dozen or so in ESTABLISHED and one (I think) in FIN_WAIT_1. The outside world connection is a full T-1, running about 40% - 50% capacity on average. Dedicated or contended bandwidth? Can you get the other 50-60% out of it if you try hard from another machine on the same network, or do you never get it in reality? That's our corporate connection, so it's shared across all users. I can easily run it up to 100% it by doing a large d/l from somewhere (I need to plan my patch Tuesday updates to avoid trouble), so my router and firewall have no trouble handling the full bandwidth. However, those are low numbers of high-throughput connections. This app produces large numbers of connections, each with small amounts of data, so it may scale differently. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > In my original post, I posted a bunch of numbers about > network and other > possible bottlenecks, and what it boiled down to was that neither my > firewall load, nor total internet connection bandwidth were close to > their limits. Thanks. Apologies for not referring back! > I do have questions about the number of connections that > the OS networking stack can handle, but have not figured out how to > check on that. As a first step: netstat -an > somefile.txt How many TCP sockets are there in the result? > The outside world connection is a full T-1, running about 40% - 50% > capacity on average. Dedicated or contended bandwidth? Can you get the other 50-60% out of it if you try hard from another machine on the same network, or do you never get it in reality? - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Peter Crowther wrote: From: David Kerber [mailto:dcker...@verizon.net] I definitely should hook a profiler to the app so I can be sure of what's taking the time, though. Yes. If you don't measure it, you don't know whether you're fixing the right problem! It was apparent early on that the synchronization was the most limiting bottleneck, and that has been mostly corrected thanks to you guys. Now I'm looking at various possibilities for the secondary bottlenecks. Also consider connector, then if necessary process and OS limits on the number of concurrent connections. Do you usually have connector threads sat idle, or are they all reading and processing requests most/all of the time? A thread dump will tell you - the last one you posted had at least one thread in the pool waiting for a Yes, I usually have several waiting on the socket, either at my InputStream.read() line, or in some tomcat code that Chuck said was waiting for http headers. However, I still have more completely idle (sleeping) threads than I do busy or locked ones at any given time, so the servlet seems to be keeping up pretty well overall. See below, though... connection, and you can simply spot which others look similar. The other way to check would be to monitor the depth of your connector's socket's accept queue, but I'm not aware of any way to do this in Windows. At this point, I'm guessing on any remaining bottlenecks. I recall your network is gigabit from the router (I think I've recalled correctly), but also check: - Is the firewall or router overloaded? Highly unlikely if they're properly specced, but I have been in one data centre where the bottleneck turned out to be the routers.* In my original post, I posted a bunch of numbers about network and other possible bottlenecks, and what it boiled down to was that neither my firewall load, nor total internet connection bandwidth were close to their limits. I do have questions about the number of connections that the OS networking stack can handle, but have not figured out how to check on that. I also need to investigate some possible latency (as opposed to throughput) issues in my network, given the small request size. - What's your external connectivity like? Gigabit from the router is irrelevant if you're trying to fit 20 Mbit/s of data down a 10 Mbit/s pipe :-). The outside world connection is a full T-1, running about 40% - 50% capacity on average. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Leon Rosenberg wrote: On Tue, May 12, 2009 at 6:27 AM, Caldarale, Charles R wrote: From: David Kerber [mailto:dcker...@verizon.net] Subject: Re: Performance with many small requests Incrementing a counter can't be much of a synchronization bottleneck, and if I switch to an AtomicInteger, it should be even less of one. Actually, it won't. There's a slight performance difference between the two mechanisms, but it's usually in favor of the synchronized increment, not the >AtomicInteger, at least on my dual-core AMD 64 system running JDK 6u12 in 64-bit server mode on Vista. The difference is only a few percent, so you should >just code it whichever way you find more maintainable. (Test program available on request; it would be interesting to see if the same relationship exists on a >modern Intel chip.) Hello, last time I checked (which is a while ago - 2006 and on 1.5) it was not only processor, but also OS dependent and clearly in favor of atomics (but it probably depends on the number of concurrent writers too). If you would share your test code, I would love to test it on some *nixes and darwins I have here; i'd also volunteer to gather and publish results from everyone else :-) I'll second that request! Dave - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David Kerber [mailto:dcker...@verizon.net] > I definitely should hook a profiler to the app so I can be sure of > what's taking the time, though. Yes. If you don't measure it, you don't know whether you're fixing the right problem! Also consider connector, then if necessary process and OS limits on the number of concurrent connections. Do you usually have connector threads sat idle, or are they all reading and processing requests most/all of the time? A thread dump will tell you - the last one you posted had at least one thread in the pool waiting for a connection, and you can simply spot which others look similar. The other way to check would be to monitor the depth of your connector's socket's accept queue, but I'm not aware of any way to do this in Windows. At this point, I'm guessing on any remaining bottlenecks. I recall your network is gigabit from the router (I think I've recalled correctly), but also check: - Is the firewall or router overloaded? Highly unlikely if they're properly specced, but I have been in one data centre where the bottleneck turned out to be the routers.* - What's your external connectivity like? Gigabit from the router is irrelevant if you're trying to fit 20 Mbit/s of data down a 10 Mbit/s pipe :-). - Peter * Names elided to protect the innocent, but a manufacturer's claim that a particular spec of router could handle two ISDN primaries turned out to be correct in the USA (23 B-channels per PRI) and wrong in Europe (30 B-channels per PRI). - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
On Tue, May 12, 2009 at 6:27 AM, Caldarale, Charles R wrote: >> From: David Kerber [mailto:dcker...@verizon.net] >> Subject: Re: Performance with many small requests >> >> Incrementing a counter can't be much of a synchronization bottleneck, >> and if I switch to an AtomicInteger, it should be even less of one. > > Actually, it won't. There's a slight performance difference between the two > mechanisms, but it's usually in favor of the synchronized increment, not the > >AtomicInteger, at least on my dual-core AMD 64 system running JDK 6u12 in > 64-bit server mode on Vista. The difference is only a few percent, so you > should >just code it whichever way you find more maintainable. (Test program > available on request; it would be interesting to see if the same relationship > exists on a >modern Intel chip.) Hello, last time I checked (which is a while ago - 2006 and on 1.5) it was not only processor, but also OS dependent and clearly in favor of atomics (but it probably depends on the number of concurrent writers too). If you would share your test code, I would love to test it on some *nixes and darwins I have here; i'd also volunteer to gather and publish results from everyone else :-) regards Leon - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David Kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > Incrementing a counter can't be much of a synchronization bottleneck, > and if I switch to an AtomicInteger, it should be even less of one. Actually, it won't. There's a slight performance difference between the two mechanisms, but it's usually in favor of the synchronized increment, not the AtomicInteger, at least on my dual-core AMD 64 system running JDK 6u12 in 64-bit server mode on Vista. The difference is only a few percent, so you should just code it whichever way you find more maintainable. (Test program available on request; it would be interesting to see if the same relationship exists on a modern Intel chip.) - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
Re: Performance with many small requests
Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Peter, On 5/8/2009 7:26 AM, Peter Crowther wrote: Decrypt: parallel. Send ack: parallel. Increment counters: synced. Write to log file: synced (or you'll have some very odd stuff happening). I'd go further and suggest that you re-factor your design so that your servlet is very simple. Something like this: public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException { RequestCounter counter = ...; // get from app scope? Class-level? RequestLogger logger = ...; // same here RequestProcessor processor = ...; // same here counter.count(); processor.processRequest(request, response, false); logger.log(request, response); } Then its up to the RequestCounter to maintain its own synchonization (if necessary) instead of your servlet having to know the semantics of thread-safety, etc. Same with the logger. As someone mentioned, most logging frameworks handle synchronization for you, and most of them can buffer the output to their log files so that you are getting the best performance you can. I highly recommend using a logging framework, or developing something that meets your needs that is self-contained, can accept log entries from multiple concurrent clients (your servlets), and buffers output to the log file to keep performance up. I've been meaning to look into some more sophisticated logging techniques, and this exercise has given me some good incentive to do so sooner rather than later. However, it doesn't look at the moment like disk writes are a limiting factor in this app's performance. My latest thread dump indicates that the socket read is where most of the waits are at. Because the requests are so small, I imagine that network latency is a far bigger factor than gross throughput is. I definitely should hook a profiler to the app so I can be sure of what's taking the time, though. What is it that processRequest actually does? Decryption? Hmm... is it possible for you to save the decryption for later? You could have a service that simply logs the notifications and then have a batch job that later does the decryption and throws-out all the incorrectly-encrypted data. Just another option. Basically the entire job of this application (servlet) is to accept the POSTs from the clients in the field, decrypt them, do a few sanity checks on the raw data, and dump them into a file on disk (we call it a "cache" file). There are separate apps that then continuously read the data from the cache file and do all kinds of processing on it, stuff it into a database, and check various values and trends for near-realtime alerting purposes. Moving the decryption to a later step in the process would be possible, but would require rewriting another application, for probably very little net gain. Early on in the design, we considered doing it all in one application, but felt that this method gave us a little more overall reliability, because one piece could go down without affecting the others, and then it could catch up when it came back up. It also allowed us to profile each section separately, making it a little easier to find the bottlenecks. Finally... if you are logging all requests, is it necessary to keep a daily and total request count? You can avoid the synchronization of those counters entirely by ... not bothering to count them. Again, retrospective counting is a possibility. The counting isn't a core requirement of the application; I just put it in a a way to help me monitor its progress during the day, to be sure it hasn't locked up or lost a network connection somewhere along the way. Incrementing a counter can't be much of a synchronization bottleneck, and if I switch to an AtomicInteger, it should be even less of one. Thanks for the comments! D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Peter, On 5/8/2009 7:26 AM, Peter Crowther wrote: > Decrypt: parallel. > Send ack: parallel. > Increment counters: synced. > Write to log file: synced (or you'll have some very odd stuff happening). I'd go further and suggest that you re-factor your design so that your servlet is very simple. Something like this: public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException { RequestCounter counter = ...; // get from app scope? Class-level? RequestLogger logger = ...; // same here RequestProcessor processor = ...; // same here counter.count(); processor.processRequest(request, response, false); logger.log(request, response); } Then its up to the RequestCounter to maintain its own synchonization (if necessary) instead of your servlet having to know the semantics of thread-safety, etc. Same with the logger. As someone mentioned, most logging frameworks handle synchronization for you, and most of them can buffer the output to their log files so that you are getting the best performance you can. I highly recommend using a logging framework, or developing something that meets your needs that is self-contained, can accept log entries from multiple concurrent clients (your servlets), and buffers output to the log file to keep performance up. What is it that processRequest actually does? Decryption? Hmm... is it possible for you to save the decryption for later? You could have a service that simply logs the notifications and then have a batch job that later does the decryption and throws-out all the incorrectly-encrypted data. Just another option. Finally... if you are logging all requests, is it necessary to keep a daily and total request count? You can avoid the synchronization of those counters entirely by ... not bothering to count them. Again, retrospective counting is a possibility. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkoIkLoACgkQ9CaO5/Lv0PAingCbBNb5ESoaIlDwoROOFrjmYySZ X94AniMh23cbmU2rodDw5fFISpRwDyhS =fB6Z -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > From what I can tell now, it looks like most of my wait time is on > socket reads. In the thread dump I took about 20 minutes ago, I didn't > see any waiting on disk writes: > > The line listed in this one is my inputStream.read(): Waiting for the body of the request to show up. > This one seems to be waiting on something in tomcat itself: Waiting for the request header to show up. If that's all you're seeing in the thread dump, then it does look like the network is sluggish, as I think you mentioned before. You might try running Wireshark or equivalent to monitor the traffic and see just how long it takes for each segment of the message to be delivered to the server. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Peter Crowther wrote: From: David kerber [dcker...@verizon.net] My cpu usage for tomcat has gone from bouncing between 0 and 1 in task manager, to a steady 2 since more threads are now actually doing work instead of waiting around for their turn at the code, my disk writes per sec in perfmon have also more than doubled, and the destination log file is growing much faster as well. All excellent news. The fact that you've seen the performance double means that there was, in fact, a bottleneck there. Have you taken a new thread dump to see whether the locks (almost certainly on the log write) are still a problem? If so, you might have to go to a more complex scheme such as multiple log files managed by a pool manager. Don't even try to write the pool manager yourself; they're horribly messy things to get right and shake the race conditions out*. I half-remember Jakarta Commons has one that can be adapted if you get to that stage. From what I can tell now, it looks like most of my wait time is on socket reads. In the thread dump I took about 20 minutes ago, I didn't see any waiting on disk writes: The line listed in this one is my inputStream.read(): [2009-05-11 08:20:09] [info] "http-1024-Processor8" [2009-05-11 08:20:09] [info] daemon [2009-05-11 08:20:09] [info] prio=6 tid=0x270e83c8 [2009-05-11 08:20:09] [info] nid=0xcd4 [2009-05-11 08:20:09] [info] runnable [2009-05-11 08:20:09] [info] [0x2755f000..0x2755f9e4] [2009-05-11 08:20:09] [info] at java.net.SocketInputStream.socketRead0(Native Method) [2009-05-11 08:20:10] [info] at java.net.SocketInputStream.read(Unknown Source) [2009-05-11 08:20:10] [info] at org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:747) [2009-05-11 08:20:10] [info] at org.apache.coyote.http11.InternalInputBuffer$InputStreamInputBuffer.doRead(InternalInputBuffer.java:777) [2009-05-11 08:20:10] [info] at org.apache.coyote.http11.filters.IdentityInputFilter.doRead(IdentityInputFilter.java:115) [2009-05-11 08:20:10] [info] at org.apache.coyote.http11.InternalInputBuffer.doRead(InternalInputBuffer.java:712) [2009-05-11 08:20:10] [info] at org.apache.coyote.Request.doRead(Request.java:423) [2009-05-11 08:20:10] [info] at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:283) [2009-05-11 08:20:10] [info] at org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:404) [2009-05-11 08:20:10] [info] at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:298) [2009-05-11 08:20:10] [info] at org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:192) [2009-05-11 08:20:10] [info] at eddsrv.EddRcvr.processRequest(EddRcvr.java:199) [2009-05-11 08:20:10] [info] at eddsrv.EddRcvr.doPost(EddRcvr.java:94) [2009-05-11 08:20:10] [info] at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) [2009-05-11 08:20:10] [info] at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) [2009-05-11 08:20:10] [info] at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252) [2009-05-11 08:20:10] [info] at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) [2009-05-11 08:20:10] [info] at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) [2009-05-11 08:20:10] [info] at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178) [2009-05-11 08:20:10] [info] at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) [2009-05-11 08:20:10] [info] at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) [2009-05-11 08:20:10] [info] at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) [2009-05-11 08:20:11] [info] at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) [2009-05-11 08:20:11] [info] at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869) [2009-05-11 08:20:11] [info] at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:667) [2009-05-11 08:20:11] [info] at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) [2009-05-11 08:20:11] [info] at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80) [2009-05-11 08:20:11] [info] at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684) [2009-05-11 08:20:11] [info] at java.lang.Thread.run(Unknown Source) [2009-05-11 08:20:11] [info] This one seems to be waiting on something in tomcat itself: [2009-05-11 08:19:49] [info] "http-1024-Processor45" [2009-05-11 08:19:49] [info] daemon [2009-05-11 08:19:49] [info] prio=6 tid=0x26fa6f38 [2009-05-11 08:19:49] [info] nid=0x340 [2009-05-11 08:19
RE: Performance with many small requests
> From: David kerber [dcker...@verizon.net] > My cpu usage for tomcat > has gone from bouncing between 0 and 1 in task manager, to a steady 2 > since more threads are now actually doing work instead of waiting around > for their turn at the code, my disk writes per sec in perfmon have also > more than doubled, and the destination log file is growing much faster > as well. All excellent news. The fact that you've seen the performance double means that there was, in fact, a bottleneck there. Have you taken a new thread dump to see whether the locks (almost certainly on the log write) are still a problem? If so, you might have to go to a more complex scheme such as multiple log files managed by a pool manager. Don't even try to write the pool manager yourself; they're horribly messy things to get right and shake the race conditions out*. I half-remember Jakarta Commons has one that can be adapted if you get to that stage. > Thanks a ton!!! No problem. - Peter * Yes, I did implement one. I still have the scars. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Peter Crowther wrote: From: Caldarale, Charles R [mailto:chuck.caldar...@unisys.com] Strictly speaking, that's one thread per *servlet* object; if using the SingleThreadModel (let's hope not), the container is allowed to create multiple instances. Good point in the general case, but I rather suspect David would have seen very different performance characteristics and some *very* confused output if that were the case here. They're not further places for contention to occur. Depending on what else uses the criticalProcess object, that may or may not be true. Another good point. I was assuming something that isn't necessarily true, namely that criticalProcess was created for just that sync block. Which, of course, is a correct assumption. Meh, why don't I bow out and leave Chuck to give all the good answers? ;-) You've done great so far: I implemented your suggestions, and it looks like it has more than doubled my throughput! My cpu usage for tomcat has gone from bouncing between 0 and 1 in task manager, to a steady 2 since more threads are now actually doing work instead of waiting around for their turn at the code, my disk writes per sec in perfmon have also more than doubled, and the destination log file is growing much faster as well. Thanks a ton!!! Dave - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: Peter Crowther [mailto:peter.crowt...@melandra.com] > Subject: RE: Performance with many small requests > > Meh, why don't I bow out and leave Chuck to give all the good answers? A) I don't have them all. B) What I do have is meetings, bloody meetings and won't be answering promptly. :-( - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: Caldarale, Charles R [mailto:chuck.caldar...@unisys.com] > Strictly speaking, that's one thread per *servlet* object; if > using the SingleThreadModel (let's hope not), the container > is allowed to create multiple instances. Good point in the general case, but I rather suspect David would have seen very different performance characteristics and some *very* confused output if that were the case here. > > They're not further places for contention to occur. > Depending on what else uses the criticalProcess object, that > may or may not be true. Another good point. I was assuming something that isn't necessarily true, namely that criticalProcess was created for just that sync block. Meh, why don't I bow out and leave Chuck to give all the good answers? ;-) - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Caldarale, Charles R wrote: From: Peter Crowther [mailto:peter.crowt...@melandra.com] Subject: RE: Performance with many small requests They look like spares in the pool, but my knowledge of Tomcat's internals is limited. Yes, they are just waiting for requests to show up. Only one thread can get into the method Strictly speaking, that's one thread per *servlet* object; if using the SingleThreadModel (let's hope not), the container is allowed to create multiple instances. They're not further places for contention to occur. Depending on what else uses the criticalProcess object, that may or may not be true. Regardless, synchronizing on the method is very likely a complete waste of time. Thanks for confirming that. Dave - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: Peter Crowther [mailto:peter.crowt...@melandra.com] > Subject: RE: Performance with many small requests > > They look like spares in the pool, but my knowledge of Tomcat's > internals is limited. Yes, they are just waiting for requests to show up. > Only one thread can get into the method Strictly speaking, that's one thread per *servlet* object; if using the SingleThreadModel (let's hope not), the container is allowed to create multiple instances. > They're not further places for contention to occur. Depending on what else uses the criticalProcess object, that may or may not be true. Regardless, synchronizing on the method is very likely a complete waste of time. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > I also have quite a few blocks like this: [...] > [2009-05-08 10:43:23] [info] - locked <0x0510e6e0> (a > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable) [...] > I assume these are just threads waiting for something to do > (waiting for a request)? They look like spares in the pool, but my knowledge of Tomcat's internals is limited. > Until you said that, I didn't even notice that I had what appear to be > "double" synchronizations, making the method synchronized, and also > having synchronized{} blocks inside it. I assume I've been > double-screwing myself all this time?? Yeah, I did raise an eyebrow when I saw it. It'll take a few CPU cycles per request, but no more than that. Only one thread can get into the method, so any internal syncs just add overhead. They're not further places for contention to occur. - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Peter Crowther wrote: From: David kerber [mailto:dcker...@verizon.net] Now that I've got a thread dump, what am I looking for? You found it first time :-). Now the hard part - fixing it. Yeah, that's what I figured! I've got a bunch of sections like this, pretty much all of which are waiting to lock <0x057c73e0>. Is there any way to figure out what that object is? I imagine it's the disk write, but can't figure out how to tell for sure. It's the sync at the start of your method. [2009-05-08 10:43:24] [info] waiting for monitor entry [2009-05-08 10:43:24] [info] [0x2739f000..0x2739fb64] [2009-05-08 10:43:24] [info] at eddsrv.EddRcvr.doPost(EddRcvr.java:70) [2009-05-08 10:43:24] [info] - waiting to lock <0x057c73e0> (a eddsrv.EddRcvr) I also have quite a few blocks like this: [2009-05-08 10:43:23] [info] "http-1024-Processor10" [2009-05-08 10:43:23] [info] daemon [2009-05-08 10:43:23] [info] prio=6 tid=0x271f1418 [2009-05-08 10:43:23] [info] nid=0xa74 [2009-05-08 10:43:23] [info] in Object.wait() [2009-05-08 10:43:23] [info] [0x275df000..0x275dfae4] [2009-05-08 10:43:23] [info] at java.lang.Object.wait(Native Method) [2009-05-08 10:43:23] [info] at java.lang.Object.wait(Unknown Source) [2009-05-08 10:43:23] [info] at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:656) [2009-05-08 10:43:23] [info] - locked <0x0510e6e0> (a org.apache.tomcat.util.threads.ThreadPool$ControlRunnable) [2009-05-08 10:43:23] [info] at java.lang.Thread.run(Unknown Source) [2009-05-08 10:43:23] [info] I assume these are just threads waiting for something to do (waiting for a request)? ... so they're all waiting to get the monitor on a eddsrv.EddRcvr, which is what the "synchronized" on your doPost method will lock on. Until you said that, I didn't even notice that I had what appear to be "double" synchronizations, making the method synchronized, and also having synchronized{} blocks inside it. I assume I've been double-screwing myself all this time?? protected synchronized void doPost(HttpServletRequest request, HttpServletResponse response ) throws ServletException, IOException { synchronized ( criticalProcess ) { totalReqCount++; dailyReqCount++; processRequest( request, response, false ); } } If you say pretty much all are stuck there, then you have massive contention on that monitor. Time to move to some finer-grained locking! As a first step, I'd remove the synchronized from the method; I'd replace it with one lock around the counter updates (locked on one object) and another lock in your decrypt/log/respond code that's purely around the logging section (locked on a different object). Then I'd re-evaluate - run, take another thread dump and see where the bottlenecks are now. If they're anywhere, I'll bet they're around the logging code. Thanks! D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Now that I've got a thread dump, what am I looking for? You found it first time :-). Now the hard part - fixing it. > I've got a > bunch of sections like this, pretty much all of which are waiting to > lock <0x057c73e0>. Is there any way to figure out what that > object is? > I imagine it's the disk write, but can't figure out how to > tell for sure. It's the sync at the start of your method. > [2009-05-08 10:43:24] [info] waiting for monitor entry > [2009-05-08 10:43:24] [info] [0x2739f000..0x2739fb64] > [2009-05-08 10:43:24] [info] at > eddsrv.EddRcvr.doPost(EddRcvr.java:70) > [2009-05-08 10:43:24] [info] - waiting to lock <0x057c73e0> (a > eddsrv.EddRcvr) ... so they're all waiting to get the monitor on a eddsrv.EddRcvr, which is what the "synchronized" on your doPost method will lock on. If you say pretty much all are stuck there, then you have massive contention on that monitor. Time to move to some finer-grained locking! As a first step, I'd remove the synchronized from the method; I'd replace it with one lock around the counter updates (locked on one object) and another lock in your decrypt/log/respond code that's purely around the logging section (locked on a different object). Then I'd re-evaluate - run, take another thread dump and see where the bottlenecks are now. If they're anywhere, I'll bet they're around the logging code. - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
David kerber wrote: Caldarale, Charles R wrote: -Original Message- From: David kerber [mailto:dcker...@verizon.net] Subject: Re: Performance with many small requests That said, any idea where that might leave the thread dump? After some experimentation, I found it in jakarta_service_MMDD.log in Tomcat's logs directory. Apparently it doesn't spit out the thread dump if the logging level is set to error, because I had looked there, and looked again just now (in case it took longer than I expected). When I get a chance to restart the service, I'll changed the logging level and try to get a dump. D Now that I've got a thread dump, what am I looking for? I've got a bunch of sections like this, pretty much all of which are waiting to lock <0x057c73e0>. Is there any way to figure out what that object is? I imagine it's the disk write, but can't figure out how to tell for sure. [2009-05-08 10:43:24] [info] "http-1024-Processor1" [2009-05-08 10:43:24] [info] daemon [2009-05-08 10:43:24] [info] prio=6 tid=0x26d0fe70 [2009-05-08 10:43:24] [info] nid=0x115c [2009-05-08 10:43:24] [info] waiting for monitor entry [2009-05-08 10:43:24] [info] [0x2739f000..0x2739fb64] [2009-05-08 10:43:24] [info] at eddsrv.EddRcvr.doPost(EddRcvr.java:70) [2009-05-08 10:43:24] [info] - waiting to lock <0x057c73e0> (a eddsrv.EddRcvr) [2009-05-08 10:43:24] [info] at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) [2009-05-08 10:43:24] [info] at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) [2009-05-08 10:43:24] [info] at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252) [2009-05-08 10:43:24] [info] at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) [2009-05-08 10:43:24] [info] at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) [2009-05-08 10:43:24] [info] at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178) [2009-05-08 10:43:24] [info] at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) [2009-05-08 10:43:24] [info] at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) [2009-05-08 10:43:24] [info] at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) [2009-05-08 10:43:24] [info] at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) [2009-05-08 10:43:25] [info] at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869) [2009-05-08 10:43:25] [info] at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:667) [2009-05-08 10:43:25] [info] at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) [2009-05-08 10:43:25] [info] at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80) [2009-05-08 10:43:25] [info] at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684) [2009-05-08 10:43:25] [info] at java.lang.Thread.run(Unknown Source) - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > I'll look into that to be sure, but I don't think the HD is limiting. I think I agree with you, but it's a classic area that people miss - Intel have done entirely too good a job of branding the CPU as the only place where speed matters! - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Peter Crowther wrote: From: David kerber [mailto:dcker...@verizon.net] Also, right now I'm doing a .flush() after the .write() to the log file. Is that usually necessary, other than to avoid losing data lines in case of a system failure? No, other than that. What disk subsystem are you running on? Start Performance Monitor and, from Physical Disks, monitor your disk writes per second. If it's over 150(ish, depending on the disk) per spindle in your disk array, you're saturating your disks. I don't recall the exact disk configuration, but it's pretty robust and on par with the rest of the system, because this server was originally spec'd as a combination file and application server. How would a .flush() affect the speed of returning from a synchronized .write()? It can be significant, as the data has to get to the file. I'd check the above. Also, do you have any battery-backed write cache (BBWC) on the disk subsystem and how's it configured? On systems where disk has proved to be the bottleneck, and there are many small pieces of data being written, I've seen better than a factor of 10 improvement by adding write cache in this way. I'll look into that to be sure, but I don't think the HD is limiting. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > Apparently it doesn't spit out the thread dump if the logging level is > set to error, because I had looked there, and looked again just now (in > case it took longer than I expected). Mine is set to error and the thread dump appears as expected; any setting of info or better should display it. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Also, right now I'm doing a .flush() after the .write() to the log > file. Is that usually necessary, other than to avoid losing > data lines in case of a system failure? No, other than that. What disk subsystem are you running on? Start Performance Monitor and, from Physical Disks, monitor your disk writes per second. If it's over 150(ish, depending on the disk) per spindle in your disk array, you're saturating your disks. > How would a > .flush() affect the speed of returning from a synchronized .write()? It can be significant, as the data has to get to the file. I'd check the above. Also, do you have any battery-backed write cache (BBWC) on the disk subsystem and how's it configured? On systems where disk has proved to be the bottleneck, and there are many small pieces of data being written, I've seen better than a factor of 10 improvement by adding write cache in this way. - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Caldarale, Charles R wrote: -Original Message- From: David kerber [mailto:dcker...@verizon.net] Subject: Re: Performance with many small requests That said, any idea where that might leave the thread dump? After some experimentation, I found it in jakarta_service_MMDD.log in Tomcat's logs directory. Apparently it doesn't spit out the thread dump if the logging level is set to error, because I had looked there, and looked again just now (in case it took longer than I expected). When I get a chance to restart the service, I'll changed the logging level and try to get a dump. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: Pid [mailto:p...@pidster.com] > Subject: Re: Performance with many small requests > > Would a single thread executor service alongside an atomic counter be > useful here? (my concurrency knowledge isn't so hot). Sounds like overkill just for ordering. Synchronization with the single thread doing the logging work would still be necessary, so nothing's really gained. > You could be dumping runnables into it during the post which would > return quickly for the next request. You'd have to consider > exec.shutdown() & exec.shutdownNow() in your servlet destroy to ensure > you didn't drop data during a shutdown or app restart Way, way too much complexity for the problem at hand. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> -Original Message- > From: David kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > That said, any idea where that might leave the thread dump? After some experimentation, I found it in jakarta_service_MMDD.log in Tomcat's logs directory. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Pid wrote: Peter Crowther wrote: From: David Kerber [mailto:dcker...@verizon.net] The synchronized section doesn't do a whole lot, so it doesn't take long to process. Indeed. So take a thread dump and see what's happening before making *any* changes to this key part. My question is, what kinds of operations need to be synchronized? All I do is decrypt the data from the POST, send a small acknowledgement response back to the site, and write the line to the log file. Does that sound like something that would need to be synchronized? If not, pulling that out would be a really easy test to see if it helps my performance issue. Decrypt: parallel. Send ack: parallel. Increment counters: synced. Write to log file: synced (or you'll have some very odd stuff happening). Would a single thread executor service alongside an atomic counter be useful here? (my concurrency knowledge isn't so hot). I'm not sure if a) this is suitable or b) if it would solve the problem, as you may still end up with a delayed write to the log during peaky periods - at least they'd be in the right order though. The order is the only thing that's important; a short delay (up to a few tens of seconds) is no problem. Also, right now I'm doing a .flush() after the .write() to the log file. Is that usually necessary, other than to avoid losing data lines in case of a system failure? A few lost lines, while not desirable, isn't too big of a problem in this particular application. How would a .flush() affect the speed of returning from a synchronized .write()? You could be dumping runnables into it during the post which would return quickly for the next request. You'd have to consider You lost me here... exec.shutdown() & exec.shutdownNow() in your servlet destroy to ensure you didn't drop data during a shutdown or app restart Dave - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Peter Crowther wrote: >> From: David Kerber [mailto:dcker...@verizon.net] >> The synchronized section doesn't do a whole lot, so it >> doesn't take long to process. > > Indeed. So take a thread dump and see what's happening before making *any* > changes to this key part. > >> My question is, what kinds of operations need to be >> synchronized? All I do is decrypt the data from the POST, >> send a small >> acknowledgement response back to the site, and write the line >> to the log >> file. Does that sound like something that would need to be >> synchronized? If not, pulling that out would be a really easy test to >> see if it helps my performance issue. > > Decrypt: parallel. > Send ack: parallel. > Increment counters: synced. > Write to log file: synced (or you'll have some very odd stuff happening). Would a single thread executor service alongside an atomic counter be useful here? (my concurrency knowledge isn't so hot). I'm not sure if a) this is suitable or b) if it would solve the problem, as you may still end up with a delayed write to the log during peaky periods - at least they'd be in the right order though. You could be dumping runnables into it during the post which would return quickly for the next request. You'd have to consider exec.shutdown() & exec.shutdownNow() in your servlet destroy to ensure you didn't drop data during a shutdown or app restart p > > - Peter > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Caldarale, Charles R wrote: From: David kerber [mailto:dcker...@verizon.net] Subject: Re: Performance with many small requests If you right-click on the icon in the system try, one of the items says "Thread dump". Right - sorry for forgetting that. I never install from the .exe download (too restrictive), so I never have the icon present. I was thinking about the tomcat5w.exe GUI. I've gotten in the habit of doing a double install: install from the .exe, and then extract the .zip distro on top of it. That said, any idea where that might leave the thread dump? Dave - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > If you right-click on the icon in the system try, one of the items says > "Thread dump". Right - sorry for forgetting that. I never install from the .exe download (too restrictive), so I never have the icon present. I was thinking about the tomcat5w.exe GUI. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Caldarale, Charles R wrote: From: David kerber [mailto:dcker...@verizon.net] Subject: Re: Performance with many small requests if I use tomcat5w.exe to take a thread dump, where does it leave the file? If you can take a thread dump with tomct5w.exe, please let us know how, because I'm certainly not aware of it having such a capability. If you right-click on the icon in the system try, one of the items says "Thread dump". The platform-independent method is to use jps to find the process id of the Tomcat instance, then jstack to create the thread dump. These tools are part of the JDK from 1.5 onwards; documentation is on the java.sun.com web site, but you probably won't need the doc. Ok, I'll have to install the jdk then; I've only got the jre installed on the server. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > Subject: Re: Performance with many small requests > > if I use tomcat5w.exe to take a thread dump, where does it > leave the file? If you can take a thread dump with tomct5w.exe, please let us know how, because I'm certainly not aware of it having such a capability. The platform-independent method is to use jps to find the process id of the Tomcat instance, then jstack to create the thread dump. These tools are part of the JDK from 1.5 onwards; documentation is on the java.sun.com web site, but you probably won't need the doc. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Peter Crowther wrote: From: David Kerber [mailto:dcker...@verizon.net] The synchronized section doesn't do a whole lot, so it doesn't take long to process. Indeed. So take a thread dump and see what's happening before making *any* changes to this key part. I'm trying; if I use tomcat5w.exe to take a thread dump, where does it leave the file? I can't find it, and it doesn't seem to put it on the clipboard either. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Mark Thomas wrote: Xie Xiaodong wrote: Hello, IMHO, it would be better to use java concurrency package now than to use the old synchronize mechanism. The old mechanism is to low level and error prone. I think you could have a thread pool and some handler pattern to handle the request from your customer. That is a massive over complication for this use case. That was my thought as well, but I don't know enough about the subject of concurrency and synchronization to be sure. On 7-May-2009, at 19:05, David Kerber wrote: The synchronized section doesn't do a whole lot, so it doesn't take long to process. My question is, what kinds of operations need to be synchronized? All I do is decrypt the data from the POST, You should be able to easily write the decryption (if it isn't already) in a multi-threaded manner. To that end, am I correct in understanding that any variables and objects that are declared locally to the method that does the work (such as the decryption routine) are going to be inherently thread safe? And that variables and objects declared at the class level (such as my counters) may not be? That's what the reading I did last night seemed to indicate, without explicitly stating so. send a small acknowledgement response back to the site, Unlikely to have sync issues (but check to be sure) and write the line to the log file. If you are using a logging framework (like log4j) this will handle the necessary sync for you. Otherwise you may have to write it yourself. I'm doing it with the standard text file methods, but all the objects and variables are local to the method that processes the request. Incrementing the counters you are using needs to be synchronized. The simplest solution would be to use atomics. I had never heard of them before I was reading about this yesterday, but it looks like a good possibility. Does that sound like something that would need to be synchronized? So, some bits do must most don't. Getting rid of unnecessary syncs is a good thing but you really should find out where the bottleneck is before you start changing code. That's my goal, but as far as I can tell right now, the bottleneck is narrowed down to either my code, or the customer's network, and testing some fixes to my code is pretty easy. Thanks for the comments! Dave - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David Kerber [mailto:dcker...@verizon.net] > The synchronized section doesn't do a whole lot, so it > doesn't take long to process. Indeed. So take a thread dump and see what's happening before making *any* changes to this key part. > My question is, what kinds of operations need to be > synchronized? All I do is decrypt the data from the POST, > send a small > acknowledgement response back to the site, and write the line > to the log > file. Does that sound like something that would need to be > synchronized? If not, pulling that out would be a really easy test to > see if it helps my performance issue. Decrypt: parallel. Send ack: parallel. Increment counters: synced. Write to log file: synced (or you'll have some very odd stuff happening). - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Xie Xiaodong wrote: > Hello, > > IMHO, it would be better to use java concurrency package now than to use > the old synchronize mechanism. The old mechanism is to low level and error > prone. I think you could have a thread pool and some handler pattern to > handle the request from your customer. That is a massive over complication for this use case. >> On 7-May-2009, at 19:05, David Kerber wrote: >>> The synchronized section doesn't do a whole lot, so it doesn't take long >>> to process. My question is, what kinds of operations need to be >>> synchronized? All I do is decrypt the data from the POST, You should be able to easily write the decryption (if it isn't already) in a multi-threaded manner. >>> send a small acknowledgement response back to the site, Unlikely to have sync issues (but check to be sure) >>> and write the line to the log file. If you are using a logging framework (like log4j) this will handle the necessary sync for you. Otherwise you may have to write it yourself. Incrementing the counters you are using needs to be synchronized. The simplest solution would be to use atomics. >>> Does that sound like something that would need to be >>> synchronized? So, some bits do must most don't. Getting rid of unnecessary syncs is a good thing but you really should find out where the bottleneck is before you start changing code. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Hello, IMHO, it would be better to use java concurrency package now than to use the old synchronize mechanism. The old mechanism is to low level and error prone. I think you could have a thread pool and some handler pattern to handle the request from your customer. 2009/5/8 Andre-John Mas > > On 7-May-2009, at 19:05, David Kerber wrote: > > Andre-John Mas wrote: >> >>> >>> That would be my impression too. It is best to avoid making the >>> synchronized scope so large, unless there is a very good reason. >>> >>> David, do you have any reason for this? Beyond the counter, what other >>> stuff do you synchronise? Also, it has generally been recommended to me to >>> avoid hitting the disk in every request, since you may result with an I/O >>> bottle neck, so if you can write the logs in batches you will have better >>> performance. If you know that you are only going to have very few users at a >>> time (say, less than 10), it may not be worth the time optimising this, but >>> if you know that you are going to get at least several hundred, then this is >>> something to watch out for. >>> >> >> Thanks for the comments, Andre-John and Peter. When I wrote that app, I >> didn't know as much as I do now, but I'm still not very knowledgeable >> about synchronized operations. >> >> The synchronized section doesn't do a whole lot, so it doesn't take long >> to process. My question is, what kinds of operations need to be >> synchronized? All I do is decrypt the data from the POST, send a small >> acknowledgement response back to the site, and write the line to the log >> file. Does that sound like something that would need to be >> synchronized? If not, pulling that out would be a really easy test to >> see if it helps my performance issue. >> >> > I am no expert in this myself, but I know enough to help me out in most day > to day scenarios. What you should be reading up on is concurrency in Java. A > few useful resources: > > site: http://java.sun.com/docs/books/tutorial/essential/concurrency/ > book: > http://www.amazon.com/Java-Concurrency-Practice-Brian-Goetz/dp/0321349601 > > I actually bought the book myself and find it a handy reference. > > What I can say is that any time two threads are likely to access the same > object, which has the potential to be modified by one of them, then you will > need to synchronize access to the object. If the object is only going to be > read during the life of the "unit of work", then you will need not > synchronize it. You shouldn't simply use the synchronize keyword as a > magical "solve all" for threading issues and instead need to understand what > the nature of the interactions are between the threads, if any. In certain > cases it is actually better to duplicate the necessary resources, have each > thread work on its copy and then synchronize the value at the end. > > In the case of your code, you should ask what are the shared objects that > are going to modified by the threads. You should also look if it is even > necessary for the objects to be shared. Also consider whether for the call > cycle the objects you are going to modify are only available on the stack, > as opposed to a class or instance member. > > To give you a real world analogy: consider a home that is being built and > you have an electrician and a plumber: > - is it better to have one wait until the other is finished (serial > execution)? > - is it possible for them to be working on different stuff and not be > stepping on each other's feet? (parallel execution) > - if you need them to work at the same time, what is the cost of > coordinating each other so that >they do not interfere with the other? (synchronization issues) > In many ways multi-threading is not much different, and you should be > asking yourself the same type of questions. > > André-John > > > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > -- Sincerely yours and Best Regards, Xie Xiaodong
Re: Performance with many small requests
On 7-May-2009, at 19:05, David Kerber wrote: Andre-John Mas wrote: That would be my impression too. It is best to avoid making the synchronized scope so large, unless there is a very good reason. David, do you have any reason for this? Beyond the counter, what other stuff do you synchronise? Also, it has generally been recommended to me to avoid hitting the disk in every request, since you may result with an I/O bottle neck, so if you can write the logs in batches you will have better performance. If you know that you are only going to have very few users at a time (say, less than 10), it may not be worth the time optimising this, but if you know that you are going to get at least several hundred, then this is something to watch out for. Thanks for the comments, Andre-John and Peter. When I wrote that app, I didn't know as much as I do now, but I'm still not very knowledgeable about synchronized operations. The synchronized section doesn't do a whole lot, so it doesn't take long to process. My question is, what kinds of operations need to be synchronized? All I do is decrypt the data from the POST, send a small acknowledgement response back to the site, and write the line to the log file. Does that sound like something that would need to be synchronized? If not, pulling that out would be a really easy test to see if it helps my performance issue. I am no expert in this myself, but I know enough to help me out in most day to day scenarios. What you should be reading up on is concurrency in Java. A few useful resources: site: http://java.sun.com/docs/books/tutorial/essential/concurrency/ book: http://www.amazon.com/Java-Concurrency-Practice-Brian-Goetz/dp/0321349601 I actually bought the book myself and find it a handy reference. What I can say is that any time two threads are likely to access the same object, which has the potential to be modified by one of them, then you will need to synchronize access to the object. If the object is only going to be read during the life of the "unit of work", then you will need not synchronize it. You shouldn't simply use the synchronize keyword as a magical "solve all" for threading issues and instead need to understand what the nature of the interactions are between the threads, if any. In certain cases it is actually better to duplicate the necessary resources, have each thread work on its copy and then synchronize the value at the end. In the case of your code, you should ask what are the shared objects that are going to modified by the threads. You should also look if it is even necessary for the objects to be shared. Also consider whether for the call cycle the objects you are going to modify are only available on the stack, as opposed to a class or instance member. To give you a real world analogy: consider a home that is being built and you have an electrician and a plumber: - is it better to have one wait until the other is finished (serial execution)? - is it possible for them to be working on different stuff and not be stepping on each other's feet? (parallel execution) - if you need them to work at the same time, what is the cost of coordinating each other so that they do not interfere with the other? (synchronization issues) In many ways multi-threading is not much different, and you should be asking yourself the same type of questions. André-John - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
Andre-John Mas wrote: On 7-May-2009, at 17:28, Peter Crowther wrote: From: David kerber [mailto:dcker...@verizon.net] The tomcat application simply takes the post request, does a checksum verification of it, decrypts the lightly-encrypted data, and writes it to a log file with the timestamps and site identifiers I mentioned above. Pretty simple processing, and it is all inside a synchronized{} construct: protected synchronized void doPost(HttpServletRequest request, HttpServletResponse response ) throws ServletException, IOException { synchronized ( criticalProcess ) { totalReqCount++; dailyReqCount++; processRequest( request, response, false ); } } Doesn't the "synchronized" in the above mean that you're essentially single-threading Tomcat? So you have all this infrastructure... and that sync may well be the bottleneck. That would be my impression too. It is best to avoid making the synchronized scope so large, unless there is a very good reason. David, do you have any reason for this? Beyond the counter, what other stuff do you synchronise? Also, it has generally been recommended to me to avoid hitting the disk in every request, since you may result with an I/O bottle neck, so if you can write the logs in batches you will have better performance. If you know that you are only going to have very few users at a time (say, less than 10), it may not be worth the time optimising this, but if you know that you are going to get at least several hundred, then this is something to watch out for. Thanks for the comments, Andre-John and Peter. When I wrote that app, I didn't know as much as I do now, but I'm still not very knowledgeable about synchronized operations. The synchronized section doesn't do a whole lot, so it doesn't take long to process. My question is, what kinds of operations need to be synchronized? All I do is decrypt the data from the POST, send a small acknowledgement response back to the site, and write the line to the log file. Does that sound like something that would need to be synchronized? If not, pulling that out would be a really easy test to see if it helps my performance issue. Thanks! D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Performance with many small requests
On 7-May-2009, at 17:28, Peter Crowther wrote: From: David kerber [mailto:dcker...@verizon.net] The tomcat application simply takes the post request, does a checksum verification of it, decrypts the lightly-encrypted data, and writes it to a log file with the timestamps and site identifiers I mentioned above. Pretty simple processing, and it is all inside a synchronized{} construct: protected synchronized void doPost(HttpServletRequest request, HttpServletResponse response ) throws ServletException, IOException { synchronized ( criticalProcess ) { totalReqCount++; dailyReqCount++; processRequest( request, response, false ); } } Doesn't the "synchronized" in the above mean that you're essentially single-threading Tomcat? So you have all this infrastructure... and that sync may well be the bottleneck. That would be my impression too. It is best to avoid making the synchronized scope so large, unless there is a very good reason. David, do you have any reason for this? Beyond the counter, what other stuff do you synchronise? Also, it has generally been recommended to me to avoid hitting the disk in every request, since you may result with an I/O bottle neck, so if you can write the logs in batches you will have better performance. If you know that you are only going to have very few users at a time (say, less than 10), it may not be worth the time optimising this, but if you know that you are going to get at least several hundred, then this is something to watch out for. André-John - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Performance with many small requests
> From: David kerber [mailto:dcker...@verizon.net] > The tomcat application simply takes the post request, > does a checksum verification of it, decrypts the > lightly-encrypted data, > and writes it to a log file with the timestamps and site identifiers I > mentioned above. Pretty simple processing, and it is all inside a > synchronized{} construct: > > protected synchronized void doPost(HttpServletRequest request, > HttpServletResponse response ) > throws ServletException, IOException { > synchronized ( criticalProcess ) { > totalReqCount++; > dailyReqCount++; > processRequest( request, response, false ); > } > } Doesn't the "synchronized" in the above mean that you're essentially single-threading Tomcat? So you have all this infrastructure... and that sync may well be the bottleneck. You could detect this by taking a thread dump in the middle of the day, and seeing whether a significant number of threads were waiting on either of your sync objects. If there are a significant number, consider re-engineering this critical piece of your application to be multi-threaded :-). - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Performance with many small requests
I'm having performance issues with my installation of TC 5.5.15, Java 1.5.0_12, on Windows 2003 server 32 bit, dual-cpu dual-core (4 cores total), 4GB physical RAM. Tomcat startup params: JvmMs = 256 JvmMx = 512 JvmSs = 0 This was the original entry in my server.xml, which has been running for the last year: Just today, I changed it to this, to see if it helps: The performance issue (see description below) has been there all along to a greater or lesser extent, but it just recently became enough of an aggravation for me to try to do something about it, which is why I made the changes to the connector settings. Our application is a data collection server. There are approx 350 sites around the US that transmit a small data packet to us every time a piece of equipment cycles on and off. The transmission is an HTTP POST request, with a data payload of about 60 bytes on average (always less than 100 bytes). All the transmissions go through the customer's corporate network, and out their single internet gateway several states away from us. The total number of data transmissions runs approx 2 million per day, totaling around 200MB in the data log files (including some time stamps and a couple of identifiers added to the raw data). The vast majority of sites are 24 hour operations, so the data never stops flowing. The tomcat application simply takes the post request, does a checksum verification of it, decrypts the lightly-encrypted data, and writes it to a log file with the timestamps and site identifiers I mentioned above. Pretty simple processing, and it is all inside a synchronized{} construct: protected synchronized void doPost(HttpServletRequest request, HttpServletResponse response ) throws ServletException, IOException { synchronized ( criticalProcess ) { totalReqCount++; dailyReqCount++; processRequest( request, response, false ); } } What is happening is that the data transmissions gradually fall behind during the course of the day, to the point that some are 3 or 4 hours behind by the end of the work day, while others are up to the minute, with a full range in between. Then they all gradually catch up over night. I can't find the bottle neck with any tools at my disposal, though I suspect it's the customer's gateway that is the limiting factor. However, I can't go back to them until I rule out all the stuff under my control. So, here's what I've checked so far: Even during the day, our internet connection bw usage rarely goes over 60%, and when it does, it never stays there for any length of time. The cisco router/firewall handling the internet connections averages about 12% cpu usage, and < 30% memory usage. The internal network is all 1Gb from the first switch inside the router, all the way to the TC server. The tomcat instance (tomcat5.exe) on the server never goes over 2% CPU usage, and the memory usage in task manager runs around 300MB (significantly less than the 512 MB I've allowed the JVM). The total memory usage (commit charge) listed in task manager runs right at 1GB. Any and all suggestions for things to check or settings to modify gratefully welcomed! D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org