Hi Jess,

I did some simple tests and was not able to reproduce your performance
observations. Nevertheless I could observe a couple of strange things,
but I doubt, if they are relevant to most use cases.

First my setup:

Apache 2.0.59 worker with mod_jk 1.2.20 and Tomcat 5.5.17 with normal
(non-apr) connectors, using Java 1.5.0_06 on an early Release of Solaris
10. Hardware Sun T-2000 (Niagara), which means relatively slow CPU but
good scalability.

I didn't have the system exclusively, but it was rather idle during the
test.

Client ab from apache 2.0.59. All ab measurements have been verified
with "%D" in the apache access log. No restarts between measurements, so
the file was most likely coming from the file system cache.

Client running either on the same machine, or on a SLES 9 SP2, 64Bit AMD
Opteron connected by 100MBit Ethernet.

Apache and mod_jk compiled with "-mcpu=v9 -O2 -g -Wall". Apache, mod_jk
and Tomcat configured default (apart from ports and log format), JVM for
tomcat with a couple of non-default values:

-server \
-Xms64m -Xmx64m \
-XX:NewSize=8m -XX:MaxNewSize=8m \
-XX:SurvivorRatio=6 -XX:MaxTenuringThreshold=31 \
-XX:+UseConcMarkSweepGC -XX:-UseAdaptiveSizePolicy

File to test throughput had size 316702480 bytes (some .tar.gz I found
lying around).

1) local client, i.e. client running on the same machine as Apache and
Tomcat

A single request took 15.71 sec (mod_jk) (=153.8 MBit/Sec) and 15.61 (TC
HTTP direct) (=154.8 MBit/sec), the same with 10 consecutive -
non-parallel - requests gave 157.1 sec resp. 156.8 sec, so this result
seems to be stable.

Now parallel requests: I used parallelity ("-c" with ab) of 2 4 8 16 32
and the double amount of requests (4, 8, ...):

Throughput results in MBit/sec, depending on concurrency:

       mod_jk  http
conc.
1      153.8   154.8
2      306.3   303.6
4      605.5   627.7
8     1090.0  1185.5
16    1137.7  1161.8
32    1210.7  1114.3

mod_jk and HTTP direct behave almost the same for the huge file. We
saturate the system at about 1100 MBit/second (going via loopback). CPU
was busy at most 60% during these tests.

This also shows, that mod_jk and HTTP throughput is enough to saturate a
lot of bandwidth - as long as your IP stack doesn't add to much overhead
to it.

2) remote client, i.e. ab running on the SLES 9 SP2 x86_64 machine,
connected via 100MBit to Apache and Tomcat.

Throughput results in MBit/sec, depending on concurrency:

       mod_jk  http
conc.
1       88.6    89.1
2       88.9    89.1

So even with only one request in parallel we saturate the network and it
does not make sense to measure more than two parallel requests.

3) Dependancy on file size:

Measuring with local client without concurrency for 50, 100, 200, 300,
400, 500, ..., 1000MB:

       mod_jk  http
  MB
  50   167.5   234.9 (5 consecutive requests)
 100   168.8   170.1 (5 consecutive requests)
 200   168.6   169.8 (2 consecutive requests)
 300   169.1   169.7 (2 consecutive requests)
 400   168.9   169.7 (2 consecutive requests)
 500   168.8   169.4 (2 consecutive requests)
 600   167.9   168.0 (2 consecutive requests)
 700   167.8   168.9 (2 consecutive requests)
 800   168.1   168.6 (2 consecutive requests)
 900   168.0   168.0 (2 consecutive requests)
1000   156.2   214.9 (2 consecutive requests)
2000   156.9   214.7 (1 request)

Interestingly the result for 1000M and for 2000M is reproducible. But as
soon as I switch from the client "ab" to wget or curl (writing output to
/dev/null) I get the same numbers for mod_jk, but for HTTP I get the
same result as for mod_jk!

The numbers are slightly better than in the first test, I guess because
this test was done using a file in the webapps file system, the first
test was done using a file in another file system symlinked from within
webapps (but still a local fs). Another possibilty would be, that a
mkfile generated file has a better block layout in the fs, than a usual
file, which was growing over time.

All in all I think that throughput for huge files is very good in both
cases. I would expect, that most often it would be much more intersting
to inspect scalability and system load (cpu/memory) for massive
concurrency. When serving large files, downloads will run a long time
because most often the client side of the connection is not a fat line.
As a result users will add up in parallel, so one might need to serve a
few thousands of users.

Regards,

Rainer

Jess Holle schrieb:
> Mladen Turk wrote:
>> Jess Holle wrote:
>> > We're seeing a *serious *performance issue with mod_jk and large
>> (e.g. 500MB+) file transfers.  [This is with Apache 2.0.55, Tomcat
>> 5.0.30, and various recent mod_jk including 1.2.20.]
>>
>> SunOS dev12.qa.atl.jboss.com 5.9 Generic_118558-25 sun4u sparc
>> SUNW,Sun-Fire-V210
>>
>> Tomcat:8080
>> Total transferred:      1782932700 bytes
>> HTML transferred:       1782908800 bytes
>> Requests per second:    5.60 [#/sec] (mean)
>>
>> Apache-mod_jk-Tomcat:8009
>> Total transferred:      1782935400 bytes
>> HTML transferred:       1782908800 bytes
>> Requests per second:    3.68 [#/sec] (mean)
> I'm re-reading this once again and:
> 
>   1. This seems like a fairly substantial degradation for an optimized
>      proxy hop, which is what AJP is.
>   2. I'm interested in MB/sec for 500+ MB (generally binary) download
>      transfers, not small HTML pages.
>          * This is significant in that the performance of the transfer
>            seems to degrade the larger the transfer is.
> 
>> Anyhow, why would you like to serve the 500+ MB
>> files trough mod_jk? The entire point is that you
>> have the option to separate the static and dynamic
>> content.
> We have a large, complex content store behind this with dynamic
> Java-based access control logic, etc.  Also contents change over time
> with new check-ins, etc, ala normal version control concepts.  While we
> have more complex things going on in our actual system, the behavior is
> quite reproducible by just dropping a >500MB file in an expanded web app
> doc base and requesting it (with JkMount settings appropriate to ensure
> this is served by mod_jk rather than directly by Apache).
> 
> The files can be of any type, but our market involves a lot of CAD data,
> which can be well over 1GB in size.
> 
> -- 
> Jess Holle
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to