Hi David

Thanks for the explanation.

I am currently using JMeter for performance testing.

When I say performance is degraded, the 90th percentile for a request increased 
from 0.08 seconds (for 50 users) to 18 seconds (for 400 users).

When I increased number of threads from 32(default) to 100, 90th percentile for 
the same request is consistent i.e. around 0.08 seconds (for 400 users). I ran 
test for many times.

I am currently using MarkLogic 6 because of which I can't see the history which 
is only available in MarkLogic 7.

My current server is 16 cores.

Major concern for me here is, As soon as I start the performance test using 
JMeter all 32 threads are active even though the request count is 1 to 5. When 
I change the thread count o 100, the active threads reached 58 thorough out the 
test and came down once the test is completed.

Logically as 32 threads are active server is not able to open any new 
connections and hence latency is increasing when we have thread count as 32. 
When we increased thread count to 100, as active threads reached 58 only, 
server is able to open new connections and serve the requests.

I am running this test directly on ELB i.e. hitting MarkLogic app-server linked 
end point directly through ELB. No other application is involved in this.

Regards,
Gnana(GP)

Message: 2
Date: Tue, 21 Oct 2014 15:10:37 +0000
From: David Lee <david....@marklogic.com>
Subject: Re: [MarkLogic Dev General] Performance issue based on no.     of
        app-server threads
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Message-ID:
        <6ad72d76c2d6f04d8be471b70d4b991e04cd4...@exchg10-be02.marklogic.com>
Content-Type: text/plain; charset="us-ascii"

Hi Gnanaprakash
You haven't provided enough details on exactly how you are measuring 
"performance"
e.g. " and performance is degraded"  and "report looks good"
What tools are you using ?  I highly recommend in addition what your using to 
also take a good look at ML's Monitory History and since you are on AWS, 
CloudWatch.  They can really help you have a comprehensive picture of whats 
going on.

But from what you have provided I can add some information that may be helpful.
The "maximum threads" setting on each app server is the maximum number of 
threads which will respond to requests at any one time.  Depending on your 
application - if it is CPU or IO bound for example, more threads may or may not 
help overall performance - and too many can hurt it.
Suppose your app is mostly CPU bound (generally ML apps are not .. but as an 
example).
Then the maximum total throughput of your system will depend on the number of 
processor cores.
If its an 8 core system then the absolute maximum work the computer could 
theoretically do is 8x the work of a 1 core system.  Adding 10000 threads 
doesn't change that  - it makes it worse because now the system is doing 
application work AND having to swap between threads.

The other extreme if your app is entirely IO bound and each request just sits 
waiting most of the time,
then even on a 1 core system you can effectively use many threads ... the 
latency won't change much however, until you overload the system.  E.g. if it 
task 100ms to do a search, adding 100 threads is not going to make any one 
thread go any faster.   But you MAY be able to service total throughput (total 
aggregated requests/sec) higher with more threads.

The real world applications are in between - some CPU some memory some IO.
TO find out a good range for what # of threads to use it helps to use the 
monitoring tools I mentioned.
There you can see where the time is spent, and run tests increasing the # of 
threads and comparing.
At some point you will see that you get no more work out of the system,  and 
then you will see latency rise because each thread is waiting on other threads 
instead of the disk or the CPU ...
At some point the latency will rise to the point that the request times out.    
At that point you have pushed the system so far that can't keep up.  You don't 
want to do that :)

A good # of threads will depend on your app.  Important to consider is that it 
takes an active thread to "accept" an inbound TCP request.   If no threads are 
available then the TCP request will time out on connect (a much shorter 
timeout)  - this is important - it saves the server from accepting too much 
work that it can't do anything.  Its a safety valve so that you don't overload 
the server and so that the client connections have a strong indication of the 
request is going to work or not - because if you are pushing your system too 
hard, its generally better for a request to not even start (so you get a good 
error condition and don't further block the system)  then it is to get a 
request started that goes half way and cant complete in time (which then slows 
down everything  - possibly making other requests fail too).

Finally there is a short "keepalive" setting you can change.  Once a thread has 
accepted a request it keeps the socket open  a short period of time (HTTP 1.1 
protocol ).  This makes requests more efficient then having to reopen a new 
socket for each request.   But the side effect is a short time where the system 
is holding a socket open trying to optimize for a new request but if the client 
doesn't send any and keeps the socket open too  then that socket is blocked for 
a while.


Put that together .. and you should be able to find a good # of threads for 
your application,
by balancing  total throughput with tolerable latency.   It depends on your use 
case where that balencer should be.  Its also critical to think about if you 
are tuning for sustained continuous load or peak load.
The system can handle short periods of high traffic at rates where it may not 
be able to sustain continuous load.   Also consider what  worse case failure 
mode you want.   If you drive the system say 10x more than it can handle - what 
do you want to happen ?  Generally people want new requests to be blocked 
rather than the entire system being too overloaded, but your app may vary, 
maybe its more important to handle an occasional peak load  with higher threads 
then it is to make sure that you never over stress the system.   I don't 
recommend that without thorough testing ... If you set things too high 
(threads, memory use etc.) it may run over the capacity for the OS itself to 
handle which could cause memory allocation errors, thread creation  or file 
handle errors - this can make the system completely stop. (you won't lose any 
committed transactions but if ML cant allocate more memory or threads  or files 
.. current transactions can be  aborted and the OS itself might kill of
 f the server or become unresponsive.  Generally not desirable, much better to 
throttle things before it gets to that point so the system keeps functioning, 
albeit slower, when given too much work to do.







-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
d...@marklogic.com
Phone: +1 812-482-5224
Cell:  +1 812-630-7622
www.marklogic.com<http://www.marklogic.com/>

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of 
gnanaprakash.bodire...@cognizant.com
Sent: Tuesday, October 21, 2014 9:40 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Performance issue based on no. of app-server 
threads

Hi

I am facing an issue with performance when we have default 32 threads for our 
application server.

We are running performance test with 50/100/400 concurrent users and 
performance is degraded a lot when we are running with 400 concurrent users.

By increasing thread count to 100 from 32, performance is consistent for 
50/100/400 concurrent users and report looks good.

I have observed that as soon as performance test kicks off When thread count is 
32:
The thread count will increase to 32 per node but requests/updates are not more 
than 5.

When thread count is 100:
The thread count will increase to 50-58 per node but requests/updates are not 
more than 5.

The only difference is as we are still having threads to use when we increase 
it to 100, performance looks good.

We are seeing more latency in ELB when thread count is 32 as it's taking more 
time to get connections. But when we increase it to 100 the latency is not 
there.

Can someone help me in understanding why threads are busy even though requests 
are less? Is there any configuration changes I need to do to reduce latency 
with 32 threads?

Server Details:
We are using cluster with 3 nodes with 16 cores each.

Regards,
Gnanaprakash Bodireddy
Sr. Associate - Projects, Dublin Ireland
Mob: +353 87 964 9549
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20141021/036610e0/attachment.html

------------------------------

_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


End of General Digest, Vol 124, Issue 60
****************************************
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to