Hi David Thanks for the explanation.
I am currently using JMeter for performance testing. When I say performance is degraded, the 90th percentile for a request increased from 0.08 seconds (for 50 users) to 18 seconds (for 400 users). When I increased number of threads from 32(default) to 100, 90th percentile for the same request is consistent i.e. around 0.08 seconds (for 400 users). I ran test for many times. I am currently using MarkLogic 6 because of which I can't see the history which is only available in MarkLogic 7. My current server is 16 cores. Major concern for me here is, As soon as I start the performance test using JMeter all 32 threads are active even though the request count is 1 to 5. When I change the thread count o 100, the active threads reached 58 thorough out the test and came down once the test is completed. Logically as 32 threads are active server is not able to open any new connections and hence latency is increasing when we have thread count as 32. When we increased thread count to 100, as active threads reached 58 only, server is able to open new connections and serve the requests. I am running this test directly on ELB i.e. hitting MarkLogic app-server linked end point directly through ELB. No other application is involved in this. Regards, Gnana(GP) Message: 2 Date: Tue, 21 Oct 2014 15:10:37 +0000 From: David Lee <david....@marklogic.com> Subject: Re: [MarkLogic Dev General] Performance issue based on no. of app-server threads To: MarkLogic Developer Discussion <general@developer.marklogic.com> Message-ID: <6ad72d76c2d6f04d8be471b70d4b991e04cd4...@exchg10-be02.marklogic.com> Content-Type: text/plain; charset="us-ascii" Hi Gnanaprakash You haven't provided enough details on exactly how you are measuring "performance" e.g. " and performance is degraded" and "report looks good" What tools are you using ? I highly recommend in addition what your using to also take a good look at ML's Monitory History and since you are on AWS, CloudWatch. They can really help you have a comprehensive picture of whats going on. But from what you have provided I can add some information that may be helpful. The "maximum threads" setting on each app server is the maximum number of threads which will respond to requests at any one time. Depending on your application - if it is CPU or IO bound for example, more threads may or may not help overall performance - and too many can hurt it. Suppose your app is mostly CPU bound (generally ML apps are not .. but as an example). Then the maximum total throughput of your system will depend on the number of processor cores. If its an 8 core system then the absolute maximum work the computer could theoretically do is 8x the work of a 1 core system. Adding 10000 threads doesn't change that - it makes it worse because now the system is doing application work AND having to swap between threads. The other extreme if your app is entirely IO bound and each request just sits waiting most of the time, then even on a 1 core system you can effectively use many threads ... the latency won't change much however, until you overload the system. E.g. if it task 100ms to do a search, adding 100 threads is not going to make any one thread go any faster. But you MAY be able to service total throughput (total aggregated requests/sec) higher with more threads. The real world applications are in between - some CPU some memory some IO. TO find out a good range for what # of threads to use it helps to use the monitoring tools I mentioned. There you can see where the time is spent, and run tests increasing the # of threads and comparing. At some point you will see that you get no more work out of the system, and then you will see latency rise because each thread is waiting on other threads instead of the disk or the CPU ... At some point the latency will rise to the point that the request times out. At that point you have pushed the system so far that can't keep up. You don't want to do that :) A good # of threads will depend on your app. Important to consider is that it takes an active thread to "accept" an inbound TCP request. If no threads are available then the TCP request will time out on connect (a much shorter timeout) - this is important - it saves the server from accepting too much work that it can't do anything. Its a safety valve so that you don't overload the server and so that the client connections have a strong indication of the request is going to work or not - because if you are pushing your system too hard, its generally better for a request to not even start (so you get a good error condition and don't further block the system) then it is to get a request started that goes half way and cant complete in time (which then slows down everything - possibly making other requests fail too). Finally there is a short "keepalive" setting you can change. Once a thread has accepted a request it keeps the socket open a short period of time (HTTP 1.1 protocol ). This makes requests more efficient then having to reopen a new socket for each request. But the side effect is a short time where the system is holding a socket open trying to optimize for a new request but if the client doesn't send any and keeps the socket open too then that socket is blocked for a while. Put that together .. and you should be able to find a good # of threads for your application, by balancing total throughput with tolerable latency. It depends on your use case where that balencer should be. Its also critical to think about if you are tuning for sustained continuous load or peak load. The system can handle short periods of high traffic at rates where it may not be able to sustain continuous load. Also consider what worse case failure mode you want. If you drive the system say 10x more than it can handle - what do you want to happen ? Generally people want new requests to be blocked rather than the entire system being too overloaded, but your app may vary, maybe its more important to handle an occasional peak load with higher threads then it is to make sure that you never over stress the system. I don't recommend that without thorough testing ... If you set things too high (threads, memory use etc.) it may run over the capacity for the OS itself to handle which could cause memory allocation errors, thread creation or file handle errors - this can make the system completely stop. (you won't lose any committed transactions but if ML cant allocate more memory or threads or files .. current transactions can be aborted and the OS itself might kill of f the server or become unresponsive. Generally not desirable, much better to throttle things before it gets to that point so the system keeps functioning, albeit slower, when given too much work to do. ----------------------------------------------------------------------------- David Lee Lead Engineer MarkLogic Corporation d...@marklogic.com Phone: +1 812-482-5224 Cell: +1 812-630-7622 www.marklogic.com<http://www.marklogic.com/> From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of gnanaprakash.bodire...@cognizant.com Sent: Tuesday, October 21, 2014 9:40 AM To: general@developer.marklogic.com Subject: [MarkLogic Dev General] Performance issue based on no. of app-server threads Hi I am facing an issue with performance when we have default 32 threads for our application server. We are running performance test with 50/100/400 concurrent users and performance is degraded a lot when we are running with 400 concurrent users. By increasing thread count to 100 from 32, performance is consistent for 50/100/400 concurrent users and report looks good. I have observed that as soon as performance test kicks off When thread count is 32: The thread count will increase to 32 per node but requests/updates are not more than 5. When thread count is 100: The thread count will increase to 50-58 per node but requests/updates are not more than 5. The only difference is as we are still having threads to use when we increase it to 100, performance looks good. We are seeing more latency in ELB when thread count is 32 as it's taking more time to get connections. But when we increase it to 100 the latency is not there. Can someone help me in understanding why threads are busy even though requests are less? Is there any configuration changes I need to do to reduce latency with 32 threads? Server Details: We are using cluster with 3 nodes with 16 cores each. Regards, Gnanaprakash Bodireddy Sr. Associate - Projects, Dublin Ireland Mob: +353 87 964 9549 This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful. Where permitted by applicable law, this e-mail and other e-mail communications sent to and from Cognizant e-mail addresses may be monitored. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://developer.marklogic.com/pipermail/general/attachments/20141021/036610e0/attachment.html ------------------------------ _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general End of General Digest, Vol 124, Issue 60 **************************************** This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful. Where permitted by applicable law, this e-mail and other e-mail communications sent to and from Cognizant e-mail addresses may be monitored. _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general