Trustin,
I will try to gather more info and will let you know what I find. Maybe I can get a trial copy of YourKit. I have included our previous correspondence in this response, since I was not posting to the dedicated Mina list, and so that others can see my problem.
Thanks!
Colin
-----Original Message-----
From: Trustin Lee <[EMAIL PROTECTED]>
Sent: Apr 19, 2006 12:20 PM
To: [email protected], Colin Cullen <[EMAIL PROTECTED]>
Subject: Re: ByteBuffer usage, performance, leak
Hi Colin,
-----Original Message-----
From: Trustin Lee <[EMAIL PROTECTED]>
Sent: Apr 19, 2006 12:20 PM
To: [email protected], Colin Cullen <[EMAIL PROTECTED]>
Subject: Re: ByteBuffer usage, performance, leak
Hi Colin,
On 4/20/06, Colin Cullen <[EMAIL PROTECTED]> wrote:
Yes, it's correct. One filter will take about three or four calls.
There's a class called 'SocketSessionManagerFilterChain' in 0.8.2. It calls SocketIoProcession.flushSession(IoSession) to trigger the selector loop to wake up and flush the buffer. You could take a look at there to find out what's going on under the hood.
Actually I never experienced this problem so far.
We use YourKit Profiler, and don't have a license for JProbe unfortunately. Is there any free viewer that shows JProbe snapshot? Otherwise JProbe could export the data into a HTML page (YourKit does). Heap dump will be very helpful to us. If you need a quick assistance, you could send me the code privately so I can test it by myself.
The problem has been isolated to the point where the out.write is called to send out the deal ticket message, and when the decoder is called for the inbound deal ticket ack message. I am running with 5 filters but have tried disabling 3 of them (the other 2 are essential) without any change in performance or performance degredation. I did note, during the course of the development / debug cycle, that the call stack for Mina's filter mechanism can get really deep, but I assumed that this is normal behavior, is that correct?
Yes, it's correct. One filter will take about three or four calls.
I have debugged through some of Mina's code trying to understand the flow and possible find where things are getting slow. It looks like the session's out.write is decoupled from the actual socket write by a queue (or queues) which naturally require synchronization. Any pointers for me here as to where to look regarding my problem?
There's a class called 'SocketSessionManagerFilterChain' in 0.8.2. It calls SocketIoProcession.flushSession(IoSession) to trigger the selector loop to wake up and flush the buffer. You could take a look at there to find out what's going on under the hood.
I can understand if performance is just not where I need it to be, but what I can't understand is that the performance starts out good but degrades as time goes on, this indicates a bottleneck somewhere. I also wonder why latency differs so widely in my test run. Is the outbound data sent in a bursty fashion?Do you know if the SUN NIO ByteBuffer leakage problems have been addressed in Java 1.5? In Mina 0.8.2?
Actually I never experienced this problem so far.
I'm not an expert in JProbe, but would a snapshot be of use?
We use YourKit Profiler, and don't have a license for JProbe unfortunately. Is there any free viewer that shows JProbe snapshot? Otherwise JProbe could export the data into a HTML page (YourKit does). Heap dump will be very helpful to us. If you need a quick assistance, you could send me the code privately so I can test it by myself.
Sorry that I cannot help you enough this time, but I believe we'll be able to fix it! :)
Trustin
--
what we call human nature is actually human habit
--
http://gleamynode.net/
--
PGP key fingerprints:
* E167 E6AF E73A CBCE EE41 4A29 544D DE48 FE95 4E7E
* B693 628E 6047 4F8F CFA4 455E 1C62 A7DC 0255 ECA6
Hi Trustin,
(Note: the following discussion is related to Mina 0.8.2.)
Thanks for the quick response! I am using JProbe to profile the Mina based server. JProbe indicates a consistent memory leak which seems to coincide with whenever a deal ticket message is sent. It also indicates a memory leak if I just let the heartbeat message run (using Mina's idle feature for this). The heartbeat message is under 100 bytes, to which the client app responds with a message of about 50 bytes, it runs when there is no other msg exchange activity. If I let heartbeat scenario JProbe will again report a consistent leak. GC occurs but never seems to recover completely and memory consumption continues to rise throughout the run.
As I stated earlier the server is also having performance problems. It starts out fine, able to send a deal ticket and receive a deal ticket ack with a latency of about 30 ms, but this time deterioriates at time go on. For example, I ran a test for about 6 hours last night where latency started out at about 30 ms, but by the end of the run latency was about 3000 ms. It is interesting to note that although the average latency increases with time, when I inspect the individual round trips toward the later parts of the test run, I see that the latency times vary greatly, anywhere between 30 ms and 3000 ms, but with the majority of the latency times being toward the high end.
I am thinking that the leak and performance problems are related since they seem to occur under the same run conditions, although I am still digging for clues.
I have searched the Mina board and I noticed others discussing ByteBuffers with respect to leakage. I believe I read that these problems were fixed and have adjusted my code accordingly, as per my coding example below. I have also discovered that SUN had leakage problems regarding NIO and ByteBuffers in 1.4.x. I am using Java 1.5 -- I am still researching in this area as I just uncovered this fact last night.
I performed some timings on Mina's demuxing decoder model, thinking that either the model and/or my NIO buffer parsing code was slow, but on the contrary is runs very fast and usually completes the decoding into a message all within the same millisecond.
The problem has been isolated to the point where the out.write is called to send out the deal ticket message, and when the decoder is called for the inbound deal ticket ack message. I am running with 5 filters but have tried disabling 3 of them (the other 2 are essential) without any change in performance or performance degredation. I did note, during the course of the development / debug cycle, that the call stack for Mina's filter mechanism can get really deep, but I assumed that this is normal behavior, is that correct?
I have debugged through some of Mina's code trying to understand the flow and possible find where things are getting slow. It looks like the session's out.write is decoupled from the actual socket write by a queue (or queues) which naturally require synchronization. Any pointers for me here as to where to look regarding my problem?
I can understand if performance is just not where I need it to be, but what I can't understand is that the performance starts out good but degrades as time goes on, this indicates a bottleneck somewhere. I also wonder why latency differs so widely in my test run. Is the outbound data sent in a bursty fashion?
Do you know if the SUN NIO ByteBuffer leakage problems have been addressed in Java 1.5? In Mina 0.8.2?
Can you point me to the points where I should look to instrument Mina when the ticket message is written and where the ticket ack message is recieved?
I always suspect my own code first but things don't appear to be pointing in that direction, unless there is some error in my usage of ByteBuffer objects.
I am thinking of using a aspect to checkout the timings on the methods between the out.write and upto the decoding of the message on the inbound side.
I'm not an expert in JProbe, but would a snapshot be of use?
I appreciate your help and appreciate any help you or the board members can give me in this area.
Thanks.
Colin
Colin
-----Original Message-----
From: Trustin Lee
Sent: Apr 18, 2006 1:03 AM
To: Apache Directory Developers List , [EMAIL PROTECTED]
Subject: Re: ByteBuffer usage, performance, leak
From: Trustin Lee
Sent: Apr 18, 2006 1:03 AM
To: Apache Directory Developers List , [EMAIL PROTECTED]
Subject: Re: ByteBuffer usage, performance, leak
Hi Colin,
I am seeking some help regarding Mina. I have completed development of a server using Mina 0.8.2. This server will be providing financial deal ticket information to multiple clients. Functionally everything works great but I do seem to have a memory leak and slowly degrading performance. I am testing using a single production level client (not Mina based). The server protocol sends a deal ticket down to the client and then awaits a high level acknowledgement from the client indicating whether the ticket has passed or failed client side validation. The performance degradation occurs between the writing of the message object (deal ticket) at the protocol level and the reception of the response message in its message handler. BTW I am using Mina's DemuxingProtocolCodecFactory and DemuxingProtocolHandler to model the server and provide the codec/handler glue. I am concerned about whether I am using Mina's ByteBuffer properly. I have also read on the board that Mina will automatically place an allocated ByteBuffer in its ByteBuffer pool.
MINA 0.8 doesn't release pooled buffers once allocated, so the pool size will never decrease. This problem has been addressed in 0.9.3 which has been released recently. Except that, there's no known memory leak in 0.8. To prevent MINA from pooling the allocated buffers, just acquire them once more, for example:
ByteBuffer buf = ByteBuffer.allocate(....);
buf.acquire(); // acquire once more.
buf.acquire(); // acquire once more.
My output buffer requirements are pretty minimal. The deal ticket message is around 1K bytes. I allocate a single ByteBuffer for each session and keep it in the session object, which is stored in Mina's session context map, when a deal ticket needs to be written to the client the pre-allocated ByteBuffer is retrieved from the session context object and used for building the outbound data set. Here are the actual code fragments:
This factory creates a session context object for each connection:
public class SessionContextFactory
{
static private SessionContextFactory instance = new SessionContextFactory();
static public SessionContextFactory getInstance()
{
return SessionContextFactory.instance;
}
private SessionContextFactory()
{
}
public SessionContext createSessionContext(ProtocolSession session)
{
return new SessionContext(session);
}
public final class SessionContext implements SessionContext, LifeSpanControllable
{
ByteBuffer dataBuffer;
private SessionContext(ProtocolSession session)
{
dataBuffer = ByteBuffer.allocate(1024);
}
public ByteBuffer getDataBuffer()
{
dataBuffer.acquire();'
return dataBuffer;
}
}
}
This class performs the message encoding using the single ByteBuffer retrieved from the session context object:
public abstract class EasiAbstractMessageEncoder implements MessageEncoder
{
private final String type;
protected EasiAbstractMessageEncoder(String type)
{
this.type = type;
}
protected String getType()
{
return type;
}
public void encode(ProtocolSession session, Object message, ProtocolEncoderOutput out) throws ProtocolViolationException
{
SessionContext sessCtx = (SessionContext)session.getAttribute(ServerConstants.SESSION_CONTEXT);
EasiAbstractMessage m = (EasiAbstractMessage)message;
ByteBuffer buf = sessCtx.getDataBuffer();
buf.clear();
// Encode the header
encodeHeader(session, m, buf); // will perform several buf.put() calls
// Encode the body
encodeBody(session, m, buf); // will perform many buf.put() calls
buf.flip();
out.write(buf);
}
protected abstract void encodeMetadata(ProtocolSession session, EasiAbstractMessage message, ByteBuffer buf);
protected abstract void encodeHeader(ProtocolSession session, EasiAbstractMessage message, ByteBuffer buf);
protected abstract void encodeBody(ProtocolSession session, EasiAbstractMessage message, ByteBuffer buf);
}
As you can see the buffer is acquired every time it is retrieved from the session context object. Apparently Mina is releasing the buffer in the out.write(buf) call, is that true?
Yes, it's true. This also means your dataBuffer will never return to the MINA buffer pool because you acquired the buffer once more.
At a high level, what this server needs to do is to be able to send 40 ? 50 1K deal ticket per second down to a single client (with the nagle algorithm disabled). BTW the client is very fast and almost always responds to the deal ticket message within the same millisecond in which it was received.
Is it reasonable to expect this kind of performance from Mina?
Actually myself didn't push MINA that much. Anyone in the list might give you some information.
Also worth noting is that the decoding, handling, retrieval of deal ticket from the database, encoding and all of the related parsing etc. is very fast and usually only takes around 10 to 20 milliseconds to complete.
Your code looks OK to me. It would be nice if you can give us heap profiler data so we can find out which objects are dominant.
Any help would be greatly appreciated! (Sorry for being so long winded!)
No problem. Thank you for detailed information! :)
Thanks for all who work on Mina, a very elegant design which renders NIO much more approachable.
Your feedback will make MINA more robust and beautiful. Please don't hesitate! ;)
HTH,
Trustin
--
what we call human nature is actually human habit
--
http://gleamynode.net/
--
PGP key fingerprints:
* E167 E6AF E73A CBCE EE41 4A29 544D DE48 FE95 4E7E
* B693 628E 6047 4F8F CFA4 455E 1C62 A7DC 0255 ECA6
HTH,
Trustin
--
what we call human nature is actually human habit
--
http://gleamynode.net/
--
PGP key fingerprints:
* E167 E6AF E73A CBCE EE41 4A29 544D DE48 FE95 4E7E
* B693 628E 6047 4F8F CFA4 455E 1C62 A7DC 0255 ECA6
