Hi All,

I am new to MINA and am trying to implement the communication layer of a
distributed system in which all my machines (I have 9 of them) send to each
other similar amounts of data. By that I mean, they send each other same
type of messages and in similar volumes. In the current code that I have for
example, each machine sends to each other machine roughly 8MB of data. That
is each machine sends a total of 64 MB of data to other machines and at the
same time receives 64 MB of data from other machines.

I have been trying to do this efficiently with java io over the last few
weeks and I couldn't and now I want to switch to MINA. Before this decision
here are the numbers I was getting: All of my machines are connected to each
other directly by 1Gbit ethernet cable. Theoretically if there are only 2
machines, reading/writing 64MB of data should take around 500 millis (I
wrote a simple code to just send and receive this much data between 2
machines and in java I get 600millis, which is very close to the theoretical
number). However with the overhead of parsing the data on the receiving
side, the number goes up roughly to 1.5seconds: there is some lock
acquiring, some object allocations, etc.. However when I go from 2 machines
to 9 machines, the number goes from 1.5 seconds to 8 seconds. I have been
trying to nail down exactly where the bottleneck is and although I can't say
I did a very good job, so far I have noticed that a) the receiving of the
messages is in general slow and b) when 8 machines are concurrently writing
to a single machine, their write performance goes down significantly (in
blocking io, sending 8MB of data should take <100 millis, but it takes close
to 1 second. this time is measured on the sender side).

Back to MINA: I wrote a simple code to replicate this scenario and here's
what I did:

   - In the beginning of my code, all my machines establish connections with
   each other (if machine i wants to connect to machine j, and i < j, then i is
   the client and j is the server but once the session is established both
   machines read/write on the same IoSession object). I store the IoSession
   object for each machine in a map. I look at the time when all machines have
   established connections.
   - I then wrote a simple computation thread on each machine and it starts
   generating dummy messages for each other machine and puts the messages into
   outgoing IoBuffers. When an outgoing IoBuffer is full, the computation
   thread wraps them into a simple object called SimpleMessage and writes it to
   the IoSession object for that machine. The IoBuffers are 1MB each. I will
   loop 8 times so a total of 8MB will be sent to each machine.
   - I wrote a MessageEncoder object which simply writes the wrapped
   IoBuffer to the ProtocolEncoderOutput.
   - For receiving data there is a MessageDecoder that extends
   CumulativeProtocolDecoder. This decoder keeps skipping over the available
   data in the IoBuffer keeping track of how much of the 8MB it has skipped.
   When it is done skipping the entire 8MB of data, it says it's done. And I
   look at System.currentTimeMillis when this happens.
   - I have a dummy MessageHandler that doesn't do anything. It just logs
   the events and I see that each time my computation thread writes to a
   IoSession object, I get a MessageSent event as expected.

I haven't yet optimized MINA, I am not that familiar with the parameters
yet. But the numbers I have been getting has been similar to the numbers
with regular io (again for 64MB of data about 1.5 seconds between 2
machines, 7-8 seconds between 9 machines, but without any parsing). I am
using Mina 2.0.3. My questions are

   1. Is there a major flaw or inefficiency in this kind of design given my
   communication pattern?
   2. I have noticed that my decoder is being called almost always in every
   64KB. Is there a hidden 64KB buffer somewhere that I can maybe make bigger
   to see whether or not it affects the performance.

Also I realized that 64MB is very small but this is just for testing. Yet I
think it should take much less to send this much data across machines but
for some reason I have failed to find the right implementation.

Sorry for the length of the email. Any help/suggestions is much appreciated.
I can paste code snippets if anybody is interested. And finally, thank you
very much in advance :).

semih

Reply via email to