Thomas makes some good points. I'll try to add a couple more...

Scalability depends both on the performance of the networking components,
and the serialization components. So even though the tests wycklk mentioned
were using a single box, that doesn't make them invalid - merely incomplete.
As Thomas pointed out, that may be deliberate. In the end, the performance
of a solution on a single box is a huge part of the scalability overall.
Sure, you can always add more boxes, but wouldn't you prefer to add 1 box
for every 100 concurrent requests, instead of 1 box for every request?

To Wycklk - some observations:

Of course raw sockets are going to be the fastest method of pushing an
arbitrary 1024 bytes of data. However, without knowing more about your
application, I'd have to say that this may not be a meaningful number. Where
does the data come from? Remoting is doing serialization of the data for
you, even in the case where you are just sending a string. If your data is
coming out of objects, you should include the serialization and
deserialization steps in your timings to make the comparisons apples to
apples.

Using raw sockets, you're are going to be creating your own protocol. This
may be fine for your application, but if you ever need to interoperate at
this level with another system, custom protocols means a more difficult
integration.  There is also the maintenance consideration - you have more
code to take care of.

Also consider the amount of code you have to write. Remoting already handles
the cases of lost sockets, etc. You will have to write that exception
handling yourself. Certainly, if performance is the most important
consideration for you, then you may be willing to invest more development
time.

You may be making a pre-mature optimization. In my experience, the best way
to approach issues like this is to build the system using the simplest
design that solves the problem. Then carefully profile the system to
determine where your bottlenecks are and optimize where the profiling tells
you to. In this case, I would put a simple abstraction around the
communications piece and use Remoting with an interface just because it is
easy. Then profile the system and see how much of your time is spent on the
remoting calls. You may find that it is not a significant contributor and
that your optimization time is best spent elsewhere.

Finally, I have to ask: If your final solution is really going to be on the
same box, is it necessary to leave the process? If you can keep the
processing within the same process - even if it is on another thread - you
are going to see much better performance than invoking a method within
another process.

- Mark Vatsaas
Galileo International

Reply via email to