Hi,
Okay, here's a not trivial patch.
I really hope one of the commiters can have a look at it, because
I spent some time writing it, debugging it, thoroughly testing it
including under high network load, documenting it, I made special
caution to try and follow doc/CodingStyle, and I will spend time
now to explain what it does and why the important modifications
were necessary.
1. Motivation
We use Postgresql databases; and current Kannel performance when
DLR are enabled, using sdb/postgres, drops terribly. In order to
decide for the best approach (make sdb use persistent connection
in the postgres driver? write a native postgres driver in Kannel?
write dbpool support for sdb in Kannel? a combination of some?),
I decided I needed the fake SMSC to support more than just always
success DLRs (e.g. failures and buffered messages), so that we
can simulate a more real situation.
2. Fake SMSC protocol background
The fake SMSC and Kannel communicate via a (local) TCP
connection. Messages are made of one line ("\n" terminated),
containing space separated fields indicating the parameters of
the message. They are similar in both ways (MO fakesmsc->kannel
and MT kannel->fakesmsc). No answer is needed nor performed,
and in particular for MT messages Kannel will always report
DLR_SUCCESS when a DLR was requested.
3. Addition of more DLR statuses and impact on the protocol
Given that, adding failed messages, and even SMSC failures, would
be easy.
However, adding buffered messages is more complicated, since when
we eventually report the success (or failure) of the message, we
need something to uniquely identify the same message. Hence, the
need to make the protocol more complicated, by sending a reply
back to Kannel for each MT message, containing this ID (routinely
called a timestamp in Kannel).
This also requires that the timestamp used for the DLR inserted
when calling `dlr_add' is really a SMSC-provided timestamp, and
explains the need to completely change the location of `dlr_add'
in `gw/smsc/smsc_fake.c'.
Since the fake SMSC might have decided to send a MO message at
the same time Kannel sent a MT message requesting a DLR, we need
a synchronization mechanism, recognizing if a message from the
fake SMSC is a reply or a MO message; and in case we're waiting
for a reply and actually receive a MO message, we need to fill a
local buffer with this MO message and loop again until we receive
this damn reply. At next loop, before examining if MO messages
were received, this buffer is flushed. This is another important
point of the proposed implementation, explaining the need for
`List *in_buffer' added in `main_connection_loop', and why the
algorithm inside this function has complicated a bit.
4. Changes in fake SMSC
Changes in the fake SMSC are somewhat parallel to above described
changes in Kannel, of course. I added the possibility of
specifying the "dlr" keyword before "type" in the message, so
that the fake SMSC knows if it should reply with a DLR
(otherwise, it will also reply, so that the protocol and
synchronization stay the same, but always with the success
value). We can control on the commandline, with three new
parameters, the ratio of messages that should be replied to with
delivery failure, smsc failure, or buffered message. Each
buffered message will be kept in a list, a random value from 0 to
10 will decide in how many seconds the final success delivery
status should be delivered. Both of these (0..10 seconds; final
delivery status == success) are hardcoded, because I didn't think
it would be useful to change them, but of course these can be
parameterized with commandline in the future if needed.
5. Performance
Ok, that should be it for a basic overview of the changes. Now,
here are the results I could measure on my machine, which is a
fairly old Dell with a p3-500 and IDE disks. I test with
bearerbox, smsbox, fakesmsc, dlr receiver, get-url receiver, and
the database on the same machine. The test basically is to send
as many MO requests as possible with fakesmsc on the commandline,
and measure how many answers (MT messages) I can get on a
maximum. Please note that the sms-service is using a get-url.
Without any DLR, with this new protocol (which adds one network
transmission per MT message, the answer), the maximum is 35
messages per second.
Requesting a DLR, with 0.1 (that is 10%) of failed messages and
0.3 (that's 30%) of buffered messages (each one adding more
network transfers and database operations), using sdb/postgres,
that drops down to 7.5 messages per second. I can hear the
harddisk going a bit crazy :).
I added the support for persistent connections in the postgres
SDB driver (this has already been accepted by SDB author as his
0.5.3 release) (the Kannel documentation is incorrect about SDB
and persistent connections, I will submit a fix), and my figures
went up to 17 messages per second, which is already really nice I
think, if you consider the mandatory overhead of DLRs in term