Hi Ben,
Ben Suffolk wrote:
I have been running kannel for a month or so and its been great.
I looked at the outstanding DLRs the earlier today and say a few, and
identified some as phones that I know people are not using any more,
hence no delivery. Thats fine, but then I noticed my number was in the
outstanding DLRs, and after a bit of investigation I knew it was a
message that I had received.
Looking at the debug from the smsc logs (Im using SMPP, with postgresql
as the DLR storage BTW) I see that what has been happening is that the
DLR is actually coming in a fraction faster that the submit_sm_resp
with the message ID in it. (Or at least the receiver thread is before
the transmitter thread).
This means the DLR is being ignored as its not in the table, then its
gets created and put in the table immediately after. So its then
outstanding, and of course the DLR callback is never run.
ok, interesting thing indeed... We need to discuss here if this is a logical PDU
flow "problem" of the SMPP SMSC, or even if we (kannel) misbehave in terms of
how threads are processing... But (!) receiver thread inside the smsc_smpp.c
module handles all PDUs from SMSC. So, if DLR (deliver_sm or data_sm) arrive
before the submit_sm_resp, then I assume this is a logical misbehaviour of SMSC.
I wonder if a) anybody else has come across this, or b) you can think
of any good ways to make sure the DLRs are not lost. e.g. maybe we
store them, and then when we create them we can see the status has
already been updated and trigger the callback?
hmmm... good point. I face also some connectivity issues when connecting 2
independant SMPP client systems with the same SMPP upstream account. Kannel
receives DLRs for which it has no temp data in DLR storage and hence "discards"
the DLRs without any meaningfull processing.
We may put any receiving DLRs that we can't match in teh "DLR MT" storage table
to the "DLR MO" storage table. Hence run 2 tables. When we insert into "DLR MT"
table at the point we receive submit_sm_resp, we may check that there is no
existing entry in "DLR MO" table. If there is, then we have already received a
DLR for this MT message.
This solves 2 issues:
a) DLR MO tables holds any DLRs that can't be resolved... that means external
applications can "fetch" the DLRs from DLR MO table to process further on.
b) "race conditioning" between submit_sm_resp with message id and DLR itself can
be hooked together, so we get the usual HTTP callback even while SMSC sends DLRs
before.
Opinions by the others for this approach?
I suspect its because I am connected directly to an operator as opposed
to an aggregator that I am having this occasional (about 30 messages in
600 over 7 days approx) issue.
I should also say that I set-up and did the operator integration
testing with 1.4.0 as 1.4.1 was not out at the time (came out a couple
of weeks after), so my live service is currently running 1.4.0. I will
upgrade, but first need to be sure of the effects of the upgrade, as
obviously having been thought the integration testing I need to be
careful about using a different version thats does something unexpected
to the connection (in which case I would be in danger of loosing the
operator connection).
1.4.1 has limited COMPATIBILITY BREAKERS, Please check the NEWS file section for
the 1.4.1 release which will indicate any serious changes.
http://www.kannel.org/download/1.4.1/NEWS-1.4.1
In any circumstances 1.4.1 is way BETTER and more RELIABLE then 1.4.0.
So if you think this issue is only with 1.4.0 then no problem, but I
could not see anything in any of the release notes that suggest this
has been identified before.
I don't think this is an issue for 1.4.0 only, regarding the DLR handling issue.
This will be definetly also an issue for 1.4.1 and CVS HEAD.
Stipe
-------------------------------------------------------------------
Kölner Landstrasse 419
40589 Düsseldorf, NRW, Germany
tolj.org system architecture Kannel Software Foundation (KSF)
http://www.tolj.org/ http://www.kannel.org/
mailto:st_{at}_tolj.org mailto:stolj_{at}_kannel.org
-------------------------------------------------------------------