Hi All, I'm finding what I think is incorrect use of gwlist_extract_first in the postgres dlr implementations (it may also exist in others - I've not checked yet). The DLR methods issue 'error's when they fail to return results, etc but subsequent calls to gwlist_extract_first on NULL lists cause 'panic's.
What I'm testing is the situation when the DLR DB is available on start-up (we panic if it is not). If during during normal operation the database is shutdown or temporarily unavailable (network issue, etc). The select fail is an error but results in a panic. 2011-08-10 16:37:43 [18552] [3] ERROR: PGSQL: SELECT count(*) FROM "dlr"; 2011-08-10 16:37:43 [18552] [3] ERROR: PGSQL: FATAL: terminating connection due to administrator command server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 2011-08-10 16:37:43 [18552] [3] ERROR: PGSQL: Select failed! 2011-08-10 16:37:43 [18552] [3] ERROR: PGSQL: Could not get count of DLR table 2011-08-10 16:37:43 [18552] [3] PANIC: gwlib/list.c:309: gwlist_extract_first: Assertion `list != NULL' failed. 2011-08-10 16:37:43 [18552] [3] PANIC: /usr/sbin/bearerbox(gw_panic +0x14b) [0x48b55b] 2011-08-10 16:37:43 [18552] [3] PANIC: /usr/sbin/bearerbox(gwlist_extract_first+0x94) [0x489874] 2011-08-10 16:37:43 [18552] [3] PANIC: /usr/sbin/bearerbox [0x41e3d3] 2011-08-10 16:37:43 [18552] [3] PANIC: /usr/sbin/bearerbox(bb_print_status+0x11d) [0x40edfd] 2011-08-10 16:37:43 [18552] [3] PANIC: /usr/sbin/bearerbox [0x415075] 2011-08-10 16:37:43 [18552] [3] PANIC: /usr/sbin/bearerbox [0x4823cf] 2011-08-10 16:37:43 [18552] [3] PANIC: /lib/libpthread.so.0 [0x2b0e670a9fc7] 2011-08-10 16:37:43 [18552] [3] PANIC: /lib/libc.so.6(clone+0x6d) [0x2b0e67a8664d] The attached patch addresses this (for postgres implementation only - I can check the others if required). Once applied The result on the status page is .. DLR: -1 queued, using pgsql storage And when a DLR is received ... 2011-08-10 16:44:53 [18889] [11] ERROR: PGSQL: FATAL: terminating connection due to administrator command server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 2011-08-10 16:44:53 [18889] [11] ERROR: PGSQL: Select failed! 2011-08-10 16:44:53 [18889] [11] DEBUG: no rows found 2011-08-10 16:44:53 [18889] [11] WARNING: DLR[pgsql]: DLR from SMSC<FOO> for DST<02xxxxxxxxx> not found. 2011-08-10 16:44:53 [18889] [11] ERROR: SMPP[FOO]: got DLR but could not find message or was not interested in it id<534001841355> dst<02xxxxxxxxx>, type<1> Cheers, Alan
Index: gw/dlr_pgsql.c =================================================================== --- gw/dlr_pgsql.c (revision 4916) +++ gw/dlr_pgsql.c (working copy) @@ -185,7 +185,7 @@ if (result == NULL || gwlist_len(result) < 1) { debug("dlr.pgsql", 0, "no rows found"); - while((row = gwlist_extract_first(result))) + while(result && (row = gwlist_extract_first(result)) != NULL) gwlist_destroy(row, octstr_destroy_item); gwlist_destroy(result, NULL); return NULL; @@ -282,7 +282,8 @@ ret = atol(octstr_get_cstr(gwlist_get(gwlist_get(res, 0), 0))); } - gwlist_destroy(gwlist_extract_first(res), octstr_destroy_item); + if ( res != NULL ) + gwlist_destroy(gwlist_extract_first(res), octstr_destroy_item); gwlist_destroy(res, NULL); return ret;