RE: [BUG] list implementation too slow.
> Lately we've been doing some very high capacity testing on > Kannel , and > found out some interesting stuff. mainly as queues (managed by lists) > are filling up to over a few hundreds of messages, the boxes start > thrashing. > I think this is directly related to the List implementation - > it's just > too . when we have more then a few hundreds > of messages > in the List, extracting one item can sometimes take anywhere > from 2 to 4 > seconds (!!!). > > Does any one have any information or experience regarding that ? Do you have any insight into which particular list_() functions are causing this ? Some of the comments in list.c suggest possible inefficiencies, notably where the list has to grow to accomodate insert/append operations. Also, all of the list search functions are linear. How are you measuring time spend in functions ? Quantify ? <>
Re: [BUG] list implementation too slow.
Hi Oded & List Oded Arbel wrote: > > Hi list. > > Lately we've been doing some very high capacity testing on Kannel , and > found out some interesting stuff. mainly as queues (managed by lists) > are filling up to over a few hundreds of messages, the boxes start > thrashing. > I think this is directly related to the List implementation - it's just > too . when we have more then a few hundreds of messages > in the List, extracting one item can sometimes take anywhere from 2 to 4 > seconds (!!!). > > Does any one have any information or experience regarding that ? I send 50 msg through Kannel and had a similar experience. Kannel does have a peak performance of 350 msg/s (with test_cimd2 sending true service request SMs and hello world server answering them; requests were http ones over our intranet). But long time performance lags, bearerbox grabbing most of cpu time. And yes, this happens when queues grow long. (As an aside, Kannel *without* https does 1300 msg/s. And "long term" means about an hour.) Aarno
RE: [BUG] list implementation too slow.
Hi Aarno. I think this problem occurs as the bearerbox can't send messages as fast as they are delivered (probably due to a large burst and then sustained high load). I see messages queued on the module's queue, and then it slows doesn the module more, so it can handle less messages (while high load continues) and so the queue grows longer. Possible solution would be to delay message delivery to the module if its queue is to long - using the module's add_msg_cb(), but I think that that would only cause the bearerbox's MT queue to grow too long and thrash bearerbox's routing thread - I've seen the same behaviour in smsbox: when MOs were delivered too fast, smsbox started thrashing. we solved that problem by running a few more smsboxs on the same machine, that way the smsbox doesn't thrash, but since MOs are now delivered as fast as they come, so are MTs, and those cause bearerbox to thrash. BTW - we do not use HTTPS, but only HTTP calls (to the application). the module in use is custom made. -- Oded Arbel m-Wise Inc. [EMAIL PROTECTED] "There is always a holiday somewhere." -- Avi Savag > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > Sent: Thursday, March 21, 2002 12:34 PM > To: Oded Arbel > Cc: Kannel-devel (E-mail) > Subject: Re: [BUG] list implementation too slow. > > > Hi Oded & List > > Oded Arbel wrote: > > > > Hi list. > > > > Lately we've been doing some very high capacity testing on > Kannel , and > > found out some interesting stuff. mainly as queues (managed > by lists) > > are filling up to over a few hundreds of messages, the boxes start > > thrashing. > > I think this is directly related to the List implementation > - it's just > > too . when we have more then a few hundreds > of messages > > in the List, extracting one item can sometimes take > anywhere from 2 to 4 > > seconds (!!!). > > > > Does any one have any information or experience regarding that ? > > I send 50 msg through Kannel and had a similar experience. Kannel > does > have a peak performance of 350 msg/s (with test_cimd2 sending true > service > request SMs and hello world server answering them; requests were http > ones > over our intranet). But long time performance lags, bearerbox grabbing > most > of cpu time. And yes, this happens when queues grow long. > > (As an aside, Kannel *without* https does 1300 msg/s. And "long term" > means > about an hour.) > > Aarno >
RE: [BUG] list implementation too slow.
I use list_extract_first to grab messages from the queue, and list_append to add messages to the queue - both are very slow if the list gets too long (a couple of handreds can cause noticable slowing). except for calling list_len on every list_append, I don't use any other list_ functions. I have a debug output on the loops that do append and extract. normally the tight extract loop should extract MAX_MESSAGES in under a second, but as the list grows, it takes it more time to complete the loop. -- Oded Arbel m-Wise Inc. [EMAIL PROTECTED] Did you know that "if" is the middle of the word "life"? -- from "Apocalypse Now" > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED]]On Behalf Of Paul Keogh > Sent: Thursday, March 21, 2002 12:37 PM > To: 'Kannel-devel (E-mail)' > Subject: RE: [BUG] list implementation too slow. > > > > Lately we've been doing some very high capacity testing on > > Kannel , and > > found out some interesting stuff. mainly as queues (managed > by lists) > > are filling up to over a few hundreds of messages, the boxes start > > thrashing. > > I think this is directly related to the List implementation - > > it's just > > too . when we have more then a few hundreds > > of messages > > in the List, extracting one item can sometimes take anywhere > > from 2 to 4 > > seconds (!!!). > > > > Does any one have any information or experience regarding that ? > > Do you have any insight into which particular list_() functions are > causing this ? Some of the comments in list.c suggest possible > inefficiencies, notably where the list has to grow to accomodate > insert/append operations. Also, all of the list search functions > are linear. > > How are you measuring time spend in functions ? Quantify ? > > >
RE: [BUG] list implementation too slow.
On Thu, 21 Mar 2002, Oded Arbel wrote: > I think this problem occurs as the bearerbox can't send messages as fast > as they are delivered (probably due to a large burst and then sustained > high load). I see messages queued on the module's queue, and then it > slows doesn the module more, so it can handle less messages (while high > load continues) and so the queue grows longer. The solution to this is to fix all code so that outgoing messages have higher priority than incoming ones, i.e. do not read anything from bufefrs before outgoing messages have been sent. NOTE: You need to set conn() buffer size, otherwise this does not help as things get added to its internal buffer... I did this for my modified version and got rid of growing size (in memory) and slowdown. (however, I do not know how SMSC would react if the socket gets filled up because the other end is not reading it fast enough..) -- &kalle marjola
Re: [BUG] list implementation too slow.
Hi, Hi, Kalle Marjola wrote: > > On Thu, 21 Mar 2002, Oded Arbel wrote: > > > I think this problem occurs as the bearerbox can't send messages as fast > > as they are delivered (probably due to a large burst and then sustained > > high load). I see messages queued on the module's queue, and then it > > slows doesn the module more, so it can handle less messages (while high > > load continues) and so the queue grows longer. > > The solution to this is to fix all code so that outgoing messages > have higher priority than incoming ones, i.e. do not read anything > from bufefrs before outgoing messages have been sent. NOTE: You need > to set conn() buffer size, otherwise this does not help as things > get added to its internal buffer... I did this for my modified > version and got rid of growing size (in memory) and slowdown. > > (however, I do not know how SMSC would react if the socket gets filled up > because the other end is not reading it fast enough..) If one Kannel cannot handle the traffic, we must use many of them (many bearer- boxes) in an array. But problem here is the slowdown: Kannel start with very good 350 msg/s but after an hour does only 50 msg/s. And bearerbox grabs about half of the memory :( Aarno
RE: [BUG] list implementation too slow.
The slowdown is thrashing. I'm not sure exactly where it is happening (can any one recomend a decent profiler for linux ?) but, like I said, my guess is its the list implementation. the memory hogging means leaks - I'm about sure that there are no leaks left (or at least no serious ones) in the boxes themselves, so you have to make sure that the smsc module you are using doesn't leak memory. -- Oded Arbel m-Wise Inc. [EMAIL PROTECTED] Next Friday will not be your lucky day. As a matter of fact, you don't have a lucky day this year. > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > Sent: Thursday, March 21, 2002 3:06 PM > To: Kalle Marjola > Cc: Kannel-devel (E-mail) > Subject: Re: [BUG] list implementation too slow. > > > Hi, > Hi, > > Kalle Marjola wrote: > > > > On Thu, 21 Mar 2002, Oded Arbel wrote: > > > > > I think this problem occurs as the bearerbox can't send > messages as fast > > > as they are delivered (probably due to a large burst and > then sustained > > > high load). I see messages queued on the module's queue, > and then it > > > slows doesn the module more, so it can handle less > messages (while high > > > load continues) and so the queue grows longer. > > > > The solution to this is to fix all code so that outgoing messages > > have higher priority than incoming ones, i.e. do not read anything > > from bufefrs before outgoing messages have been sent. NOTE: You need > > to set conn() buffer size, otherwise this does not help as things > > get added to its internal buffer... I did this for my modified > > version and got rid of growing size (in memory) and slowdown. > > > > (however, I do not know how SMSC would react if the socket > gets filled up > > because the other end is not reading it fast enough..) > > If one Kannel cannot handle the traffic, we must use many of > them (many > bearer- > boxes) in an array. But problem here is the slowdown: Kannel > start with > very > good 350 msg/s but after an hour does only 50 msg/s. And > bearerbox grabs > about > half of the memory :( > > Aarno > >
RE: [BUG] list implementation too slow.
Currently I'm using a simple rate limiting which is made up two aspects : a. I send upto 20 messages, and then listen for up to 20 replies, and so on. come to think about it, it's not so good as I'm supposed to get at least 2 replies for each message - one ACK and one delivery report. but we'll see. b. is the module's add_msg_cb is called and the MT queus is over this much full, the callback sleeps a bit - to slow down the filling of the queue so that the module will have time work on the queue. I've set the conn buffer to 4K, but I've used to flush it after each write. I changed that, because I think its one of the things that slows me down, so I only flush after a batch of sends (which averages to about 3K of data). -- Oded Arbel m-Wise Inc. [EMAIL PROTECTED] "One question: How come the .44 magnum is the worlds only usable point and click interface?" -- Alan Cox > -Original Message- > From: Kalle Marjola [mailto:[EMAIL PROTECTED]] > Sent: Thursday, March 21, 2002 2:40 PM > Cc: Kannel-devel (E-mail) > Subject: RE: [BUG] list implementation too slow. > > > On Thu, 21 Mar 2002, Oded Arbel wrote: > > > I think this problem occurs as the bearerbox can't send > messages as fast > > as they are delivered (probably due to a large burst and > then sustained > > high load). I see messages queued on the module's queue, and then it > > slows doesn the module more, so it can handle less messages > (while high > > load continues) and so the queue grows longer. > > The solution to this is to fix all code so that outgoing messages > have higher priority than incoming ones, i.e. do not read anything > from bufefrs before outgoing messages have been sent. NOTE: You need > to set conn() buffer size, otherwise this does not help as things > get added to its internal buffer... I did this for my modified > version and got rid of growing size (in memory) and slowdown. > > (however, I do not know how SMSC would react if the socket > gets filled up > because the other end is not reading it fast enough..) > > > -- > &kalle marjola > > >
Re: [BUG] list implementation too slow.
>Hi list. > >Lately we've been doing some very high capacity testing on Kannel , and >found out some interesting stuff. mainly as queues (managed by lists) >are filling up to over a few hundreds of messages, the boxes start >thrashing. >I think this is directly related to the List implementation - it's just >too . when we have more then a few hundreds of messages >in the List, extracting one item can sometimes take anywhere from 2 to 4 >seconds (!!!). > >Does any one have any information or experience regarding that ? I strongly disagree with this. My gateway at some point in time had over 100'000 messages in the list and it dequeued as fast as it can, sending out about 40msg/sec (and that was the limit of the SMSC, not kannel). Maybe you run it in non native-malloc mode? -- Andreas Fink Fink-Consulting -- Tel: +41-61-6932730 Fax: +41-61-6932729 Mobile: +41-79-2457333 Address: A. Fink, Schwarzwaldallee 16, 4058 Basel, Switzerland E-Mail: [EMAIL PROTECTED] Homepage: http://www.finkconsulting.com -- Something urgent? Try http://www.smsrelay.com/ Nickname afink
RE: [BUG] list implementation too slow.
Hmm. yes - I do compile using checking malloc. I re-comiled using native malloc and it looks better - haven't had all the capacity testing done on it yet though. Is it just that - checking malloc is so slow to cause thrashing in code that do de-allocations ? -- Oded Arbel m-Wise Inc. [EMAIL PROTECTED] Knebel's Law: It is now proved beyond doubt that smoking is one of the leading causes of statistics. > -Original Message- > From: Andreas Fink [mailto:[EMAIL PROTECTED]] > Sent: Thursday, March 21, 2002 8:26 PM > To: Oded Arbel > Cc: [EMAIL PROTECTED] > Subject: Re: [BUG] list implementation too slow. > > > >Hi list. > > > >Lately we've been doing some very high capacity testing on > Kannel , and > >found out some interesting stuff. mainly as queues (managed by lists) > >are filling up to over a few hundreds of messages, the boxes start > >thrashing. > >I think this is directly related to the List implementation > - it's just > >too . when we have more then a few hundreds > of messages > >in the List, extracting one item can sometimes take anywhere > from 2 to 4 > >seconds (!!!). > > > >Does any one have any information or experience regarding that ? > > I strongly disagree with this. My gateway at some point in time had > over 100'000 messages in the list and it dequeued as fast as it can, > sending out about 40msg/sec (and that was the limit of the SMSC, not > kannel). > > Maybe you run it in non native-malloc mode? > -- > > Andreas Fink > Fink-Consulting > > -- > Tel: +41-61-6932730 Fax: +41-61-6932729 Mobile: +41-79-2457333 > Address: A. Fink, Schwarzwaldallee 16, 4058 Basel, Switzerland > E-Mail: [EMAIL PROTECTED] Homepage: http://www.finkconsulting.com > -- > Something urgent? Try http://www.smsrelay.com/ Nickname afink >
Re: [BUG] list implementation too slow.
Hi Oded & Andreas, Oded Arbel wrote: > > Hmm. yes - I do compile using checking malloc. I re-comiled using native > malloc and it looks better - haven't had all the capacity testing done > on it yet though. Is it just that - checking malloc is so slow to cause > thrashing in code that do de-allocations ? > > -- > Oded Arbel > m-Wise Inc. > [EMAIL PROTECTED] > > Knebel's Law: > It is now proved beyond doubt that smoking is one of the leading causes > of statistics. > > > -Original Message- > > From: Andreas Fink [mailto:[EMAIL PROTECTED]] > > Sent: Thursday, March 21, 2002 8:26 PM > > To: Oded Arbel > > Cc: [EMAIL PROTECTED] > > Subject: Re: [BUG] list implementation too slow. > > > > > > >Hi list. > > > > > >Lately we've been doing some very high capacity testing on > > Kannel , and > > >found out some interesting stuff. mainly as queues (managed by lists) > > >are filling up to over a few hundreds of messages, the boxes start > > >thrashing. > > >I think this is directly related to the List implementation > > - it's just > > >too . when we have more then a few hundreds > > of messages > > >in the List, extracting one item can sometimes take anywhere > > from 2 to 4 > > >seconds (!!!). > > > > > >Does any one have any information or experience regarding that ? > > > > I strongly disagree with this. My gateway at some point in time had > > over 100'000 messages in the list and it dequeued as fast as it can, > > sending out about 40msg/sec (and that was the limit of the SMSC, not > > kannel). > > > > Maybe you run it in non native-malloc mode? I did use native malloc. And yes, checking malloc is much slower. And yes, I would agree that Kannel's long term performance is about or higher than 40 msg/s. It is difference between peak and long term performances that worries me. Aarno
Re: [BUG] list implementation too slow.
On Thu, Mar 21, 2002 at 01:06:39PM +0200, Oded Arbel wrote: > I use list_extract_first to grab messages from the queue, and > list_append to add messages to the queue - both are very slow if the > list gets too long (a couple of handreds can cause noticable slowing). > except for calling list_len on every list_append, I don't use any other > list_ functions. The weird thing is that neither of these operations should be affected by the length of the list. Lists use a circular buffer, so list_extract_first just moves the start pointer up by one, and list_append uses the next free entry in the buffer. You should only see a slowdown when the list is actually growing to a size bigger than it's ever been before, because then a new buffer is allocated. And even that should only happen when the buffer size has doubled, if you have an smart realloc() implementation. (The checking realloc is smart in that way.) Richard Braakman
Re: [BUG] list implementation too slow.
On Thu, Mar 21, 2002 at 09:02:38PM +0200, Oded Arbel wrote: > Hmm. yes - I do compile using checking malloc. I re-comiled using native > malloc and it looks better - haven't had all the capacity testing done > on it yet though. Is it just that - checking malloc is so slow to cause > thrashing in code that do de-allocations ? Hmm, this might be a bug in the checking malloc. It's not supposed to slow down when the number of allocations goes up. It only loops over all the allocated entries when it finds that a start marker has been damaged (which will be logged), or when gw_check_leaks() is called. Hmm, or if it encounters a pointer it considers "suspicious", there may be a bug there. Can we see the code you're using for measuring the List speed? Richard Braakman
Re: [BUG] list implementation too slow.
> Kalle Marjola wrote: > > > > On Thu, 21 Mar 2002, Oded Arbel wrote: > > > > > I think this problem occurs as the bearerbox can't send messages as fast > > > as they are delivered (probably due to a large burst and then sustained > > > high load). I see messages queued on the module's queue, and then it > > > slows doesn the module more, so it can handle less messages (while high > > > load continues) and so the queue grows longer. > > > > The solution to this is to fix all code so that outgoing messages > > have higher priority than incoming ones, i.e. do not read anything > > from bufefrs before outgoing messages have been sent. NOTE: You need > > to set conn() buffer size, otherwise this does not help as things > > get added to its internal buffer... I did this for my modified > > version and got rid of growing size (in memory) and slowdown. > > > > (however, I do not know how SMSC would react if the socket gets filled up > > because the other end is not reading it fast enough..) I think that we should at least test this approach. There is no reason to accept a new request when there are old requests in the queue. Kalle, can you send the priority patch ? Aarno