Does this list have a no-attachments policy? I sent the attached 2 emails, both with attachments and it looks like they never went out to the list (according to archive), i got a "rejected post" for the second attempt today (3 days after mailing). Weird thing is as far as my inbox is concerned the message was sent to the list so I thought i hadn't given enough details in the first email so I sent another one (the rejected one). Its weird that the "rejected post" came so long after I sent it, is a human driven process? A quick rejection would be better, i've spent the last 2 weeks wondering whether anyone would respond :).
Anyways, my original issue is below, I have removed all attachments hoping this will get it past the filter. I can pastebin them or send them individually if they will be helpful. Thanks Sam ---------- Forwarded message ---------- From: Sam Hendley <[email protected]> Date: Fri, Jan 7, 2011 at 1:04 PM Subject: Re: Qpid 0.8 c++ broker memory usage To: [email protected] I'm sorry to repost this but since I originally sent it in the midst of the holidays I imagine alot of people may have not noticed it. I have continued to look at the qpid memory issue and I still cannot come up with a decent explanation of what I am doing wrong to cause this increase in memory usage. I have tried using every qpid-* tool at my disposal and can't find anything unusual. I am attaching 2 files collected 1 hour apart on one of our running systems. The last line shows the "ps -aux" output and you can see a 20mb increase in 'dirty' memory over that one hour. This bloat keeps up at that rate pretty steadily (way past the starting VM size) and eventually chokes the system requiring a restart of qpidd. I am at a loss on where to proceed from here, can anyone recommend steps I might take to diagnose what I am doing wrong? It appears to be due to having a single listener bound with the routing key '#', if I disable that listener the memory growth doesn't happen. Clearly this can't be a widespread problem or everyone would be restarting qpid every few days. Thanks again, Sam BTW: Script to generate file: (qpid-stat -q -S msgIn && qpid-stat -e -S msgIn && qpid-stat -c -S msgIn && qpid-stat -u -S delivered && sudo ps aux | grep qpid | grep -v grep ) > later.log On Wed, Dec 29, 2010 at 3:58 PM, Sam Hendley <[email protected]>wrote: > We have been doing soak testing on our systems for the last few weeks and > keep having issues where qpidd is continiously increasing its memory usage, > eventually paging out the OS and crippling the system. The growth is very > slow, somewhere between 4 and 20 bytes a second. The broker is not heavily > loaded, perhaps 100 msgs a second, none of them larger than a hundred bytes. > As far as I can tell (using the qpid-* tools) the messages are being > consumed as fast as possible, the queued messages counter is always 0. > > The main message flow is many messages published to a topic exchange with > different routing keys. I have a single listener bound with the all key "#": > > s...@reef-deploy:~/qpidc-0.8$ qpid-stat -q -S msgIn -L 4 && qpid-stat -e -S > msgIn -L 4 > Queues > queue dur autoDel excl msg msgIn > msgOut bytes bytesIn bytesOut cons bind > > > ======================================================================================================================= > "0fd2fc12-5d12-4722-b352-74e557b5af88" Y Y 0 3.12k > 3.12k 0 137k 137k 1 2 > "f0ca48a8-13cc-49fe-9f94-8b2f592bef70" Y Y 0 3.12k > 3.12k 0 335k 335k 1 2 > "a4491db9-a8c6-4acb-91d4-5c792ac961fe" Y Y 0 3.12k > 3.12k 0 182k 182k 1 2 > reply-reef-deploy.3606.1 Y Y 0 71 > 71 0 64.9k 64.9k 1 2 > Exchanges > exchange type dur bind msgIn msgOut msgDrop byteIn > byteOut byteDrop > > > ========================================================================================= > amq.direct direct Y 22 3.50k 3.50k 0 500k > 500k 0 > measurement_batch topic 2 3.20k 3.20k 2 344k > 344k 167 > measurement topic 1 3.20k 3.20k 0 140k > 140k 0 > qpid.management topic 2 401 0 401 351k > 0 351k > > > > When I watch the memory usage it seems to go up at around 4 bytes a second, > after a few days of running it will have pushed the rest of the system into > swap. This publishing step is something I can disable and have found that > the "leak" goes away if I stop the publishing or disable the subscriber so I > am pretty convinced its this part of the system that is causing the problem. > I am using the same bind and listen code for both the "measurement listener" > and the "measurement_batch" handlers so if the binding code was bad (like > not acknowledging) it should occur regardless of whether I am publishing the > "measurements" or not. > > I am at a bit of a loss as to what to look for next, as far as I can tell > everything is working as it should be but the memory keeps going up. I have > included a snippet of the logs from running the broker with '-t' logging > turned on but theres nothing untoward I can see in this file. Clearly a > memory leak of this magnitude would have been noticed by now so I am > assuming I am doing something wrong but I am out of ideas on what could be > wrong, the only difference I can figure out between the "measurement_batch" > service and the "measurement" service is the routing key being "#". Why that > should cause a slow gain in memory in the broker is where I get stuck. > > Thanks for any advice you can give, happy holidays. > Sam > > Details: > > The broker is qpid 0.8 compiled against boost 1.41 on Ubuntu Lucid. The > client is the java 0.6 client. > Command line is: > sudo /usr/local/sbin/qpidd --port 5672 --auth no --data-dir > /var/db/qpidd.5672 --pid-dir /var/db/qpidd.5672 --log-to-file > /var/log/qpidd.5672.log > > > I have also tried using the pmap utility to see what if I can determine > where the memory is being allocated from. I am including 2 pmap -x outputs, > seperated by about 100 seconds. Doing a diff on the 2 files produces: > 6,10c6,10 > < 0000000001b78000 3888 - - - rw--- [ anon ] > < 00007f7934000000 1640 - - - rw--- [ anon ] > < 00007f793419a000 63896 - - - ----- [ anon ] > < 00007f793c000000 2472 - - - rw--- [ anon ] > < 00007f793c26a000 63064 - - - ----- [ anon ] > --- > > 0000000001b78000 4148 - - - rw--- [ anon ] > > 00007f7934000000 1908 - - - rw--- [ anon ] > > 00007f79341dd000 63628 - - - ----- [ anon ] > > 00007f793c000000 2800 - - - rw--- [ anon ] > > 00007f793c2bc000 62736 - - - ----- [ anon ] > 109c109 > < total kB 208852 - - - > --- > > total kB 209112 - - - > > I don't really know what I am looking at here but it looks to me like 2 of > those memory hunks are getting "used". The other 2 large chunks (63 Mb) > appear to have moved (probably meaning they were grown or shrunk). I have > not collected these files when the listeners are removed (and the memory > usage is stable), if someone thinks that would be valuable I can do that. >
