from:"Jimmy Jones"

Re: System stalling

2013-09-18 Thread Jimmy Jones

> Thanks, Jimmy. I have been looking into this issue a little more. I 
> couldn't exactly duplicate your numbers as my test machine did not have 
> sufficient memory but I believe I have identified the key symptom (JIRA 
> updated accordingly), though as yet not the root cause.
> 
> As noted in the JIRA, it may be possible to tune your receivers to 
> mitigate the issue. How feasible that is probably depends on how closely 
> your real system follows the test scenario in the JIRA. For large 
> messages, reducing the capacity seems to be the most effective 
> improvement. As message size decreases, acknowledging in larger batches 
> becomes more effective.

I tried lowering the capacity in the albeit extreme testcase, and the issue was 
only resolved when I put it down to 1, which I can't do on my live systems as 
the performance will be too poor. However as noted below I've rerouted most of 
my large messages, which is okay for the time being.

> One other question was just to confirm that the case as reported does 
> match your real system. Initially there was a suspicion that the ingest 
> process was blocked on send which would I think would be a different issue.

I'm pretty sure this is causing at least some of my problems, as I've rerouted 
the main culprit of the large messages to a second broker, and now the main 
broker is much happier. The second broker exhibits the performance problems 
above, despite not having very many messages to process. However I think I'm 
still see some sends taking >30s on the main broker, however a lot less 
frequently than before, and not causing ring queue overflows. Possibly a 
separate issue, although could well be caused by the poor IO performance of the 
VM its running on which should hopefully be resolved soon, so I will test 
further then.

> I'll do some more digging on what the root cause from the drop in 
> throughput for large messages on a full ring queue might be and update 
> the JIRA with any progress.

Thanks!

Jimmy

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: Removing the bindings from qpid-cpp source tarball...

2013-09-13 Thread Jimmy Jones

[X] Yes, remove the language bindings from qpid-cpp-${VER}

Havn't been able to compile bindings from this package since 0.22 anyway!

Jimmy

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: System stalling

2013-09-12 Thread Jimmy Jones

> > Hi,
> >
> > I've finally managed to isolate the issue and can reproduce it with the 
> > attached scripts. Running rx-test.pl followed by tx-test.pl results in a 
> > system where the receiver can keep up with the producer (gets a message 
> > every <1s) (tx-test 118% CPU, qpidd 97% CPU, rx-test 60% CPU). However, if 
> > you stop rx-test and restart it (even after only a second or so), it starts 
> > to take 2s+ to receive messages, going up to about 6s on my system, so the 
> > ring quickly fills and overflows. Even if the producer is then stopped, 
> > messages are still only received every 3s - with qpidd on 100% CPU and the 
> > receiver on 5%. Also the resident size of qpidd reaches 5GB, yet the queue 
> > is only 2GB.
> >
> > Hopefully I can now regain my sanity :)
> 
> Well done! Unfortunately your scripts seem to have been stripped off at 
> some stage. Could you attach them to a JIRA perhaps? This was with 0.22, 
> right?

Created QPID-5135.

Also wanted to thank everyone for their awesome help and support!

Jimmy

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: System stalling

2013-09-12 Thread Jimmy Jones

Hi,

I've finally managed to isolate the issue and can reproduce it with the 
attached scripts. Running rx-test.pl followed by tx-test.pl results in a system 
where the receiver can keep up with the producer (gets a message every <1s) 
(tx-test 118% CPU, qpidd 97% CPU, rx-test 60% CPU). However, if you stop 
rx-test and restart it (even after only a second or so), it starts to take 2s+ 
to receive messages, going up to about 6s on my system, so the ring quickly 
fills and overflows. Even if the producer is then stopped, messages are still 
only received every 3s - with qpidd on 100% CPU and the receiver on 5%. Also 
the resident size of qpidd reaches 5GB, yet the queue is only 2GB.

Hopefully I can now regain my sanity :)

Cheers,

Jimmy


-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: System stalling

2013-09-10 Thread Jimmy Jones

> > Hi Ken,
> > 
> > Had a play with oprofile... but seems to have lumped everything into glibc, 
> > any ideas? The queue is setup as ring, max size 2GB.
> 
> You probably need to install libstdc++-debuginfo & glibc-debuginfo to
> see more detail.

Right, installed debuginfo packages and now makes a little more sense:

CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        image name               symbol name
6569     33.6337  libstdc++.so.6.0.8       __gnu_cxx::__atomic_add(int 
volatile*, int)
5117     26.1994  libstdc++.so.6.0.8       __gnu_cxx::__exchange_and_add(int 
volatile*, int)
2004     10.2606  libc-2.5.so              memcpy
1695      8.6785  libqpidbroker.so.2.0.0   void 
deque::_M_range_insert_aux<_Deque_iterator>(_Deque_iterator, 
_Deque_iterator, _Deque_iterator, forward_iterator_tag)
823       4.2138  libc-2.5.so              _int_malloc

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: System stalling

2013-09-09 Thread Jimmy Jones

> > Hi Ken,
> > 
> > Had a play with oprofile... but seems to have lumped everything into glibc, 
> > any ideas? The queue is setup as ring, max size 2GB.
> 
> You probably need to install libstdc++-debuginfo & glibc-debuginfo to
> see more detail.

Will go and hunt them down for RHEL5. Also, on reflection, I should have said I 
had to use the timer interrupt as I'm on VMware. I'm wondering if the wall 
clock time in glibc is actually waiting on ptreads locks and epoll which is 
counted, whereas if I was using the hardware performance counters I'd get CPU 
time...

Jimmy

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: System stalling

2013-09-09 Thread Jimmy Jones

Hi Ted,

I don't have any flow control that I'm aware of. Will send the logs separately.

Cheers,

Jimmy
  
- Original Message -
From: Ted Ross
Sent: 09/06/13 02:02 PM
To: users@qpid.apache.org
Subject: Re: System stalling
 Jimmy,

Do your ring queues have any flow-control configuration set up? This 
would be --flow-* thresholds in qpid-config.

Also, it would be helpful to see the output of a pstack on the qpidd 
process when the condition occurs. I think almost everything happens 
under DispatchHandle::processEvent :)

-Ted

On 09/06/2013 09:50 AM, Jimmy Jones wrote:
> I've done some further digging, and managed to simplify the system a little 
> to reproduce the problem. The system is now an external process that posts 
> messages to the default headers exchange on my machine, which has a ring 
> queue to receive effectively all messages from the default headers exchange, 
> process them, and post to another headers exchange. There is now nothing 
> listening on the subsequent headers exchange, and all exchanges are 
> non-durable. I've also tried Fraser's suggestion of marking the link as 
> unreliable on the queue which seems to have no effect (is there any way in 
> the qpid utilities to confirm the link has been set to unreliable?)
>
> So essentially what happens is the system happily processes away, normally 
> with an empty ring queue, sometimes it spikes up a bit and goes back down 
> again, with my ingest process using ~70% CPU and qpidd ~50% CPU, on a machine 
> with 8 CPU cores. However sometimes the queue spikes up to 2GB (the max), 
> starts throwing messages away, and qpid hits 100%+ CPU and the ingest process 
> goes to about 3% CPU. I can see messages are being very slowly processed.
>
> I've tried attaching to qpidd with gdb a few times, and all threads apart 
> from one seem to be idle in epoll_wait or pthread_cond_wait. The running 
> thread always seems to be somewhere under DispatchHandle::processEvent.
>
> I'm at a bit of a loss for what I can do to fix this!
>
> Jimmy
> 
> - Original Message -
> From: Fraser Adams
> Sent: 08/23/13 09:09 AM
> To: users@qpid.apache.org
> Subject: Re: System stalling
> Hi Jimmy, hope you are well!
> As an experiment one thing that you could try is messing with the link
> "reliability". As you know in the normal mode of operation it's
> necessary to periodically send acknowledgements from the consumer client
> application which then get passed back ultimately to the broker.
>
> I'm no expert on this but from my recollection if you are in a position
> particularly where circular queues are overflowing and you are
> continually trying to produce and consume and you have some fair level
> of prefetch/capacity on the consumer the mechanism for handling the
> acknowledgements on the broker is "sub-optimal" - I think it's a linear
> search or some such and there are conditions where catching up with
> acknowledgements becomes a bit "N squared".
>
> Gordon would be able to explain this way better than me - that's
> assuming this hypothesis is even relevant :-)
>
> Anyway if you try having a link: {reliability: unreliable} stanza in
> your consumer address string (as an example one of mine looks like the
> following - the address sting syntax isn't exactly trivial :-)).
>
> string address = "test_consumer; {create: receiver, node: {x-declare:
> {auto-delete: True, exclusive: True, arguments: {'qpid.policy_type':
> ring, 'qpid.max_size': 1}}, x-bindings: [{exchange: 'amq.match',
> queue: 'test_consumer', key: 'test1', arguments: {x-match: all,
> data-format: test}}]}, link: {reliability: unreliable}}";
>
> Clearly your arguments would be different but hopefully it'll give you a
> kick start.
>
>
> The main down side of disabling link reliability is that if you have
> enabled prefetch and the consumer unexpectedly dies then all of the
> messages on the prefetch queue will be lost, whereas with reliable
> messaging the broker maintains references to all unacknowledged messages
> so would resent them (I *think* that's how it works.)
>
>
> At the very least it's a fairly simple tweak to your consumer addresses
> that might rule out (or point to) acknowledgement shenanigans as being
> the root of your problem. From my own experience I always end up blaming
> this first if I hit performance weirdness with ring queues :-)
>
> HTH,
> Frase
>
>
>
> On 21/08/13 17:08, Jimmy Jones wrote:
>>>>>> I've got an simple processing system using the 0.22 C++ broker, all
>>>>>> on one box, where an external system po

Re: System stalling

2013-09-09 Thread Jimmy Jones

Hi Ken,

Had a play with oprofile... but seems to have lumped everything into glibc, any 
ideas? The queue is setup as ring, max size 2GB.

# qpid-stat -q

Queues
  queue                                     dur  autoDel  excl  msg    msgIn  
msgOut  bytes  bytesIn  bytesOut  cons  bind
  
==
  9d587baf-edc4-694b-bd43-f0accdf77a44:0.0       Y        Y        0      0     
 0       0      0        0         1     2
  ingest                                                  Y     16.4k  49.9k  
33.5k   2.13g  7.17g    5.04g        1     2

# opreport --long-filenames --session-dir=/root/oprof
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
          TIMER:0|
  samples|      %|
--
    78847 100.000 /usr/sbin/qpidd
                  TIMER:0|
          samples|      %|
        --
            53196 67.4674 /usr/lib64/libstdc++.so.6.0.8
            14220 18.0349 /usr/lib/libqpidbroker.so.2.0.0
             7368  9.3447 /lib64/libc-2.5.so
             3833  4.8613 /usr/lib/libqpidcommon.so.2.0.0
               93  0.1179 /usr/lib/libqpidtypes.so.1.0.0
               70  0.0888 /lib64/libpthread-2.5.so
               43  0.0545 /lib64/ld-2.5.so
               19  0.0241 /lib64/librt-2.5.so
                4  0.0051 /lib64/libuuid.so.1.2
                1  0.0013 /usr/sbin/qpidd

# opreport --demangle=smart --session-dir=/root/oprof --symbols `which qpidd`
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        image name               symbol name
53196    67.4674  libstdc++.so.6.0.8       /usr/lib64/libstdc++.so.6.0.8
10597    13.4400  libqpidbroker.so.2.0.0   void 
deque::_M_range_insert_aux<_Deque_iterator>(_Deque_iterator, 
_Deque_iterator, _Deque_iterator, forward_iterator_tag)
2922      3.7059  libqpidcommon.so.2.0.0   qpid::framing::AMQFrame::~AMQFrame()
2833      3.5930  libqpidbroker.so.2.0.0   
deque::clear()
2486      3.1529  libc-2.5.so              _int_malloc
1882      2.3869  libc-2.5.so              _int_free
1757      2.2284  libc-2.5.so              malloc
589       0.7470  libc-2.5.so              memcpy
384       0.4870  libc-2.5.so              free
...

Cheers,

Jimmy
  
- Original Message -
From: Ken Giusti
Sent: 09/06/13 06:27 PM
To: users@qpid.apache.org
Subject: Re: System stalling
 Hi Jimmy,

Have you ever used the oprofile tool before?

http://oprofile.sourceforge.net/about/

I've found this tool useful when I need to get a sense of where the broker is 
spending its time, especially when it is compute-bound.

You'll need to be able to install oprofile on the system that is running the 
broker, and you'd need root permission to run it.

The approach I take is to configure oprofile to analyze the broker, then 
perform whatever actions I need to get the broker to into the compute bound 
state. Once the broker is acting up, I then trigger oprofile to start a 
capture. That results in a capture that best represents what the broker is 
doing when it is in that compute bound state.

It's been awhile since I used oprofile, but here's a summary of the commands I 
used last.

First, the path to the broker executable for this example is 
/home/kgiusti/mrg/qpid/cpp/src/.libs/lt-qpidd. Be sure you're referencing the 
actual executable image an not the shell wrapper that autotools generates!

After starting the broker daemon, I delete any old oprofile configuration and 
captures. I then configure and start the oprofile daemon using the following 
commands (done as root):

$ rm -rf /root/oprof
$ rm -rf ~/.oprofile
$ opcontrol --shutdown
$ opcontrol --init
$ opcontrol --reset
$ opcontrol --setup --no-vmlinux --session-dir=/root/oprof 
--image=/home/kgiusti/mrg/qpid/cpp/src/.libs/lt-qpidd --separate=library 
--event=INST_RETIRED_ANY_P:6000:0:0:1 --cpu-buffer-size=100
$ opcontrol --start-daemon

Once that is done, you should try to reproduce the problem. Once the broker is 
in that weird state, start the capture:

$ opcontrol --start

Capture for a while, then stop the capture and dump the results:

$ opcontrol --stop
$ opreport --long-filenames --session-dir=/root/oprof

opreport will dump the methods where the broker is spending most of its compute 
time. You might need to also provide the paths to the link libraries, e.g.:

$ opreport --long-filenames --session-dir=/root/oprof -l 
/home/kgiusti/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2.0.0

These notes are a bit old, and opcontrol/opreport's options may have changed a 
bit, but this should give you a general idea of how to use it.

-K


----- Original Message -
> From: "Jimmy Jones" 
> To: users@qpid.apache.org
> Sent: Friday, September 6, 2013 9:50:17 AM
> Subject: Re: System stalling
> 
> I've done some further digging, a

Re: System stalling

2013-09-06 Thread Jimmy Jones

I've done some further digging, and managed to simplify the system a little to 
reproduce the problem. The system is now an external process that posts 
messages to the default headers exchange on my machine, which has a ring queue 
to receive effectively all messages from the default headers exchange, process 
them, and post to another headers exchange. There is now nothing listening on 
the subsequent headers exchange, and all exchanges are non-durable. I've also 
tried Fraser's suggestion of marking the link as unreliable on the queue which 
seems to have no effect (is there any way in the qpid utilities to confirm the 
link has been set to unreliable?)

So essentially what happens is the system happily processes away, normally with 
an empty ring queue, sometimes it spikes up a bit and goes back down again, 
with my ingest process using ~70% CPU and qpidd ~50% CPU, on a machine with 8 
CPU cores. However sometimes the queue spikes up to 2GB (the max), starts 
throwing messages away, and qpid hits 100%+ CPU and the ingest process goes to 
about 3% CPU. I can see messages are being very slowly processed.

I've tried attaching to qpidd with gdb a few times, and all threads apart from 
one seem to be idle in epoll_wait or pthread_cond_wait. The running thread 
always seems to be somewhere under DispatchHandle::processEvent.

I'm at a bit of a loss for what I can do to fix this!

Jimmy

- Original Message -
From: Fraser Adams
Sent: 08/23/13 09:09 AM
To: users@qpid.apache.org
Subject: Re: System stalling
 Hi Jimmy, hope you are well!
As an experiment one thing that you could try is messing with the link 
"reliability". As you know in the normal mode of operation it's 
necessary to periodically send acknowledgements from the consumer client 
application which then get passed back ultimately to the broker.

I'm no expert on this but from my recollection if you are in a position 
particularly where circular queues are overflowing and you are 
continually trying to produce and consume and you have some fair level 
of prefetch/capacity on the consumer the mechanism for handling the 
acknowledgements on the broker is "sub-optimal" - I think it's a linear 
search or some such and there are conditions where catching up with 
acknowledgements becomes a bit "N squared".

Gordon would be able to explain this way better than me - that's 
assuming this hypothesis is even relevant :-)

Anyway if you try having a link: {reliability: unreliable} stanza in 
your consumer address string (as an example one of mine looks like the 
following - the address sting syntax isn't exactly trivial :-)).

string address = "test_consumer; {create: receiver, node: {x-declare: 
{auto-delete: True, exclusive: True, arguments: {'qpid.policy_type': 
ring, 'qpid.max_size': 1}}, x-bindings: [{exchange: 'amq.match', 
queue: 'test_consumer', key: 'test1', arguments: {x-match: all, 
data-format: test}}]}, link: {reliability: unreliable}}";

Clearly your arguments would be different but hopefully it'll give you a 
kick start.

The main down side of disabling link reliability is that if you have 
enabled prefetch and the consumer unexpectedly dies then all of the 
messages on the prefetch queue will be lost, whereas with reliable 
messaging the broker maintains references to all unacknowledged messages 
so would resent them (I *think* that's how it works.)

At the very least it's a fairly simple tweak to your consumer addresses 
that might rule out (or point to) acknowledgement shenanigans as being 
the root of your problem. From my own experience I always end up blaming 
this first if I hit performance weirdness with ring queues :-)

HTH,
Frase

On 21/08/13 17:08, Jimmy Jones wrote:
>>>>> I've got an simple processing system using the 0.22 C++ broker, all
>>>>> on one box, where an external system posts messages to the default
>>>>> headers exchange, and an ingest process receives them using a ring
>>>>> queue, transforms them and outputs to a different headers exchange.
>>>>> Various other processes pick messages of interest off that exchange
>>>>> using ring queues. Recently however the system has been stalling -
>>>>> I'm still receiving lots of data from the other system, but the
>>>>> ingest process suddenly goes to <5% CPU usage and its queue fills up
>>>>> and messages start getting discarded from the ring, the follow on
>>>>> processes go to practically 0% CPU and qpidd hovers around 95-120%
>>>>> CPU (normally its ~75%) and the rest of the system pretty much goes
>>>>> idle (no swapping, there is free memory)
>>>>>
>>>>> I attached to the ing

Re: UTF8 / binary strings in dynamic languages

2013-08-21 Thread Jimmy Jones

> 3. If the language string is an overloaded text/bytes type, as is
> regrettably quite common, what do we do then?
> 
> The current answer to this question is "send it as vbin". That's very
> safe, insofar as it won't throw any sort of encoding exception. It
> does not, however, always honor what I think is the user's more
> typical intention: produce an ascii string at the other end.

I guess the problem is between dynamically and statically typed languages,
if you stay with the same language you don't notice anything, but this
slightly defeats the object of AMQP!

> So for 3, I'd like to consider the possibility of, by default, sending
> ambiguous language strings as ascii rendered to amqp str16. This
> requires an encoding step that may produce errors. And maybe that's
> just too obnoxious! That's what I'd like to know.

I'm not convinced, but I'm prepared to be convinced. If I put a binary
value in a map and encoded it some of the time it might be valid utf8,
other times not. Could this lead to a class of subtle bugs where a receiver
written in a statically typed language will work most of the time when
the value appears as a vbin, but not other times when it "accidentally"
appears a a str16?
 
> In summary, if we have a way to determine what the user wanted (text
> or bytes), we should try to carry that through on the wire. At the
> following URL I've tried to map out what type information we can get
> for each language. Please update it as you please.
> 
>  
> https://cwiki.apache.org/confluence/display/qpid/Language+support+for+unambiguous+text+string+and+byte+array+types

I've just signed up, but don't seem to be able to edit the page? I'll
add the stuff about utf8::upgrade when I can edit.
 
> On Wed, Aug 21, 2013 at 8:44 AM, Jimmy Jones  wrote:
>>> > AFAIK in perl, if you include unicode characters in a string it'll
>>> > set the utf8 flag. If you don't include any unicode characters (eg. 7
>>> > bit ascii, or raw bytes) the flag won't be set. So given a perl
>>> > scalar that doesn't contain any utf8 characters, you don't know if
>>> > its a textual string (str16) or a binary string (vbin). There is a
>>> > is_utf8_string function, but that'll only tell you if the string
>>> > would be valid utf8, but it could be a binary string that happens to
>>> > be valid utf8, so that's not really safe.
>>>
>>> You can explicitly mark it as utf8 using utf8::upgrade() though, right?
>>> Certainly I tried that in a simple test and the property in question was
>>> then sent as str16.
>>
>> Yes, if I as a user had a string that was textual, I could call 
>> utf8::upgrade() to ensure it got sent as str16. I guess this is similar in 
>> concept to calling setEncoding in C++, although maybe less natural in a 
>> dynamically typed language.
>
> It would be more reasonable to treat perl scalars as textual for our
> API if perl offered a good way to explicitly handle byte arrays. My
> (certainly insufficient) web browsing suggested that wasn't really
> available, or not in a form recommended for use. Any candidates for a
> serviceable explicitly-arbitrary-bytes-and-not-text-at-all "type" in
> perl?

Sorry, I don't know of any, althogh I'm no perl guru! I'll have another
look though.

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: System stalling

2013-08-21 Thread Jimmy Jones

 I've got an simple processing system using the 0.22 C++ broker, all
 on one box, where an external system posts messages to the default
 headers exchange, and an ingest process receives them using a ring
 queue, transforms them and outputs to a different headers exchange.
 Various other processes pick messages of interest off that exchange
 using ring queues. Recently however the system has been stalling -
 I'm still receiving lots of data from the other system, but the
 ingest process suddenly goes to <5% CPU usage and its queue fills up
 and messages start getting discarded from the ring, the follow on
 processes go to practically 0% CPU and qpidd hovers around 95-120%
 CPU (normally its ~75%) and the rest of the system pretty much goes
 idle (no swapping, there is free memory)

 I attached to the ingest process with gdb and it was stuck in send
 (waitForCapacity/waitForCompletionImpl) - I notice this can block.
>>>
>>> Is there any queue bound to the second headers exchange, i.e. to the one
>>> this ingest process is sending to, that is not a ring queue? (If you run
>>> qpid-config queue -r, you get a quick listing of the queues and their
>>> bindings).
>>
>> I've run qpid-config queue, and all my queues have --limit-policy=ring, apart
>> from a UUID one which I presume is qpid-config itself. Are there any other 
>> useful
>> debugging things I can do?
>
>What does qpid-stat -q show? Is it possible to test whether the broker 
>is still responsive e,g, by sending and receiving messages through a 
>test queue/exchange? Are there any errors in the logs? Are any of the 
>queues durable (and messages persistent)?

qpid-stat -q is all zero's in the msg & bytes column, apart from the ingest 
queue,
and another overflowing ring queue I have.

I did run qpid-tool when the system was broken to dump some stats. 
msgTotalDequeues
was slowly incremeneting on the ingest queue, so I presume messages were still 
being
delivered and the broker was responsive?

The only logging I've got is syslog, and I just see a warning about unsent data,
presumably when the ingest process receives a SIGALARM. I'm happy to swich on 
more
logging, what would you recommend?

None of my queues are durable, but I think incoming messages from the other 
system
are marked as durable. The exchange that the ingest process sends to is durable,
but I'm not setting any durable flags on outgoing messages (I presume the 
default
is off).

>Another thing might be a ptrace of the broker process. Maybe two or 
>three with a short delay between them.

I'll try this next time it goes haywire.

>For some reason it seems like the broker is not sending back 
>confirmation to the sender in the ingest process, causing that to block. 
>Ring queues shouldn't be subject to producer flow control so we need to 
>figure out what other reason there could be for that.


-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: System stalling

2013-08-21 Thread Jimmy Jones

> > I've got an simple processing system using the 0.22 C++ broker, all
> > on one box, where an external system posts messages to the default
> > headers exchange, and an ingest process receives them using a ring
> > queue, transforms them and outputs to a different headers exchange.
> > Various other processes pick messages of interest off that exchange
> > using ring queues. Recently however the system has been stalling -
> > I'm still receiving lots of data from the other system, but the
> > ingest process suddenly goes to <5% CPU usage and its queue fills up
> > and messages start getting discarded from the ring, the follow on
> > processes go to practically 0% CPU and qpidd hovers around 95-120%
> > CPU (normally its ~75%) and the rest of the system pretty much goes
> > idle (no swapping, there is free memory)
> >
> > I attached to the ingest process with gdb and it was stuck in send
> > (waitForCapacity/waitForCompletionImpl) - I notice this can block.
> 
> Is there any queue bound to the second headers exchange, i.e. to the one 
> this ingest process is sending to, that is not a ring queue? (If you run 
> qpid-config queue -r, you get a quick listing of the queues and their 
> bindings).

I've run qpid-config queue, and all my queues have --limit-policy=ring, apart
from a UUID one which I presume is qpid-config itself. Are there any other 
useful
debugging things I can do?

> If there was a queue to which messages were enqueued that started to 
> apply rpoducer flow control, then that would block your ingest process 
> (and since the messages are still coming in, the broker would spend all 
> its time just removing old ones to make space).

I'd expect the broker to use less CPU when discarding messages rather
than shipping them to consumers? But I'm saying that without much knowledge
of the code!

> > However given the rest of the system is idle when this problem occurs
> > I can't understand why this would happen. I added a SIGALARM handler
> > around send with a timeout of 30s and the process did sometimes get
> > killed. Looking at qpid-tool it does seem to still be processing
> > messages, just extremely slowly. My other observation is from
> > netstat, the Send-Q of qpidd to the ingest process is 16363, and the
> > Recv-Q and Send-Q of the ingest process are both 0.
> >
> > Any ideas on what might be happening are very welcome!

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

System stalling

2013-08-21 Thread Jimmy Jones

Hi,

I've got an simple processing system using the 0.22 C++ broker, all on one box, 
where an external system posts messages to the default headers exchange, and an 
ingest process receives them using a ring queue, transforms them and outputs to 
a different headers exchange. Various other processes pick messages of interest 
off that exchange using ring queues. Recently however the system has been 
stalling - I'm still receiving lots of data from the other system, but the 
ingest process suddenly goes to <5% CPU usage and its queue fills up and 
messages start getting discarded from the ring, the follow on processes go to 
practically 0% CPU and qpidd hovers around 95-120% CPU (normally its ~75%) and 
the rest of the system pretty much goes idle (no swapping, there is free memory)

I attached to the ingest process with gdb and it was stuck in send 
(waitForCapacity/waitForCompletionImpl) - I notice this can block. However 
given the rest of the system is idle when this problem occurs I can't 
understand why this would happen. I added a SIGALARM handler around send with a 
timeout of 30s and the process did sometimes get killed. Looking at qpid-tool 
it does seem to still be processing messages, just extremely slowly. My other 
observation is from netstat, the Send-Q of qpidd to the ingest process is 
16363, and the Recv-Q and Send-Q of the ingest process are both 0.

Any ideas on what might be happening are very welcome!

Cheers,

Jimmy

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: Handling queue overflows

2013-07-17 Thread Jimmy Jones

I'll second that - thanks Gordon!
  
- Original Message -
From: Fraser Adams
Sent: 07/17/13 07:05 PM
To: users@qpid.apache.org
Subject: Re: Handling queue overflows
 Nice one Gordon, thanks!! Once again you display your awesomeness :-)

Frase

On 17/07/13 10:02, Gordon Sim wrote:
> On 07/16/2013 09:54 PM, Jimmy Jones wrote:
>>> On 07/15/2013 07:05 PM, Fraser Adams wrote:
>>>> I'd have quite liked the option to be able to trigger message
>>>> delivery to the alternate exchange when being automatically removed
>>>> from a circular queue
>>>
>>> That would be a fairly easy change (see attached patch if interested).
>>>
>>> On 07/15/2013 08:21 PM, Jimmy Jones wrote:
>>>> I don't think i'll now need it, but is there interest in a limit
>>>> policy of sending to an alternate exchange?
>>>
>>> This is similar in some ways with the functionality above, except that
>>> (if I understand it correctly) you would want the newly arriving
>>> messages to be rerouted, rather than the oldest (or lowest priority)
>>> messages(?). Not too difficult to implement either, I don't think.
>>
>> I was actually after the same thing as Fraser - your attached patch does
>> the trick! Is it possible for something like this to make its way into
>> a release?
>
> I've committed it to trunk: 
> https://issues.apache.org/jira/browse/QPID-4993, 
> https://svn.apache.org/r1504058. I don't believe we have branched yet 
> for 0.24 (though that is imminent), so this may just have squeezed in 
> for that.
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> For additional commands, e-mail: users-h...@qpid.apache.org
>


-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org 

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: Handling queue overflows

2013-07-16 Thread Jimmy Jones

> On 07/15/2013 07:05 PM, Fraser Adams wrote:
> > I'd have quite liked the option to be able to trigger message
> > delivery to the alternate exchange when being automatically removed
> > from a circular queue
> 
> That would be a fairly easy change (see attached patch if interested).
> 
> On 07/15/2013 08:21 PM, Jimmy Jones wrote:
> > I don't think i'll now need it, but is there interest in a limit
> > policy of sending to an alternate exchange?
> 
> This is similar in some ways with the functionality above, except that 
> (if I understand it correctly) you would want the newly arriving 
> messages to be rerouted, rather than the oldest (or lowest priority) 
> messages(?). Not too difficult to implement either, I don't think. 

I was actually after the same thing as Fraser - your attached patch does
the trick! Is it possible for something like this to make its way into
a release?

Thanks,

Jimmy

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: Handling queue overflows

2013-07-15 Thread Jimmy Jones

Hi Gordon,

Thanks for your swift and helpful reply! I had wrongly assumed that the 
pagesize was fixed at the platform page size. I think paging should suit my 
needs, I'll give it a go once 0.24 is released.

I don't think i'll now need it, but is there interest in a limit policy of 
sending to an alternate exchange?

Cheers,

Jimmy

- Original Message -
From: Gordon Sim
Sent: 07/15/13 02:59 PM
To: users@qpid.apache.org
Subject: Re: Handling queue overflows
 On 07/15/2013 02:01 PM, Jimmy Jones wrote:
> Hi,
>
> I've got a system which can sometimes be a bit bursty, which would exhaust 
> system memory if the queues were left unchecked. Therefore I've been using 
> ring queues, which solve the problem quite nicely, apart from what happens to 
> the "excess" messages. Ideally I'd like to buffer them to disk and process 
> them at a later, quieter time. I've been digging around and can see a few 
> options:
>
> 1) 0.24 will have flow to disk, which would be perfect but sometimes my 
> messages are quite big (eg. 10MB) and this requires messages to be smaller 
> than a page. Is this limitation likely to be removed soon?

The old mechanism (removed in 0.20) was called 'flow to disk'. I prefer 
to call the newer feature (to be released with 0.24) 'paging' or paged 
queue.

Though it is true that the queues page size must be as large as the 
largest message, you can configure that page size. So you could have 
just a few pages allowed in memory per queue, but have each page be 10MB 
(the page size is configured as a multiple of the platform page size).

As to whether it is likely that the implementation gets updated to allow 
a message to span multiple pages... I'd say probably not. To be able to 
dispatch the message in parts without having the entire thing in memory 
would require a fair bit of work. And without that I don't see a great 
advantage over just having bigger pages. (Unless I'm missing something?)

> 2) 0.24 allows a "backup engine" to take over a loaded queue (QPID-4650), but 
> this looks like it'd require a fair bit of legwork to implement said engine.
> 3) alternate-exchanges. These look pretty good for my needs, but I can't seem 
> to get them to work! From reading some documentation, I thought they'd good 
> with a limit policy of reject - MRG 2 Installation & Configuration guide, 
> 4.8.2 says for an alternate exchange specified for a queue: "Messages that 
> are acquired and then rejected by a message consumer". However if I run the 
> test below, messages only get routed to the alternate exchange when the queue 
> is destroyed while containing messages, and not when messages are rejected 
> because the queue is full. Presumably calling Session::reject would cause it 
> to go to the alternate exchange, but should a limit policy of reject be the 
> same?

The 'reject' policy is probably a little misleading given the other use 
of reject. WHat a 'reject' policy actually does is raise an AMQP 0-10 
exception when the limit is reached, which effectively ends the session. 
Such messages are never routed to the alternate-exchange of the exchange 
or the queue.

Having a client reject rather than accept a message is in fact entirely 
different, despite the (confusing) similarity in name.

I have also just added a new policy that causes a queue to self destruct 
when it reaches the preconfigured limit. That could possibly be of 
interest in conjunction with an alternate-exchange. What would happen 
would be that at the point the limit is reached, the queue will delete 
itself, re-routing any orphaned messages to the alternate-exchange if 
set. The deletion of the queue will result in any subscribing session 
being terminated, but won't result in the publishers session hitting an 
exception. The issue there however is that messages published while the 
queue doesn't exist (i.e. before the subscriber re-establishes the 
session and recreates it) would be dropped (unless of course there were 
then no matching bindings in which case it would be rerouted to the 
exchange's alternate-exchange).

I suspect having spelled that all out it won't be a terribly appealing 
path...

> --8<--
>
> qpid-config add exchange headers test1
> qpid-config add exchange headers test1-overflow
>
> # drain for messages in normal case
> ./drain -f "normal; { create: receiver, node: {type: queue, x-declare: 
> {exclusive: True, alternate-exchange: 'test1-overflow', arguments: 
> {'qpid.max_size': 1024, 'qpid.policy_type': 'reject'}}, x-bindings: 
> [{exchange: test1, arguments:{x-match:any, data-format: xyz}}]}}"
>
> # drain for messages in overflow case
> ./drain -f "overflow; { create: rec

Handling queue overflows

2013-07-15 Thread Jimmy Jones

Hi,

I've got a system which can sometimes be a bit bursty, which would exhaust 
system memory if the queues were left unchecked. Therefore I've been using ring 
queues, which solve the problem quite nicely, apart from what happens to the 
"excess" messages. Ideally I'd like to buffer them to disk and process them at 
a later, quieter time. I've been digging around and can see a few options:

1) 0.24 will have flow to disk, which would be perfect but sometimes my 
messages are quite big (eg. 10MB) and this requires messages to be smaller than 
a page. Is this limitation likely to be removed soon?
2) 0.24 allows a "backup engine" to take over a loaded queue (QPID-4650), but 
this looks like it'd require a fair bit of legwork to implement said engine.
3) alternate-exchanges. These look pretty good for my needs, but I can't seem 
to get them to work! From reading some documentation, I thought they'd good 
with a limit policy of reject - MRG 2 Installation & Configuration guide, 4.8.2 
says for an alternate exchange specified for a queue: "Messages that are 
acquired and then rejected by a message consumer". However if I run the test 
below, messages only get routed to the alternate exchange when the queue is 
destroyed while containing messages, and not when messages are rejected because 
the queue is full. Presumably calling Session::reject would cause it to go to 
the alternate exchange, but should a limit policy of reject be the same?

Any ideas very welcome!

Jimmy

--8<--

qpid-config add exchange headers test1
qpid-config add exchange headers test1-overflow

# drain for messages in normal case
./drain -f "normal; { create: receiver, node: {type: queue, x-declare: 
{exclusive: True, alternate-exchange: 'test1-overflow', arguments: 
{'qpid.max_size': 1024, 'qpid.policy_type': 'reject'}}, x-bindings: [{exchange: 
test1, arguments:{x-match:any, data-format: xyz}}]}}"

# drain for messages in overflow case
./drain -f "overflow; { create: receiver, node: {type: queue, x-declare: 
{exclusive: True, arguments: {'qpid.max_size': 1024000, 'qpid.policy_type': 
'ring'}}, x-bindings: [{exchange: test1-overflow, arguments:{x-match:any, 
data-format: xyz}}]}}"

./spout --content test -c 5 --property data-format=xyz test1
# works as expected, messages received by normal drain
./spout --content test -c 5 --property data-format=xyz test1-overflow
# works as expected, messages received by overflow drain

# Now ctrl-c normal drain, and queue will remain
# Send loads of messages, fills up q1
./spout --content test -c 500 --property data-format=xyz test1
# Blocks... and no messages sent to overflow drain

qpid-config del queue normal --force
# now messages appear in overflow drain

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Munin monitoring plugin

2013-01-24 Thread Jimmy Jones

Hi,

For those who might be interested, I've just had my munin plugin to monitor 
qpid queues (msg depth, byte depth, ring discards, msg rate, byte rate) 
integrated into the munin-contrib repo. It's nowhere near as advanced as Cumin 
or Fraser Adams' GUI, but is good if you already use munin or want some simple 
graphs for monitoring.

Cheers,

Jimmy

-
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: System stalling

Re: Removing the bindings from qpid-cpp source tarball...

Re: System stalling

Re: System stalling

Re: System stalling

Re: System stalling

Re: System stalling

Re: System stalling

Re: System stalling

Re: UTF8 / binary strings in dynamic languages

Re: System stalling

Re: System stalling

System stalling

Re: Handling queue overflows

Re: Handling queue overflows

Re: Handling queue overflows

Handling queue overflows

Munin monitoring plugin

18 matches

Site Navigation

Mail list logo

Footer information