RE: Java Broker (7.0.6) stops delivering queue/consumer messages after 4 GB data transfer

2018-08-06 Thread Dyslin, Mike
Rob,

I'm running with your fixed jar file now, and passed the 4GB boundary, 
currently at 5.11 GB, so your fix seems to work!

Thanks so much for your quick response!  We really appreciate it.

We will watch for the next release that has this fix in it.  Do you think it 
will be in 7.0.7?

Thanks,
Mike


-Original Message-
From: Rob Godfrey [mailto:rob.j.godf...@gmail.com] 
Sent: Monday, August 6, 2018 10:25 AM
To: users@qpid.apache.org
Cc: Mears, David B ; Herren, Elaine 
; Rao, Shobha (NonStop) 
Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer messages after 
4 GB data transfer

Hi Mike,
On Mon, 6 Aug 2018 at 18:23, Dyslin, Mike  wrote:

> Rob,
>
> I created JIRA QPID-8225 with Title "Java Broker (7.0.6) stops 
> delivering queue/consumer messages after 4 GB data transfer".
>
> The log you requested is attached to the JIRA.
>
> The log contains the connection startup, 10 messages flowing (2 every 
> 10 seconds), then connection close.  It does not contain when message 
> flow stopped (credit exhausted?).  I can get that for you if you like.
>
> We did see this in the log:
>
> 2018-08-06 07:57:34,100 DEBUG [IO-/15.252.32.148:43266]
> (o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1 
> MessageSetFlowMode(destination=dsaqp15, flowMode=CREDIT)
> 2018-08-06 07:57:34,102 DEBUG [IO-/15.252.32.148:43266]
> (o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1 
> MessageFlow(destination=dsaqp15, unit=MESSAGE, value=4294967295)
> 2018-08-06 07:57:34,102 DEBUG [IO-/15.252.32.148:43266]
> (o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1 
> MessageFlow(destination=dsaqp15, unit=BYTE, value=4294967295)
>
> And we never saw anything that looked like credit was being reset by 
> the consumer.
>
>
OK - this actually  matches up with what I suspected looking more deeply at the 
code this morning.  The value 4294967295 is supposed to be a magic number 
meaning "infinite" credit, however the Java Broker is mishandling this, 
expecting it to be passed as -1, whereas in fact a lower layer in the code is 
instead passing the value as 4294967295 in a signed 64bit value.  I just 
committed a fix to master, but I'm not sure when that might be released as a 
new version.


> So, perhaps the 4 GB credit simply is all used up and messages stop 
> flowing.
>
> How do we change FlowMode to OFF, or if that is not possible, how do 
> we reset our credit back to 4 GB in the QPID C++ API?
>
>
So, I'm not familiar with the C++ API, if there were an obvious way to switch 
from "credit" mode to "window" mode (which is the protocol default) then that 
would likely work around the problem.


> Since we don't see any of our code setting any kind of flow control 
> option, is the C++ API sending flow control default setting to the 
> broker, or is the broker setting flow control default because the 
> producer has not specified any flow control?  If the latter, then 
> perhaps the C++ and Java brokers have different flow control defaults.  
> If the former, then C++ broker and the java broker are not 
> implementing specified the flow control in the same way.
>
> Again, our original code was only tested with C++ brokers.
>
>
Yeah - sadly this is a Java Broker bug you've discovered.  I've uploaded a jar 
file with the proposed fix into the JIRA (just replace the existing jar of the 
same name in lib/broker-plugins).  Alternatively you can build this yourself by 
doing the following:

git clone https://git-wip-us.apache.org/repos/asf/qpid-broker-j.git

cd qpid-broker-j/

git checkout 7.0.6

git cherry-pick cf40fdea39d9633702ee286d94e950a19ec7be74
mvn package

 Apologies you've run into this :-(  I think most clients are using window, or 
non-infinite credit - that's the only reason I can think that this has not been 
spotted before.

If you give the jar a try and it works (or doesn't work) for you, please let us 
know.

-- Rob

Thanks,
> Mike
>
>
> -Original Message-
> From: Rob Godfrey [mailto:rob.j.godf...@gmail.com]
> Sent: Friday, August 3, 2018 3:34 PM
> To: users@qpid.apache.org
> Cc: Mears, David B ; Herren, Elaine < 
> elaine.her...@hpe.com>; Rao, Shobha (NonStop) 
>  >
> Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer 
> messages after 4 GB data transfer
>
> On Sat, 4 Aug 2018 at 00:11, Dyslin, Mike  wrote:
>
> > Rob,
> >
> > We do not know how our consumer is managing credit, or anything 
> > about credit flow, but have started looking for information on this.  
> > It sounds like the right place to look.  If we could figure out how 
> > to turn off the flow control limits, that may do it.  We inherited 
> > most of this code, and the original authors of the code are long 
> &

RE: Java Broker (7.0.6) stops delivering queue/consumer messages after 4 GB data transfer

2018-08-06 Thread Dyslin, Mike
Rob,

I created JIRA QPID-8225 with Title "Java Broker (7.0.6) stops delivering 
queue/consumer messages after 4 GB data transfer".

The log you requested is attached to the JIRA.

The log contains the connection startup, 10 messages flowing (2 every 10 
seconds), then connection close.  It does not contain when message flow stopped 
(credit exhausted?).  I can get that for you if you like.

We did see this in the log:

2018-08-06 07:57:34,100 DEBUG [IO-/15.252.32.148:43266] 
(o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1 
MessageSetFlowMode(destination=dsaqp15, flowMode=CREDIT)
2018-08-06 07:57:34,102 DEBUG [IO-/15.252.32.148:43266] 
(o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1 
MessageFlow(destination=dsaqp15, unit=MESSAGE, value=4294967295)
2018-08-06 07:57:34,102 DEBUG [IO-/15.252.32.148:43266] 
(o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1 
MessageFlow(destination=dsaqp15, unit=BYTE, value=4294967295)

And we never saw anything that looked like credit was being reset by the 
consumer.

So, perhaps the 4 GB credit simply is all used up and messages stop flowing.

How do we change FlowMode to OFF, or if that is not possible, how do we reset 
our credit back to 4 GB in the QPID C++ API?

Since we don't see any of our code setting any kind of flow control option, is 
the C++ API sending flow control default setting to the broker, or is the 
broker setting flow control default because the producer has not specified any 
flow control?  If the latter, then perhaps the C++ and Java brokers have 
different flow control defaults.  If the former, then C++ broker and the java 
broker are not implementing specified the flow control in the same way.

Again, our original code was only tested with C++ brokers.

Thanks,
Mike
 

-Original Message-
From: Rob Godfrey [mailto:rob.j.godf...@gmail.com] 
Sent: Friday, August 3, 2018 3:34 PM
To: users@qpid.apache.org
Cc: Mears, David B ; Herren, Elaine 
; Rao, Shobha (NonStop) 
Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer messages after 
4 GB data transfer

On Sat, 4 Aug 2018 at 00:11, Dyslin, Mike  wrote:

> Rob,
>
> We do not know how our consumer is managing credit, or anything about 
> credit flow, but have started looking for information on this.  It 
> sounds like the right place to look.  If we could figure out how to 
> turn off the flow control limits, that may do it.  We inherited most 
> of this code, and the original authors of the code are long gone, so 
> we don't have the experience of putting it all together.  Perhaps the 
> defaults for this credit flow is different between the C++ and Java brokers.
>
> OK, we'll try to figure out how to get protocol logging turned on and 
> get a log file to look at.
>

Thanks - I had a quick look at the Java Broker code for managing credit and 
didn't immediately see any obvious errors that would hit a properly functioning 
client, *however* it does look like that if the client is improperly managing 
credit there would be an overflow error where a value that should be an 
unsigned integral value (the amount of remaining credit) will turn negative.  
The broker is storing the amount of outstanding credit as a (signed) Java long 
value.  AMQP 0-10 defines the message.flow command as "This command controls 
the flow of message data to a given destination.
It is used by the recipient of messages to dynamically match the incoming rate 
of message flow to its processing or forwarding capacity. Upon receipt of this 
command, the sender must add "value" number of the specified unit to the 
available credit balance for the specified destination."  So if the client is 
repeatedly adding more credit than it needs the stored "limit" in the broker 
might overflow and turn negative.  At this point the broker would stop sending 
messages.

-- Rob



>
> Thanks,
> Mike
>
> -Original Message-
> From: Rob Godfrey [mailto:rob.j.godf...@gmail.com]
> Sent: Friday, August 3, 2018 11:11 AM
> To: users@qpid.apache.org
> Cc: Mears, David B ; Herren, Elaine < 
> elaine.her...@hpe.com>; Rao, Shobha (NonStop) 
>  >
> Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer 
> messages after 4 GB data transfer
>
> On Fri, 3 Aug 2018 at 19:18, Dyslin, Mike  wrote:
>
> > Rob,
> >
> > Our consumer client version is 1.37:
> >
> >bash 0 3: rpm -qa | grep -i qpid
> >qpid-cpp-client-devel-1.37.0-1.el7.x86_64
> >python-qpid-1.37.0-1.el7.noarch
> >qpid-qmf-1.37.0-1.el7.x86_64
> >qpid-cpp-client-1.37.0-1.el7.x86_64
> >python-qpid-qmf-1.37.0-1.el7.x86_64
> >qpid-cpp-server-1.37.0-1.el7.x86_64
> >qpid-proton-c-0.18.1-1.el7.x86_64
> >qpid-tools-1.37.0-1.el7.noarch
> >
> > Our producer client version is 6.1.5:
> &

RE: Java Broker (7.0.6) stops delivering queue/consumer messages after 4 GB data transfer

2018-08-03 Thread Dyslin, Mike
Rob,

We do not know how our consumer is managing credit, or anything about credit 
flow, but have started looking for information on this.  It sounds like the 
right place to look.  If we could figure out how to turn off the flow control 
limits, that may do it.  We inherited most of this code, and the original 
authors of the code are long gone, so we don't have the experience of putting 
it all together.  Perhaps the defaults for this credit flow is different 
between the C++ and Java brokers.

OK, we'll try to figure out how to get protocol logging turned on and get a log 
file to look at.

Thanks,
Mike

-Original Message-
From: Rob Godfrey [mailto:rob.j.godf...@gmail.com] 
Sent: Friday, August 3, 2018 11:11 AM
To: users@qpid.apache.org
Cc: Mears, David B ; Herren, Elaine 
; Rao, Shobha (NonStop) 
Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer messages after 
4 GB data transfer

On Fri, 3 Aug 2018 at 19:18, Dyslin, Mike  wrote:

> Rob,
>
> Our consumer client version is 1.37:
>
>bash 0 3: rpm -qa | grep -i qpid
>qpid-cpp-client-devel-1.37.0-1.el7.x86_64
>python-qpid-1.37.0-1.el7.noarch
>qpid-qmf-1.37.0-1.el7.x86_64
>qpid-cpp-client-1.37.0-1.el7.x86_64
>python-qpid-qmf-1.37.0-1.el7.x86_64
>qpid-cpp-server-1.37.0-1.el7.x86_64
>qpid-proton-c-0.18.1-1.el7.x86_64
>qpid-tools-1.37.0-1.el7.noarch
>
> Our producer client version is 6.1.5:
> -rw-r--r--1 NSDA.NSDA  NSDA  570873 Jul 26 15:42
> qpid-client-6.1.5.jar
> -rw-r--r--1 NSDA.NSDA  NSDA  864493 Jul 26 15:42
> qpid-common-6.1.5.jar
>
> Both producer and consumer clients use AMQP_0_10 protocol.
>

OK - so my assumption here is that there is some issue in the management of 
credit in the broker (or possibly in the client).  AMQP 0-10 has two distinct 
credit flow modes "credit" and "window" and also allows the consumer to 
separately set limits for both "message" and "byte" credit.
I'm not very familiar with the C++ API, but do you know how your consumer is 
managing credit?

One thing that would be very helpful in trying to diagnose this problem is 
getting protocol logging for the consumer (at least for the start of the 
consumer where it sets up the credit flow mode, and towards the end where it 
would be interesting to see the credit being allocated just before message flow 
stops).

-- Rob


>
> Thanks for your assistance,
> Mike
>
> FYI - There may be a typo on the past releases web page.  I believe "2017"
> should be "2018" in "Qpid JMS AMQP 0-x 6.3.2, July 2017".  URL:
> https://qpid.apache.org/releases/index.html#past-releases
>
> -Original Message-
> From: Rob Godfrey [mailto:rob.j.godf...@gmail.com]
> Sent: Thursday, August 2, 2018 5:57 PM
> To: users@qpid.apache.org
> Cc: Mears, David B ; Herren, Elaine < 
> elaine.her...@hpe.com>; Rao, Shobha (NonStop) 
>  >
> Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer 
> messages after 4 GB data transfer
>
> Hi Mike,
>
> On Fri, 3 Aug 2018 at 01:25, Dyslin, Mike  wrote:
>
> > This is my first submit to this email group.  Hopefully this is the 
> > correct place to post this problem.
> >
> >
> This is exactly the right place to post this problem.
>
>
> > We are running a continuous stream of message (about 5K each) from 
> > producer to consumer over a single java broker queue at a rate of 
> > about 600 messages/second.  Outbound message flow stops after 
> > transferring 4 GB of message data (about 770,000 messages in 25 
> > minutes).  The Web Management Console page for our consumer 
> > connection
> shows the total "Outbound Bytes"
> > growing steadily until it reaches 4.0 GB and stops with "Last I/O time"
> > unchanging thereafter.
> >
> > After outbound messages stop:
> > Inbound messages continue on the producer connection (well past 4.0
> > GB) and are kept in the queue until they expire with a time-to-live 
> > value of 3 minutes.  The queue grows until is stabilizes with a 
> > steady
> > 600 m/s inbound, and 600 m/s expiring and being deleted from the 
> > queue (as expected).  The Web Management Console shows that the 
> > consumer connection remains open and is a consumer on the queue, and 
> > the queue shows the connection as a consumer on the queue.
> >
> > If I run the exact same test replacing the Java Broker with a C++ 
> > broker (1.37.0), message flow continues well past the 4 GB barrier.  
> > I kept it running for about 17 hours reaching about 37 million 
> > messages, about 180 GB data transferred on the queue.
> >
> > Since the only dif

RE: Java Broker (7.0.6) stops delivering queue/consumer messages after 4 GB data transfer

2018-08-03 Thread Dyslin, Mike
Rob,

Our consumer client version is 1.37:

   bash 0 3: rpm -qa | grep -i qpid
   qpid-cpp-client-devel-1.37.0-1.el7.x86_64
   python-qpid-1.37.0-1.el7.noarch
   qpid-qmf-1.37.0-1.el7.x86_64
   qpid-cpp-client-1.37.0-1.el7.x86_64
   python-qpid-qmf-1.37.0-1.el7.x86_64
   qpid-cpp-server-1.37.0-1.el7.x86_64
   qpid-proton-c-0.18.1-1.el7.x86_64
   qpid-tools-1.37.0-1.el7.noarch

Our producer client version is 6.1.5:
-rw-r--r--1 NSDA.NSDA  NSDA  570873 Jul 26 15:42 
qpid-client-6.1.5.jar
-rw-r--r--1 NSDA.NSDA  NSDA  864493 Jul 26 15:42 
qpid-common-6.1.5.jar

Both producer and consumer clients use AMQP_0_10 protocol.

Thanks for your assistance,
Mike

FYI - There may be a typo on the past releases web page.  I believe "2017" 
should be "2018" in "Qpid JMS AMQP 0-x 6.3.2, July 2017".  URL: 
https://qpid.apache.org/releases/index.html#past-releases

-Original Message-
From: Rob Godfrey [mailto:rob.j.godf...@gmail.com] 
Sent: Thursday, August 2, 2018 5:57 PM
To: users@qpid.apache.org
Cc: Mears, David B ; Herren, Elaine 
; Rao, Shobha (NonStop) 
Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer messages after 
4 GB data transfer

Hi Mike,

On Fri, 3 Aug 2018 at 01:25, Dyslin, Mike  wrote:

> This is my first submit to this email group.  Hopefully this is the 
> correct place to post this problem.
>
>
This is exactly the right place to post this problem.


> We are running a continuous stream of message (about 5K each) from 
> producer to consumer over a single java broker queue at a rate of 
> about 600 messages/second.  Outbound message flow stops after 
> transferring 4 GB of message data (about 770,000 messages in 25 
> minutes).  The Web Management Console page for our consumer connection shows 
> the total "Outbound Bytes"
> growing steadily until it reaches 4.0 GB and stops with "Last I/O time"
> unchanging thereafter.
>
> After outbound messages stop:
> Inbound messages continue on the producer connection (well past 4.0 
> GB) and are kept in the queue until they expire with a time-to-live 
> value of 3 minutes.  The queue grows until is stabilizes with a steady 
> 600 m/s inbound, and 600 m/s expiring and being deleted from the queue 
> (as expected).  The Web Management Console shows that the consumer 
> connection remains open and is a consumer on the queue, and the queue 
> shows the connection as a consumer on the queue.
>
> If I run the exact same test replacing the Java Broker with a C++ 
> broker (1.37.0), message flow continues well past the 4 GB barrier.  I 
> kept it running for about 17 hours reaching about 37 million messages, 
> about 180 GB data transferred on the queue.
>
> Since the only difference seems to be the broker, this seems to point 
> to a problem with the Java Broker, and not issues with our producer, 
> consumer or network issues.  Could there be some problem with our java 
> broker configuration that would explain this behaviour?
>

Unfortunately this sounds like it may be a bug in the Java Broker :-(


>
> Has anyone out there experienced more than 4 GB of outbound data on a 
> single java broker connection or queue?
>
>
Can you confirm which client you are using, and which version of AMQP is in use 
(as you have identified I don't expect this to be a client problem, but knowing 
the client will help us track down the issue in the broker)?


> Any help would be appreciated.
>
> Other comments/observations:
>
> I do not know if the 4 GB barrier is associated with the connection 
> and/or the queue because all our message traffic is over one consumer 
> connection and one queue.  I could determine this by changing our 
> consumer code to spread message traffic over one connection and multiple 
> queues.



>
>
We are using the heartbeat feature with a 5 minute timeout.  Since the
> connection stays open beyond the 5 minute timeout after the messages 
> stop, I assume the heartbeat messages are still being sent between 
> consumer and broker, indicating that the consumer and broker are able 
> to communicate over the socket.  It has been awhile since I have 
> tested that the heartbeat feature is working correctly.
>
> If I close the consumer connection from the Web Management Console, 
> the broker deletes the queue (I believe) and our consumer detects the 
> closed connection, establishes a new connection and new queue, and 
> messages start flowing again until . . . we reach the 4 GB barrier and 
> messages stop being delivered once again.
>
> We have run with the Java Broker on both Linux (RHEL 7.4) and 
> proprietary NonStop POSIX platform with the same results.  
> Unfortunately, the C++ broker is n