[jira] [Created] (PROTON-1170) closed links are never deleted

2016-04-06 Thread michael goulish (JIRA)
michael goulish created PROTON-1170:
---

 Summary: closed links are never deleted
 Key: PROTON-1170
 URL: https://issues.apache.org/jira/browse/PROTON-1170
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
 Environment: miserable
Reporter: michael goulish


I wrote a reactor-based application that makes a single connection, and then 
repeatedly makes-and-closes links (receivers) on that connection.

It makes and closes the links as fast as possible: as soon as it gets the 
on_receiver_close event, it makes a new one.  As soon as it gets the 
on_receiver_open event -- it closes that receiver.

This application talks to a dispatch router.

Problem:  Both the router and my application grow their memory (RSS) rapidly -- 
and the router's ability to respond to new link creations slows down rapidly.  
Looking at the router with   Valgrind/Callgrind, after about 15,000 links have 
been created and closed I see that 45% of all CPU time on the router is being 
consumed by pn_find_link().   Instrumenting that code, I see that the list it 
is looking at never decreases in size.

I tried creating my links with the "lifetime_policy" set to DELETE_ON_CLOSE, 
but that had no effect.  Grepping for that symbol, I see that it does not occur 
in the proton C code except in its definition, and in a printing convenience 
function.

Major scalability bug.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-1009) message.h does not have a set method for annotations

2015-09-28 Thread michael goulish (JIRA)
michael goulish created PROTON-1009:
---

 Summary: message.h does not have a set method for annotations
 Key: PROTON-1009
 URL: https://issues.apache.org/jira/browse/PROTON-1009
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Reporter: michael goulish


Comments above the method  pn_message_annotations() indicate that it can bot 
set and get annotations -- but in fact it has no way to set.

And it looks like there is no other way in the C API, either.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PROTON-1009) message.h does not have a set method for annotations

2015-09-28 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish resolved PROTON-1009.
-
Resolution: Not A Problem

Oops.
I didn't realize that the function is returning a pointer that can be used to 
change the annotations.  *That's* how you set them.  Sorry for the noise.

> message.h does not have a set method for annotations
> 
>
> Key: PROTON-1009
> URL: https://issues.apache.org/jira/browse/PROTON-1009
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Reporter: michael goulish
>
> Comments above the method  pn_message_annotations() indicate that it can bot 
> set and get annotations -- but in fact it has no way to set.
> And it looks like there is no other way in the C API, either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (PROTON-992) Proton's use of Cyrus SASL is not thread-safe.

2015-09-22 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish closed PROTON-992.
--
Resolution: Duplicate

this is a duplicate of PROTON-862

> Proton's use of Cyrus SASL is not thread-safe.
> --
>
> Key: PROTON-992
> URL: https://issues.apache.org/jira/browse/PROTON-992
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: 0.10
>Reporter: michael goulish
>Assignee: michael goulish
>Priority: Critical
>
> Documentation for the Cyrus SASL library says that the library is believed to 
> be thread-safe only if the code that uses it meets several requirements.
> The requirements are:
> * you supply mutex functions (see sasl_set_mutex())
> * you make no libsasl calls until sasl_client/server_init() completes
> * no libsasl calls are made after sasl_done() is begun
> * when using GSSAPI, you use a thread-safe GSS / Kerberos 5 library.
> It says explicitly that that sasl_set* calls are not thread safe, since they 
> set global state.
> The proton library makes calls to sasl_set* functions in :
>   pni_init_client()
>   pni_init_server(), and
>   pni_process_init()
> Since those are internal functions, there is no way for code that uses Proton 
> to lock around those calls.
> I think proton needs a new API call to let applications call 
> sasl_set_mutex().  Or something.
> We probably also need other protections to meet the other requirements 
> specified in the Cyrus documentation (and quoted above).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PROTON-992) Proton's use of Cyrus SASL is not thread-safe.

2015-09-22 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902731#comment-14902731
 ] 

michael goulish commented on PROTON-992:


oops.  this is a duplicate of PROTON-862

> Proton's use of Cyrus SASL is not thread-safe.
> --
>
> Key: PROTON-992
> URL: https://issues.apache.org/jira/browse/PROTON-992
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: 0.10
>Reporter: michael goulish
>Assignee: michael goulish
>Priority: Critical
>
> Documentation for the Cyrus SASL library says that the library is believed to 
> be thread-safe only if the code that uses it meets several requirements.
> The requirements are:
> * you supply mutex functions (see sasl_set_mutex())
> * you make no libsasl calls until sasl_client/server_init() completes
> * no libsasl calls are made after sasl_done() is begun
> * when using GSSAPI, you use a thread-safe GSS / Kerberos 5 library.
> It says explicitly that that sasl_set* calls are not thread safe, since they 
> set global state.
> The proton library makes calls to sasl_set* functions in :
>   pni_init_client()
>   pni_init_server(), and
>   pni_process_init()
> Since those are internal functions, there is no way for code that uses Proton 
> to lock around those calls.
> I think proton needs a new API call to let applications call 
> sasl_set_mutex().  Or something.
> We probably also need other protections to meet the other requirements 
> specified in the Cyrus documentation (and quoted above).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-992) Proton's use of Cyrus SASL is not thread-safe.

2015-09-10 Thread michael goulish (JIRA)
michael goulish created PROTON-992:
--

 Summary: Proton's use of Cyrus SASL is not thread-safe.
 Key: PROTON-992
 URL: https://issues.apache.org/jira/browse/PROTON-992
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.10
Reporter: michael goulish
Priority: Critical


Documentation for the Cyrus SASL library says that the library is believed to 
be thread-safe only if the code that uses it meets several requirements.

The requirements are:
* you supply mutex functions (see sasl_set_mutex())
* you make no libsasl calls until sasl_client/server_init() completes
* no libsasl calls are made after sasl_done() is begun
* when using GSSAPI, you use a thread-safe GSS / Kerberos 5 library.

It says explicitly that that sasl_set* calls are not thread safe, since they 
set global state.

The proton library makes calls to sasl_set* functions in :
  pni_init_client()
  pni_init_server(), and
  pni_process_init()

Since those are internal functions, there is no way for code that uses Proton 
to lock around those calls.

I think proton needs a new API call to let applications call sasl_set_mutex().  
Or something.

We probably also need other protections to meet the other requirements 
specified in the Cyrus documentation (and quoted above).






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (PROTON-919) make C impl behave like java wrt channel_max error

2015-07-17 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish closed PROTON-919.
--
   Resolution: Fixed
Fix Version/s: 0.10

commit 4ee726002804d7286a8c76b42e0a0717e0798822

please NOTE that this change also adds  #define PN_OK (0)  to the list of 
errors in error.h

 make C impl behave like java wrt channel_max error
 --

 Key: PROTON-919
 URL: https://issues.apache.org/jira/browse/PROTON-919
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c, python-binding
Reporter: michael goulish
Assignee: michael goulish
Priority: Minor
 Fix For: 0.10


 In the Java impl, I made TransportImpl throw an exception if the application 
 tries to change the local channel_max setting after we have already sent the 
 OPEN frame to the remote peer.  ( Because at that point we communicate our 
 channel_max limit to the peer -- no fair changing it afterwards.)
 One reviewer suggested that it would be nice if the C impl worked the same 
 way.  That would mean that pn_set_channel_max() would have to return a result 
 code, which the Python binding would detect -- Python binding throws 
 exception, python tests detect it -- so it would work same way as Java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (PROTON-864) don't crash when channel number goes high

2015-07-17 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish closed PROTON-864.
--
   Resolution: Fixed
Fix Version/s: 0.10

This is a duplicate of PROTON-842 

 don't crash when channel number goes high
 -

 Key: PROTON-864
 URL: https://issues.apache.org/jira/browse/PROTON-864
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish
 Fix For: 0.10


 Code in transport.c, and a little in engine.c, looks at the topmost bit in 
 channel numbers to decide if the channels are in use.
 This causes crashes when the number of channels in a single connection goes 
 beyond 32767.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-949) proton doesn't build with ccache swig

2015-07-14 Thread michael goulish (JIRA)
michael goulish created PROTON-949:
--

 Summary: proton doesn't build with ccache swig
 Key: PROTON-949
 URL: https://issues.apache.org/jira/browse/PROTON-949
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Reporter: michael goulish


Thanks to aconway for finding this and saving me a day of madness and horror.

On freshly-downloaded proton tree, if I use this swig:

   /usr/lib64/ccache/swig

the build fails this way:
  qpid-proton/build/proton-c/bindings/python/cprotonPYTHON_wrap.c:4993:25: 
error: 'PN_HANDLE' undeclared (first use in this function)
PNI_PYTRACER = *((PN_HANDLE *)(argp));

--

but if I delete that swig executable, and use the one in  /bin/swig ,
then everything works.

yikes.

aconway believes the bug is in ccache-swig, not in proton, but I want to put 
this here in case this bites someone else in Proton Land.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PROTON-946) remove generated data structure definitions from protocol.h

2015-07-13 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish updated PROTON-946:
---
Description: 
Currently protocol.h.py reads the AMQP 1.0 spec xml files and generates all of 
its output into protocol.h -- even the data structure definitions.

Those definitions are currently protected by  #ifdef DEFINE_FIELDS , which is 
defined only in codec.c -- so the definitions only show up in that file, while 
other .c files only see the declarations.

If DEFINE_FIELDS is #defined in any other file, compilation will fail with 
multiple definition errors.

The structure declarations should remain in the .h file , but the actual 
definitions should be moved into a generated .c file.


  was:
Currently protocol.h.py reads the AMQP 1.0 spec xml files and generates all of 
its output into protocol.h -- evel the data structure definitions.

Those definitions are currently protected by  #ifdef DEFINE_FIELDS , which is 
defined only in codec.c -- so the definitions only show up in that file, while 
other .c files only see the declarations.

If DEFINE_FIELDS is #defined in any other file, compilation will fail with 
multiple definition errors.

The structure declarations should remain in the .h file , but the actual 
definitions should be moved into a generated .c file.


Summary: remove generated data structure definitions from protocol.h  
(was: remove generated data structure definitions from .protocol.h)

 remove generated data structure definitions from protocol.h
 ---

 Key: PROTON-946
 URL: https://issues.apache.org/jira/browse/PROTON-946
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: 0.10
Reporter: michael goulish
Assignee: michael goulish

 Currently protocol.h.py reads the AMQP 1.0 spec xml files and generates all 
 of its output into protocol.h -- even the data structure definitions.
 Those definitions are currently protected by  #ifdef DEFINE_FIELDS , which is 
 defined only in codec.c -- so the definitions only show up in that file, 
 while other .c files only see the declarations.
 If DEFINE_FIELDS is #defined in any other file, compilation will fail with 
 multiple definition errors.
 The structure declarations should remain in the .h file , but the actual 
 definitions should be moved into a generated .c file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-946) remove generated data structure definitions from .protocol.h

2015-07-13 Thread michael goulish (JIRA)
michael goulish created PROTON-946:
--

 Summary: remove generated data structure definitions from 
.protocol.h
 Key: PROTON-946
 URL: https://issues.apache.org/jira/browse/PROTON-946
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: 0.10
Reporter: michael goulish
Assignee: michael goulish


Currently protocol.h.py reads the AMQP 1.0 spec xml files and generates all of 
its output into protocol.h -- evel the data structure definitions.

Those definitions are currently protected by  #ifdef DEFINE_FIELDS , which is 
defined only in codec.c -- so the definitions only show up in that file, while 
other .c files only see the declarations.

If DEFINE_FIELDS is #defined in any other file, compilation will fail with 
multiple definition errors.

The structure declarations should remain in the .h file , but the actual 
definitions should be moved into a generated .c file.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PROTON-826) recent checkin causes frequent double-free or corruption crash

2015-07-08 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish resolved PROTON-826.

Resolution: Fixed

I recreated my test from February, and cannot reproduce the bug using latest 
dispatch + protron code.



 recent checkin causes frequent double-free or corruption crash
 --

 Key: PROTON-826
 URL: https://issues.apache.org/jira/browse/PROTON-826
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish
Priority: Blocker

 In my dispatch testing I am seeing frequent crashes in proton library that 
 began with proton checkin   01cb00c  on 2015-02-15   report read and write 
 errors through the transport
 The output at crash-time says this:
 ---
 *** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or 
 corruption (fasttop): 0x020ee880 ***
 === Backtrace: =
 /lib64/libc.so.6[0x3e3d875a4f]
 /lib64/libc.so.6[0x3e3d87cd78]
 /lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18]
 /lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41]
 /lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e]
 /lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032]
 /lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737]
 /lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a]
 /home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430]
 The backtrace from the core file looks like this:
 
 #0  0x003e3d835877 in raise () from /lib64/libc.so.6
 #1  0x003e3d836f68 in abort () from /lib64/libc.so.6
 #2  0x003e3d875a54 in __libc_message () from /lib64/libc.so.6
 #3  0x003e3d87cd78 in _int_free () from /lib64/libc.so.6
 #4  0x7fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:56
 #5  0x7fbf8a59b311 in pn_error_set (error=error@entry=0x1501140, 
 code=code@entry=-2,
 text=text@entry=0x7fbf801a69c0 recv: Resource temporarily unavailable)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:65
 #6  0x7fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2, 
 fmt=optimized out,
 ap=ap@entry=0x7fbf801a6de8) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:81
 #7  0x7fbf8a59b402 in pn_error_format (error=error@entry=0x1501140, 
 code=optimized out,
 fmt=fmt@entry=0x7fbf8a5bb21e %s: %s) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:89
 #8  0x7fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140,
 msg=msg@entry=0x7fbf8a5bbe1a recv)
 at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119
 #9  0x7fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=optimized out, 
 buf=optimized out,
 size=optimized out) at 
 /home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271
 #10 0x7fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0)
 -
 And I can prevent the crash from happening, apparently forever, by commenting 
 out this line:
   free(error-text);
 in the function  pn_error_clear
 in the file proton-c/src/error.c
 The error text that is being freed which causes the crash looks like this:
   $2 = {text = 0x7f66e8104e30 recv: Resource temporarily unavailable, root 
 = 0x0, code = -2}
 My dispatch test creates a router network and then repeatedly kills and 
 restarts a randomly-selected router.  After this proton checkin it almost 
 never gets through 5 iterations without this crash.  After I commented out 
 that line, it got through more than 500 iterations before I stopped it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PROTON-826) recent checkin causes frequent double-free or corruption crash

2015-07-07 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish reassigned PROTON-826:
--

Assignee: michael goulish

 recent checkin causes frequent double-free or corruption crash
 --

 Key: PROTON-826
 URL: https://issues.apache.org/jira/browse/PROTON-826
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish
Priority: Blocker

 In my dispatch testing I am seeing frequent crashes in proton library that 
 began with proton checkin   01cb00c  on 2015-02-15   report read and write 
 errors through the transport
 The output at crash-time says this:
 ---
 *** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or 
 corruption (fasttop): 0x020ee880 ***
 === Backtrace: =
 /lib64/libc.so.6[0x3e3d875a4f]
 /lib64/libc.so.6[0x3e3d87cd78]
 /lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18]
 /lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41]
 /lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e]
 /lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032]
 /lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737]
 /lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a]
 /home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430]
 The backtrace from the core file looks like this:
 
 #0  0x003e3d835877 in raise () from /lib64/libc.so.6
 #1  0x003e3d836f68 in abort () from /lib64/libc.so.6
 #2  0x003e3d875a54 in __libc_message () from /lib64/libc.so.6
 #3  0x003e3d87cd78 in _int_free () from /lib64/libc.so.6
 #4  0x7fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:56
 #5  0x7fbf8a59b311 in pn_error_set (error=error@entry=0x1501140, 
 code=code@entry=-2,
 text=text@entry=0x7fbf801a69c0 recv: Resource temporarily unavailable)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:65
 #6  0x7fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2, 
 fmt=optimized out,
 ap=ap@entry=0x7fbf801a6de8) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:81
 #7  0x7fbf8a59b402 in pn_error_format (error=error@entry=0x1501140, 
 code=optimized out,
 fmt=fmt@entry=0x7fbf8a5bb21e %s: %s) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:89
 #8  0x7fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140,
 msg=msg@entry=0x7fbf8a5bbe1a recv)
 at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119
 #9  0x7fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=optimized out, 
 buf=optimized out,
 size=optimized out) at 
 /home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271
 #10 0x7fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0)
 -
 And I can prevent the crash from happening, apparently forever, by commenting 
 out this line:
   free(error-text);
 in the function  pn_error_clear
 in the file proton-c/src/error.c
 The error text that is being freed which causes the crash looks like this:
   $2 = {text = 0x7f66e8104e30 recv: Resource temporarily unavailable, root 
 = 0x0, code = -2}
 My dispatch test creates a router network and then repeatedly kills and 
 restarts a randomly-selected router.  After this proton checkin it almost 
 never gets through 5 iterations without this crash.  After I commented out 
 that line, it got through more than 500 iterations before I stopped it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PROTON-826) recent checkin causes frequent double-free or corruption crash

2015-07-07 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616749#comment-14616749
 ] 

michael goulish commented on PROTON-826:


I see why I didn't follow this up earlier.
Current dispatch will not compile against latest proton because of some SASL 
issues.
But I need to test against latest proton.
SO ... now attempting to hack up dispatch so that it doesn't have SASL but will 
still build and run against latest proton

 recent checkin causes frequent double-free or corruption crash
 --

 Key: PROTON-826
 URL: https://issues.apache.org/jira/browse/PROTON-826
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish
Priority: Blocker

 In my dispatch testing I am seeing frequent crashes in proton library that 
 began with proton checkin   01cb00c  on 2015-02-15   report read and write 
 errors through the transport
 The output at crash-time says this:
 ---
 *** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or 
 corruption (fasttop): 0x020ee880 ***
 === Backtrace: =
 /lib64/libc.so.6[0x3e3d875a4f]
 /lib64/libc.so.6[0x3e3d87cd78]
 /lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18]
 /lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41]
 /lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e]
 /lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032]
 /lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737]
 /lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a]
 /home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430]
 The backtrace from the core file looks like this:
 
 #0  0x003e3d835877 in raise () from /lib64/libc.so.6
 #1  0x003e3d836f68 in abort () from /lib64/libc.so.6
 #2  0x003e3d875a54 in __libc_message () from /lib64/libc.so.6
 #3  0x003e3d87cd78 in _int_free () from /lib64/libc.so.6
 #4  0x7fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:56
 #5  0x7fbf8a59b311 in pn_error_set (error=error@entry=0x1501140, 
 code=code@entry=-2,
 text=text@entry=0x7fbf801a69c0 recv: Resource temporarily unavailable)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:65
 #6  0x7fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2, 
 fmt=optimized out,
 ap=ap@entry=0x7fbf801a6de8) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:81
 #7  0x7fbf8a59b402 in pn_error_format (error=error@entry=0x1501140, 
 code=optimized out,
 fmt=fmt@entry=0x7fbf8a5bb21e %s: %s) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:89
 #8  0x7fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140,
 msg=msg@entry=0x7fbf8a5bbe1a recv)
 at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119
 #9  0x7fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=optimized out, 
 buf=optimized out,
 size=optimized out) at 
 /home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271
 #10 0x7fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0)
 -
 And I can prevent the crash from happening, apparently forever, by commenting 
 out this line:
   free(error-text);
 in the function  pn_error_clear
 in the file proton-c/src/error.c
 The error text that is being freed which causes the crash looks like this:
   $2 = {text = 0x7f66e8104e30 recv: Resource temporarily unavailable, root 
 = 0x0, code = -2}
 My dispatch test creates a router network and then repeatedly kills and 
 restarts a randomly-selected router.  After this proton checkin it almost 
 never gets through 5 iterations without this crash.  After I commented out 
 that line, it got through more than 500 iterations before I stopped it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-930) add explicit AMQP 1.0 constants

2015-07-02 Thread michael goulish (JIRA)
michael goulish created PROTON-930:
--

 Summary: add explicit AMQP 1.0 constants
 Key: PROTON-930
 URL: https://issues.apache.org/jira/browse/PROTON-930
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Reporter: michael goulish
Assignee: michael goulish
Priority: Minor
 Fix For: 0.10


Add an include file that has explicit defined constants for every numeric 
default value that is mandated by the AMQP 1.0 spec.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PROTON-925) proton-c seems to treat unspecified channel-max as implying 0

2015-07-02 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish resolved PROTON-925.

Resolution: Fixed

commit fc38e86a6f5a1b265552708e674d3c8040c1985b

 proton-c seems to treat unspecified channel-max as implying 0
 -

 Key: PROTON-925
 URL: https://issues.apache.org/jira/browse/PROTON-925
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.10
Reporter: Gordon Sim
Assignee: michael goulish
Priority: Blocker
 Fix For: 0.10


 If max-channels is not specified in the open, it appears the latest proton-c 
 treats that as implying the maximum is 0 though the spec states the default 
 is 65535.
 This breaks compatibility with previous proton releases. E.g. the following 
 is the interaction between a sender using the latest 0.10 and a receiver 
 using proton 0.9.
 {noformat}
 [0x151c710]:  - AMQP
 [0x151c710]:0 - @open(16) 
 [container-id=65A6602D-5D24-4D39-9C6F-7403D98F5E15, hostname=localhost, 
 channel-max=32767]
 [0x151c710]:0 - @begin(17) [next-outgoing-id=0, incoming-window=2147483647, 
 outgoing-window=1]
 [0x151c710]:1 - @begin(17) [next-outgoing-id=0, incoming-window=2147483647, 
 outgoing-window=1]
 [0x151c710]:2 - @begin(17) [next-outgoing-id=0, incoming-window=2147483647, 
 outgoing-window=1]
 [0x151c710]:0 - @attach(18) [name=sender-xxx, handle=0, role=false, 
 snd-settle-mode=2, rcv-settle-mode=0, source=@source(40) [address=queue_a, 
 durable=0, timeout=0, dynamic=false], target=@target(41) [address=queue_a, 
 durable=0, timeout=0, dynamic=false], initial-delivery-count=0]
 [0x151c710]:1 - @attach(18) [name=sender-xxx, handle=0, role=false, 
 snd-settle-mode=2, rcv-settle-mode=0, source=@source(40) [address=queue_b, 
 durable=0, timeout=0, dynamic=false], target=@target(41) [address=queue_b, 
 durable=0, timeout=0, dynamic=false], initial-delivery-count=0]
 [0x151c710]:2 - @attach(18) [name=sender-xxx, handle=0, role=false, 
 snd-settle-mode=2, rcv-settle-mode=0, source=@source(40) [address=queue_c, 
 durable=0, timeout=0, dynamic=false], target=@target(41) [address=queue_c, 
 durable=0, timeout=0, dynamic=false], initial-delivery-count=0]
 [0x151c710]:  - AMQP
 [0x151c710]:0 - @open(16) 
 [container-id=abab56b0-c25e-427b-9f4f-d63da48d1973]
 [0x151c710]:0 - @begin(17) [remote-channel=0, next-outgoing-id=0, 
 incoming-window=2147483647, outgoing-window=0]
 [0x151c710]:1 - @begin(17) [remote-channel=1, next-outgoing-id=0, 
 incoming-window=2147483647, outgoing-window=0]
 [0x151c710]:2 - @begin(17) [remote-channel=2, next-outgoing-id=0, 
 incoming-window=2147483647, outgoing-window=0]
 [0x151c710]:0 - @attach(18) [name=sender-xxx, handle=0, role=true, 
 snd-settle-mode=2, rcv-settle-mode=0, source=@source(40) [address=queue_a, 
 durable=0, timeout=0, dynamic=false], target=@target(41) [address=queue_a, 
 durable=0, timeout=0, dynamic=false], initial-delivery-count=0]
 [0x151c710]:1 - @attach(18) [name=sender-xxx, handle=0, role=true, 
 snd-settle-mode=2, rcv-settle-mode=0, source=@source(40) [address=queue_b, 
 durable=0, timeout=0, dynamic=false], target=@target(41) [address=queue_b, 
 durable=0, timeout=0, dynamic=false], initial-delivery-count=0]
 [0x151c710]:2 - @attach(18) [name=sender-xxx, handle=0, role=true, 
 snd-settle-mode=2, rcv-settle-mode=0, source=@source(40) [address=queue_c, 
 durable=0, timeout=0, dynamic=false], target=@target(41) [address=queue_c, 
 durable=0, timeout=0, dynamic=false], initial-delivery-count=0]
 [0x151c710]:0 - @flow(19) [next-incoming-id=0, incoming-window=2147483647, 
 next-outgoing-id=0, outgoing-window=0, handle=0, delivery-count=0, 
 link-credit=341, drain=false]
 [0x151c710]:1 - @flow(19) [next-incoming-id=0, incoming-window=2147483647, 
 next-outgoing-id=0, outgoing-window=0, handle=0, delivery-count=0, 
 link-credit=341, drain=false]
 [0x151c710]:2 - @flow(19) [next-incoming-id=0, incoming-window=2147483647, 
 next-outgoing-id=0, outgoing-window=0, handle=0, delivery-count=0, 
 link-credit=341, drain=false]
 [0x151c710]:0 - @close(24) [error=@error(29) 
 [condition=:amqp:connection:framing-error, description=remote channel 1 is 
 above negotiated channel_max 0.]]
 [0x151c710]:  - EOS
 [0x151c710]:0 - @close(24) []
 [0x151c710]:  - EOS
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PROTON-842) proton-c should honor channel_max

2015-06-30 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish resolved PROTON-842.

Resolution: Fixed

Last checkin fixed java tests.

 proton-c should honor channel_max
 -

 Key: PROTON-842
 URL: https://issues.apache.org/jira/browse/PROTON-842
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-j
Affects Versions: 0.9, 0.10
Reporter: michael goulish
Assignee: michael goulish

 proton-c code should use  transport-channel_max and 
 transport-remote_channel_max  to enforce a limit on the
 maximum number of simultaneously active sessions on a 
 connection.   
 I guess the limit should be the minimum of those
 two numbers, or, if neither side sets a limit, then 2^16.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-919) make C impl behave like java wrt channel_max error

2015-06-23 Thread michael goulish (JIRA)
michael goulish created PROTON-919:
--

 Summary: make C impl behave like java wrt channel_max error
 Key: PROTON-919
 URL: https://issues.apache.org/jira/browse/PROTON-919
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c, python-binding
Reporter: michael goulish
Assignee: michael goulish
Priority: Minor


In the Java impl, I made TransportImpl throw an exception if the application 
tries to change the local channel_max setting after we have already sent the 
OPEN frame to the remote peer.  ( Because at that point we communicate our 
channel_max limit to the peer -- no fair changing it afterwards.)

One reviewer suggested that it would be nice if the C impl worked the same way. 
 That would mean that pn_set_channel_max() would have to return a result code, 
which the Python binding would detect -- Python binding throws exception, 
python tests detect it -- so it would work same way as Java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PROTON-919) make C impl behave like java wrt channel_max error

2015-06-23 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598095#comment-14598095
 ] 

michael goulish commented on PROTON-919:


~~~ NOTE ~~~

The proposed change alters the public API in that it changes 
pn_transport_set_channel_max() to return an int, rather than void.




 make C impl behave like java wrt channel_max error
 --

 Key: PROTON-919
 URL: https://issues.apache.org/jira/browse/PROTON-919
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c, python-binding
Reporter: michael goulish
Assignee: michael goulish
Priority: Minor

 In the Java impl, I made TransportImpl throw an exception if the application 
 tries to change the local channel_max setting after we have already sent the 
 OPEN frame to the remote peer.  ( Because at that point we communicate our 
 channel_max limit to the peer -- no fair changing it afterwards.)
 One reviewer suggested that it would be nice if the C impl worked the same 
 way.  That would mean that pn_set_channel_max() would have to return a result 
 code, which the Python binding would detect -- Python binding throws 
 exception, python tests detect it -- so it would work same way as Java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PROTON-842) proton-c should honor channel_max

2015-06-18 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish resolved PROTON-842.

Resolution: Fixed

commit e38957ae5115ec023993672ca5b7d5e3df414f7e

 proton-c should honor channel_max
 -

 Key: PROTON-842
 URL: https://issues.apache.org/jira/browse/PROTON-842
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish

 proton-c code should use  transport-channel_max and 
 transport-remote_channel_max  to enforce a limit on the
 maximum number of simultaneously active sessions on a 
 connection.   
 I guess the limit should be the minimum of those
 two numbers, or, if neither side sets a limit, then 2^16.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PROTON-842) proton-c should honor channel_max

2015-06-18 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591877#comment-14591877
 ] 

michael goulish commented on PROTON-842:


-- please note -- 

This fix changes API behavior in one way:   pn_session can now return NULL if 
an attempt is made to create more sessions than are allowed by the value of 
channel_max.

Previously, limitation on number of session was enforced by SEGV.



 proton-c should honor channel_max
 -

 Key: PROTON-842
 URL: https://issues.apache.org/jira/browse/PROTON-842
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish

 proton-c code should use  transport-channel_max and 
 transport-remote_channel_max  to enforce a limit on the
 maximum number of simultaneously active sessions on a 
 connection.   
 I guess the limit should be the minimum of those
 two numbers, or, if neither side sets a limit, then 2^16.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (PROTON-842) proton-c should honor channel_max

2015-06-18 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish reopened PROTON-842:


My fix for proton-c is making trouble for proton-j

 proton-c should honor channel_max
 -

 Key: PROTON-842
 URL: https://issues.apache.org/jira/browse/PROTON-842
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish

 proton-c code should use  transport-channel_max and 
 transport-remote_channel_max  to enforce a limit on the
 maximum number of simultaneously active sessions on a 
 connection.   
 I guess the limit should be the minimum of those
 two numbers, or, if neither side sets a limit, then 2^16.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PROTON-896) change all static function names to begin with pni_

2015-06-08 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish reassigned PROTON-896:
--

Assignee: michael goulish

 change all static function names to begin with pni_
 ---

 Key: PROTON-896
 URL: https://issues.apache.org/jira/browse/PROTON-896
 Project: Qpid Proton
  Issue Type: Improvement
Reporter: michael goulish
Assignee: michael goulish
Priority: Minor

 Change all the static function names to start with pni_ ,
 and declare all functions as static that ought to be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-896) change all statis function names to begin with pni_

2015-05-29 Thread michael goulish (JIRA)
michael goulish created PROTON-896:
--

 Summary: change all statis function names to begin with pni_
 Key: PROTON-896
 URL: https://issues.apache.org/jira/browse/PROTON-896
 Project: Qpid Proton
  Issue Type: Improvement
Reporter: michael goulish
Priority: Minor


Change all the static function names to start with pni_ ,
and declare all functions as static that ought to be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PROTON-896) change all static function names to begin with pni_

2015-05-29 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish updated PROTON-896:
---
Summary: change all static function names to begin with pni_  (was: change 
all statis function names to begin with pni_)

 change all static function names to begin with pni_
 ---

 Key: PROTON-896
 URL: https://issues.apache.org/jira/browse/PROTON-896
 Project: Qpid Proton
  Issue Type: Improvement
Reporter: michael goulish
Priority: Minor

 Change all the static function names to start with pni_ ,
 and declare all functions as static that ought to be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PROTON-864) don't crash when channel number goes high

2015-05-19 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish updated PROTON-864:
---
Summary: don't crash when channel number goes high  (was: avoid crashes 
when channel number goes high.)

 don't crash when channel number goes high
 -

 Key: PROTON-864
 URL: https://issues.apache.org/jira/browse/PROTON-864
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish

 Code in transport.c, and a little in engine.c, looks at the topmost bit in 
 channel numbers to decide if the channels are in use.
 This causes crashes when the number of channels in a single connection goes 
 beyond 32767.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PROTON-864) avoid crashes when channel number goes high.

2015-05-19 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish updated PROTON-864:
---
Summary: avoid crashes when channel number goes high.  (was: don't overload 
top bit of channel numbers )

 avoid crashes when channel number goes high.
 

 Key: PROTON-864
 URL: https://issues.apache.org/jira/browse/PROTON-864
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish

 Code in transport.c, and a little in engine.c, looks at the topmost bit in 
 channel numbers to decide if the channels are in use.
 This causes crashes when the number of channels in a single connection goes 
 beyond 32767.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-888) allocate_alias linear search becomes slow at scale

2015-05-18 Thread michael goulish (JIRA)
michael goulish created PROTON-888:
--

 Summary: allocate_alias linear search becomes slow at scale
 Key: PROTON-888
 URL: https://issues.apache.org/jira/browse/PROTON-888
 Project: Qpid Proton
  Issue Type: Improvement
Reporter: michael goulish


Testing that I have done recently goes to large scale on number of sessions per 
connection.  I noticed that the test was slowing down rapidly over time, in 
terms of how many sessions were being established per unit time.

The function allocate_alias in file transport.c uses a linear search through an 
array to find the next available channel number for a session  (or the next 
available handle number for a link).  In a usage scenario like mine in which 
many sessions will be established, this becomes very slow as the array fills up.

At the beginning of my test, this function is too fast to measure.  By the end, 
it is using more than 82 milliseconds per call.  Overall, this function alone 
is contributing more than 20 seconds to my 3-minute test.

This is not an unrealistic scenario -- we already have one potential customer 
who is interested in going to this kind of scale.  (Which is why I was doing 
this test.)

Maybe we can find an implementation that does not slow down the common scale, 
and yet behaves better at the high end.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-886) make proton enforce handle-max

2015-05-14 Thread michael goulish (JIRA)
michael goulish created PROTON-886:
--

 Summary: make proton enforce handle-max 
 Key: PROTON-886
 URL: https://issues.apache.org/jira/browse/PROTON-886
 Project: Qpid Proton
  Issue Type: Bug
Reporter: michael goulish


Make the code enforce limits on handles (and links) from section 2.7.2 of the 
AMQP 1.0 spec.

The handle-max value is the highest handle value that can be used on the 
session. A peer MUST NOT attempt to attach a link using a handle value outside 
the range that its partner can handle.  A peer that receives a handle outside 
the supported range MUST close the connection with the framing-error error-code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-864) don't overload top bit of channel numbers

2015-04-24 Thread michael goulish (JIRA)
michael goulish created PROTON-864:
--

 Summary: don't overload top bit of channel numbers 
 Key: PROTON-864
 URL: https://issues.apache.org/jira/browse/PROTON-864
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Assignee: michael goulish


Code in transport.c, and a little in engine.c, looks at the topmost bit in 
channel numbers to decide if the channels are in use.
This causes crashes when the number of channels in a single connection goes 
beyond 32767.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PROTON-826) recent checkin causes frequent double-free or corruption crash

2015-02-25 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336824#comment-14336824
 ] 

michael goulish commented on PROTON-826:


It looks like the problem here is just that the error struct used in  
proton-c/src/error.c is not thread safe -- so I am opening a new Jira for 
Dispatch.

I am leaving this one open for now, however, because other applications using 
proton will encounter this.  Either something could be changed in proton to 
make this less thread-hostile, or ... it could be publicized better?

Please feel free to close when appropriate.



 recent checkin causes frequent double-free or corruption crash
 --

 Key: PROTON-826
 URL: https://issues.apache.org/jira/browse/PROTON-826
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Priority: Blocker

 In my dispatch testing I am seeing frequent crashes in proton library that 
 began with proton checkin   01cb00c  on 2015-02-15   report read and write 
 errors through the transport
 The output at crash-time says this:
 ---
 *** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or 
 corruption (fasttop): 0x020ee880 ***
 === Backtrace: =
 /lib64/libc.so.6[0x3e3d875a4f]
 /lib64/libc.so.6[0x3e3d87cd78]
 /lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18]
 /lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41]
 /lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e]
 /lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032]
 /lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737]
 /lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a]
 /home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430]
 The backtrace from the core file looks like this:
 
 #0  0x003e3d835877 in raise () from /lib64/libc.so.6
 #1  0x003e3d836f68 in abort () from /lib64/libc.so.6
 #2  0x003e3d875a54 in __libc_message () from /lib64/libc.so.6
 #3  0x003e3d87cd78 in _int_free () from /lib64/libc.so.6
 #4  0x7fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:56
 #5  0x7fbf8a59b311 in pn_error_set (error=error@entry=0x1501140, 
 code=code@entry=-2,
 text=text@entry=0x7fbf801a69c0 recv: Resource temporarily unavailable)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:65
 #6  0x7fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2, 
 fmt=optimized out,
 ap=ap@entry=0x7fbf801a6de8) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:81
 #7  0x7fbf8a59b402 in pn_error_format (error=error@entry=0x1501140, 
 code=optimized out,
 fmt=fmt@entry=0x7fbf8a5bb21e %s: %s) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:89
 #8  0x7fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140,
 msg=msg@entry=0x7fbf8a5bbe1a recv)
 at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119
 #9  0x7fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=optimized out, 
 buf=optimized out,
 size=optimized out) at 
 /home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271
 #10 0x7fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0)
 -
 And I can prevent the crash from happening, apparently forever, by commenting 
 out this line:
   free(error-text);
 in the function  pn_error_clear
 in the file proton-c/src/error.c
 The error text that is being freed which causes the crash looks like this:
   $2 = {text = 0x7f66e8104e30 recv: Resource temporarily unavailable, root 
 = 0x0, code = -2}
 My dispatch test creates a router network and then repeatedly kills and 
 restarts a randomly-selected router.  After this proton checkin it almost 
 never gets through 5 iterations without this crash.  After I commented out 
 that line, it got through more than 500 iterations before I stopped it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-826) recent checkin causes frequent double-free or corruption crash

2015-02-24 Thread michael goulish (JIRA)
michael goulish created PROTON-826:
--

 Summary: recent checkin causes frequent double-free or corruption 
crash
 Key: PROTON-826
 URL: https://issues.apache.org/jira/browse/PROTON-826
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Priority: Blocker


In my dispatch testing I am seeing frequent crashes in proton library that 
began with proton checkin   01cb00c  on 2015-02-15   report read and write 
errors through the transport



The output at crash-time says this:
---

*** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or 
corruption (fasttop): 0x020ee880 ***
=== Backtrace: =
/lib64/libc.so.6[0x3e3d875a4f]
/lib64/libc.so.6[0x3e3d87cd78]
/lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18]
/lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41]
/lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e]
/lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032]
/lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737]
/lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a]
/home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430]




The backtrace from the core file looks like this:


#0  0x003e3d835877 in raise () from /lib64/libc.so.6
#1  0x003e3d836f68 in abort () from /lib64/libc.so.6
#2  0x003e3d875a54 in __libc_message () from /lib64/libc.so.6
#3  0x003e3d87cd78 in _int_free () from /lib64/libc.so.6
#4  0x7fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140)
at /home/mick/rh-qpid-proton/proton-c/src/error.c:56
#5  0x7fbf8a59b311 in pn_error_set (error=error@entry=0x1501140, 
code=code@entry=-2,
text=text@entry=0x7fbf801a69c0 recv: Resource temporarily unavailable)
at /home/mick/rh-qpid-proton/proton-c/src/error.c:65
#6  0x7fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2, 
fmt=optimized out,
ap=ap@entry=0x7fbf801a6de8) at 
/home/mick/rh-qpid-proton/proton-c/src/error.c:81
#7  0x7fbf8a59b402 in pn_error_format (error=error@entry=0x1501140, 
code=optimized out,
fmt=fmt@entry=0x7fbf8a5bb21e %s: %s) at 
/home/mick/rh-qpid-proton/proton-c/src/error.c:89
#8  0x7fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140,
msg=msg@entry=0x7fbf8a5bbe1a recv)
at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119
#9  0x7fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=optimized out, 
buf=optimized out,
size=optimized out) at 
/home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271
#10 0x7fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0)

-

And I can prevent the crash from happening, apparently forever, by commenting 
out this line:
  free(error-text);
in the function  pn_error_clear
in the file proton-c/src/error.c

The error text that is being freed which causes the crash looks like this:
  $2 = {text = 0x7f66e8104e30 recv: Resource temporarily unavailable, root = 
0x0, code = -2}


My dispatch test creates a router network and then repeatedly kills and 
restarts a randomly-selected router.  After this proton checkin it almost never 
gets through 5 iterations without this crash.  After I commented out that line, 
it got through more than 500 iterations before I stopped it.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-703) inlining performance improvements

2014-09-29 Thread michael goulish (JIRA)
michael goulish created PROTON-703:
--

 Summary: inlining performance improvements
 Key: PROTON-703
 URL: https://issues.apache.org/jira/browse/PROTON-703
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Reporter: michael goulish
Assignee: michael goulish
Priority: Minor


omnibus jira for any other inlining performance improvements i may find.

notes to self:
  * don't affect public APIs.
  * don't forget to test Debug build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-700) small performance improvement from inling one fn.

2014-09-25 Thread michael goulish (JIRA)
michael goulish created PROTON-700:
--

 Summary: small performance improvement from inling one fn.
 Key: PROTON-700
 URL: https://issues.apache.org/jira/browse/PROTON-700
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Reporter: michael goulish
Assignee: michael goulish
Priority: Minor


inlining the internal function pn_data_node()  improves speed somewhere between 
2.6% and 6%, depending on architecture.

This is based on testing I did with two C-based clients written at the engine 
interface level.

The higher 6% figure was seen on a more modern machine with recent Intel 
processors, the lower figure was seen on an older box with AMD processors.

But the effect is real: after 5- repetition before the change  50 after, 
T-test indicates odds of this happening by chance is 2.0e-18 .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PROTON-625) Biggest Backtrace Ever!

2014-07-03 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051826#comment-14051826
 ] 

michael goulish commented on PROTON-625:


Here's what happens, and a fix.


  1. pni_map_entry() calls pni_map_ensure() to make sure map
 has enough capacity.

  2. The capacity-increasing loop in pni_map_ensure() has two
 conditions on it:  increase the capacity if map-capacity
 is too small, or if  map 'load' is greater than map-load_factor.
 ( Map load is ... meaning not obvious to me. )

  3. If pni_map_ensure() returns true, then pni_map_entry() will
 call itself recursively, and keep doing that until
 pni_map_ensure() returns false.
 'False' means 'I made no change.'

  4. But it is possible for pni_map_ensure() to make no change,
 and yet return true.
 Here is how it happened in my most recent test:
 map-capacity 512
 capacity  331
 pni_map_load(map) 0.75
 map-load_factor  0.75

   5. Those values made *both* conditions on the capacity-
  increasing loop in pni_map_ensure() false.
  So it didn't do anything to change the map.
  But it returned true.
  So pni_map_entry() called itself.
  But nothing had changed.
  And away we go.

   FIX 

 Make the test on the if at the top of pni_map_ensure
 say this:

   if (capacity = map-capacity  load = map-load_factor) {

 ( Added '=' to the load test. )

 After that, I ran twenty tests with no failure.
 Previously, failure probability on my system
 was 0.3.So odds of 20 in a row happening
 by chance is a little less than 1 in 1000.


 Biggest Backtrace Ever!
 ---

 Key: PROTON-625
 URL: https://issues.apache.org/jira/browse/PROTON-625
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.8
Reporter: michael goulish

 I am saving all my stuff so I can repro on demand.
 It doesn't happen every time, but it's about 50%.
 --
 On one box, I have a dispatch router.
 On the other box, I have 10 clients: 5 Messenger-based receivers, and 5 
 qpid-messaging-based senders.
 Each client will handle 100 addresses, of the form mick/0 ... mick/1 ... 
  c.
 100 messages will be sent to each address.
 I start the 5 receivers first.  They start OK.  Dispatch router happy  
 stable.
 Wait a few seconds.
 I start the 5 senders, from a bash script.
 The first sender is already sending when the 2nd, 3rd, 4th start.
 After a few of them start,but before all have finished starting,  a few 
 seconds into the script, the crash occurs.  ( If they all start up 
 successfully, no crash. )
 The crash occurs in the dispatch router.
 Here is the biggest backtrace ever:
 #0  0x003cf9879ad1 in _int_malloc (av=0x7f101c20, bytes=16384) at 
 malloc.c:4383
 #1  0x003cf987a911 in __libc_malloc (bytes=16384) at malloc.c:3664
 #2  0x0039c6c1650a in pni_map_allocate () from 
 /usr/lib64/libqpid-proton.so.2
 #3  0x0039c6c16a3a in pni_map_ensure () from 
 /usr/lib64/libqpid-proton.so.2
 #4  0x0039c6c16c45 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #5  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #6  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #7  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #8  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #9  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #10 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #11 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #12 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #13 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #14 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 .
 .
 .
 .
 #93549 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93550 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93551 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93552 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93553 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93554 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93555 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93556 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93557 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93558 

[jira] [Commented] (PROTON-625) Biggest Backtrace Ever!

2014-07-02 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049897#comment-14049897
 ] 

michael goulish commented on PROTON-625:


I had some confusion about what libraries were being picked up.  
Sorry!

This bug is *not* present on 0.7 !

I was able to run 0.7-based dispatch-router 10 times with no failure.
Then, switching to latest proton trunk code as of today -- 2 out of first 3 
tests resulted in this failure.



 Biggest Backtrace Ever!
 ---

 Key: PROTON-625
 URL: https://issues.apache.org/jira/browse/PROTON-625
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.8
Reporter: michael goulish

 I am saving all my stuff so I can repro on demand.
 It doesn't happen every time, but it's about 50%.
 --
 On one box, I have a dispatch router.
 On the other box, I have 10 clients: 5 Messenger-based receivers, and 5 
 qpid-messaging-based senders.
 Each client will handle 100 addresses, of the form mick/0 ... mick/1 ... 
  c.
 100 messages will be sent to each address.
 I start the 5 receivers first.  They start OK.  Dispatch router happy  
 stable.
 Wait a few seconds.
 I start the 5 senders, from a bash script.
 The first sender is already sending when the 2nd, 3rd, 4th start.
 After a few of them start,but before all have finished starting,  a few 
 seconds into the script, the crash occurs.  ( If they all start up 
 successfully, no crash. )
 The crash occurs in the dispatch router.
 Here is the biggest backtrace ever:
 #0  0x003cf9879ad1 in _int_malloc (av=0x7f101c20, bytes=16384) at 
 malloc.c:4383
 #1  0x003cf987a911 in __libc_malloc (bytes=16384) at malloc.c:3664
 #2  0x0039c6c1650a in pni_map_allocate () from 
 /usr/lib64/libqpid-proton.so.2
 #3  0x0039c6c16a3a in pni_map_ensure () from 
 /usr/lib64/libqpid-proton.so.2
 #4  0x0039c6c16c45 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #5  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #6  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #7  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #8  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #9  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #10 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #11 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #12 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #13 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #14 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 .
 .
 .
 .
 #93549 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93550 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93551 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93552 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93553 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93554 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93555 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93556 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93557 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93558 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93559 0x0039c6c16dc0 in pn_map_put () from /usr/lib64/libqpid-proton.so.2
 #93560 0x0039c6c17226 in pn_hash_put () from 
 /usr/lib64/libqpid-proton.so.2
 #93561 0x0039c6c2a643 in pn_delivery_map_push () from 
 /usr/lib64/libqpid-proton.so.2
 #93562 0x0039c6c2c44b in pn_do_transfer () from 
 /usr/lib64/libqpid-proton.so.2
 #93563 0x0039c6c24385 in pn_dispatch_frame () from 
 /usr/lib64/libqpid-proton.so.2
 #93564 0x0039c6c2448f in pn_dispatcher_input () from 
 /usr/lib64/libqpid-proton.so.2
 #93565 0x0039c6c2d68b in pn_input_read_amqp () from 
 /usr/lib64/libqpid-proton.so.2
 #93566 0x0039c6c3011a in pn_io_layer_input_passthru () from 
 /usr/lib64/libqpid-proton.so.2
 #93567 0x0039c6c3011a in pn_io_layer_input_passthru () from 
 /usr/lib64/libqpid-proton.so.2
 #93568 0x0039c6c2d275 in transport_consume () from 
 /usr/lib64/libqpid-proton.so.2
 #93569 0x0039c6c304cd in pn_transport_process () from 
 /usr/lib64/libqpid-proton.so.2
 #93570 0x0039c6c3e40c in pn_connector_process () from 
 /usr/lib64/libqpid-proton.so.2
 #93571 0x7f1060c60460 in process_connector () from 
 /home/mick/dispatch/build/libqpid-dispatch.so.0
 

[jira] [Commented] (PROTON-625) Biggest Backtrace Ever!

2014-07-02 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051007#comment-14051007
 ] 

michael goulish commented on PROTON-625:


Here is a hack that fixes it.
A little new code in pni_map_ensure().

Tested this on latest protonics, version 1607485.

Without hack:  3 failures out of 10 tests. (similar to what I have been seeing 
on other versions.)

With hack:  0 failures out of 13 tests.  ( probability this happened by chance: 
less that 1% )


So, now I'm trying to see how it should *really* be fixed...


--- code --- code --- code --- code --- code --- code --- code --- code --- 
code ---


 // This loop is what is already there, in pni_map_ensure.  No change.
  while (map-capacity  capacity || pni_map_load(map)  map-load_factor) {
map-capacity *= 2;
map-addressable = (size_t) (0.86 * map-capacity);
  }

  /*---
If ever we get past the above while-loop without 
actually having changed map-cap, we are doomed 
to eternal torment.  So, force it.
  ---*/
  if ( oldcap == map-capacity )
  {
fprintf ( stderr, Fiery the angels fell; deep thunder rolled around their 
shores, burning with the fires of Orc!\n );
map-capacity *= 2;
map-addressable = (size_t) (0.86 * map-capacity);
  }


 Biggest Backtrace Ever!
 ---

 Key: PROTON-625
 URL: https://issues.apache.org/jira/browse/PROTON-625
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.8
Reporter: michael goulish

 I am saving all my stuff so I can repro on demand.
 It doesn't happen every time, but it's about 50%.
 --
 On one box, I have a dispatch router.
 On the other box, I have 10 clients: 5 Messenger-based receivers, and 5 
 qpid-messaging-based senders.
 Each client will handle 100 addresses, of the form mick/0 ... mick/1 ... 
  c.
 100 messages will be sent to each address.
 I start the 5 receivers first.  They start OK.  Dispatch router happy  
 stable.
 Wait a few seconds.
 I start the 5 senders, from a bash script.
 The first sender is already sending when the 2nd, 3rd, 4th start.
 After a few of them start,but before all have finished starting,  a few 
 seconds into the script, the crash occurs.  ( If they all start up 
 successfully, no crash. )
 The crash occurs in the dispatch router.
 Here is the biggest backtrace ever:
 #0  0x003cf9879ad1 in _int_malloc (av=0x7f101c20, bytes=16384) at 
 malloc.c:4383
 #1  0x003cf987a911 in __libc_malloc (bytes=16384) at malloc.c:3664
 #2  0x0039c6c1650a in pni_map_allocate () from 
 /usr/lib64/libqpid-proton.so.2
 #3  0x0039c6c16a3a in pni_map_ensure () from 
 /usr/lib64/libqpid-proton.so.2
 #4  0x0039c6c16c45 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #5  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #6  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #7  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #8  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #9  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #10 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #11 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #12 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #13 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #14 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 .
 .
 .
 .
 #93549 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93550 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93551 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93552 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93553 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93554 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93555 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93556 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93557 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93558 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93559 0x0039c6c16dc0 in pn_map_put () from /usr/lib64/libqpid-proton.so.2
 #93560 0x0039c6c17226 in pn_hash_put () from 
 /usr/lib64/libqpid-proton.so.2
 #93561 0x0039c6c2a643 in pn_delivery_map_push () from 
 /usr/lib64/libqpid-proton.so.2
 #93562 0x0039c6c2c44b 

[jira] [Commented] (PROTON-625) Biggest Backtrace Ever!

2014-07-01 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048569#comment-14048569
 ] 

michael goulish commented on PROTON-625:


BTW -- I kill and restart the router after each test.

 Biggest Backtrace Ever!
 ---

 Key: PROTON-625
 URL: https://issues.apache.org/jira/browse/PROTON-625
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.7
Reporter: michael goulish

 I am saving all my stuff so I can repro on demand.
 It doesn't happen every time, but it's about 50%.
 --
 On one box, I have a dispatch router.
 On the other box, I have 10 clients: 5 Messenger-based receivers, and 5 
 qpid-messaging-based senders.
 Each client will handle 100 addresses, of the form mick/0 ... mick/1 ... 
  c.
 100 messages will be sent to each address.
 I start the 5 receivers first.  They start OK.  Dispatch router happy  
 stable.
 Wait a few seconds.
 I start the 5 senders, from a bash script.
 The first sender is already sending when the 2nd, 3rd, 4th start.
 After a few of them start,but before all have finished starting,  a few 
 seconds into the script, the crash occurs.  ( If they all start up 
 successfully, no crash. )
 The crash occurs in the dispatch router.
 Here is the biggest backtrace ever:
 #0  0x003cf9879ad1 in _int_malloc (av=0x7f101c20, bytes=16384) at 
 malloc.c:4383
 #1  0x003cf987a911 in __libc_malloc (bytes=16384) at malloc.c:3664
 #2  0x0039c6c1650a in pni_map_allocate () from 
 /usr/lib64/libqpid-proton.so.2
 #3  0x0039c6c16a3a in pni_map_ensure () from 
 /usr/lib64/libqpid-proton.so.2
 #4  0x0039c6c16c45 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #5  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #6  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #7  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #8  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #9  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #10 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #11 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #12 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #13 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #14 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 .
 .
 .
 .
 #93549 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93550 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93551 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93552 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93553 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93554 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93555 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93556 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93557 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93558 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93559 0x0039c6c16dc0 in pn_map_put () from /usr/lib64/libqpid-proton.so.2
 #93560 0x0039c6c17226 in pn_hash_put () from 
 /usr/lib64/libqpid-proton.so.2
 #93561 0x0039c6c2a643 in pn_delivery_map_push () from 
 /usr/lib64/libqpid-proton.so.2
 #93562 0x0039c6c2c44b in pn_do_transfer () from 
 /usr/lib64/libqpid-proton.so.2
 #93563 0x0039c6c24385 in pn_dispatch_frame () from 
 /usr/lib64/libqpid-proton.so.2
 #93564 0x0039c6c2448f in pn_dispatcher_input () from 
 /usr/lib64/libqpid-proton.so.2
 #93565 0x0039c6c2d68b in pn_input_read_amqp () from 
 /usr/lib64/libqpid-proton.so.2
 #93566 0x0039c6c3011a in pn_io_layer_input_passthru () from 
 /usr/lib64/libqpid-proton.so.2
 #93567 0x0039c6c3011a in pn_io_layer_input_passthru () from 
 /usr/lib64/libqpid-proton.so.2
 #93568 0x0039c6c2d275 in transport_consume () from 
 /usr/lib64/libqpid-proton.so.2
 #93569 0x0039c6c304cd in pn_transport_process () from 
 /usr/lib64/libqpid-proton.so.2
 #93570 0x0039c6c3e40c in pn_connector_process () from 
 /usr/lib64/libqpid-proton.so.2
 #93571 0x7f1060c60460 in process_connector () from 
 /home/mick/dispatch/build/libqpid-dispatch.so.0
 #93572 0x7f1060c61017 in thread_run () from 
 /home/mick/dispatch/build/libqpid-dispatch.so.0
 #93573 0x003cf9c07851 in start_thread (arg=0x7f1052bfd700) at 
 pthread_create.c:301
 #93574 0x003cf98e890d in clone () at 
 

[jira] [Commented] (PROTON-625) Biggest Backtrace Ever!

2014-07-01 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048573#comment-14048573
 ] 

michael goulish commented on PROTON-625:


When I put usleep(1000) after each message sent, I have zero failures in 10 
tries.

 Biggest Backtrace Ever!
 ---

 Key: PROTON-625
 URL: https://issues.apache.org/jira/browse/PROTON-625
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.7
Reporter: michael goulish

 I am saving all my stuff so I can repro on demand.
 It doesn't happen every time, but it's about 50%.
 --
 On one box, I have a dispatch router.
 On the other box, I have 10 clients: 5 Messenger-based receivers, and 5 
 qpid-messaging-based senders.
 Each client will handle 100 addresses, of the form mick/0 ... mick/1 ... 
  c.
 100 messages will be sent to each address.
 I start the 5 receivers first.  They start OK.  Dispatch router happy  
 stable.
 Wait a few seconds.
 I start the 5 senders, from a bash script.
 The first sender is already sending when the 2nd, 3rd, 4th start.
 After a few of them start,but before all have finished starting,  a few 
 seconds into the script, the crash occurs.  ( If they all start up 
 successfully, no crash. )
 The crash occurs in the dispatch router.
 Here is the biggest backtrace ever:
 #0  0x003cf9879ad1 in _int_malloc (av=0x7f101c20, bytes=16384) at 
 malloc.c:4383
 #1  0x003cf987a911 in __libc_malloc (bytes=16384) at malloc.c:3664
 #2  0x0039c6c1650a in pni_map_allocate () from 
 /usr/lib64/libqpid-proton.so.2
 #3  0x0039c6c16a3a in pni_map_ensure () from 
 /usr/lib64/libqpid-proton.so.2
 #4  0x0039c6c16c45 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #5  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #6  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #7  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #8  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #9  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #10 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #11 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #12 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #13 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 #14 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
 .
 .
 .
 .
 #93549 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93550 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93551 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93552 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93553 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93554 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93555 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93556 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93557 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93558 0x0039c6c16c64 in pni_map_entry () from 
 /usr/lib64/libqpid-proton.so.2
 #93559 0x0039c6c16dc0 in pn_map_put () from /usr/lib64/libqpid-proton.so.2
 #93560 0x0039c6c17226 in pn_hash_put () from 
 /usr/lib64/libqpid-proton.so.2
 #93561 0x0039c6c2a643 in pn_delivery_map_push () from 
 /usr/lib64/libqpid-proton.so.2
 #93562 0x0039c6c2c44b in pn_do_transfer () from 
 /usr/lib64/libqpid-proton.so.2
 #93563 0x0039c6c24385 in pn_dispatch_frame () from 
 /usr/lib64/libqpid-proton.so.2
 #93564 0x0039c6c2448f in pn_dispatcher_input () from 
 /usr/lib64/libqpid-proton.so.2
 #93565 0x0039c6c2d68b in pn_input_read_amqp () from 
 /usr/lib64/libqpid-proton.so.2
 #93566 0x0039c6c3011a in pn_io_layer_input_passthru () from 
 /usr/lib64/libqpid-proton.so.2
 #93567 0x0039c6c3011a in pn_io_layer_input_passthru () from 
 /usr/lib64/libqpid-proton.so.2
 #93568 0x0039c6c2d275 in transport_consume () from 
 /usr/lib64/libqpid-proton.so.2
 #93569 0x0039c6c304cd in pn_transport_process () from 
 /usr/lib64/libqpid-proton.so.2
 #93570 0x0039c6c3e40c in pn_connector_process () from 
 /usr/lib64/libqpid-proton.so.2
 #93571 0x7f1060c60460 in process_connector () from 
 /home/mick/dispatch/build/libqpid-dispatch.so.0
 #93572 0x7f1060c61017 in thread_run () from 
 /home/mick/dispatch/build/libqpid-dispatch.so.0
 #93573 0x003cf9c07851 in start_thread (arg=0x7f1052bfd700) at 
 pthread_create.c:301
 #93574 

[jira] [Created] (PROTON-625) Biggest Backtrace Ever!

2014-06-30 Thread michael goulish (JIRA)
michael goulish created PROTON-625:
--

 Summary: Biggest Backtrace Ever!
 Key: PROTON-625
 URL: https://issues.apache.org/jira/browse/PROTON-625
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Reporter: michael goulish


I am saving all my stuff so I can repro on demand.
It doesn't happen every time, but it's about 50%.

--

On one box, I have a dispatch router.
On the other box, I have 10 clients: 5 Messenger-based receivers, and 5 
qpid-messaging-based senders.

Each client will handle 100 addresses, of the form mick/0 ... mick/1 ...  
c.

100 messages will be sent to each address.

I start the 5 receivers first.  They start OK.  Dispatch router happy  stable.

Wait a few seconds.

I start the 5 senders, from a bash script.
The first sender is already sending when the 2nd, 3rd, 4th start.

After a few of them start,but before all have finished starting,  a few seconds 
into the script, the crash occurs.  ( If they all start up successfully, no 
crash. )

Here is the biggest backtrace ever:

#0  0x003cf9879ad1 in _int_malloc (av=0x7f101c20, bytes=16384) at 
malloc.c:4383
#1  0x003cf987a911 in __libc_malloc (bytes=16384) at malloc.c:3664
#2  0x0039c6c1650a in pni_map_allocate () from 
/usr/lib64/libqpid-proton.so.2
#3  0x0039c6c16a3a in pni_map_ensure () from /usr/lib64/libqpid-proton.so.2
#4  0x0039c6c16c45 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#5  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#6  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#7  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#8  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#9  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#10 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#11 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#12 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#13 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#14 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
.
.
.
.
#93549 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93550 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93551 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93552 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93553 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93554 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93555 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93556 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93557 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93558 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93559 0x0039c6c16dc0 in pn_map_put () from /usr/lib64/libqpid-proton.so.2
#93560 0x0039c6c17226 in pn_hash_put () from /usr/lib64/libqpid-proton.so.2
#93561 0x0039c6c2a643 in pn_delivery_map_push () from 
/usr/lib64/libqpid-proton.so.2
#93562 0x0039c6c2c44b in pn_do_transfer () from 
/usr/lib64/libqpid-proton.so.2
#93563 0x0039c6c24385 in pn_dispatch_frame () from 
/usr/lib64/libqpid-proton.so.2
#93564 0x0039c6c2448f in pn_dispatcher_input () from 
/usr/lib64/libqpid-proton.so.2
#93565 0x0039c6c2d68b in pn_input_read_amqp () from 
/usr/lib64/libqpid-proton.so.2
#93566 0x0039c6c3011a in pn_io_layer_input_passthru () from 
/usr/lib64/libqpid-proton.so.2
#93567 0x0039c6c3011a in pn_io_layer_input_passthru () from 
/usr/lib64/libqpid-proton.so.2
#93568 0x0039c6c2d275 in transport_consume () from 
/usr/lib64/libqpid-proton.so.2
#93569 0x0039c6c304cd in pn_transport_process () from 
/usr/lib64/libqpid-proton.so.2
#93570 0x0039c6c3e40c in pn_connector_process () from 
/usr/lib64/libqpid-proton.so.2
#93571 0x7f1060c60460 in process_connector () from 
/home/mick/dispatch/build/libqpid-dispatch.so.0
#93572 0x7f1060c61017 in thread_run () from 
/home/mick/dispatch/build/libqpid-dispatch.so.0
#93573 0x003cf9c07851 in start_thread (arg=0x7f1052bfd700) at 
pthread_create.c:301
#93574 0x003cf98e890d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:115




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PROTON-625) Biggest Backtrace Ever!

2014-06-30 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish updated PROTON-625:
---

  Description: 
I am saving all my stuff so I can repro on demand.
It doesn't happen every time, but it's about 50%.

--

On one box, I have a dispatch router.
On the other box, I have 10 clients: 5 Messenger-based receivers, and 5 
qpid-messaging-based senders.

Each client will handle 100 addresses, of the form mick/0 ... mick/1 ...  
c.

100 messages will be sent to each address.

I start the 5 receivers first.  They start OK.  Dispatch router happy  stable.

Wait a few seconds.

I start the 5 senders, from a bash script.
The first sender is already sending when the 2nd, 3rd, 4th start.

After a few of them start,but before all have finished starting,  a few seconds 
into the script, the crash occurs.  ( If they all start up successfully, no 
crash. )

The crash occurs in the dispatch router.

Here is the biggest backtrace ever:

#0  0x003cf9879ad1 in _int_malloc (av=0x7f101c20, bytes=16384) at 
malloc.c:4383
#1  0x003cf987a911 in __libc_malloc (bytes=16384) at malloc.c:3664
#2  0x0039c6c1650a in pni_map_allocate () from 
/usr/lib64/libqpid-proton.so.2
#3  0x0039c6c16a3a in pni_map_ensure () from /usr/lib64/libqpid-proton.so.2
#4  0x0039c6c16c45 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#5  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#6  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#7  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#8  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#9  0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#10 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#11 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#12 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#13 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
#14 0x0039c6c16c64 in pni_map_entry () from /usr/lib64/libqpid-proton.so.2
.
.
.
.
#93549 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93550 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93551 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93552 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93553 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93554 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93555 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93556 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93557 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93558 0x0039c6c16c64 in pni_map_entry () from 
/usr/lib64/libqpid-proton.so.2
#93559 0x0039c6c16dc0 in pn_map_put () from /usr/lib64/libqpid-proton.so.2
#93560 0x0039c6c17226 in pn_hash_put () from /usr/lib64/libqpid-proton.so.2
#93561 0x0039c6c2a643 in pn_delivery_map_push () from 
/usr/lib64/libqpid-proton.so.2
#93562 0x0039c6c2c44b in pn_do_transfer () from 
/usr/lib64/libqpid-proton.so.2
#93563 0x0039c6c24385 in pn_dispatch_frame () from 
/usr/lib64/libqpid-proton.so.2
#93564 0x0039c6c2448f in pn_dispatcher_input () from 
/usr/lib64/libqpid-proton.so.2
#93565 0x0039c6c2d68b in pn_input_read_amqp () from 
/usr/lib64/libqpid-proton.so.2
#93566 0x0039c6c3011a in pn_io_layer_input_passthru () from 
/usr/lib64/libqpid-proton.so.2
#93567 0x0039c6c3011a in pn_io_layer_input_passthru () from 
/usr/lib64/libqpid-proton.so.2
#93568 0x0039c6c2d275 in transport_consume () from 
/usr/lib64/libqpid-proton.so.2
#93569 0x0039c6c304cd in pn_transport_process () from 
/usr/lib64/libqpid-proton.so.2
#93570 0x0039c6c3e40c in pn_connector_process () from 
/usr/lib64/libqpid-proton.so.2
#93571 0x7f1060c60460 in process_connector () from 
/home/mick/dispatch/build/libqpid-dispatch.so.0
#93572 0x7f1060c61017 in thread_run () from 
/home/mick/dispatch/build/libqpid-dispatch.so.0
#93573 0x003cf9c07851 in start_thread (arg=0x7f1052bfd700) at 
pthread_create.c:301
#93574 0x003cf98e890d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:115


  was:
I am saving all my stuff so I can repro on demand.
It doesn't happen every time, but it's about 50%.

--

On one box, I have a dispatch router.
On the other box, I have 10 clients: 5 Messenger-based receivers, and 5 
qpid-messaging-based senders.

Each client will handle 100 addresses, of the form mick/0 ... mick/1 ...  
c.

100 messages will be sent to each address.

I start the 5 

[jira] [Commented] (PROTON-577) CollectorImpl creates a lot of unnecessary garbage

2014-05-05 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989829#comment-13989829
 ] 

michael goulish commented on PROTON-577:


What the engineer *means* to say is  superfluous paraphernalia.

 CollectorImpl creates a lot of unnecessary garbage
 --

 Key: PROTON-577
 URL: https://issues.apache.org/jira/browse/PROTON-577
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-j
Affects Versions: 0.7
Reporter: Rafael H. Schloming
Assignee: Rafael H. Schloming





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PROTON-566) crash in pn_transport_set_max_frame

2014-04-16 Thread michael goulish (JIRA)
michael goulish created PROTON-566:
--

 Summary: crash in pn_transport_set_max_frame
 Key: PROTON-566
 URL: https://issues.apache.org/jira/browse/PROTON-566
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.7
 Environment: 3 boxes.  1 with senders, 1 with receivers, and 1 in the 
middle with a single router.
Reporter: michael goulish


Here's what I do:

( I have saved all relevant software so I can repro this. )

  1. On router box, start 1 router.
  2. On receiver box, start 1000 receivers.  With delays in between each group 
of 50, so as to avoid backlog problem.
  3. After receivers are all started, start 1000 senders also with delays.
 Senders start up but do not yet begin sending until I manually signal 
them by touching a file.
  4. Short time after sender start sending, qdrouter crashes in proton code, 
with this traceback:

  Core was generated by `/home/mick/dispatch/build/router/qdrouterd --config 
./config_1/X.conf'.
  Program terminated with signal 11, Segmentation fault.
  #0  0x7f29d3c0f3c0 in pn_transport_set_max_frame (transport=0x0, 
size=65536)
  at /home/mick/proton/proton-c/src/transport/transport.c:1915
  1915transport-local_max_frame = size;
  

  #0  0x7f8ad5a613c0 in pn_transport_set_max_frame (transport=0x0, 
size=65536)
  at /home/mick/proton/proton-c/src/transport/transport.c:1915
  #1  0x7f8ad5cdd4bd in thread_process_listeners (qd_server=0x14f8e10) 
at /home/mick/dispatch/src/server.c:100
  #2  0x7f8ad5cddedb in thread_run (arg=0x1490bf0) at 
/home/mick/dispatch/src/server.c:416
  #3  0x003638c07de3 in start_thread () from /lib64/libpthread.so.0
  #4  0x0036388f616d in clone () from /lib64/libc.so.6




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PROTON-566) crash in pn_transport_set_max_frame

2014-04-16 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish updated PROTON-566:
---

Description: 
Here's what I do:

( I have saved all relevant software so I can repro this. )

  1. On router box, start 1 router.
  2. On receiver box, start 1000 receivers.  With delays in between each group 
of 50, so as to avoid backlog problem.
  3. After receivers are all started, start 1000 senders also with delays.
 Senders start up but do not yet begin sending until I manually signal 
them by touching a file.
  4. Short time after sender start sending, qdrouter crashes in proton code, 
with this traceback:

  Core was generated by `/home/mick/dispatch/build/router/qdrouterd --config 
./config_1/X.conf'.
  Program terminated with signal 11, Segmentation fault.
  #0  0x7f29d3c0f3c0 in pn_transport_set_max_frame (transport=0x0, 
size=65536)
  at /home/mick/proton/proton-c/src/transport/transport.c:1915
  1915transport-local_max_frame = size;
  

  #0  0x7f8ad5a613c0 in pn_transport_set_max_frame (transport=0x0, 
size=65536)
  at /home/mick/proton/proton-c/src/transport/transport.c:1915
  #1  0x7f8ad5cdd4bd in thread_process_listeners (qd_server=0x14f8e10) 
at /home/mick/dispatch/src/server.c:100
  #2  0x7f8ad5cddedb in thread_run (arg=0x1490bf0) at 
/home/mick/dispatch/src/server.c:416
  #3  0x003638c07de3 in start_thread () from /lib64/libpthread.so.0
  #4  0x0036388f616d in clone () from /lib64/libc.so.6



Looks like this is not a proton problem, but something in dispatch.
I'm closing this and moving it



  was:
Here's what I do:

( I have saved all relevant software so I can repro this. )

  1. On router box, start 1 router.
  2. On receiver box, start 1000 receivers.  With delays in between each group 
of 50, so as to avoid backlog problem.
  3. After receivers are all started, start 1000 senders also with delays.
 Senders start up but do not yet begin sending until I manually signal 
them by touching a file.
  4. Short time after sender start sending, qdrouter crashes in proton code, 
with this traceback:

  Core was generated by `/home/mick/dispatch/build/router/qdrouterd --config 
./config_1/X.conf'.
  Program terminated with signal 11, Segmentation fault.
  #0  0x7f29d3c0f3c0 in pn_transport_set_max_frame (transport=0x0, 
size=65536)
  at /home/mick/proton/proton-c/src/transport/transport.c:1915
  1915transport-local_max_frame = size;
  

  #0  0x7f8ad5a613c0 in pn_transport_set_max_frame (transport=0x0, 
size=65536)
  at /home/mick/proton/proton-c/src/transport/transport.c:1915
  #1  0x7f8ad5cdd4bd in thread_process_listeners (qd_server=0x14f8e10) 
at /home/mick/dispatch/src/server.c:100
  #2  0x7f8ad5cddedb in thread_run (arg=0x1490bf0) at 
/home/mick/dispatch/src/server.c:416
  #3  0x003638c07de3 in start_thread () from /lib64/libpthread.so.0
  #4  0x0036388f616d in clone () from /lib64/libc.so.6



 crash in pn_transport_set_max_frame
 ---

 Key: PROTON-566
 URL: https://issues.apache.org/jira/browse/PROTON-566
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.7
 Environment: 3 boxes.  1 with senders, 1 with receivers, and 1 in the 
 middle with a single router.
Reporter: michael goulish

 Here's what I do:
 ( I have saved all relevant software so I can repro this. )
   1. On router box, start 1 router.
   2. On receiver box, start 1000 receivers.  With delays in between each 
 group of 50, so as to avoid backlog problem.
   3. After receivers are all started, start 1000 senders also with delays.
  Senders start up but do not yet begin sending until I manually signal 
 them by touching a file.
   4. Short time after sender start sending, qdrouter crashes in proton code, 
 with this traceback:
   Core was generated by `/home/mick/dispatch/build/router/qdrouterd --config 
 ./config_1/X.conf'.
   Program terminated with signal 11, Segmentation fault.
   #0  0x7f29d3c0f3c0 in pn_transport_set_max_frame (transport=0x0, 
 size=65536)
   at /home/mick/proton/proton-c/src/transport/transport.c:1915
   1915transport-local_max_frame = size;
   
   #0  0x7f8ad5a613c0 in pn_transport_set_max_frame (transport=0x0, 
 size=65536)
   at /home/mick/proton/proton-c/src/transport/transport.c:1915
   #1  0x7f8ad5cdd4bd in thread_process_listeners 
 (qd_server=0x14f8e10) at /home/mick/dispatch/src/server.c:100
   #2  0x7f8ad5cddedb in thread_run (arg=0x1490bf0) at 
 /home/mick/dispatch/src/server.c:416
   #3  0x003638c07de3 in start_thread () from /lib64/libpthread.so.0
   #4  0x0036388f616d in clone () from /lib64/libc.so.6
 Looks like this is not a 

[jira] [Closed] (PROTON-566) crash in pn_transport_set_max_frame

2014-04-16 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish closed PROTON-566.
--

Resolution: Fixed

It looks like this is not a proton issue, but a dispatch issue.
I'm closing this and moving it.

 crash in pn_transport_set_max_frame
 ---

 Key: PROTON-566
 URL: https://issues.apache.org/jira/browse/PROTON-566
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.7
 Environment: 3 boxes.  1 with senders, 1 with receivers, and 1 in the 
 middle with a single router.
Reporter: michael goulish

 Here's what I do:
 ( I have saved all relevant software so I can repro this. )
   1. On router box, start 1 router.
   2. On receiver box, start 1000 receivers.  With delays in between each 
 group of 50, so as to avoid backlog problem.
   3. After receivers are all started, start 1000 senders also with delays.
  Senders start up but do not yet begin sending until I manually signal 
 them by touching a file.
   4. Short time after sender start sending, qdrouter crashes in proton code, 
 with this traceback:
   Core was generated by `/home/mick/dispatch/build/router/qdrouterd --config 
 ./config_1/X.conf'.
   Program terminated with signal 11, Segmentation fault.
   #0  0x7f29d3c0f3c0 in pn_transport_set_max_frame (transport=0x0, 
 size=65536)
   at /home/mick/proton/proton-c/src/transport/transport.c:1915
   1915transport-local_max_frame = size;
   
   #0  0x7f8ad5a613c0 in pn_transport_set_max_frame (transport=0x0, 
 size=65536)
   at /home/mick/proton/proton-c/src/transport/transport.c:1915
   #1  0x7f8ad5cdd4bd in thread_process_listeners 
 (qd_server=0x14f8e10) at /home/mick/dispatch/src/server.c:100
   #2  0x7f8ad5cddedb in thread_run (arg=0x1490bf0) at 
 /home/mick/dispatch/src/server.c:416
   #3  0x003638c07de3 in start_thread () from /lib64/libpthread.so.0
   #4  0x0036388f616d in clone () from /lib64/libc.so.6
 Looks like this is not a proton problem, but something in dispatch.
 I'm closing this and moving it



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PROTON-452) Ruby API doesn't have pn_messenger_interrupt()

2013-11-13 Thread michael goulish (JIRA)
michael goulish created PROTON-452:
--

 Summary: Ruby API doesn't have pn_messenger_interrupt()
 Key: PROTON-452
 URL: https://issues.apache.org/jira/browse/PROTON-452
 Project: Qpid Proton
  Issue Type: Bug
Affects Versions: 0.5
Reporter: michael goulish


It looks like the Ruby binding doesn't cover the new-ish C function  
pn_messenger_interrupt().





--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (PROTON-260) Messenger Documentation

2013-10-16 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13797027#comment-13797027
 ] 

michael goulish edited comment on PROTON-260 at 10/16/13 5:30 PM:
--

rev 152 -- checked in new C API doxygen comments in messenger.h



was (Author: mgoulish):
rev r152 -- checked in new C API doxygen comments in messenger.h


 Messenger Documentation
 ---

 Key: PROTON-260
 URL: https://issues.apache.org/jira/browse/PROTON-260
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: 0.5
Reporter: michael goulish
Assignee: michael goulish

 Write documentation for the Proton Messenger interface, to include:
   introduction
   API explanations
   theory of operation
   example programs
   programming idioms
   tutorials
   quickstarts
   troubleshooting
 Documents should use MarkDown markup language.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PROTON-260) Messenger Documentation

2013-10-16 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13797027#comment-13797027
 ] 

michael goulish commented on PROTON-260:


rev r152 -- checked in new C API doxygen comments in messenger.h


 Messenger Documentation
 ---

 Key: PROTON-260
 URL: https://issues.apache.org/jira/browse/PROTON-260
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: 0.5
Reporter: michael goulish
Assignee: michael goulish

 Write documentation for the Proton Messenger interface, to include:
   introduction
   API explanations
   theory of operation
   example programs
   programming idioms
   tutorials
   quickstarts
   troubleshooting
 Documents should use MarkDown markup language.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (PROTON-300) qpidd --help should show sasl config path default

2013-04-19 Thread michael goulish (JIRA)
michael goulish created PROTON-300:
--

 Summary: qpidd --help should show sasl config path default
 Key: PROTON-300
 URL: https://issues.apache.org/jira/browse/PROTON-300
 Project: Qpid Proton
  Issue Type: Bug
Reporter: michael goulish
Priority: Minor


qpidd --help does not show the sasl config path default, which is /etc/sasl2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PROTON-300) qpidd --help should show sasl config path default

2013-04-19 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish updated PROTON-300:
---

Assignee: michael goulish

 qpidd --help should show sasl config path default
 -

 Key: PROTON-300
 URL: https://issues.apache.org/jira/browse/PROTON-300
 Project: Qpid Proton
  Issue Type: Bug
Reporter: michael goulish
Assignee: michael goulish
Priority: Minor

 qpidd --help does not show the sasl config path default, which is /etc/sasl2  
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PROTON-295) recv(-1) + incoming_window == bad

2013-04-17 Thread michael goulish (JIRA)
michael goulish created PROTON-295:
--

 Summary: recv(-1) + incoming_window == bad
 Key: PROTON-295
 URL: https://issues.apache.org/jira/browse/PROTON-295
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.4
Reporter: michael goulish


Use of recv(-1) could receive enough messages that some would exceed the 
incoming window size and be automatically accepted -- with app logic never 
getting a say in the matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PROTON-260) Messenger Documentation

2013-04-03 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621232#comment-13621232
 ] 

michael goulish commented on PROTON-260:


rev 1464126 -- new version of  message_disposition.md based on Rafi's and 
Alan's comments.

 Messenger Documentation
 ---

 Key: PROTON-260
 URL: https://issues.apache.org/jira/browse/PROTON-260
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: 0.5
Reporter: michael goulish
Assignee: michael goulish

 Write documentation for the Proton Messenger interface, to include:
   introduction
   API explanations
   theory of operation
   example programs
   programming idioms
   tutorials
   quickstarts
   troubleshooting
 Documents should use MarkDown markup language.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PROTON-260) Messenger Documentation

2013-03-14 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602578#comment-13602578
 ] 

michael goulish commented on PROTON-260:


Checked in trunk/docs/messenger message_disposition.md   -- rev 1456600 
.

 Messenger Documentation
 ---

 Key: PROTON-260
 URL: https://issues.apache.org/jira/browse/PROTON-260
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: 0.5
Reporter: michael goulish
Assignee: michael goulish

 Write documentation for the Proton Messenger interface, to include:
   introduction
   API explanations
   theory of operation
   example programs
   programming idioms
   tutorials
   quickstarts
   troubleshooting
 Documents should use MarkDown markup language.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PROTON-260) Messenger Documentation

2013-03-05 Thread michael goulish (JIRA)
michael goulish created PROTON-260:
--

 Summary: Messenger Documentation
 Key: PROTON-260
 URL: https://issues.apache.org/jira/browse/PROTON-260
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: 0.5
Reporter: michael goulish


Write documentation for the Proton Messenger interface, to include:


  introduction

  API explanations

  theory of operation

  example programs

  programming idioms

  tutorials

  quickstarts

  troubleshooting




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PROTON-260) Messenger Documentation

2013-03-05 Thread michael goulish (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

michael goulish updated PROTON-260:
---

Description: 
Write documentation for the Proton Messenger interface, to include:


  introduction

  API explanations

  theory of operation

  example programs

  programming idioms

  tutorials

  quickstarts

  troubleshooting


Documents should use MarkDown markup language.


  was:
Write documentation for the Proton Messenger interface, to include:


  introduction

  API explanations

  theory of operation

  example programs

  programming idioms

  tutorials

  quickstarts

  troubleshooting





 Messenger Documentation
 ---

 Key: PROTON-260
 URL: https://issues.apache.org/jira/browse/PROTON-260
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: 0.5
Reporter: michael goulish

 Write documentation for the Proton Messenger interface, to include:
   introduction
   API explanations
   theory of operation
   example programs
   programming idioms
   tutorials
   quickstarts
   troubleshooting
 Documents should use MarkDown markup language.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PROTON-243) 0.4 RC1 libqpid-proton not found

2013-02-18 Thread michael goulish (JIRA)
michael goulish created PROTON-243:
--

 Summary: 0.4 RC1 libqpid-proton not found 
 Key: PROTON-243
 URL: https://issues.apache.org/jira/browse/PROTON-243
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.4
 Environment: Fedora 18
Reporter: michael goulish


All build steps went well, following the README directions, until I got to 
building the C examples

here is what happened then:

( executive summary: I had to set LD_LIBRARY_PATH to get libqpid-proton to be 
findable at link time. )

  cd ../examples/messenger/c
  cmake .
  make

  ./recv 

  ./recv: error while loading shared libraries: 
libqpid-proton.so.1: cannot open shared object file: 
No such file or directory

  # Uh-oh.

  ldd recv
linux-vdso.so.1 =  (0x7fff0396)
libqpid-proton.so.1 = not found
libc.so.6 = /lib64/libc.so.6 (0x7f5dfc48f000)
/lib64/ld-linux-x86-64.so.2 (0x7f5dfc851000)

  export LD_LIBRARY_PATH=/usr/lib

  ./recv 
  # It's Happy !
  ./send
  Address: amqp://0.0.0.0
  Subject: (no subject)
  Content: Hello World!
  # Hooray !


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PROTON-200) [Proton-c] Credit distribution by messenger is not balanced across all links

2013-02-17 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580317#comment-13580317
 ] 

michael goulish commented on PROTON-200:


I think this bug is a release blocker.

An implication, that is not immediately obvious, is that if you have one 
receiver with two senders -- one of the senders will hang until the other 
receiver calls 'stop'.
This makes it impossible to set up any topology except the simplest possible -- 
one sender, one receiver.



 [Proton-c] Credit distribution by messenger is not balanced across all links
 

 Key: PROTON-200
 URL: https://issues.apache.org/jira/browse/PROTON-200
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.3
Reporter: Ken Giusti
Assignee: Ken Giusti
 Fix For: 0.4


 The method used to distribute credit to receiving links may lead to 
 starvation when the number of receiving links is  the available credit.
 The distribution algorithm always starts with the same link - see 
 messenger.c::pn_messenger_flow()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PROTON-222) pn_messenger_send returns before message data has been written to the wire

2013-02-12 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576893#comment-13576893
 ] 

michael goulish commented on PROTON-222:


I am able to get my example working the way I want to by using a tracker, with 
window size 1, on the sender, and calling  pn_messenger_status() after every 
message sent.

new code:

 sender 
=

#include proton/message.h
#include proton/messenger.h

#include stdio.h
#include stdlib.h
#include string.h



int
main(int argc, char** argv)
{
  int c;
  opterr = 0;
  char addr [ 1000 ];
  char content [ 1000 ];
  char subject [ 1000 ];

  sprintf ( addr, amqp://0.0.0.0:%s, argv[1] );

  pn_message_t * message;
  pn_messenger_t * messenger;

  message = pn_message();
  messenger = pn_messenger(NULL);
  pn_messenger_set_outgoing_window ( messenger, 1 );

  pn_messenger_start(messenger);

  int n_messages = 10;
  int sent_count;

  /*--
Put and send a message every 1 second.
  --*/
  for ( sent_count = 0 ; sent_count  n_messages; ++ sent_count )
  {
sleep ( 1 );
sprintf ( subject, This is message %d., sent_count + 1 );
pn_message_set_address ( message, addr );
pn_message_set_subject ( message, subject );
pn_data_t *body = pn_message_body(message);
sprintf ( content, Hello, Proton! );
pn_data_put_string(body, pn_bytes(strlen(content), content));
pn_messenger_put(messenger, message);

pn_tracker_t tracker;
tracker = pn_messenger_outgoing_tracker ( messenger );

pn_messenger_send(messenger);

pn_messenger_status ( messenger, tracker );

fprintf ( stderr, sent %d messages.\n, sent_count + 1 );
  }


  // Countdown to stop, to give me time to see it 
  fprintf ( stderr, Calling stop in ...\n );
  for ( int i = 5; i  0; -- i )
  {
fprintf ( stderr, %d\n, i );
sleep ( 1 );
  }
  fprintf ( stderr, stop.\n);

  pn_messenger_stop(messenger);
  pn_messenger_free(messenger);
  pn_message_free(message);

  return 0;
}



= receiver 
=
#include proton/message.h
#include proton/messenger.h

#include stdio.h
#include stdlib.h
#include ctype.h



#define BUFSIZE 1024


void
consume_messages ( pn_messenger_t * messenger, int n, pn_message_t * message )
{
  for ( int consume_count = 0; consume_count  n; ++ consume_count )
  {
pn_messenger_get ( messenger, message );

size_t bufsize = BUFSIZE;
char buffer [ bufsize ];
pn_data_t * body = pn_message_body ( message );
pn_data_format ( body, buffer,  bufsize );

printf ( \n\nMessage \n);
printf ( Address: %s\n, pn_message_get_address ( message ) );
char const * subject = pn_message_get_subject(message);
printf ( Subject: %s\n, subject ? subject : (no subject) );
printf(Content: %s\n\n, buffer);
  }
}


int
main(int argc, char** argv)
{
  char addr [ 1000 ];

  sprintf ( addr, amqp://~0.0.0.0:%s, argv[1] );
  pn_message_t   * message;
  pn_messenger_t * messenger;

  message = pn_message();
  messenger = pn_messenger ( NULL );

  pn_messenger_start(messenger);
  pn_messenger_subscribe ( messenger, addr );

  int messages_wanted= 10;
  int total_received =  0;
  int received_this_time;

  pn_messenger_set_timeout ( messenger, 700 );

  int tries = 0;
  while ( total_received  messages_wanted )
  {
++ tries;
pn_messenger_recv ( messenger, BUFSIZE );
received_this_time = pn_messenger_incoming ( messenger );
fprintf ( stderr,
  try: %d received: %d   total: %d\n,
  tries,
  received_this_time,
  total_received
);
consume_messages ( messenger, received_this_time, message );
total_received += received_this_time;
  }

  pn_messenger_stop(messenger);
  pn_messenger_free(messenger);

  return 0;
}









 pn_messenger_send returns before message data has been written to the wire
 --

 Key: PROTON-222
 URL: https://issues.apache.org/jira/browse/PROTON-222
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c, proton-j
Affects Versions: 0.3
Reporter: Rafael H. Schloming
Assignee: Ken Giusti
 Fix For: 0.4

 Attachments: transport.patch


 Currently, pn_messender_send will block until the engine reports there are no 
 queued messages being held. The problem arises because the queued message 
 count only reports message data that is being held by the engine due to 
 insufficient credit to send the messages. Messages may also be sitting in the 
 transport's encoded frame buffer waiting to be