[ 
https://issues.apache.org/jira/browse/QPID-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572050#comment-16572050
 ] 

dan clark commented on QPID-8134:
---------------------------------

Hi Alan!
Thanks for your engagement with this issue, I appreciate it.  I will do
some additional work on using the smaller test programs to see if I can get
a better use case stand alone.  I will try the use of massif with the
visualization tool, although under valgrind the performance degrades too
significantly to test at much load so I fear that massif may contribute yet
another performance drop...




Agreed, closing up the connection generally does the correct thing and
frees the memory cached for the sender.

Thanks.

Generally the application grabs a message an immediately acknowledges,
unfortunately there are bursts of activity and as a result the queues can
get fairly large for brief periods of tie.  Thus to have message guarantees
it was necessary to have a fairly large capacity. I'll try tuning down.  Is
it possible we could consider decoupling the cacheing policy from the
message count?  The problem is two fold:
a) it might be nice to have no caching - preferred in my case
b) the message sizes might vary by a large amount (< 100 bytes to > 10k
bytes), so using count would create a very unusual sized cache
c) there may be a large number of daemons providing different services so
having a large memory cache mostly unused except during peak time is not a
judicious use of memory
d) malloc tends to do a much better job of managing memory then most
internal library methods, thus avoiding memory fragmentation and other
issues.

Agreed, although the capacity is quite large.

Yes, default use of 0-10 protocol.  I did not see a simple path to move
over to the 1.0 protocol and maintain the application library mechanism for
creating and configuration the topic queues and send parameters.

sender set capacity 1000000
receiver set capacity 1000

Sender has the problem in general, requires a large elasticity as the
sender must tolerate spikes of behavior that the receiver then catches up
to.

Generally the application is receiving and immediately the acknowledging
but during activity spikes delays may be a few seconds thus the build of
the queue.


May not be able to provide this as there is too much unrelated activity.




Likewise, I am clearly hoping to tune DOWN the sending capacity because it
is directly tied to a cache I would prefer to disable.

Is there a potential of this cache also being used when:

link: {name: send-link,durable: false,reliability: unreliable",timeout: 1}

as well as

link: {name: send-link,durable: false,at-least-once: unreliable",timeout: 1}





-- 
Dan Clark   503-915-3646


> qpid::client::Message::send multiple memory leaks
> -------------------------------------------------
>
>                 Key: QPID-8134
>                 URL: https://issues.apache.org/jira/browse/QPID-8134
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Client
>    Affects Versions: qpid-cpp-1.37.0, qpid-cpp-1.38.0
>         Environment: *CentOS* Linux release 7.4.1708 (Core)
> Linux localhost.novalocal 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 
> UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> *qpid*-qmf-1.37.0-1.el7.x86_64
> *qpid*-dispatch-debuginfo-1.0.0-1.el7.x86_64
> python-*qpid*-1.37.0-1.el7.noarch
> *qpid*-proton-c-0.18.1-1.el7.x86_64
> python-*qpid*-qmf-1.37.0-1.el7.x86_64
> *qpid*-proton-debuginfo-0.18.1-1.el7.x86_64
> *qpid*-cpp-debuginfo-1.37.0-1.el7.x86_64
> *qpid*-cpp-client-devel-1.37.0-1.el7.x86_64
> *qpid*-cpp-server-1.37.0-1.el7.x86_64
> *qpid*-cpp-client-1.37.0-1.el7.x86_64
>  
>            Reporter: dan clark
>            Assignee: Alan Conway
>            Priority: Blocker
>              Labels: leak, maven
>             Fix For: qpid-cpp-1.39.0
>
>         Attachments: drain.cpp, godrain.sh, gospout.sh, qpid-8134.tgz, 
> qpid-stat.out, spout.cpp, spout.log
>
>   Original Estimate: 40h
>  Remaining Estimate: 40h
>
> There may be multiple leaks of the outgoing message structure and associated 
> fields when using the qpid::client::amqp0_10::SenderImpl::send function to 
> publish messages under certain setups. I will concede that there may be 
> options that are beyond my ken to ameliorate the leak of messages structures, 
> especially since there is an indication that under prolonged runs (a 
> demonized version of an application like spout) that the statistics for quidd 
> indicate increased acquires with zero releases.
> The basic notion is illustrated with the test application spout (and drain).  
> Consider a long running daemon reducing the overhead of open/send/close by 
> keeping the message connection open for long periods of time.  Then the logic 
> would be: start application/open connection.  In a loop send data (and never 
> reach a close).  Thus the drain application illustrates the behavior and 
> demonstrates the leak using valgrind by sending the data followed by an 
> exit(0).  
> Note also the lack of 'releases' associated with the 'acquires' in the stats 
> output.
> Capturing the leaks using the test applications spout/drain required adding 
> an 'exit()' prior to the close, as during normal operations of a daemon, the 
> connection remains open for a sustained period of time, thus the leak of 
> structures within the c++ client library are found as structures still 
> tracked by the library and cleaned up on 'connection.close()', but they 
> should be cleaned up as a result of the completion of the send/receive ack or 
> the termination of the life of the message based on the TTL of the message, 
> which ever comes first.  I have witnessed growth of the leaked structures 
> into the millions of messages lasting more than 24hours with short (300sec) 
> TTL of the messages based on scenarios attached using spout/drain as test 
> vehicle.
> The attached spout.log uses a short 10message test and the spout.log contains 
> 5 sets of different structures leaked (found with the 'bytes in 10 blocks are 
> still reachable' lines, that are in line with much more sustained leaks when 
> running the application for multiple days with millions of messages.
> The leaks seem to be associated with structures allocation 'stdstrings' to 
> save the "subject" and the "payload" for string based messages using send for 
> amq.topic output.
> Suggested work arounds are welcome based on application level changes to 
> spout/drain (if they are missing key components) or changes to the 
> address/setup of the queues for amq.topic messages (see the 'gospout.sh and 
> godrain.sh' test drivers providing the specific address structures being used.
> For example, the following is one of the 5 different categories of leaked 
> data from 'spout.log' on a valgrind analysis of the output post the send and 
> session.sync but prior connection.close():
>  
> ==3388== 3,680 bytes in 10 blocks are still reachable in loss record 233 of 
> 234
> ==3388==    at 0x4C2A203: operator new(unsigned long) 
> (vg_replace_malloc.c:334)
> ==3388==    by 0x4EB046C: qpid::client::Message::Message(std::string const&, 
> std::string const&) (Message.cpp:31)
> ==3388==    by 0x51742C1: 
> qpid::client::amqp0_10::OutgoingMessage::OutgoingMessage() 
> (OutgoingMessage.cpp:167)
> ==3388==    by 0x5186200: 
> qpid::client::amqp0_10::SenderImpl::sendImpl(qpid::messaging::Message const&) 
> (SenderImpl.cpp:140)
> ==3388==    by 0x5186485: operator() (SenderImpl.h:114)
> ==3388==    by 0x5186485: execute<qpid::client::amqp0_10::SenderImpl::Send> 
> (SessionImpl.h:102)
> ==3388==    by 0x5186485: 
> qpid::client::amqp0_10::SenderImpl::send(qpid::messaging::Message const&, 
> bool) (SenderImpl.cpp:49)
> ==3388==    by 0x40438D: main (spout.cpp:185)
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to