[ 
https://issues.apache.org/jira/browse/QPID-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570940#comment-16570940
 ] 

dan clark commented on QPID-8134:
---------------------------------

I hope to refactor the code to provide you a more plausible example that
closely mimics the 24x7 application.  Note that going through the normal
exit path by using connection.close() is not a valid test so the
'options.vexit' path must be used for analysis.

Let me try to provide a more clear example:
given an application that echos back responses to a message based on a
topic, run the application 24x7 (therefore never hitting the close path).
Have other applications, some of which are long running, some of which
startup and shutdown, all of which send a request to the first application
listening on a topic and then get a reply.  The application that is sending
the replies continues to lose memory all associated with the 'send' path.

The use of spout and drain, was used to provide a very simple already
written sample application to provide the exact same valgrind output to the
location of the leak.  Note, that the leak can be worked around by having
the send path LINK attribute set to use a different 'reliability' attribute:

So: using the following link attribute set on both sender/receiver leaks
memory on the send
link: {name:send-link, reliability: at-least-once, timeout:1}

changing the link attribute on both sender/receiver no longer leaks memory
but changes the reliability
link: {name:send-link, reliability: at-most-once, timeout:1} // or
reliability: unreliable // documented as equivalent.

I apologize for the over-simplicity of the example setup in order to
provide the most accessible of applications for analysis.  However, it did
recreate the point.

Note, that due to the very high transaction load and the near real time
nature of the application, there are times when some elasticity is required
of the queues.  This means that the default queue limit is quite high to
provide some buffering.

Is it possible to turn OFF the policy which attempts to out-wit the
underling malloc implementation by caching unused memory in the QPID C++
library?  This policy tends to reserve an very large cache of essentially
unused memory for every application running which is not a good policy for
every daemon accessing the system.

For example, given the following quidd.conf file one would expect that each
application using such a policy might be somewhat unbounded in growth
despite only needing high message elasticity during peak demand:

cat /etc/qpid/qpidd.conf

auth=yes

# no tracing too verbose

trace=no

# by default qpid sets worker threads to # processors

# worker-threads=n

log-enable=warning+

log-enable=info+:Broker

log-enable=info+:Queue

# set logging to a file (see logrotate)

log-to-stderr=no

log-to-stdout=no

log-to-file=/var/log/qpidd.log

log-to-syslog=no

# drop default purge interval is 10m (600s)

queue-purge-interval=300

# bump default limit queue (bytes) is 104857600

default-queue-limit=1048576000

# increase responsiveness

tcp-nodelay=yes

# bump the flow stop threshold default 80%

default-flow-stop-threshold=90

# maintain flow resume threashold default 70%

default-flow-resume-threshold=85

# start getting events when we cross 50% level (default 80%)

default-event-threshold-ratio=75

# send errant message to standard topic but allow tuning (default
qpid.no-group)

default-message-group=qpid.no-group

# allow adjustments to set receive timestamp (default no)

enable-timestamp=no







-- 
Dan Clark   503-915-3646


> qpid::client::Message::send multiple memory leaks
> -------------------------------------------------
>
>                 Key: QPID-8134
>                 URL: https://issues.apache.org/jira/browse/QPID-8134
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Client
>    Affects Versions: qpid-cpp-1.37.0, qpid-cpp-1.38.0
>         Environment: *CentOS* Linux release 7.4.1708 (Core)
> Linux localhost.novalocal 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 
> UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> *qpid*-qmf-1.37.0-1.el7.x86_64
> *qpid*-dispatch-debuginfo-1.0.0-1.el7.x86_64
> python-*qpid*-1.37.0-1.el7.noarch
> *qpid*-proton-c-0.18.1-1.el7.x86_64
> python-*qpid*-qmf-1.37.0-1.el7.x86_64
> *qpid*-proton-debuginfo-0.18.1-1.el7.x86_64
> *qpid*-cpp-debuginfo-1.37.0-1.el7.x86_64
> *qpid*-cpp-client-devel-1.37.0-1.el7.x86_64
> *qpid*-cpp-server-1.37.0-1.el7.x86_64
> *qpid*-cpp-client-1.37.0-1.el7.x86_64
>  
>            Reporter: dan clark
>            Assignee: Alan Conway
>            Priority: Blocker
>              Labels: leak, maven
>             Fix For: qpid-cpp-1.39.0
>
>         Attachments: drain.cpp, godrain.sh, gospout.sh, qpid-8134.tgz, 
> qpid-stat.out, spout.cpp, spout.log
>
>   Original Estimate: 40h
>  Remaining Estimate: 40h
>
> There may be multiple leaks of the outgoing message structure and associated 
> fields when using the qpid::client::amqp0_10::SenderImpl::send function to 
> publish messages under certain setups. I will concede that there may be 
> options that are beyond my ken to ameliorate the leak of messages structures, 
> especially since there is an indication that under prolonged runs (a 
> demonized version of an application like spout) that the statistics for quidd 
> indicate increased acquires with zero releases.
> The basic notion is illustrated with the test application spout (and drain).  
> Consider a long running daemon reducing the overhead of open/send/close by 
> keeping the message connection open for long periods of time.  Then the logic 
> would be: start application/open connection.  In a loop send data (and never 
> reach a close).  Thus the drain application illustrates the behavior and 
> demonstrates the leak using valgrind by sending the data followed by an 
> exit(0).  
> Note also the lack of 'releases' associated with the 'acquires' in the stats 
> output.
> Capturing the leaks using the test applications spout/drain required adding 
> an 'exit()' prior to the close, as during normal operations of a daemon, the 
> connection remains open for a sustained period of time, thus the leak of 
> structures within the c++ client library are found as structures still 
> tracked by the library and cleaned up on 'connection.close()', but they 
> should be cleaned up as a result of the completion of the send/receive ack or 
> the termination of the life of the message based on the TTL of the message, 
> which ever comes first.  I have witnessed growth of the leaked structures 
> into the millions of messages lasting more than 24hours with short (300sec) 
> TTL of the messages based on scenarios attached using spout/drain as test 
> vehicle.
> The attached spout.log uses a short 10message test and the spout.log contains 
> 5 sets of different structures leaked (found with the 'bytes in 10 blocks are 
> still reachable' lines, that are in line with much more sustained leaks when 
> running the application for multiple days with millions of messages.
> The leaks seem to be associated with structures allocation 'stdstrings' to 
> save the "subject" and the "payload" for string based messages using send for 
> amq.topic output.
> Suggested work arounds are welcome based on application level changes to 
> spout/drain (if they are missing key components) or changes to the 
> address/setup of the queues for amq.topic messages (see the 'gospout.sh and 
> godrain.sh' test drivers providing the specific address structures being used.
> For example, the following is one of the 5 different categories of leaked 
> data from 'spout.log' on a valgrind analysis of the output post the send and 
> session.sync but prior connection.close():
>  
> ==3388== 3,680 bytes in 10 blocks are still reachable in loss record 233 of 
> 234
> ==3388==    at 0x4C2A203: operator new(unsigned long) 
> (vg_replace_malloc.c:334)
> ==3388==    by 0x4EB046C: qpid::client::Message::Message(std::string const&, 
> std::string const&) (Message.cpp:31)
> ==3388==    by 0x51742C1: 
> qpid::client::amqp0_10::OutgoingMessage::OutgoingMessage() 
> (OutgoingMessage.cpp:167)
> ==3388==    by 0x5186200: 
> qpid::client::amqp0_10::SenderImpl::sendImpl(qpid::messaging::Message const&) 
> (SenderImpl.cpp:140)
> ==3388==    by 0x5186485: operator() (SenderImpl.h:114)
> ==3388==    by 0x5186485: execute<qpid::client::amqp0_10::SenderImpl::Send> 
> (SessionImpl.h:102)
> ==3388==    by 0x5186485: 
> qpid::client::amqp0_10::SenderImpl::send(qpid::messaging::Message const&, 
> bool) (SenderImpl.cpp:49)
> ==3388==    by 0x40438D: main (spout.cpp:185)
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to