On Thu, 2017-08-17 at 21:25 +0000, Juergen Gnoss wrote:
> Problem is, my program crashes after half an hour or so,
> 
> saying:
> 
> FATAL ERROR at src/zhashx.c:210
> 
> OUT OF MEMORY (malloc returned NULL)
> 
> Aborted
> 
> Strange, running it with valgrind for a few minutes and
> 
> terminating it with ctrlC all is OK, valgrind at least
> 
> say’s that all is OK.
> 
> ==29022==
> 
> ==29022== HEAP SUMMARY:
> 
> ==29022== in use at exit: 0 bytes in 0 blocks
> 
> ==29022== total heap usage: 2,665 allocs, 2,665 frees, 965,174 bytes
> allocated
> 
> ==29022==
> 
> ==29022== All heap blocks were freed — no leaks are possible
> 
> ==29022==
> 
> ==29022== For counts of detected and suppressed errors, rerun with:
> -v
> 
> ==29022== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from
> 0)
> 
> So, roll up sleeves and find a way how to see what’s going on.
> 
> Watching the memory, shows a slight grow in usage over time.
> 
> Try to get familiar with wireshark …
> 
> and I tracked it down, that a mobile client that connects to the
> server
> 
> changes at will it’s outgoing port and sometimes even it’s IP.
> 
> Sometimes he terminates the connection correct before coming over the
> 
> new port or IP, sometimes not.
> 
> I put an interesting logfile on pastebin for people that are
> interested
> 
> to see what’s going on in my case.
> 
> https://pastebin.com/hiRc3AfD
> 
> I tried to play with heartbeat options on the socket.
> 
> Last day’s I saw another people on the list have similar problems
> with lost
> 
> clients on stream sockets. If I remember well, luca recommended to
> use
> 
> socket options, so the kernel will take over on lost clients and the
> library
> 
> will do the rest.
> 
> In my case, sure I don’t use the right combination of options,
> because the
> 
> connections stay open. When my program crashes, the kernel still
> tries to notify
> 
> the clients to close.
> 
> Here is what I use to create the socket’s.
> 
> ‘’’c
> 
> self->deviceSocket = zsock_new_stream(connstr);
> 
> if (!self->deviceSocket) {
> 
>     zsys_error( "Error getting BSD Socket socket\n%s\n", zmq_strerror
> (errno));
> 
>     free_DBPool(self);
> 
>     return -1;
> 
> }
> 
> zsys_info( "BSD Socket bind to : '%s'\n", connstr);
> 
> 
> 
> 
> int hbi = zsock_heartbeat_ivl (self->deviceSocket);
> 
> if (hbi < 300) {
> 
>     zsys_info( "BSD Socket heartbeat is '%d' --> to low \n", hbi);
> 
>     zsock_set_heartbeat_ivl (self->deviceSocket, 30);
> 
>     hbi = zsock_heartbeat_ivl (self->deviceSocket);
> 
>     zsys_info( "BSD Socket heartbeat is now : '%d'\n", hbi);
> 
> }
> 
> 
> 
> 
> int hbto = zsock_heartbeat_timeout(self->deviceSocket);
> 
> if (hbi < 120) {
> 
>     zsys_info( "BSD Socket heartbeat timeout is '%d' --> to low \n",
> hbto);
> 
>     zsock_set_heartbeat_timeout(self->deviceSocket, 120);
> 
>     hbto = zsock_heartbeat_timeout(self->deviceSocket);
> 
>     zsys_info( "BSD Socket heartbeat timeout is now : '%d'\n", hbto);
> 
> }
> 
> 
> 
> 
> zpoller_add (self->devActor_poller, self->deviceSocket);
> 
> 
> ‘’’
> 
> 
> Question one is now, what is the right option to use?
> 
> Or should I take care myself to disconnect clients by sending a
> 
> zero frame to sockets clientID.
> 
> Question two is, why the program crashes?
> 
> Operating System is far from out of memory at the time the
> 
> program crashes.
> 
> 
> thanks
> 
> Ju
> 
> PS.:
> 
> Logfile on pastebin is a combined log from 3 sources into one
> logfile,
> 
>   1.  the program running the socket
>   2.  tshark watching the port in question
>   3.  a czmq - dish  listening on the programs radio socket
> 
> 
> all connections coming from the same mobile client (only one client
> in that case)
> 
> going to a server with public IP ( IP replaced with "my.ser.vers.ip")
> 
> and no (known) firewall or NAT in the way.

Hi,

A few things:

1) When using Wireshark with ZMQ there's this great dissector for the
protocol: https://github.com/whitequark/zmtp-wireshark
I recommend using it, as it makes it so much easier to debug
connections
2) If a malloc is failing, then your program IS going out of memory as
there's really no other reason why it would fail - might not be the
system memory, but only what that process is allowed to use (cgroups or
other limitations?)
3) If you want to analyse the program's memory utilisation I suggest
valgrind with --tool=massif, it's a great profiler
4) As the manpage for zmq_setsockopt says, the heartbeat options only
apply for the next connections - so you need to create the socket
first, apply the options then connect it, eg:

s = zsock_new(ZMQ_STREAM);
zsock_set...
zsock_connect(s, connstr, NULL);

Finally and most importantly, heartbeats are a ZMTP feature, so it
won't work for ZMQ_STREAM as in those cases the peers are not ZMTP
sockets, but plain TCP sockets.

So as you guessed I think you should take care of gracefully
terminating the connections, but I don't use ZMQ_STREAM that much so
there might be more.

-- 
Kind regards,
Luca Boccassi

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to