Hi Itay:

Pls. see embedded comments below …

Best Regards,

Bill

P.S.  Good luck with your product!  My mom had macular degeneration and it made 
her later years really difficult.  Something like an OrCam would have been a 
blessing!

> On Jan 5, 2021, at 10:06 AM, Itay Chamiel <itay.cham...@orcam.com> wrote:
> 
> Hi Bill,
> 
> Thanks for responding. We have by now found a workaround that doesn't require 
> creating a socket each time, so this issue doesn't affect us anymore. I did 
> however continue investigating out of curiosity.

Fantastic!  That’s the only way things get better.


> 
> After reading the linked thread I see that your bottom-line question there is 
> how to get process_commands to run on these sockets - and that question is 
> unanswered. I've tried an approach suggested by one of the other 
> participants, to call zmq_getsockopt for ZMQ_EVENTS, but this seemed to have 
> no effect on the outcome.

I’ve noticed the same thing — ZMQ_EVENTS doesn’t seem to do the trick, but 
zmq_poll(..., ZMQ_POLLIN | ZMQ_POLLOUT) does seem to trigger the required 
cleanup.  I should really dig down and see if I can figure out why.


> 
> I have noticed that closing the context does free up the resources and this 
> was in fact a workaround we used for a while, but then we started getting 
> occasional segfaults within the context destructor, at which point I decided 
> that's a bad direction.

Just so I understand, you’re seeing SEGV in the dtor, not the ctor — correct?


> 
> I've tried to reproduce that crash in the context of this simple test, and 
> ran into some more weird behavior. Here is the code:
> 
> int main() {
>   while (true) {
>     printf("context create\n");
>     zmq::context_t context;
>     const int max_cnt = 900;
>     for (int i=0; i<max_cnt; i++) {
>       zmq::socket_t* socket = new zmq::socket_t(context, ZMQ_SUB);
>       socket->connect("inproc://some_name");
>       delete socket;
>     }
>   }
> }
> 
> Note the max_cnt constant. If it is set to 508 or lower (on my machine, yours 
> may vary), this test runs indefinitely. At 1017 or above I get the "Too many 
> open files" error, as in my original code sample, as expected. But at any 
> value in between I get this output:
> context create
> Assertion failed: s (src/ctx.cpp:148)
> Aborted (core dumped)
> 
> Now, src/ctx.cpp:148 is within the ctx destructor but there is nothing there 
> that looks like an assertion (I see: "_tag = ZMQ_CTX_TAG_VALUE_BAD;") so this 
> is where my investigation stops. I'm going to guess that this is another 
> resource error (it's probably not a coincidence that 508 is half of the upper 
> limit) and not related to the segfault I saw.

I’m guessing that the assert is actually the initial "zmq_assert 
(_sockets.empty ());”, and that the line number 148 is being somehow picked up 
because it’s the last line before the return?

> 
> To reiterate, I'm just sharing things I found while poking around, these 
> issues don't affect us at present.

That’s great, and thanks for that!  

I’ll give the above a try and report back what I find out.

A suggestion in future is to use the GitHub issues instead of email.  At least 
in my experience, that tends to get more visibility, and is also easier to 
refer back to than emails.  


> 
> Best regards,
> 
> Itay Chamiel, OrCam
> 
> 
> On Mon, Jan 4, 2021 at 6:07 PM Bill Torpey <wallstp...@gmail.com 
> <mailto:wallstp...@gmail.com>> wrote:
> Hi Itay:
> 
> Take a look at https://github.com/zeromq/libzmq/issues/3186 
> <https://github.com/zeromq/libzmq/issues/3186> — it may be relevant to the 
> behavior you’re seeing.
> 
> The short version is that process_commands needs to get a chance to run on 
> the socket to clean up resources.  If that isn’t done, resources (in this 
> case memory, but in your case potentially fd’s) can appear to leak until the 
> context is shut down.
> 
> Hope this helps…
> 
> Bill
> 
>> On Dec 29, 2020, at 7:12 AM, Itay Chamiel <itay.cham...@orcam.com 
>> <mailto:itay.cham...@orcam.com>> wrote:
>> 
>> Hi, we have a client thread that is supposed to receive data from a parent 
>> thread, then disconnect when done. We've noticed that when the socket is 
>> closed there's a leak of an eventfd (file descriptor), therefore we have a 
>> leak every time such a client is created and destroyed - even if no data is 
>> transferred.
>> 
>> Here is a quick C++ program to reproduce it. I'm running on a Ubuntu 18 
>> desktop with ZMQ 4.1.6 or 4.3.3. This loop is expected to run forever but 
>> crashes a little after 1000 iterations due to too many open files.
>> 
>> #include "zmq.hpp"
>> 
>> int main() {
>>   zmq::context_t context;
>>   while(1) {
>>     zmq::socket_t* socket = new zmq::socket_t(context, ZMQ_SUB);
>>     socket->connect("inproc://some_name <>");
>>     delete socket;
>>   }
>> }
>> 
>> The problem does not occur for other connection types (i.e. replace inproc 
>> with ipc and the problem will not occur).
>> In case you want it without the C++ bindings, here is a slightly more 
>> elaborate C example which also sets LINGER to zero (with no effect) and 
>> displays the number of FDs in use by the process each iteration.
>> 
>> #include <zmq.h>
>> #include <stdlib.h>
>> #include <sys/types.h>
>> #include <unistd.h>
>> 
>> int main() {
>>   void* ctx = zmq_ctx_new();
>>   for (int i=0; ; i++) {
>>     void* zmq_sock = zmq_socket(ctx, ZMQ_SUB);
>>     if (!zmq_sock) { printf("fail after %d iterations: %s\n", i, 
>> zmq_strerror(errno)); exit(-1); }
>>     int linger = 0;
>>     int rc = zmq_setsockopt(zmq_sock, ZMQ_LINGER, &linger, sizeof(linger)); 
>> // this doesn't actually help
>>     if (rc != 0) exit(-1);
>>     rc = zmq_connect(zmq_sock, "inproc://some_name <>");
>>     if (rc != 0) exit(-1);
>>     rc = zmq_close(zmq_sock);
>>     if (rc != 0) exit(-1);
>>     // show the number of used FDs
>>     char cmd[100];
>>     sprintf(cmd, "ls -l -v /proc/%d/fd | wc -l", (int)getpid());
>>     system(cmd);
>>     // test is hard to abort without a sleep
>>     usleep(100*1000);
>>   }
>> }
>> 
>> Thank you,
>> 
>> Itay Chamiel, OrCam
>> 
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev@lists.zeromq.org <mailto:zeromq-dev@lists.zeromq.org>
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev 
>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev@lists.zeromq.org <mailto:zeromq-dev@lists.zeromq.org>
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev 
> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev@lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to