Re: [zeromq-dev] Open file leak during DNS resolution while network is down (regression in libzmq-4.1.0 and libzmq-master) #1302
OK, thanks for the simple test case. I'm not familiar enough with the internals of libzmq to debug this. I see that tcp_connecter.cpp is closing the socket if it can't resolve the hostname. However it's possible some error handling isn't right here. On Tue, Jan 20, 2015 at 5:04 AM, Tomas Krajca to...@repositpower.com wrote: Hi Peter, It's actually really simple, I have posted example C code to github at https://github.com/zeromq/libzmq/issues/1302. Am I doing something wrong or is it that obvious? There is no need for a poller or anything like that. It seems that DNS resolution during zmq_connect() somehow does not release the file handle if network is down. Cheers, Tomas Message: 19 Date: Mon, 19 Jan 2015 10:32:47 +0100 From: Pieter Hintjens p...@imatix.com Subject: Re: [zeromq-dev] Open file leak during DNS resolution while network is down (regression in libzmq-4.1.0 and libzmq-master) #1302 To: ZeroMQ development list zeromq-dev@lists.zeromq.org Cc: Mark Burgess m...@repositpower.com Message-ID: CADL5_shw3B+0YN4MYVCROEwufeH6_ThvxDKcw5c=vkfo4dw...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 Can your C++ programmer make a minimal test case in C that reproduces the problem? On Mon, Jan 19, 2015 at 1:55 AM, Tomas Krajca to...@repositpower.com wrote: Hi, I've reported this weird bug https://github.com/zeromq/libzmq/issues/1302 that we hit last week, I wonder if anybody experienced the same thing or can reproduce it. Basically, we saw a progressive file handle leak that crashed our application after about an hour of network outage. Any thoughts of which part of the code could the bug be in? We've got a C++ programmer in our team but don't know enough about libzmq internals to try to fix this. Thanks, Tomas ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] ZMQ vs SPI: FD shenanigans
Hello, I was trying to use ZMQ in a C++ program that also uses the STL to open regular files and that uses the open(2) and ioctl(2) syscalls of the Linux kernel for SPI communication (on an ARM platform). For certain combinations of opening and closing a regular file and binding to a TCP or Inproc socket, a subsequent ioctl call fails with either No such device or Resource temporarily unavailable. My first guess is that somehow ZMQ is messing up the file descriptors it uses, but I am certainly not ready to rule out bugs in the STL, Kernel or (least probably): my own code :-) Could anyone give me a suggestion on how continue debugging this? This is my demo code of the problem. The code uses the C++ binding for ZMQ (cppzmq) for brevity, but it should work the same for the native C interface. #include sys/ioctl.h #include zmq.hpp //#define ADDR tcp://127.0.0.1:8000 #define ADDR inproc://addr /* * The program works fine if D is present and it shows the problem if * D is commented out for the following permutations of the blocks: * - A B C D * - A C B D * - C A B D * * The program always shows the problem (independent of the presence * of D) for the following permutations of the blocks: * - A C D B * - C A D B * - C D A B * * With the constraint that A must be before B and C before D, there * are no further valid permutations. * * For the TCP address, the error message is: No such device * For the inproc address, it is: Resource temporarily unavailable */ int main() { zmq::context_t ctx;/* A */ zmq::socket_t skt(ctx, ZMQ_PUB); /* B */ skt.bind(ADDR); std::fstream fs(log, std::ios::out); /* C */ fs.close();/* D */ /* The following always last */ int fd = open(/dev/spidev32766.0, O_RDWR); struct spi_ioc_transfer pcks; memset(pcks, 0, sizeof(struct spi_ioc_transfer)); ioctl(fd, SPI_IOC_MESSAGE(1), pcks); std::cout errno: strerror(errno) std::endl; return 0; } Thank you for any suggestion, Olaf Mandel -- Olaf Mandel phone: +49-89-189166-250 fax: +49-89-189166-111 Menlo Systems GmbH Am Klopferspitz 19a, D-82152 Martinsried Amtsgericht München HRB 138145 Geschäftsführung: Dr Michael Mei, Dr Ronald Holzwarth USt-IdNr. DE217772017, St.-Nr. 14316170324 signature.asc Description: OpenPGP digital signature ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ vs SPI: FD shenanigans
Hello Thomas, thank you for the quick answer. Am 20.01.2015 15:11, schrieb Thomas Rodgers: Do you see the same behavior if you replace C with fopen() ? -Snipp- Not quite: if the error occurs, the behaviour is the same as before. But now the error _always_ happens: closing the file descriptor in the C version of the test makes no difference. New test below: /* * Compile with: * g++ -Wall -Werror -Wextra -pedantic -x c -o test test.c -lzmq */ #include fcntl.h #include linux/spi/spidev.h #include stdio.h #include string.h #include sys/ioctl.h #include zmq.h /*#define ADDR tcp://127.0.0.1:8000*/ #define ADDR inproc://addr /* * The program always shows the problem (independent of the presence * of D) for the following permutations of the blocks: * - A B C D * - A C B D * - A C D B * - C A B D * - C A D B * - C D A B * * With the constraint that A must be before B and C before D, there * are no further valid permutations. * * For the TCP address, the error message is: No such device * For the inproc address, it is: Resource temporarily unavailable */ int main() { void* ctx; void* skt; FILE* f; intfd; struct spi_ioc_transfer pcks; ctx = zmq_ctx_new();/* A */ skt = zmq_socket(ctx, ZMQ_PUB); /* B */ zmq_bind(skt, ADDR); f = fopen(log, w); /* C */ (void)sizeof(f); fclose(f); /* D */ /* The following always last */ fd = open(/dev/spidev32766.0, O_RDWR); memset(pcks, 0, sizeof(struct spi_ioc_transfer)); ioctl(fd, SPI_IOC_MESSAGE(1), pcks); printf(errno: %s\n, strerror(errno)); return 0; } The one advantage of this is: I now at least can see the FILE structure and the contained _fileno member. Any further thoughts? Olaf Mandel -- Olaf Mandel phone: +49-89-189166-250 fax: +49-89-189166-111 Menlo Systems GmbH Am Klopferspitz 19a, D-82152 Martinsried Amtsgericht München HRB 138145 Geschäftsführung: Dr Michael Mei, Dr Ronald Holzwarth USt-IdNr. DE217772017, St.-Nr. 14316170324 signature.asc Description: OpenPGP digital signature ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ vs SPI: FD shenanigans
Do you see the same behavior if you replace C with fopen() ? On Tue, Jan 20, 2015 at 7:40 AM, Olaf Mandel o.man...@menlosystems.com wrote: Am 20.01.2015 14:35, schrieb Olaf Mandel: -Snipp- #include sys/ioctl.h #include zmq.hpp -Snipp- Shoot: That was missing a few include statements at the top the program and the compile instructions. Correct start of the demo program: /* * Compile with: * g++ -Wall -Werror -Wextra -o test test.cpp -lzmq */ #include fcntl.h #include fstream #include iostream #include linux/spi/spidev.h #include sys/ioctl.h #include zmq.hpp Sorry about that, Olaf Mandel -- Olaf Mandel phone: +49-89-189166-250 fax: +49-89-189166-111 Menlo Systems GmbH Am Klopferspitz 19a, D-82152 Martinsried Amtsgericht München HRB 138145 Geschäftsführung: Dr Michael Mei, Dr Ronald Holzwarth USt-IdNr. DE217772017, St.-Nr. 14316170324 ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ vs SPI: FD shenanigans
Am 20.01.2015 14:35, schrieb Olaf Mandel: -Snipp- #include sys/ioctl.h #include zmq.hpp -Snipp- Shoot: That was missing a few include statements at the top the program and the compile instructions. Correct start of the demo program: /* * Compile with: * g++ -Wall -Werror -Wextra -o test test.cpp -lzmq */ #include fcntl.h #include fstream #include iostream #include linux/spi/spidev.h #include sys/ioctl.h #include zmq.hpp Sorry about that, Olaf Mandel -- Olaf Mandel phone: +49-89-189166-250 fax: +49-89-189166-111 Menlo Systems GmbH Am Klopferspitz 19a, D-82152 Martinsried Amtsgericht München HRB 138145 Geschäftsführung: Dr Michael Mei, Dr Ronald Holzwarth USt-IdNr. DE217772017, St.-Nr. 14316170324 signature.asc Description: OpenPGP digital signature ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Open file leak during DNS resolution while network is down (regression in libzmq-4.1.0 and libzmq-master) #1302
I'm running your sample against current libzmq trunk, Ubuntu 14.04 and I am unable to reproduce any leak. On Tue, Jan 20, 2015 at 2:08 AM, Pieter Hintjens p...@imatix.com wrote: OK, thanks for the simple test case. I'm not familiar enough with the internals of libzmq to debug this. I see that tcp_connecter.cpp is closing the socket if it can't resolve the hostname. However it's possible some error handling isn't right here. On Tue, Jan 20, 2015 at 5:04 AM, Tomas Krajca to...@repositpower.com wrote: Hi Peter, It's actually really simple, I have posted example C code to github at https://github.com/zeromq/libzmq/issues/1302. Am I doing something wrong or is it that obvious? There is no need for a poller or anything like that. It seems that DNS resolution during zmq_connect() somehow does not release the file handle if network is down. Cheers, Tomas Message: 19 Date: Mon, 19 Jan 2015 10:32:47 +0100 From: Pieter Hintjens p...@imatix.com Subject: Re: [zeromq-dev] Open file leak during DNS resolution while network is down (regression in libzmq-4.1.0 and libzmq-master) #1302 To: ZeroMQ development list zeromq-dev@lists.zeromq.org Cc: Mark Burgess m...@repositpower.com Message-ID: CADL5_shw3B+0YN4MYVCROEwufeH6_ThvxDKcw5c= vkfo4dw...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 Can your C++ programmer make a minimal test case in C that reproduces the problem? On Mon, Jan 19, 2015 at 1:55 AM, Tomas Krajca to...@repositpower.com wrote: Hi, I've reported this weird bug https://github.com/zeromq/libzmq/issues/1302 that we hit last week, I wonder if anybody experienced the same thing or can reproduce it. Basically, we saw a progressive file handle leak that crashed our application after about an hour of network outage. Any thoughts of which part of the code could the bug be in? We've got a C++ programmer in our team but don't know enough about libzmq internals to try to fix this. Thanks, Tomas ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ vs SPI: FD shenanigans
Hello, I don't really see anything in the small example that could cause a bug. I am myself using libzmq (with zmqpp as a wrapper) on a Raspberry Pi + a Piface digital card (which is a SPI device), and I have no trouble. My kernel version is 3.12.28+ and I use a recent (a few commits behind HEAD) version of libzmq. On Tue, Jan 20, 2015 at 4:28 PM, Thomas Rodgers rodg...@twrodgers.com wrote: No, not really. libc++ doesn't do anything particularly interesting with fd's but I just wanted to formally rule it out. Unfortunately I don't have a system (or any experience dealing) with SPI devices. I don't think libzmq does anything particularly 'funky' with any of the file descriptors it manages though (at least based on my reading of the source). On Tue, Jan 20, 2015 at 8:56 AM, Olaf Mandel o.man...@menlosystems.com wrote: Hello Thomas, thank you for the quick answer. Am 20.01.2015 15:11, schrieb Thomas Rodgers: Do you see the same behavior if you replace C with fopen() ? -Snipp- Not quite: if the error occurs, the behaviour is the same as before. But now the error _always_ happens: closing the file descriptor in the C version of the test makes no difference. New test below: /* * Compile with: * g++ -Wall -Werror -Wextra -pedantic -x c -o test test.c -lzmq */ #include fcntl.h #include linux/spi/spidev.h #include stdio.h #include string.h #include sys/ioctl.h #include zmq.h /*#define ADDR tcp://127.0.0.1:8000*/ #define ADDR inproc://addr /* * The program always shows the problem (independent of the presence * of D) for the following permutations of the blocks: * - A B C D * - A C B D * - A C D B * - C A B D * - C A D B * - C D A B * * With the constraint that A must be before B and C before D, there * are no further valid permutations. * * For the TCP address, the error message is: No such device * For the inproc address, it is: Resource temporarily unavailable */ int main() { void* ctx; void* skt; FILE* f; intfd; struct spi_ioc_transfer pcks; ctx = zmq_ctx_new();/* A */ skt = zmq_socket(ctx, ZMQ_PUB); /* B */ zmq_bind(skt, ADDR); f = fopen(log, w); /* C */ (void)sizeof(f); fclose(f); /* D */ /* The following always last */ fd = open(/dev/spidev32766.0, O_RDWR); memset(pcks, 0, sizeof(struct spi_ioc_transfer)); ioctl(fd, SPI_IOC_MESSAGE(1), pcks); printf(errno: %s\n, strerror(errno)); return 0; } The one advantage of this is: I now at least can see the FILE structure and the contained _fileno member. Any further thoughts? Olaf Mandel -- Olaf Mandel phone: +49-89-189166-250 fax: +49-89-189166-111 Menlo Systems GmbH Am Klopferspitz 19a, D-82152 Martinsried Amtsgericht München HRB 138145 Geschäftsführung: Dr Michael Mei, Dr Ronald Holzwarth USt-IdNr. DE217772017, St.-Nr. 14316170324 ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev -- Kapp Arnaud - Xaqq ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ vs SPI: FD shenanigans
Hello, Am 20.01.2015 18:07, schrieb Arnaud Kapp: I don't really see anything in the small example that could cause a bug. I am myself using libzmq (with zmqpp as a wrapper) on a Raspberry Pi + a Piface digital card (which is a SPI device), and I have no trouble. At least in the C++ version, the problem is relatively fragile: it only occurs if mixing ZMQ socket operations binds/connects with opening and closing files. There are cases where I also don't see the problem. Weirdly, I always see the problem in my pure-C test... My kernel version is 3.12.28+ and I use a recent (a few commits behind HEAD) version of libzmq. -Snipp- Good point, my machine and version numbers are: CPU: Freescale i.MX537 (Cortex-A8, NEON) GCC: 4.8.1, cross-compiling Linux: 3.10.28 + many platform patches ZMQ: zeromq/libzmq.git @ be23e699c Any other info of interest? Best regards, Olaf Mandel -- Olaf Mandel phone: +49-89-189166-250 fax: +49-89-189166-111 Menlo Systems GmbH Am Klopferspitz 19a, D-82152 Martinsried Amtsgericht München HRB 138145 Geschäftsführung: Dr Michael Mei, Dr Ronald Holzwarth USt-IdNr. DE217772017, St.-Nr. 14316170324 signature.asc Description: OpenPGP digital signature ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ vs SPI: FD shenanigans
No, not really. libc++ doesn't do anything particularly interesting with fd's but I just wanted to formally rule it out. Unfortunately I don't have a system (or any experience dealing) with SPI devices. I don't think libzmq does anything particularly 'funky' with any of the file descriptors it manages though (at least based on my reading of the source). On Tue, Jan 20, 2015 at 8:56 AM, Olaf Mandel o.man...@menlosystems.com wrote: Hello Thomas, thank you for the quick answer. Am 20.01.2015 15:11, schrieb Thomas Rodgers: Do you see the same behavior if you replace C with fopen() ? -Snipp- Not quite: if the error occurs, the behaviour is the same as before. But now the error _always_ happens: closing the file descriptor in the C version of the test makes no difference. New test below: /* * Compile with: * g++ -Wall -Werror -Wextra -pedantic -x c -o test test.c -lzmq */ #include fcntl.h #include linux/spi/spidev.h #include stdio.h #include string.h #include sys/ioctl.h #include zmq.h /*#define ADDR tcp://127.0.0.1:8000*/ #define ADDR inproc://addr /* * The program always shows the problem (independent of the presence * of D) for the following permutations of the blocks: * - A B C D * - A C B D * - A C D B * - C A B D * - C A D B * - C D A B * * With the constraint that A must be before B and C before D, there * are no further valid permutations. * * For the TCP address, the error message is: No such device * For the inproc address, it is: Resource temporarily unavailable */ int main() { void* ctx; void* skt; FILE* f; intfd; struct spi_ioc_transfer pcks; ctx = zmq_ctx_new();/* A */ skt = zmq_socket(ctx, ZMQ_PUB); /* B */ zmq_bind(skt, ADDR); f = fopen(log, w); /* C */ (void)sizeof(f); fclose(f); /* D */ /* The following always last */ fd = open(/dev/spidev32766.0, O_RDWR); memset(pcks, 0, sizeof(struct spi_ioc_transfer)); ioctl(fd, SPI_IOC_MESSAGE(1), pcks); printf(errno: %s\n, strerror(errno)); return 0; } The one advantage of this is: I now at least can see the FILE structure and the contained _fileno member. Any further thoughts? Olaf Mandel -- Olaf Mandel phone: +49-89-189166-250 fax: +49-89-189166-111 Menlo Systems GmbH Am Klopferspitz 19a, D-82152 Martinsried Amtsgericht München HRB 138145 Geschäftsführung: Dr Michael Mei, Dr Ronald Holzwarth USt-IdNr. DE217772017, St.-Nr. 14316170324 ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ vs SPI: FD shenanigans
At least in the C++ version, the problem is relatively fragile I always see the problem in my pure-C test There are those that maintain that C is a simpler and more robust language than C++ :-)~ On Tue, Jan 20, 2015 at 11:32 AM, Olaf Mandel o.man...@menlosystems.com wrote: Hello, Am 20.01.2015 18:07, schrieb Arnaud Kapp: I don't really see anything in the small example that could cause a bug. I am myself using libzmq (with zmqpp as a wrapper) on a Raspberry Pi + a Piface digital card (which is a SPI device), and I have no trouble. At least in the C++ version, the problem is relatively fragile: it only occurs if mixing ZMQ socket operations binds/connects with opening and closing files. There are cases where I also don't see the problem. Weirdly, I always see the problem in my pure-C test... My kernel version is 3.12.28+ and I use a recent (a few commits behind HEAD) version of libzmq. -Snipp- Good point, my machine and version numbers are: CPU: Freescale i.MX537 (Cortex-A8, NEON) GCC: 4.8.1, cross-compiling Linux: 3.10.28 + many platform patches ZMQ: zeromq/libzmq.git @ be23e699c Any other info of interest? Best regards, Olaf Mandel -- Olaf Mandel phone: +49-89-189166-250 fax: +49-89-189166-111 Menlo Systems GmbH Am Klopferspitz 19a, D-82152 Martinsried Amtsgericht München HRB 138145 Geschäftsführung: Dr Michael Mei, Dr Ronald Holzwarth USt-IdNr. DE217772017, St.-Nr. 14316170324 ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ vs SPI: FD shenanigans
Haha be careful Thomas or you will wake Pieter :) Well I am mixing socket operations binds/connects with opening and closing files (and reading and writing from/to files). I also use ZMQ to poll on my files' descriptors. So at first sight it doesn't look like a ZMQ bug. I see two things to try: * Try with a more stable version of libzmq - https://github.com/zeromq/zeromq4-1 * Upgrading to a more recent kernel and to gcc 4.9 if those are available. Good luck ! On Tue, Jan 20, 2015 at 6:45 PM, Thomas Rodgers rodg...@twrodgers.com wrote: At least in the C++ version, the problem is relatively fragile I always see the problem in my pure-C test There are those that maintain that C is a simpler and more robust language than C++ :-)~ On Tue, Jan 20, 2015 at 11:32 AM, Olaf Mandel o.man...@menlosystems.com wrote: Hello, Am 20.01.2015 18:07, schrieb Arnaud Kapp: I don't really see anything in the small example that could cause a bug. I am myself using libzmq (with zmqpp as a wrapper) on a Raspberry Pi + a Piface digital card (which is a SPI device), and I have no trouble. At least in the C++ version, the problem is relatively fragile: it only occurs if mixing ZMQ socket operations binds/connects with opening and closing files. There are cases where I also don't see the problem. Weirdly, I always see the problem in my pure-C test... My kernel version is 3.12.28+ and I use a recent (a few commits behind HEAD) version of libzmq. -Snipp- Good point, my machine and version numbers are: CPU: Freescale i.MX537 (Cortex-A8, NEON) GCC: 4.8.1, cross-compiling Linux: 3.10.28 + many platform patches ZMQ: zeromq/libzmq.git @ be23e699c Any other info of interest? Best regards, Olaf Mandel -- Olaf Mandel phone: +49-89-189166-250 fax: +49-89-189166-111 Menlo Systems GmbH Am Klopferspitz 19a, D-82152 Martinsried Amtsgericht München HRB 138145 Geschäftsführung: Dr Michael Mei, Dr Ronald Holzwarth USt-IdNr. DE217772017, St.-Nr. 14316170324 ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev -- Kapp Arnaud - Xaqq ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev