[ https://issues.apache.org/jira/browse/ZOOKEEPER-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985978#comment-14985978 ]
Marshall McMullen commented on ZOOKEEPER-2311: ---------------------------------------------- Another interesting link related to this: https://bugzilla.kernel.org/show_bug.cgi?id=80981 > assert in setup_random > ---------------------- > > Key: ZOOKEEPER-2311 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2311 > Project: ZooKeeper > Issue Type: Bug > Components: c client > Reporter: Marshall McMullen > > We've started seeing an assert failing inside setup_random at line 537: > 528 static void setup_random() > 529 { > 530 #ifndef _WIN32 // TODO: better seed > 531 int seed; > 532 int fd = open("/dev/urandom", O_RDONLY); > 533 if (fd == -1) { > 534 seed = getpid(); > 535 } else { > 536 int rc = read(fd, &seed, sizeof(seed)); > 537 assert(rc == sizeof(seed)); > 538 close(fd); > 539 } > 540 srandom(seed); > 541 srand48(seed); > 542 #endif > The core files show: > Program terminated with signal 6, Aborted. > #0 0x00007f9ff665a0d5 in raise () from /lib/x86_64-linux-gnu/libc.so.6 > #0 0x00007f9ff665a0d5 in raise () from /lib/x86_64-linux-gnu/libc.so.6 > #1 0x00007f9ff665d83b in abort () from /lib/x86_64-linux-gnu/libc.so.6 > #2 0x00007f9ff6652d9e in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > #3 0x00007f9ff6652e42 in __assert_fail () from > /lib/x86_64-linux-gnu/libc.so.6 > #4 0x00007f9ff8e4070a in setup_random () at src/zookeeper.c:476 > #5 0x00007f9ff8e40d76 in resolve_hosts (zh=0x7f9fe14de400, > hosts_in=0x7f9fd700f400 "10.26.200.6:2181,10.26.200.7:2181,10.26.200.8:2181", > avec=0x7f9fd87fab60) at src/zookeeper.c:730 > #6 0x00007f9ff8e40e87 in update_addrs (zh=0x7f9fe14de400) at > src/zookeeper.c:801 > #7 0x00007f9ff8e44176 in zookeeper_interest (zh=0x7f9fe14de400, > fd=0x7f9fd87fac4c, interest=0x7f9fd87fac50, tv=0x7f9fd87fac80) at > src/zookeeper.c:1980 > #8 0x00007f9ff8e553f5 in do_io (v=0x7f9fe14de400) at src/mt_adaptor.c:379 > #9 0x00000000020170ac in solidfire::ThreadBacktraces::LaunchThread > (this=0x7f9ff0c8d500, args=<optimized out>) at shared/ThreadBacktraces.cpp:497 > #10 0x00007f9ff804de9a in start_thread () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #11 0x00007f9ff671738d in clone () from /lib/x86_64-linux-gnu/libc.so.6 > #12 0x0000000000000000 in ?? () > I'm not sure what the underlying cause of this is... But POSIX always allows > for a short read(2), and any program MUST check for short reads... > Has anyone else encountered this issue? We are seeing it rather frequently > which is concerning. -- This message was sent by Atlassian JIRA (v6.3.4#6332)