Re: libgo patch committed: Update to current Go library
Ian Lance Taylor i...@google.com writes: There is a decent change that this will break something on non-x86 systems. I will do what testing I am able to do after the commit. As expected, it did break the Solaris libgo build: * udpsock_posix.go lacked definitions of joinIPv4Group, joinIPv6Group, setIPv6MulticastInterface, setIPv6MulticastLoopback. It turned out that sockoptip_solaris.go isn't needed any longer, but sockoptip_posix.go can be used instead. * Solaris lacked some stat_atim*.go in go/archive/tar initially. With stat_atim.go used, it didn't build initially. Since Stat_t.[AC]tim are Timestruc, we need a corresponding Unix(), now provided in syscall_solaris.go. With those changes, libgo builds again, and Solaris/x86 testsuite results are reasonable: FAIL: net --- FAIL: TestMulticastListener (0.03 seconds) multicast_posix_test.go:72: 224.0.0.254:12345 not found in RIB FAIL: syscall creds_test.go:22:41: error: reference to undefined identifier 'syscall.AF_LOCAL' fds, err := syscall.Socketpair(syscall.AF_LOCAL, syscall.SOCK_STREAM, 0) ^ creds_test.go:29:66: error: reference to undefined identifier 'syscall.SO_PASSCRED' err = syscall.SetsockoptInt(fds[0], syscall.SOL_SOCKET, syscall.SO_PASSCRED, 1) ^ creds_test.go:48:12: error: reference to undefined identifier 'syscall.Ucred' var ucred syscall.Ucred ^ creds_test.go:53:18: error: reference to undefined identifier 'syscall.UnixCredentials' oob := syscall.UnixCredentials(ucred) ^ creds_test.go:63:17: error: reference to undefined identifier 'syscall.UnixCredentials' oob := syscall.UnixCredentials(ucred) ^ creds_test.go:102:27: error: reference to undefined identifier 'syscall.ParseUnixCredentials' newUcred, err := syscall.ParseUnixCredentials(scm[0]) ^ passfd_test.go:40:41: error: reference to undefined identifier 'syscall.AF_LOCAL' fds, err := syscall.Socketpair(syscall.AF_LOCAL, syscall.SOCK_STREAM, 0) ^ Needs some porting work, it seems. FAIL: net/http unexpected fault address unexpected fault address 0x7f079d14be59unexpected fault address 0x7d140410 0x7f079bb4bed9unexpected fault address unexpected fault address Illegal Instruction 0x7d140410 0x7f079d54be59 goroutine 1 [chan receive]: main.main /var/gcc/regression/trunk/10-gcc/build/i386-pc-solaris2.10/amd64/libgo/gotest29238/test/_testmain.go:159 and many more goroutine stacks... FAIL: runtime/pprof Reported as PR go/54873 already. On the other hand, many Solaris/SPARC tests are failing: unexpected fault address 0xfcb70dec throw: fault [signal 0xa code=0x1 addr=0xfcb70dec] goroutine 4 [running]: unexpected fault address 0xfcb71864 panic during panic FAIL: log Still have to investigate what's going on. Rainer 2012-10-23 Rainer Orth r...@cebitec.uni-bielefeld.de * Makefile.am [LIBGO_IS_SOLARIS] (go_net_sockoptip_file): Replace go/net/sockoptip_solaris.go by go/net/sockoptip_posix.go. [LIBGO_IS_SOLARIS] (archive_tar_atim_file): Use go/archive/tar/stat_atim.go. * Makefile.in: Regenerate. * go/net/sockoptip_solaris.go: Remove. * go/syscall/syscall_solaris.go: New file. # HG changeset patch # Parent e8a7c94c7461ab7d5fc572468b69feddbb338069 Restore Solaris bootstrap diff --git a/libgo/Makefile.am b/libgo/Makefile.am --- a/libgo/Makefile.am +++ b/libgo/Makefile.am @@ -704,7 +704,7 @@ if LIBGO_IS_SOLARIS go_net_cgo_file = go/net/cgo_linux.go go_net_sock_file = go/net/sock_solaris.go go_net_sockopt_file = go/net/sockopt_bsd.go -go_net_sockoptip_file = go/net/sockoptip_bsd.go go/net/sockoptip_solaris.go +go_net_sockoptip_file = go/net/sockoptip_bsd.go go/net/sockoptip_posix.go else if LIBGO_IS_FREEBSD go_net_cgo_file = go/net/cgo_bsd.go @@ -968,6 +968,9 @@ endif if LIBGO_IS_OPENBSD archive_tar_atim_file = go/archive/tar/stat_atim.go endif +if LIBGO_IS_SOLARIS +archive_tar_atim_file = go/archive/tar/stat_atim.go +endif if LIBGO_IS_DARWIN archive_tar_atim_file = go/archive/tar/stat_atimespec.go endif diff --git a/libgo/go/syscall/syscall_solaris.go b/libgo/go/syscall/syscall_solaris.go new file mode 100644 --- /dev/null +++ b/libgo/go/syscall/syscall_solaris.go @@ -0,0 +1,13 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall + +func (ts *Timestruc) Unix() (sec int64, nsec int64) { + return int64(ts.Sec), int64(ts.Nsec) +} + +func (ts *Timestruc) Nano() int64 { + return int64(ts.Sec)*1e9 + int64(ts.Nsec) +} -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: libgo patch committed: Update to current Go library
On Thu, Oct 25, 2012 at 2:36 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: There is a decent change that this will break something on non-x86 systems. I will do what testing I am able to do after the commit. As expected, it did break the Solaris libgo build: * udpsock_posix.go lacked definitions of joinIPv4Group, joinIPv6Group, setIPv6MulticastInterface, setIPv6MulticastLoopback. It turned out that sockoptip_solaris.go isn't needed any longer, but sockoptip_posix.go can be used instead. * Solaris lacked some stat_atim*.go in go/archive/tar initially. With stat_atim.go used, it didn't build initially. Since Stat_t.[AC]tim are Timestruc, we need a corresponding Unix(), now provided in syscall_solaris.go. With those changes, libgo builds again, and Solaris/x86 testsuite results are reasonable: Thanks. I committed your patch to mainline. Ian
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 11:03 AM, Ian Lance Taylor i...@google.com wrote: OK, so it's a garbage collector problem. Can you e-mail me the test binary offlist? I will try to figure out where readFile lives. The fact that I'm not seeing any GC problems on x86 or x86_64 suggests that this is something Alpha-specific. Or, it could be something specific to the non-split-stack code. What's weird is that it's unlikely to be a major error, since most of your tests pass. Thanks for sending the binaries. Unfortunately I don't see any problems. All the code looks reasonable, the registers should be on the stack, and the garbage collector should see the stack. I don't see any way to make progress without doing some actual debugging. Ian
Re: libgo patch committed: Update to current Go library
Uros Bizjak ubiz...@gmail.com writes: Yes, I am running under gdb and all FDs are offset by +4 for some reason. So, FD 8 corresponds to FD4 in the strace log. This is normal, gdb is leaking some fds to the inferior (which is a bug). Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: libgo patch committed: Update to current Go library
On Tue, Oct 23, 2012 at 8:09 PM, Ian Lance Taylor i...@google.com wrote: On Tue, Oct 23, 2012 at 10:47 AM, Uros Bizjak ubiz...@gmail.com wrote: Additional test fails on alphaev68-linux-gnu: --- FAIL: TestPassFD (0.15 seconds) passfd_test.go:62: FileConn: dup: Bad file descriptor FAIL FAIL: syscall As far as I can see this error message occurs when calling dup on the second file descriptor returned by socketpair. But I can't see why that would fail on Alpha. For some reason, the second dup on the same file descriptor fails. Attached, please find strace dump, where: ... gettimeofday({1351064744, 381316}, NULL) = 0 gettimeofday({1351064744, 381316}, NULL) = 0 gettimeofday({1351064744, 382293}, NULL) = 0 gettimeofday({1351064744, 382293}, NULL) = 0 gettimeofday({1351064744, 382293}, NULL) = 0 dup(4) = 9 osf_sigprocmask(SIG_BLOCK, []) = 0 (old mask []) mmap(NULL, 24704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x20003076000 mprotect(0x20003076000, 8192, PROT_NONE) = 0 clone(child_stack=0x2000307ab70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2000307b350, tls=0x2000307b970, child_tidptr=0x2000307b350) = 21544 fcntl(9, F_SETFD, FD_CLOEXEC) = 0 osf_sigprocmask(SIG_BLOCK, []) = 0 (old mask []) sigreturn() (mask [SYS STOP TSTP IO USR2]) = 0 mmap(NULL, 24704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000307e000 mprotect(0x2000307e000, 8192, PROT_NONE) = 0 clone(child_stack=0x20003082b70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x20003083350, tls=0x20003083970, child_tidptr=0x20003083350) = 21545 futex(0x2c4a450, FUTEX_WAIT, 0, NULL) = 0 sigreturn() (mask [ILL ABRT FPE KILL BUS ALRM URG CONT IO XCPU]) = 0 osf_sigprocmask(SIG_BLOCK, []) = 0 (old mask []) futex(0xf84042b248, FUTEX_WAKE, 1) = 1 wait4(21546, [{WIFEXITED(s) WEXITSTATUS(s) == 0}], 0, {ru_utime={0, 51757}, ru_stime={0, 21484}, ...}) = 21546 osf_sigprocmask(SIG_BLOCK, []) = 0 (old mask []) close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) open(./a.out, O_RDONLY|O_CLOEXEC) = 3 ... Is it OK to call dup on the same FD the second time? Uros.
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 10:01 AM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Oct 23, 2012 at 8:09 PM, Ian Lance Taylor i...@google.com wrote: On Tue, Oct 23, 2012 at 10:47 AM, Uros Bizjak ubiz...@gmail.com wrote: Additional test fails on alphaev68-linux-gnu: --- FAIL: TestPassFD (0.15 seconds) passfd_test.go:62: FileConn: dup: Bad file descriptor FAIL FAIL: syscall As far as I can see this error message occurs when calling dup on the second file descriptor returned by socketpair. But I can't see why that would fail on Alpha. For some reason, the second dup on the same file descriptor fails. Attached, please find strace dump, where: ... Now attached for real. Uros. strace.txt.gz Description: GNU Zip compressed data
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 10:01 AM, Uros Bizjak ubiz...@gmail.com wrote: Additional test fails on alphaev68-linux-gnu: --- FAIL: TestPassFD (0.15 seconds) passfd_test.go:62: FileConn: dup: Bad file descriptor FAIL FAIL: syscall As far as I can see this error message occurs when calling dup on the second file descriptor returned by socketpair. But I can't see why that would fail on Alpha. For some reason, the second dup on the same file descriptor fails. Attached, please find strace dump, where: ... gettimeofday({1351064744, 381316}, NULL) = 0 gettimeofday({1351064744, 381316}, NULL) = 0 gettimeofday({1351064744, 382293}, NULL) = 0 gettimeofday({1351064744, 382293}, NULL) = 0 gettimeofday({1351064744, 382293}, NULL) = 0 dup(4) = 9 osf_sigprocmask(SIG_BLOCK, []) = 0 (old mask []) mmap(NULL, 24704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x20003076000 mprotect(0x20003076000, 8192, PROT_NONE) = 0 clone(child_stack=0x2000307ab70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2000307b350, tls=0x2000307b970, child_tidptr=0x2000307b350) = 21544 fcntl(9, F_SETFD, FD_CLOEXEC) = 0 osf_sigprocmask(SIG_BLOCK, []) = 0 (old mask []) sigreturn() (mask [SYS STOP TSTP IO USR2]) = 0 mmap(NULL, 24704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000307e000 mprotect(0x2000307e000, 8192, PROT_NONE) = 0 clone(child_stack=0x20003082b70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x20003083350, tls=0x20003083970, child_tidptr=0x20003083350) = 21545 futex(0x2c4a450, FUTEX_WAIT, 0, NULL) = 0 sigreturn() (mask [ILL ABRT FPE KILL BUS ALRM URG CONT IO XCPU]) = 0 osf_sigprocmask(SIG_BLOCK, []) = 0 (old mask []) futex(0xf84042b248, FUTEX_WAKE, 1) = 1 wait4(21546, [{WIFEXITED(s) WEXITSTATUS(s) == 0}], 0, {ru_utime={0, 51757}, ru_stime={0, 21484}, ...}) = 21546 osf_sigprocmask(SIG_BLOCK, []) = 0 (old mask []) close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) open(./a.out, O_RDONLY|O_CLOEXEC) = 3 ... Is it OK to call dup on the same FD the second time? To answer my own question: dup(4) = 9 ... close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) Test is calling dup on a closed file descriptor. FD 4 and 9 share file status flags. Uros.
Re: libgo patch committed: Update to current Go library
On 10/24/2012 10:12 AM, Uros Bizjak wrote: Is it OK to call dup on the same FD the second time? To answer my own question: dup(4) = 9 ... close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) Test is calling dup on a closed file descriptor. FD 4 and 9 share file status flags. I suspect that this is something else because this works: open(/dev/null, O_RDONLY) = 3 dup(3) = 4 close(4)= 0 dup(3) = 4 #include fcntl.h #include unistd.h int main(void) { int fd1 = open(/dev/null, O_RDONLY); if (fd1 0) { perror(open); return 1; } int fd2 = dup(fd1); if (fd2 0) { perror(dup); return 1; } if (close(fd2) 0) { perror(close); return 1; } int fd3 = dup(fd1); if (fd3 0) { perror(dup); return 1; } return 0; } -- Florian Weimer / Red Hat Product Security Team
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 10:22 AM, Florian Weimer fwei...@redhat.com wrote: Is it OK to call dup on the same FD the second time? To answer my own question: dup(4) = 9 ... close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) Test is calling dup on a closed file descriptor. FD 4 and 9 share file status flags. I suspect that this is something else because this works: [...] You are right, this example works on alpha, too. Thanks, Uros.
Re: libgo patch committed: Update to current Go library
Uros Bizjak ubiz...@gmail.com writes: To answer my own question: dup(4) = 9 ... close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) Test is calling dup on a closed file descriptor. FD 4 is most likely closed by one of the cloned threads. Use strace -f to follow them. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 2:18 PM, Andreas Schwab sch...@linux-m68k.org wrote: Uros Bizjak ubiz...@gmail.com writes: To answer my own question: dup(4) = 9 ... close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) Test is calling dup on a closed file descriptor. FD 4 is most likely closed by one of the cloned threads. Use strace -f to follow them. Yes, indeed! Attached strace -f record confirms this on alpha. The same happens on x86_64, but for some reason x86_64 doesn't complain when executing dup(2) on closed FD. Uros.
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 5:31 AM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Oct 24, 2012 at 2:18 PM, Andreas Schwab sch...@linux-m68k.org wrote: Uros Bizjak ubiz...@gmail.com writes: To answer my own question: dup(4) = 9 ... close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) Test is calling dup on a closed file descriptor. FD 4 is most likely closed by one of the cloned threads. Use strace -f to follow them. Yes, indeed! Attached strace -f record confirms this on alpha. The same happens on x86_64, but for some reason x86_64 doesn't complain when executing dup(2) on closed FD. The test execs itself by calling the fork and execve functions in libc. It is the child process that closes the FD after it forks. From the point of view of the parent process, the FD should still be open. I don't think you attached the strace -f output so I can't confirm this. Ian
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 3:10 PM, Ian Lance Taylor i...@google.com wrote: On Wed, Oct 24, 2012 at 5:31 AM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Oct 24, 2012 at 2:18 PM, Andreas Schwab sch...@linux-m68k.org wrote: Uros Bizjak ubiz...@gmail.com writes: To answer my own question: dup(4) = 9 ... close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) Test is calling dup on a closed file descriptor. FD 4 is most likely closed by one of the cloned threads. Use strace -f to follow them. Yes, indeed! Attached strace -f record confirms this on alpha. The same happens on x86_64, but for some reason x86_64 doesn't complain when executing dup(2) on closed FD. The test execs itself by calling the fork and execve functions in libc. It is the child process that closes the FD after it forks. From the point of view of the parent process, the FD should still be open. I don't think you attached the strace -f output so I can't confirm this. Eh, sorry, attached now. After the second socketpair and before dup, there is a close in the same thread. Uros. strace.txt.gz Description: GNU Zip compressed data
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 6:19 AM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Oct 24, 2012 at 3:10 PM, Ian Lance Taylor i...@google.com wrote: On Wed, Oct 24, 2012 at 5:31 AM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Oct 24, 2012 at 2:18 PM, Andreas Schwab sch...@linux-m68k.org wrote: Uros Bizjak ubiz...@gmail.com writes: To answer my own question: dup(4) = 9 ... close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) Test is calling dup on a closed file descriptor. FD 4 is most likely closed by one of the cloned threads. Use strace -f to follow them. Yes, indeed! Attached strace -f record confirms this on alpha. The same happens on x86_64, but for some reason x86_64 doesn't complain when executing dup(2) on closed FD. The test execs itself by calling the fork and execve functions in libc. It is the child process that closes the FD after it forks. From the point of view of the parent process, the FD should still be open. I don't think you attached the strace -f output so I can't confirm this. Eh, sorry, attached now. After the second socketpair and before dup, there is a close in the same thread. Thanks. I agree. Unfortunately, I can't figure out what is causing it. These seem to be the relevant calls. 16252 socketpair(PF_FILE, SOCK_STREAM, 0, [3, 4]) = 0 16252 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x20002f0ecd0) = 16259 16259 execve(./a.out, [./a.out, -test.run=^TestPassFD$, --, /tmp/TestPassFD684357043], [/* 33 vars */] unfinished ... 16252 clone( unfinished ... 16252 ... clone resumed child_stack=0x200031e2b70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x200031e3350, tls=0x200031e3970, child_tidptr=0x200031e3350) = 16260 * This is the bad call, but where does this come from? 16252 close(4 unfinished ... 16252 ... close resumed ) = 0 16260 wait4(16259, unfinished ... 16261 read(9, unfinished ... 16259 sendmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{x, 1}], msg_controllen=24, {cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {9}}, msg_flags=0}, 0) = 1 16259 exit_group(0) = ? 16261 ... read resumed , 512) = 0 16260 ... wait4 resumed [{WIFEXITED(s) WEXITSTATUS(s) == 0}], 0, {ru_utime={0, 46875}, ru_stime={0, 26367}, ...}) = 16259 16261 dup(4)= -1 EBADF (Bad file descriptor) Here is an approachj that might help. Set a breakpoint on socketpair. The failure comes after the second call to socketpair, the one from the function TestPassFD. After that breakpoint is reached, set a breakpoint on close. Look for the call to close(4). Get a backtrace from there so we can see what is calling it. Ian
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 4:30 PM, Ian Lance Taylor i...@google.com wrote: On Wed, Oct 24, 2012 at 6:19 AM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Oct 24, 2012 at 3:10 PM, Ian Lance Taylor i...@google.com wrote: On Wed, Oct 24, 2012 at 5:31 AM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Oct 24, 2012 at 2:18 PM, Andreas Schwab sch...@linux-m68k.org wrote: Uros Bizjak ubiz...@gmail.com writes: To answer my own question: dup(4) = 9 ... close(9)= 0 dup(4) = -1 EBADF (Bad file descriptor) Test is calling dup on a closed file descriptor. FD 4 is most likely closed by one of the cloned threads. Use strace -f to follow them. Yes, indeed! Attached strace -f record confirms this on alpha. The same happens on x86_64, but for some reason x86_64 doesn't complain when executing dup(2) on closed FD. The test execs itself by calling the fork and execve functions in libc. It is the child process that closes the FD after it forks. From the point of view of the parent process, the FD should still be open. I don't think you attached the strace -f output so I can't confirm this. Eh, sorry, attached now. After the second socketpair and before dup, there is a close in the same thread. Thanks. I agree. Unfortunately, I can't figure out what is causing it. These seem to be the relevant calls. 16252 socketpair(PF_FILE, SOCK_STREAM, 0, [3, 4]) = 0 16252 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x20002f0ecd0) = 16259 16259 execve(./a.out, [./a.out, -test.run=^TestPassFD$, --, /tmp/TestPassFD684357043], [/* 33 vars */] unfinished ... 16252 clone( unfinished ... 16252 ... clone resumed child_stack=0x200031e2b70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x200031e3350, tls=0x200031e3970, child_tidptr=0x200031e3350) = 16260 * This is the bad call, but where does this come from? 16252 close(4 unfinished ... 16252 ... close resumed ) = 0 16260 wait4(16259, unfinished ... 16261 read(9, unfinished ... 16259 sendmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{x, 1}], msg_controllen=24, {cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {9}}, msg_flags=0}, 0) = 1 16259 exit_group(0) = ? 16261 ... read resumed , 512) = 0 16260 ... wait4 resumed [{WIFEXITED(s) WEXITSTATUS(s) == 0}], 0, {ru_utime={0, 46875}, ru_stime={0, 26367}, ...}) = 16259 16261 dup(4)= -1 EBADF (Bad file descriptor) Here is an approachj that might help. Set a breakpoint on socketpair. The failure comes after the second call to socketpair, the one from the function TestPassFD. After that breakpoint is reached, set a breakpoint on close. Look for the call to close(4). Get a backtrace from there so we can see what is calling it. Continuing. [New Thread 0x2000307b280 (LWP 8059)] Breakpoint 18, 0x020002e378c0 in socketpair () from /lib/libc.so.6.1 Continuing. [New Thread 0x20003083280 (LWP 8065)] [Switching to Thread 0x20003083280 (LWP 8065)] [...] The first call with relevant FD is from: Breakpoint 21, 0x020002e243f8 in close () from /lib/libc.so.6.1 (gdb) i r a0 a0 0x8 8 (gdb) bt #0 0x020002e243f8 in close () from /lib/libc.so.6.1 #1 0x00012003559c in syscall.Close (fd=optimized out) at libcalls.go:271 #2 0x025d3cfc in os.close.pN7_os.file (file=0xf840414b70) at ../../../gcc-svn/trunk/libgo/go/os/file_unix.go:106 #3 0x02888f18 in ffi_call_osf () at ../../../gcc-svn/trunk/libffi/src/alpha/osf.S:79 #4 0x028889c4 in ffi_call (cif=optimized out, fn=optimized out, rvalue=optimized out, avalue=0xf840c87fe8) at ../../../gcc-svn/trunk/libffi/src/alpha/ffi.c:169 #5 0x02558204 in reflect.call (func_type=0x29e9650 __go_td_FppN7_os.fileerN5_erroree, func_addr=0x25d3c60 os.close.pN7_os.file, is_interface=optimized out, is_method=optimized out, params=0xf840c87fe0, results=0x0) at ../../../gcc-svn/trunk/libgo/runtime/go-reflect-call.c:498 #6 0x025620b8 in runfinq (dummy=optimized out) at ../../../gcc-svn/trunk/libgo/runtime/mgc0.c:1168 #7 0x02566b20 in kickoff () at ../../../gcc-svn/trunk/libgo/runtime/proc.c:338 #8 0x020002d8d024 in ?? () from /lib/libc.so.6.1 [... dup call follows from the same thread ...] Breakpoint 22, 0x020002e24e90 in dup () from /lib/libc.so.6.1 (gdb) i r a0 a0 0x8 8 (gdb) bt #0 0x020002e24e90 in dup () from /lib/libc.so.6.1 #1 0x00012003578c in syscall.Dup (oldfd=8) at libcalls.go:314 #2 0x025b2794 in net.newFileFD (f=0xf844a0) at ../../../gcc-svn/trunk/libgo/go/net/file_unix.go:16 #3 0x025c0dcc in net.FileConn (f=0x8) at ../../../gcc-svn/trunk/libgo/go/net/file_unix.go:78 #4
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 9:34 AM, Uros Bizjak ubiz...@gmail.com wrote: Continuing. [New Thread 0x2000307b280 (LWP 8059)] Breakpoint 18, 0x020002e378c0 in socketpair () from /lib/libc.so.6.1 Continuing. [New Thread 0x20003083280 (LWP 8065)] [Switching to Thread 0x20003083280 (LWP 8065)] [...] The first call with relevant FD is from: Breakpoint 21, 0x020002e243f8 in close () from /lib/libc.so.6.1 (gdb) i r a0 a0 0x8 8 Does this mean that this is a call to close file descriptor 8? According to the strace log, the file descriptor we care about is 4. Although it is also true that I don't see a close of file descriptor 8 at all in the strace log. Or is the change from 4 to 8 due somehow to running the program under gdb? (gdb) bt #0 0x020002e243f8 in close () from /lib/libc.so.6.1 #1 0x00012003559c in syscall.Close (fd=optimized out) at libcalls.go:271 #2 0x025d3cfc in os.close.pN7_os.file (file=0xf840414b70) at ../../../gcc-svn/trunk/libgo/go/os/file_unix.go:106 #3 0x02888f18 in ffi_call_osf () at ../../../gcc-svn/trunk/libffi/src/alpha/osf.S:79 #4 0x028889c4 in ffi_call (cif=optimized out, fn=optimized out, rvalue=optimized out, avalue=0xf840c87fe8) at ../../../gcc-svn/trunk/libffi/src/alpha/ffi.c:169 #5 0x02558204 in reflect.call (func_type=0x29e9650 __go_td_FppN7_os.fileerN5_erroree, func_addr=0x25d3c60 os.close.pN7_os.file, is_interface=optimized out, is_method=optimized out, params=0xf840c87fe0, results=0x0) at ../../../gcc-svn/trunk/libgo/runtime/go-reflect-call.c:498 #6 0x025620b8 in runfinq (dummy=optimized out) at ../../../gcc-svn/trunk/libgo/runtime/mgc0.c:1168 #7 0x02566b20 in kickoff () at ../../../gcc-svn/trunk/libgo/runtime/proc.c:338 #8 0x020002d8d024 in ?? () from /lib/libc.so.6.1 If this is indeed the file descriptor we care about, then this is interesting, because it is being closed by a finalizer run by the garbage collector. That implies that the garbage collector collected the local variable readFile in TestPassFD in passfd_test.go, which would be clearly wrong. Unfortunately this could be rather difficult to debug. Ian
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 7:46 PM, Ian Lance Taylor i...@google.com wrote: On Wed, Oct 24, 2012 at 9:34 AM, Uros Bizjak ubiz...@gmail.com wrote: Continuing. [New Thread 0x2000307b280 (LWP 8059)] Breakpoint 18, 0x020002e378c0 in socketpair () from /lib/libc.so.6.1 Continuing. [New Thread 0x20003083280 (LWP 8065)] [Switching to Thread 0x20003083280 (LWP 8065)] [...] The first call with relevant FD is from: Breakpoint 21, 0x020002e243f8 in close () from /lib/libc.so.6.1 (gdb) i r a0 a0 0x8 8 Does this mean that this is a call to close file descriptor 8? According to the strace log, the file descriptor we care about is 4. Although it is also true that I don't see a close of file descriptor 8 at all in the strace log. Or is the change from 4 to 8 due somehow to running the program under gdb? Yes, I am running under gdb and all FDs are offset by +4 for some reason. So, FD 8 corresponds to FD4 in the strace log. (gdb) bt #0 0x020002e243f8 in close () from /lib/libc.so.6.1 #1 0x00012003559c in syscall.Close (fd=optimized out) at libcalls.go:271 #2 0x025d3cfc in os.close.pN7_os.file (file=0xf840414b70) at ../../../gcc-svn/trunk/libgo/go/os/file_unix.go:106 #3 0x02888f18 in ffi_call_osf () at ../../../gcc-svn/trunk/libffi/src/alpha/osf.S:79 #4 0x028889c4 in ffi_call (cif=optimized out, fn=optimized out, rvalue=optimized out, avalue=0xf840c87fe8) at ../../../gcc-svn/trunk/libffi/src/alpha/ffi.c:169 #5 0x02558204 in reflect.call (func_type=0x29e9650 __go_td_FppN7_os.fileerN5_erroree, func_addr=0x25d3c60 os.close.pN7_os.file, is_interface=optimized out, is_method=optimized out, params=0xf840c87fe0, results=0x0) at ../../../gcc-svn/trunk/libgo/runtime/go-reflect-call.c:498 #6 0x025620b8 in runfinq (dummy=optimized out) at ../../../gcc-svn/trunk/libgo/runtime/mgc0.c:1168 #7 0x02566b20 in kickoff () at ../../../gcc-svn/trunk/libgo/runtime/proc.c:338 #8 0x020002d8d024 in ?? () from /lib/libc.so.6.1 If this is indeed the file descriptor we care about, then this is interesting, because it is being closed by a finalizer run by the garbage collector. That implies that the garbage collector collected the local variable readFile in TestPassFD in passfd_test.go, which would be clearly wrong. Unfortunately this could be rather difficult to debug. Yes, this is correct descriptor. For added fun, a descriptor that corresponds to writeFile closes through the same mechanism. Uros.
Re: libgo patch committed: Update to current Go library
On Wed, Oct 24, 2012 at 10:52 AM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Oct 24, 2012 at 7:46 PM, Ian Lance Taylor i...@google.com wrote: On Wed, Oct 24, 2012 at 9:34 AM, Uros Bizjak ubiz...@gmail.com wrote: Continuing. [New Thread 0x2000307b280 (LWP 8059)] Breakpoint 18, 0x020002e378c0 in socketpair () from /lib/libc.so.6.1 Continuing. [New Thread 0x20003083280 (LWP 8065)] [Switching to Thread 0x20003083280 (LWP 8065)] [...] The first call with relevant FD is from: Breakpoint 21, 0x020002e243f8 in close () from /lib/libc.so.6.1 (gdb) i r a0 a0 0x8 8 Does this mean that this is a call to close file descriptor 8? According to the strace log, the file descriptor we care about is 4. Although it is also true that I don't see a close of file descriptor 8 at all in the strace log. Or is the change from 4 to 8 due somehow to running the program under gdb? Yes, I am running under gdb and all FDs are offset by +4 for some reason. So, FD 8 corresponds to FD4 in the strace log. (gdb) bt #0 0x020002e243f8 in close () from /lib/libc.so.6.1 #1 0x00012003559c in syscall.Close (fd=optimized out) at libcalls.go:271 #2 0x025d3cfc in os.close.pN7_os.file (file=0xf840414b70) at ../../../gcc-svn/trunk/libgo/go/os/file_unix.go:106 #3 0x02888f18 in ffi_call_osf () at ../../../gcc-svn/trunk/libffi/src/alpha/osf.S:79 #4 0x028889c4 in ffi_call (cif=optimized out, fn=optimized out, rvalue=optimized out, avalue=0xf840c87fe8) at ../../../gcc-svn/trunk/libffi/src/alpha/ffi.c:169 #5 0x02558204 in reflect.call (func_type=0x29e9650 __go_td_FppN7_os.fileerN5_erroree, func_addr=0x25d3c60 os.close.pN7_os.file, is_interface=optimized out, is_method=optimized out, params=0xf840c87fe0, results=0x0) at ../../../gcc-svn/trunk/libgo/runtime/go-reflect-call.c:498 #6 0x025620b8 in runfinq (dummy=optimized out) at ../../../gcc-svn/trunk/libgo/runtime/mgc0.c:1168 #7 0x02566b20 in kickoff () at ../../../gcc-svn/trunk/libgo/runtime/proc.c:338 #8 0x020002d8d024 in ?? () from /lib/libc.so.6.1 If this is indeed the file descriptor we care about, then this is interesting, because it is being closed by a finalizer run by the garbage collector. That implies that the garbage collector collected the local variable readFile in TestPassFD in passfd_test.go, which would be clearly wrong. Unfortunately this could be rather difficult to debug. Yes, this is correct descriptor. For added fun, a descriptor that corresponds to writeFile closes through the same mechanism. OK, so it's a garbage collector problem. Can you e-mail me the test binary offlist? I will try to figure out where readFile lives. The fact that I'm not seeing any GC problems on x86 or x86_64 suggests that this is something Alpha-specific. Or, it could be something specific to the non-split-stack code. What's weird is that it's unlikely to be a major error, since most of your tests pass. Ian
Re: libgo patch committed: Update to current Go library
Hello! I have committed a patch to update the mainline version of libgo to the current master Go library sources. At this point I will only be updating the gcc 4.7 branch for bug fixes. This is a substantial patch that brings in several months of work. As usual I am not posting the complete patch here, as it is mostly simply copies of changes to the upstream repository. I have attached the changes to gccgo-specific files and files with lots of gccgo-specific changes. There is a decent change that this will break something on non-x86 systems. I will do what testing I am able to do after the commit. On my x86_64-linux-gnu (Fedora 18) libgo testsuite fails following test: --- FAIL: TestCgoCrashHandler (0.01 seconds) testing.go:377: program exited with error: exec: go: executable file not found in $PATH --- FAIL: TestCrashHandler (0.00 seconds) testing.go:377: program exited with error: exec: go: executable file not found in $PATH FAIL FAIL: runtime Probably some trivial test issue. Additional test fails on alphaev68-linux-gnu: --- FAIL: TestPassFD (0.15 seconds) passfd_test.go:62: FileConn: dup: Bad file descriptor FAIL FAIL: syscall I didn't yet debug this one, will do soon. Otherwise all other libgo tests pass on these two systems. Uros.
Re: libgo patch committed: Update to current Go library
On Tue, Oct 23, 2012 at 10:47 AM, Uros Bizjak ubiz...@gmail.com wrote: On my x86_64-linux-gnu (Fedora 18) libgo testsuite fails following test: --- FAIL: TestCgoCrashHandler (0.01 seconds) testing.go:377: program exited with error: exec: go: executable file not found in $PATH --- FAIL: TestCrashHandler (0.00 seconds) testing.go:377: program exited with error: exec: go: executable file not found in $PATH FAIL FAIL: runtime Thanks. Turns out this test is currently meaningless with gccgo, and was only working for me because I have the other Go compiler on my PATH as well. I committed this patch to mainline to disable it for gccgo. Ian foo.patch Description: Binary data
Re: libgo patch committed: Update to current Go library
On Tue, Oct 23, 2012 at 10:47 AM, Uros Bizjak ubiz...@gmail.com wrote: Additional test fails on alphaev68-linux-gnu: --- FAIL: TestPassFD (0.15 seconds) passfd_test.go:62: FileConn: dup: Bad file descriptor FAIL FAIL: syscall As far as I can see this error message occurs when calling dup on the second file descriptor returned by socketpair. But I can't see why that would fail on Alpha. Ian