Re: git status core dump with bad sector!
Hi, sorry for the delay... On 22/04/16 01:11 AM, Jeff King wrote: On Thu, Apr 14, 2016 at 10:59:57AM -0400, Eric Chamberland wrote: just cloned a repo and it checked-out wihtout any error (with git 2.2.0) but got come corrupted files (because I got some sdd failures). Then, I get a git core dump when trying to "git status" into the repo which have a "bad sector" on sdd drive (crypted partition). I tried with git 2.2.0 AND git version 2.8.1.185.gdc0db2c.dirty (just modified the Makefile to remove STRIP part) In both cases, I have a Bus error (core dumped) Interesting. There was a known issue with reading corrupted pack .idx files, but it was fixed in v2.8.0. So this could be a new thing. SIGBUS is somewhat rare, though (usually just accessing unmapped memory should get us a SIGSEGV). What platform are you on? I seem to recall that hardware like ARM that cares about memory alignment is more likely to get a SIGBUS. Linux ... 3.7.10-1.45-desktop #1 SMP PREEMPT Tue Dec 16 20:27:58 UTC 2014 (4c885a1) x86_64 x86_64 x86_64 GNU/Linux df . Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/cr_ata-ST31000524AS_6VPCXHSW-part1 961430856 699476812 213116108 77% /pmi model name : Intel(R) Xeon(R) CPU X5690 @ 3.47GHz Program received signal SIGBUS, Bus error. 0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0 (gdb) bt #0 0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0 #1 0x3334d90d8c20f3f0 in ?? () #2 0xe59b5a6cd844a601 in ?? () #3 0xc587a53f67985ae7 in ?? () #4 0x3ce81893e5541777 in ?? () #5 0xdeb18349a4b042ea in ?? () #6 0x8254de489067ec4b in ?? () #7 0x6fbef2439704c81b in ?? () #8 0xe0eee2bb385a96da in ?? () #9 0x76e19ab3 in ?? () #10 0x7fffc4d0 in ?? () #11 0x001d in ?? () #12 0x77863f80 in SHA1_Update () from /lib64/libcrypto.so.1.0.0 #13 0x005102c0 in write_sha1_file_prepare (buf=buf@entry=0x76c81000, len=1673936, type=, sha1=sha1@entry=0x7fffc750 "\340_~", hdr=hdr@entry=0x7fffc570 "blob 1673936", So I'd assume here that the problem is in accessing the memory in "buf". to actually compute the sha1. That is mmap'd data, but the process is fairly bland (mmap however many bytes stat() tells us the file has, and then compute the sha1). You mentioned a bad sector; could it be that the filesystem is corrupted, and the OS is giving us SIGBUS when trying to read unavailable bytes from an mmap'd file? Yes it could be that... That would explain the SIGBUS versus SIGSEGV. What happens if you "cat" the file in question: hmmm, it shows the beginning of the file, then ends with: cat: Avion.Quadratique.cont.vtu.etalon: Input/output error also, this appear in /var/log/messages: 2016-05-04T16:33:19.243595-04:00 melkor kernel: [1096660.854161] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 2016-05-04T16:33:19.243609-04:00 melkor kernel: [1096660.854165] ata4.00: irq_stat 0x4008 2016-05-04T16:33:19.243610-04:00 melkor kernel: [1096660.854168] ata4.00: failed command: READ FPDMA QUEUED 2016-05-04T16:33:19.243611-04:00 melkor kernel: [1096660.854175] ata4.00: cmd 60/08:00:70:30:c6/00:00:53:00:00/40 tag 0 ncq 4096 in 2016-05-04T16:33:19.243612-04:00 melkor kernel: [1096660.854175] res 41/40:08:71:30:c6/00:00:53:00:00/00 Emask 0x409 (media error) 2016-05-04T16:33:19.243613-04:00 melkor kernel: [1096660.854178] ata4.00: status: { DRDY ERR } 2016-05-04T16:33:19.243614-04:00 melkor kernel: [1096660.854180] ata4.00: error: { UNC } 2016-05-04T16:33:19.340479-04:00 melkor kernel: [1096660.950794] ata4.00: configured for UDMA/133 2016-05-04T16:33:19.340484-04:00 melkor kernel: [1096660.950806] sd 3:0:0:0: [sdb] Unhandled sense code 2016-05-04T16:33:19.340485-04:00 melkor kernel: [1096660.950809] sd 3:0:0:0: [sdb] 2016-05-04T16:33:19.340485-04:00 melkor kernel: [1096660.950811] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE 2016-05-04T16:33:19.340486-04:00 melkor kernel: [1096660.950814] sd 3:0:0:0: [sdb] 2016-05-04T16:33:19.340486-04:00 melkor kernel: [1096660.950815] Sense Key : Medium Error [current] [descriptor] 2016-05-04T16:33:19.340486-04:00 melkor kernel: [1096660.950819] Descriptor sense data with sense descriptors (in hex): 2016-05-04T16:33:19.340487-04:00 melkor kernel: [1096660.950820] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 2016-05-04T16:33:19.340487-04:00 melkor kernel: [1096660.950829] 53 c6 30 71 2016-05-04T16:33:19.340488-04:00 melkor kernel: [1096660.950834] sd 3:0:0:0: [sdb] 2016-05-04T16:33:19.340488-04:00 melkor kernel: [1096660.950836] Add. Sense: Unrecovered read error - auto reallocate failed 2016-05-04T16:33:19.340489-04:00 melkor kernel: [1096660.950839] sd 3:0:0:0: [sdb] CDB: 2016-05-04T16:33:19.340489-04:00 melkor kernel: [1096660.950840] Read(10): 28 00 53 c6 30 70 00 00 08 00 2016-05-04T16:33:19.340489-04:00 melkor kernel: [1096660.950848] end_request: I/O error, dev sdb, sector
Re: git status core dump with bad sector!
On Thu, Apr 14, 2016 at 10:59:57AM -0400, Eric Chamberland wrote: > just cloned a repo and it checked-out wihtout any error (with git 2.2.0) but > got come corrupted files (because I got some sdd failures). > > Then, I get a git core dump when trying to "git status" into the repo which > have a "bad sector" on sdd drive (crypted partition). > > I tried with git 2.2.0 AND git version 2.8.1.185.gdc0db2c.dirty (just > modified the Makefile to remove STRIP part) > > In both cases, I have a Bus error (core dumped) Interesting. There was a known issue with reading corrupted pack .idx files, but it was fixed in v2.8.0. So this could be a new thing. SIGBUS is somewhat rare, though (usually just accessing unmapped memory should get us a SIGSEGV). What platform are you on? I seem to recall that hardware like ARM that cares about memory alignment is more likely to get a SIGBUS. > Program received signal SIGBUS, Bus error. > 0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0 > (gdb) bt > #0 0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0 > #1 0x3334d90d8c20f3f0 in ?? () > #2 0xe59b5a6cd844a601 in ?? () > #3 0xc587a53f67985ae7 in ?? () > #4 0x3ce81893e5541777 in ?? () > #5 0xdeb18349a4b042ea in ?? () > #6 0x8254de489067ec4b in ?? () > #7 0x6fbef2439704c81b in ?? () > #8 0xe0eee2bb385a96da in ?? () > #9 0x76e19ab3 in ?? () > #10 0x7fffc4d0 in ?? () > #11 0x001d in ?? () > #12 0x77863f80 in SHA1_Update () from /lib64/libcrypto.so.1.0.0 > #13 0x005102c0 in write_sha1_file_prepare > (buf=buf@entry=0x76c81000, len=1673936, type=, > sha1=sha1@entry=0x7fffc750 "\340_~", hdr=hdr@entry=0x7fffc570 "blob > 1673936", So I'd assume here that the problem is in accessing the memory in "buf". to actually compute the sha1. That is mmap'd data, but the process is fairly bland (mmap however many bytes stat() tells us the file has, and then compute the sha1). You mentioned a bad sector; could it be that the filesystem is corrupted, and the OS is giving us SIGBUS when trying to read unavailable bytes from an mmap'd file? That would explain the SIGBUS versus SIGSEGV. What happens if you "cat" the file in question: > #15 0x005159f8 in index_mem (sha1=sha1@entry=0x7fffc750 > "\340_~", buf=buf@entry=0x76c81000, size=1673936, > type=type@entry=OBJ_BLOB, > path=path@entry=0x80a818 > "Ressources/dev/Test.ExportationVTK/Ressources.Avion/Avion.Quadratique.cont.vtu.etalon", > flags=flags@entry=0) at sha1_file.c:3305 Can it show all of the bytes? I guess from the "size" field it's too big to manually verify, but "cat >/dev/null" should be enough to see if we can read the whole thing. > Ii would have expected git to first gave me an error when checking out the > files!!! Here is the log: > > Checking out files: 99% (28645/28934) > Checking out files: 100% (28934/28934) > Checking out files: 100% (28934/28934), done. > Already on 'master' > Your branch is up-to-date with 'origin/master'. > On valide le dépôt TestValidation avec la référence: > 9b4a485202b2b52922377842c15bfd605d240667 > HEAD is now at 9b4a485 BUG: On spécifie bash comme shell... > > But at least 1 file is corrupted! > > I keep preciously this faulty repo to further investigation with someone who > can help dig into the coredump and correct it... So _if_ my guess is right that you have filesystem corruption, git may not even know about it. It wrote the file, and the OS said "OK, success", not knowing it had been partially corrupted. And if that guess is right, it also means there's no git bug to fix. SIGBUS is the natural way for the OS to tell the process that mmap'd data isn't available. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git status core dump with bad sector!
Hi, just cloned a repo and it checked-out wihtout any error (with git 2.2.0) but got come corrupted files (because I got some sdd failures). Then, I get a git core dump when trying to "git status" into the repo which have a "bad sector" on sdd drive (crypted partition). I tried with git 2.2.0 AND git version 2.8.1.185.gdc0db2c.dirty (just modified the Makefile to remove STRIP part) In both cases, I have a Bus error (core dumped) Tried to make it more verbose: GIT_TRACE=2 GIT_CURL_VERBOSE=2 GIT_TRACE_PERFORMANCE=2 GIT_TRACE_PACK_ACCESS=2 GIT_TRACE_PACKET=2 GIT_TRACE_PACKFILE=2 GIT_TRACE_SETUP=2 GIT_TRACE_SHALLOW=2 /opt/gitgit/bin/git status 10:54:30.644999 trace.c:318 setup: git_dir: .git 10:54:30.645094 trace.c:319 setup: git_common_dir: .git 10:54:30.645102 trace.c:320 setup: worktree: /pmi/cmpbib/compilation_BIB_gcc-4.5.1_64bit/TestValidation_avec_erreur_disque_git_core_dump_dans_dev_Test.ExportationVTK_Avion 10:54:30.645112 trace.c:321 setup: cwd: /pmi/cmpbib/compilation_BIB_gcc-4.5.1_64bit/TestValidation_avec_erreur_disque_git_core_dump_dans_dev_Test.ExportationVTK_Avion 10:54:30.645151 trace.c:322 setup: prefix: Ressources/dev/Test.ExportationVTK/ 10:54:30.645181 git.c:350 trace: built-in: git 'status' Bus error (core dumped) started in gdb: Program received signal SIGBUS, Bus error. 0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0 (gdb) bt #0 0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0 #1 0x3334d90d8c20f3f0 in ?? () #2 0xe59b5a6cd844a601 in ?? () #3 0xc587a53f67985ae7 in ?? () #4 0x3ce81893e5541777 in ?? () #5 0xdeb18349a4b042ea in ?? () #6 0x8254de489067ec4b in ?? () #7 0x6fbef2439704c81b in ?? () #8 0xe0eee2bb385a96da in ?? () #9 0x76e19ab3 in ?? () #10 0x7fffc4d0 in ?? () #11 0x001d in ?? () #12 0x77863f80 in SHA1_Update () from /lib64/libcrypto.so.1.0.0 #13 0x005102c0 in write_sha1_file_prepare (buf=buf@entry=0x76c81000, len=1673936, type=, sha1=sha1@entry=0x7fffc750 "\340_~", hdr=hdr@entry=0x7fffc570 "blob 1673936", hdrlen=hdrlen@entry=0x7fffc56c) at sha1_file.c:2951 #14 0x0051567b in hash_sha1_file (buf=buf@entry=0x76c81000, len=, type=, sha1=sha1@entry=0x7fffc750 "\340_~") at sha1_file.c:3010 #15 0x005159f8 in index_mem (sha1=sha1@entry=0x7fffc750 "\340_~", buf=buf@entry=0x76c81000, size=1673936, type=type@entry=OBJ_BLOB, path=path@entry=0x80a818 "Ressources/dev/Test.ExportationVTK/Ressources.Avion/Avion.Quadratique.cont.vtu.etalon", flags=flags@entry=0) at sha1_file.c:3305 #16 0x005160ee in index_core (flags=0, path=0x80a818 "Ressources/dev/Test.ExportationVTK/Ressources.Avion/Avion.Quadratique.cont.vtu.etalon", type=OBJ_BLOB, size=, fd=7, sha1=0x7fffc750 "\340_~") at sha1_file.c:3367 #17 index_fd (sha1=sha1@entry=0x7fffc750 "\340_~", fd=7, st=st@entry=0x7fffc7c0, type=type@entry=OBJ_BLOB, path=path@entry=0x80a818 "Ressources/dev/Test.ExportationVTK/Ressources.Avion/Avion.Quadratique.cont.vtu.etalon", flags=flags@entry=0) at sha1_file.c:3410 #18 0x004eac66 in ce_compare_data (st=0x7fffc7c0, ce=0x80a7c0) at read-cache.c:166 #19 ce_modified_check_fs (ce=0x80a7c0, st=0x7fffc7c0) at read-cache.c:215 #20 0x004ebb6d in ie_modified (istate=istate@entry=0x7e5fe0 , ce=ce@entry=0x80a7c0, st=st@entry=0x7fffc7c0, options=options@entry=16) at read-cache.c:395 #21 0x004ebcfe in refresh_cache_ent (istate=istate@entry=0x7e5fe0 , ce=ce@entry=0x80a7c0, options=options@entry=16, err=err@entry=0x7fffc908, changed_ret=changed_ret@entry=0x7fffc90c) at read-cache.c:1130 #22 0x004ed59c in refresh_index (istate=0x7e5fe0 , flags=flags@entry=6, pathspec=pathspec@entry=0x7bb738, seen=seen@entry=0x0, header_msg=header_msg@entry=0x0) at read-cache.c:1221 #23 0x00429e3b in cmd_status (argc=, argv=0x7fffcca0, prefix=0x7e950f "Ressources/dev/Test.ExportationVTK/") at builtin/commit.c:1376 #24 0x004063b3 in run_builtin (argv=0x7fffcca0, argc=1, p=0x7b4030 ) at git.c:352 #25 handle_builtin (argc=1, argv=0x7fffcca0) at git.c:539 #26 0x004054a1 in run_argv (argv=0x7fffca80, argcp=0x7fffca6c) at git.c:593 #27 main (argc=1, av=) at git.c:698 Ii would have expected git to first gave me an error when checking out the files!!! Here is the log: Checking out files: 99% (28645/28934) Checking out files: 100% (28934/28934) Checking out files: 100% (28934/28934), done. Already on 'master' Your branch is up-to-date with 'origin/master'. On valide le dépôt TestValidation avec la référence: 9b4a485202b2b52922377842c15bfd605d240667 HEAD is now at 9b4a485 BUG: On spécifie bash comme shell... But at least 1 file is corrupted! I keep preciously this faulty repo to further investigation with someone who can help dig